Re: Summary of responses so far and proposal moving forward[WasRe: [tcpm] Is this a problem?]

David Borman wrote:
> Ok, I haven't chimed in yet on this conversation.
I am glad you finally did :-)
>
> While I agree with the document on the identification of the problem, 
> I disagree with the proposed solution (changing TCP to time out 
> connections in persist state).  Having a connection stay in persist 
> state for long periods of time (i.e., zero window probes continue to 
> be ACKed) by itself is not a bad thing.  That is how TCP was designed 
> to work.  Connections can survive through lots of adversity.  If a 
> connection is stuck because it is waiting for user action and the user 
> walked away and went home for the day, he should be able to come back 
> the next morning and do what needs to be done, and then the connection 
> will continue.
My understanding of rfc 1122 is and I quote from the rfc itself:

> A TCP MAY keep its offered receive window closed
>             indefinitely.  As long as the receiving TCP continues to
>             send acknowledgments in response to the probe segments, the
>             sending TCP MUST allow the connection to stay open.
>
>             DISCUSSION:
>                  It is extremely important to remember that ACK
>                  (acknowledgment) segments that contain no data are not
>                  reliably transmitted by TCP.  If zero window probing is
>                  not supported, a connection may hang forever when an
>                  ACK segment that re-opens the window is lost.
>   
This tells me that the concern was with ACK's getting lost in the 
network and that is why the need to keep the connection open. The point 
we bring up in the draft is in the case the ACK's are being received 
reliably then the need to keep the connection open just to make sure the 
ACK has made it to the other end goes away. That is why the request to 
change the language to say that *in case of reliable ACK*, TCP MAY tear 
the connection down if it is not able to service existing or new 
connections.

We seem to agree on the user scenario you describe above.  That is why 
we make it clear in the draft that we should not tear down a connection 
just because the connection is open for a long time. Where there is one 
or a few connections that are keeping the connection open, the solution 
will not tear the connection down. The problem happens is when lots of 
users (or attackers) do the same.
>
> As has already been stated, the issue is what should the OS do when it 
> runs out of resources.  TCP implementations typically oversubscribe 
> their resources, and run into problems when all the open connections 
> try to use up all the resources that they've been told they can use.  
> In this situation, the OS has to figure out some way to free up 
> resources.  There may be some things it can do without killing 
> connections (e.g., flush TCP resequencing queues), but usually that 
> won't be sufficient if you have a runaway or malicious source that is 
> causing the resource problem in the first place.  In this situation, 
> anything the OS decides to do, including killing TCP connections, is 
> at the discretion of the OS, and I don't that view as violating any 
> RFC.  You're out of resources, you have to do something.  This is not 
> a TCP protocol issue, it is an OS implementation issue.
True. But it was caused by TCP's insistence on keeping the connection 
open that causes the OS to even run out of resources, even if the reason 
to keep the connection open (unreliable ACKs) may not be true.

I know there is very little support for this argument, but for a reader 
reading the rfc there is sufficient confusion on whether the connection 
can be cleared or not. Why not change the MUST to a MAY for reliable ACKs?

People on this mailing list have been arguing on the point of if 
connections can even be cleared and this is tcpm mailing list!! 
Everybody is a TCP expert here. Is it not telling of a problem in the 
language of the rfc?

/mahesh