Re: [tcpm] Congestion control in face of ICMP unreachable messages

On Wed, Sep 12, 2007 at 08:23:32PM +0200, Daniel Schaffrath wrote:
> 
> On 2007/09/07  , at 02:43, Ted Faber wrote:
> 
> >Acting in no official capacity, I say:
> Sorry, I don't get this.

I'm one of the co-chairs of the TCPM working group.  I'm just having a
public conversation with you and neither you nor observers of this public
conversation should take it as being an official position of this august
WG.

Because tempers fray and IETF participants are somewhat averse to people
throwing their weight around, those of us who don't want to have those
kinds of arguments try to be specific about when we're speaking ex
cathedra. :-)

> >If TCP has gotten to the point where a retransmission timer has  
> >gone off
> >not only has the transmission of the packet to be retransmitted  
> >failed,
> >but enough other packets have been lost that the fast retransmit
> >algorithm (3 dupacks) has also not happened (yes, the window needs to
> >have grown to 4...).  In short, there's something very wrong with the
> >communication between the endpoints, and drastic action is called for.
> >Specifically, the sender acts as though the connection is almost new -
> >window is as small as possible and the ssthresh is halved.
> >
> >Your text above sounds like you're saying that if a sender has  
> >heard the
> >network say that there's a connectivity problem between the sender and
> >receiver (ICMP destination unreachable) the sender is entitled to  
> >react
> >less conservatively, even though all the evidence is that the link is
> >congested.
> From my understanding the evidence is that just some link is (was)  
> not working and that is not the same as a link is congested. Even,  
> there is some slight indication that there is no congestion in the  
> network (or at least the very part of it) as the ICMP message (as  
> well as the segment eliciting the ICMP message) was not dropped and  
> got to the TCP source.

So, the TCP stack really just knows that packets are not being
acknowledged, and the current design interprets that as congestion.
That's a conservative approach, and one people can and do argue about.
If you'd like to argue about that interpretation of loss, the end-to-end
mailing list might be a better place to do so.  I mean we'll talk about
it, too, but I think of it as a bigger picture issue.

Still and all, one could say that the ICMPs count as extra evidence that
a connection isn't congested but confused and should be treated
differently.  I'd recommend against treating them this way because:

	* ICMP operates on a different timescale than TCP.  It may take
	  a router longer to decide that there's a host unreachable
	  situation than the TCP stack would.  It may also take a router
	  longer to detect the opposite.
	* They're really easy to spoof.

I don't think you gain very much, either.  If your application believes
that the connection has stuttered - a route flap or something - the
easiest way to fix it may be to reconnect.  I'll bet you do this a
couple times a day with your browser.  Hitting reload on a slow loading
page does exactly this.  Other applications may value continuity of
connection more.

I think that kind of semantics - whether it's better for my application
to retry or to stick it out on a slow connection - is known by the app
and shouldn't be encoded in the transport.  I realize that not going
into exponential backoff is a pretty small case of a change in
semantics, but it complicates things and it's not intuitive to me that
it's worth the cost.

> >Explain to me why ICMP messages indicating that your packets are not
> >being delivered indicate you should not slow down.
>
> I am not saying you shouldn't slow down. After RTO of course you  
> should reset cwnd and halve ssthresh. I was just thinking of skipping  
> (or delaying) doubling RTO if the retransmission after RTO does not  
> simply vanish in the network but is  replied to with an ICMP (host/ 
> net) unreachable message. This seems reasonable to me if my above  
> finding is true.

I think its extra complexity for minimal gain.

But, I haven't done anything to validate that opinion.  Do you have some
evidence either way?

> 
> >>Another thing is that RTO needs to expire to reach slow-start
> >>recovery. If my understanding of RFC 1122 is correct, it's still
> >>standard compliant to continue sending after having received the
> >>first appropriate ICMP message. Maybe that's a bit too aggressive (as
> >>opposed to slow-start recovery after an "ICMP induced" RTO).
> >
> >I believe your understanding of RFC 1122 is incorrect.  I don't  
> >believe
> >that receiving an ICMP message of any kind has any effect on the TCP
> >congestion control algorithms whatsoever (other than aborting the
> >connection in the case above).  ICMP messages are not part of a TCP
> >connection, and receiving one does not count as receiveing a TCP
> >packet, much less an ACK.  After all, an ICMP message may well be  
> >coming
> >from an intermediate gateway.
> Maybe my paragraph above was not clear. I am not saying that RFC 1122  
> suggests ICMP messages to have effect on cwnd. I am saying that from  
> my understanding RFC 1122 allows  a TCP to continue sending segments  
> after having received an ICMP host/net unreachable message. This  
> seems to me as a waste of bandwidth as the arrival of the ICMP  
> message just indicated that right now there is no route to the  
> destination. Doesn't that sound reasonable? But maybe continuing  
> sending makes sense together with some other RFC I am not aware of  
> but someone else on the list.

I think I was unclear.  There are several kinds of destination
unreachable ICMP messages - subtypes of the message.  RFCs 792 and 1122
list them.  There are a set of them that indicate that any transport,
including TCP, that get one have to shutdown the connection.  If a host
gets a destination unreachable message of the subtype "protocol
unreachable" the host MUST abort the connection that caused it.  That
message indicates a long term error.  If the destination unreachable
message indicates an error that may not be so long term, the connection
should continue to transmit, and if the transport decides to terminate
the connection (e.g. a TCP timeout) it should report the ICMP error as
well.

That second choice is there because the subtypes in question could be
caused by a single misrouted packet or something and are not strong
evidence that the connection is impossible.

Make sense?

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG