[Tmrg] Now, where were we...?

krasnoj at gmx.at (Stefan Hirschmann) Thu, 19 November 2009 20:05 UTC

From: krasnoj at gmx.at (Stefan Hirschmann)
Date: Thu, 19 Nov 2009 21:05:22 +0100
Subject: [Tmrg] Now, where were we...?
In-Reply-To: <BE0E1358-7C27-46A8-AF1E-D8D7CC834A52@ifi.uio.no>
References: <BE0E1358-7C27-46A8-AF1E-D8D7CC834A52@ifi.uio.no>
Message-ID: <4B05A502.4050402@gmx.at>

Michael Welzl wrote:
> Hi,
> 
> This prompts me to ask a question that I've been pondering
> ever since a hallway conversation that I had with Stanislav
> Shalunov at the Stockholm IETF:
> 
> 
>> 2. How reliable are implicit congestion indicators?  The prevailing
>> wisdom in the IETF seems to be that "ECN=loss = congestion, delay =
>> noise, nothing else is useful for congestion control".  What criteria
>> would "delay" have to satisfy in order to be a useful indicator of
>> congestion?  Should we listen to the average delay, the frequency with
>> which delay exceeds a threshold, or the jitter?
> 
> Can delay ever be worse as a congestion indicator than
> loss is?

Yes. It can be wrong in two ways:


If there is physical corruption and data link repeating of the signals, 
any correlation between congestion and delay is just random.

Error 1: There data link retransmissions (due to physical corruption / 
checksum errors) are increasing and the delay increases. Reality: Same 
state of congestion, but assumption that congestion increased.

Error 2: There are less data link retransmissions (compared to beginning 
of connection). This leads to the wrong assumption that the congestion 
(the queue delay) is also less.

In my opinion, delay should only be considered as limiting factor, but 
never as increasing factor: cwnd = min( f(delay), g(loss) ).



Another bad effect is due to the burstiness of TCP:
If a queue of a bottleneck is empty and a traffic burst arrives, the 
first packet of the burst have less RTT than the last packets of the 
burst (first packet is transported immediately, the other packets have 
an increasing queuing delay). A delay based approach can already lead to 
reduction of cwnd even if cwnd = 10% BDP.


Cheers,
Stefan