[Tmrg] Now, where were we...?

michawe at ifi.uio.no (Michael Welzl) Thu, 19 November 2009 22:01 UTC

From: michawe at ifi.uio.no (Michael Welzl)
Date: Thu, 19 Nov 2009 23:01:05 +0100
Subject: [Tmrg] Now, where were we...?
In-Reply-To: <4B05A502.4050402@gmx.at>
References: <BE0E1358-7C27-46A8-AF1E-D8D7CC834A52@ifi.uio.no> <4B05A502.4050402@gmx.at>
Message-ID: <4B05C021.5000105@ifi.uio.no>


>> This prompts me to ask a question that I've been pondering
>> ever since a hallway conversation that I had with Stanislav
>> Shalunov at the Stockholm IETF:
>>> 2. How reliable are implicit congestion indicators?  The prevailing
>>> wisdom in the IETF seems to be that "ECN=loss = congestion, delay =
>>> noise, nothing else is useful for congestion control".  What criteria
>>> would "delay" have to satisfy in order to be a useful indicator of
>>> congestion?  Should we listen to the average delay, the frequency with
>>> which delay exceeds a threshold, or the jitter?
>> Can delay ever be worse as a congestion indicator than
>> loss is?
> Yes. It can be wrong in two ways:
> If there is physical corruption and data link repeating of the 
> signals, any correlation between congestion and delay is just random.
> Error 1: There data link retransmissions (due to physical corruption / 
> checksum errors) are increasing and the delay increases. Reality: Same 
> state of congestion, but assumption that congestion increased.
> Error 2: There are less data link retransmissions (compared to 
> beginning of connection). This leads to the wrong assumption that the 
> congestion (the queue delay) is also less.

I know about these, but the difference between delay and loss doesn't 
seem to
be huge here: both errors will lead to loss beyond a certain level of 
delay, and hence cause the same misinterpretation. so my question is, can
delay really be arguably worse?

(of course we could easily construct a case where increasing delay 
(incipient congestion)
appears, but loss (heavy congestion) does not... but isn't that making 
the same,
possibly wrong, decision, just using a finer granularity and hence 
reacting earlier?

> In my opinion, delay should only be considered as limiting factor, but 
> never as increasing factor: cwnd = min( f(delay), g(loss) ).
> Another bad effect is due to the burstiness of TCP:
> If a queue of a bottleneck is empty and a traffic burst arrives, the 
> first packet of the burst have less RTT than the last packets of the 
> burst (first packet is transported immediately, the other packets have 
> an increasing queuing delay). A delay based approach can already lead 
> to reduction of cwnd even if cwnd = 10% BDP. 
... but a shorter queue would  cause the same effect with a loss based