Re: Comment on draft-ietf-tsvwg-byte-pkt-congest-05.txt

Bob Briscoe <bob.briscoe@bt.com> Wed, 30 November 2011 12:49 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 88DAD21F8B26 for <tsvwg@ietfa.amsl.com>; Wed, 30 Nov 2011 04:49:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.099
X-Spam-Level:
X-Spam-Status: No, score=-1.099 tagged_above=-999 required=5 tests=[AWL=-2.501, BAYES_00=-2.599, GB_SUMOF=5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jtOsJB5K8Vc8 for <tsvwg@ietfa.amsl.com>; Wed, 30 Nov 2011 04:49:27 -0800 (PST)
Received: from smtp3.smtp.bt.com (smtp3.smtp.bt.com [217.32.164.138]) by ietfa.amsl.com (Postfix) with ESMTP id 3BCB421F8B20 for <tsvwg@ietf.org>; Wed, 30 Nov 2011 04:49:26 -0800 (PST)
Received: from i2kc08-ukbr.domain1.systemhost.net ([193.113.197.71]) by smtp3.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 30 Nov 2011 12:49:24 +0000
Received: from cbibipnt08.iuser.iroot.adidom.com ([147.149.100.81]) by i2kc08-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.4675); Wed, 30 Nov 2011 12:49:24 +0000
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt08.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 1322657363826; Wed, 30 Nov 2011 12:49:23 +0000
Received: from MUT.jungle.bt.co.uk ([10.215.131.152]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id pAUCnL8u032441; Wed, 30 Nov 2011 12:49:21 GMT
Message-Id: <201111301249.pAUCnL8u032441@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Wed, 30 Nov 2011 12:49:20 +0000
To: Joe Touch <touch@isi.edu>, Fred Baker <fred@cisco.com>
From: Bob Briscoe <bob.briscoe@bt.com>
Subject: Re: Comment on draft-ietf-tsvwg-byte-pkt-congest-05.txt
In-Reply-To: <4EC0E434.1090107@isi.edu>
References: <CA11D312-2132-49F0-A3CF-0A3049010BF8@cisco.com> <4EC0E434.1090107@isi.edu>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="=====================_619688534==.ALT"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 30 Nov 2011 12:49:24.0200 (UTC) FILETIME=[7CC56680:01CCAF5E]
Cc: tsvwg list <tsvwg@ietf.org>
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Nov 2011 12:49:29 -0000

Joe & Fred,

At 09:49 14/11/2011, Joe Touch wrote:
>Hi, Fred,
>
>On 11/13/2011 11:52 PM, Fred Baker wrote:
>...

(I've added the this context, given I've taken so long to reply - sorry):
 >    If the resource is bit-congestible, the implementation SHOULD measure
 >    the length of the queue in bytes.  If the resource is packet-
 >    congestible, the implementation SHOULD measure the length of the
 >    queue in packets.

>>I think the statement is fine as far as it goes, but it has two
>>remaining issues. One is the question of when an interface is
>>bit/byte congestible and when it is packet-congestible; I'm sure that
>>means something specific to the authors, but will be far less clear
>>to an implementer - every interface has a rate, and few if any
>>measure their rates in packets per unit time, so specifically what is
>>in view as a "packet-congestible interface"?
>
>Some examples in the text might be useful.


We did give examples at the point where we define 
'packet-congestible' earlier (pasted below).

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
1.1. Terminology and Scoping

    Bit-congestible vs. Packet-congestible:

       Examples of packet-congestible resources are route look-up engines
       and firewalls, because load depends on how many packet headers
       they have to process.  Examples of bit-congestible resources are
       transmission links, radio power and most buffer memory, because
       the load depends on how many bits they have to transmit or store.
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

>Bit/byte congestion:        - storage that assigns space based on 
>the packet size
>         - interfaces whose output bandwidth is overloaded
>         - processing that is length-dependent (e.g., full packet
>         hashes or encryption)
>
>Packet congestion:
>         - storage that assigns space on a per-packet basis,
>         i.e., that has fixed slots (for max packets)
>         - interfaces whose output slots are overloaded
>         (with fixed slots per packet)
>         - processing that is per-packet dependent (e.g.,
>         header processing, header hashing, tunneling)
>
>It is useful to note that there are some variants of these, e.g., 
>where there are a small number of different buffer sizes (coarsely 
>per-bit/byte), or ones with combined bottlenecks (there can be 
>multiple places where resources are under contention!)

[BB]: To help implementers understand how to interpret the guidance 
in a more complex cases like this we talk about both cases as 
specific examples in the text. Search for 'buffer carving' and 
'hybrid forwarding system' respectively.


>It's useful to explain what to do when there are combined 
>bottlenecks in both places (alleviating only one might not suffice). 
>I wonder what the algorithm should be in that case...

[BB]: To avoid making the draft too complicated, we fudged a bit by 
saying the following:

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
    If we now imagine a hybrid forwarding system with transmission delay
    largely dependent on the byte-size of packets but buffers of one MTU
    per packet, it should strictly require a more complex algorithm to
    determine the probability of congestion.  It should be treated as two
    resources in sequence, where the sum of the byte-sizes of the packets
    within each packet buffer models congestion of the line while the
    length of the queue in packets models congestion of the queue.  Then
    the probability of congesting the forwarding buffer would be a
    conditional probability--conditional on the previously calculated
    probability of congesting the line.
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

Given Fred's request for clearer advice to implementers, I would like 
to replace the above para with the following (Jukaa might be able to 
shorten it - he's better than me at brevity):

"If we now consider a hybrid forwarding system with transmission 
congestion dependent on the byte-size of packets but fixed-size 
buffers of one MTU per packet, the queue length can still simply be 
measured in bytes, based on the following reasoning.

When an interface buffers packets, we need to ask what is the root 
cause of the congestion: the line or the buffer? In this case, the 
line is the root of the congestion. Packets temporarily back-up into 
the buffer because more bits are arriving than departing. Even though 
a run of small packets would fill MTU-sized buffers more quickly, the 
line will also drain them more quickly. Therefore, the queue is 
always merely a symptom of how congested the line is. Certainly, once 
the buffer gets full, we could say the buffer has become congested 
too. But reducing the arriving bit-rate would clear congestion of the 
line, which in turn would empty the buffer.

Theoretically, congestion in this hybrid could be treated as the 
probability of congestion of the buffer conditional on congestion of 
the line, where the sum of bytes in the buffers measures congestion 
of the line while the length of the queue in packets measures 
congestion of the queue. In practice, however, just measuring the 
bytes queued gets at the root cause: congestion of the line.

If the queue were building up behind forwarding look-ups or a 
firewall, then the problem would be the number of packets headers, 
not bits. But if it is a queue into a transmission line, the problem is bits.

In summary, for these hybrid cases, whether to measure the queue 
length in bits or packets depends primarily on whether the process in 
front of the queue is bit-congestible (e.g. a transmission line) or 
packet-congestible (e.g. forwarding look-ups). The internals of the 
buffer can usefully be ignored.
"


> > The other is that it
>>only really applies to interfaces to a unique and non-blocking
>>channel, like a serial line. When the congestible interface or queue
>>is on a shared resource such as the fabric in an input-queued switch,
>>an 802.11 interface, etc, the mathematics really involves two queues
>>- the interface queues (in a statistical sense if not a hardware
>>version) for access to the channel, and the packet is enqueued toward
>>the interface, and both arrival/departure distributions are
>>important.
>>
>>Consider, for example, an AQM implementation on a WiFi channel. We
>>have all been on busy 802.11 networks; we have this experience in our
>>own history at the IETF. The dynamics are fairly strange. At the IP
>>layer, sessions are generally between hosts on the WiFi network and
>>hosts somewhere else. At the WiFi layer the AP is a proxy for the
>>remote hosts, and is therefore in essence an extremely busy member of
>>the WiFi system that otherwise operates in an approximation to a
>>round-robin fashion. The effect is to build a bufferbloat scenario
>>into the AP. Due to the shared nature of the medium and the fact that
>>retransmissions are driven by absolute time (an RTO happens after a
>>certain point in time), what one is really looking for is a packet's
>>queue occupancy being measured in time. Bytes are a reasonable analog
>>to time (which is what makes bytes reasonable for a bit-congestible
>>interface) for noncontested interfaces, but a shared interface can
>>fail to emit a transmission for an arbitrary time interval. So I
>>might find myself, if some variant on Blue is in use, bumping the
>>mark/drop probability when the queue becomes full and *also* any time
>>one dequeues a packet that has been waiting longer than some time
>>interval.

Yes, you have to find something that approximates to a model of the 
sequence of congested resources. In byte-pkt we decided not to go 
deeply into this as another example. We felt the previous hybrid 
example had already covered the byte-vs-pkt issue sufficiently, and a 
wireless example might say a lot more about how to design a wireless 
AQM, but not a lot more about byte-v-pkt. Instead we referred to a 
Mobicom paper by Vasilios Siris on this 
<http://tools.ietf.org/html/draft-ietf-tsvwg-byte-pkt-congest-05#ref-ECNFixedWireless> 
(see Section 4.1.2. Congestion Measurement without a Queue).

That reference focuses on CDMA, but I chose it because it combines 
many forms of congestion you can get on an e2e fixed/wireless path. 
You can find other papers, including one on correctly signalling WiFi 
congestion here:
<http://www.ics.forth.gr/netlab/future_wireless.html>


>This is the combined example I suggested above AFAICT.

Yup.


Bob


>Joe
>
>>As to the "SHOULD NOT" regarding configuration, this probably makes a
>>lot of sense in academic circles. Speaking as a vendor, I'm going to
>>do what my customers tell me to. I'm happy enough with a default
>>configuration following this recommendation, but if my customer wants
>>it different, I'm not going to slap his hands.
>>
>>As a side note, I would generally suggest that we step aside from
>>"RED" and talk about "AQM". Blue, SFQ-Blue, and AVP are AQM
>>algorithms that are likely superior to RED from an operational
>>viewpoint.
>
>________________________________________________________________
>Bob Briscoe,                                BT Innovate & Design