Re: [conex] byte vs packet counting

Bob Briscoe <bob.briscoe@bt.com> Sun, 20 November 2011 19:56 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: conex@ietfa.amsl.com
Delivered-To: conex@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0682221F8672 for <conex@ietfa.amsl.com>; Sun, 20 Nov 2011 11:56:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.843
X-Spam-Level:
X-Spam-Status: No, score=-1.843 tagged_above=-999 required=5 tests=[AWL=-0.845, BAYES_50=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AiYRiwKpn5vA for <conex@ietfa.amsl.com>; Sun, 20 Nov 2011 11:56:30 -0800 (PST)
Received: from smtp4.smtp.bt.com (smtp4.smtp.bt.com [217.32.164.151]) by ietfa.amsl.com (Postfix) with ESMTP id 5890B21F8663 for <conex@ietf.org>; Sun, 20 Nov 2011 11:56:30 -0800 (PST)
Received: from i2kc06-ukbr.domain1.systemhost.net ([193.113.197.70]) by smtp4.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.4675); Sun, 20 Nov 2011 19:56:27 +0000
Received: from cbibipnt05.iuser.iroot.adidom.com ([147.149.196.177]) by i2kc06-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.4675); Sun, 20 Nov 2011 19:56:27 +0000
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt05.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 1321818986587; Sun, 20 Nov 2011 19:56:26 +0000
Received: from MUT.jungle.bt.co.uk ([10.142.208.72]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id pAKJuJSQ007421; Sun, 20 Nov 2011 19:56:21 GMT
Message-Id: <201111201956.pAKJuJSQ007421@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Sun, 20 Nov 2011 19:56:14 +0000
To: Matt Mathis <mattmathis@google.com>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <CAH56bmD2fh3sm4mozh17K2C+K0Pxyw7vRvykCo9Xt-jeEP36ZQ@mail.g mail.com>
References: <201111171402.pAHE26tB006646@bagheera.jungle.bt.co.uk> <CAH56bmBhg6VpDMKye+GO_=gN-MAjskTMaAznhYSJm3N4R6Zjtg@mail.gmail.com> <CAH56bmD2fh3sm4mozh17K2C+K0Pxyw7vRvykCo9Xt-jeEP36ZQ@mail.gmail.com>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="=====================_632751959==.ALT"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 20 Nov 2011 19:56:27.0386 (UTC) FILETIME=[7D3FCDA0:01CCA7BE]
Cc: ConEx IETF list <conex@ietf.org>
Subject: Re: [conex] byte vs packet counting
X-BeenThere: conex@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Congestion Exposure working group discussion list <conex.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/conex>, <mailto:conex-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/conex>
List-Post: <mailto:conex@ietf.org>
List-Help: <mailto:conex-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/conex>, <mailto:conex-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Nov 2011 19:56:32 -0000

Matt,

[Adding the ConEx list back in]

I think we're both now agreeing that the /goal/ should be 
byte-balance for ConEx auditing. Correct?

I am also in full agreement that this goal isn't always easy to meet 
if retrofitting ConEx onto an existing transport like TCP. It works 
if the packet sizes are regular, but if they are lumpy, the sender is 
likely to understate or overstate. And I agree that these aren't 
necessarily just corner-cases (well, I guess there is a large set of 
cases with regular packet sizes, but the corner is also quite large).

You ask what I think we should do about this. My view, we should:
a) state what the goal should be for all transports (byte-balance)
b) give advice on how to implement a TCP sender to do as well as it can
c) run experiments and see whether the outcome is good enough, or if 
further protocol mechanism is needed.

Concerning (b) advice on the best that TCP implementations can do:
- the TCP receiver (if optionally supporting ConEx) should suppress 
ACK delay whenever it detects congestion and instead give immediate feedback
- TCP sender can use ack sequence numbers to improve its guess of the 
required size of an re-echo-loss or re-echo-ECN
- If the receiver is not ConEx-aware, a sender that knows it is 
sending lumpy packet sizes can introduce additional credit to cover 
any mistakes.

If a ConEx sender does unintentionally understate bytes of 
congestion, the auditor will discard some packets, then the sender 
will redress the balance with some more re-echo-loss and it should 
all heal itself, albeit having suffered some losses in the process.

Do you think this is sufficient (at least to run experiments to test 
whether it is)?



Bob

At 20:55 18/11/2011, Matt Mathis wrote:
>So here is a pathology: I have an application that sends 15001 bytes 
>(10 segments + 1 byte), followed by 1 byte beacons every 10 second 
>for 100 seconds and then the entire pattern repeats (total of 
>10*1500 + 10*1 segments every 100 seconds).
>
>Now suppose there are ECN marks near the end of the burst on every 
>repeat.   How does this get signaled, and what is the correct sender 
>response?   Due to delayed ACKs it may be ambiguous exactly which 
>segment was marked (e.g. was it 1500 or 1 bytes?)   Even if the 
>sender knows for sure that it was the 1500 byte segment that was 
>marked, it has to wait until the next burst to send the proper feedback.
>
>The problem is the ACKs carry the count of the number of marked 
>segments, not marked bytes.  Also the re-feed itself has no size, 
>except the length of the segments carrying it.
>
>What do you think should happen in this case?
>(BTW modern persistent web protocols, such as SPDY do stuff like 
>this all the time, although perhaps not as extreme as my 
>example.   BGP also has streaming data with very irregular message sizes.)
>
>This would be completely clear if the ACKs and re-feedback were 
>counts of bytes.
>
>Thanks,
>--MM--
>The best way to predict the future is to create it.  - Alan Kay
>
>
>
>On Thu, Nov 17, 2011 at 7:07 PM, Matt Mathis 
><<mailto:mattmathis@google.com>mattmathis@google.com> wrote:
>
>I'm sorry I did not make my self clear. I understand and agree with 
>your argument.
>
>The problem is when the forward path is carrying lumpy data : highly 
>irregular segment sizes, typical of streaming media, p-http, or bgp 
>at edge of the onset of congestion. When there is an ecn mark the 
>ack  only indicates that a segment was marked and not its size. If 
>the sender has a choice of segments to re-feedback it has has no 
>idea which to choose.  Worse it
>may not have a choice.   Furthermore under pathological conditions 
>it will persistently get it wrong.
>
>I can explain better when I have a real. Keyboard.
>On Nov 17, 2011 10:02 PM, "Bob Briscoe" 
><<mailto:bob.briscoe@bt.com>bob.briscoe@bt.com> wrote:
>Matt,
>
>This issue has come up in at least two threads recently: 
>"Congestion" vs. "Congestion Volume" and "byte-counting in conex-destopt"
>
>Consider the buffer into the high speed line you were talking about 
>this morning - it holds packets in equal MTU-sized packet buffers, 
>whatever the packet size.
>
>When it buffers packets, we need to ask what exactly is congested: 
>the line or the buffer? In this case, the line is congested. Packets 
>are temporarily in the buffer because more bits arrived than 
>departed. The queue is merely a symptom of how congested the line 
>is. Certainly, once the buffer gets full, we could say the buffer 
>has become congested too. But the root of the problem is how fast 
>the line can carry away the bits.
>
>If the queue was building up behind forwarding look-ups or a 
>firewall, then the problem would be the number of packets headers, 
>not bits. But if it's a queue into a transmission line, the problem is bits.
>
>The point is, whether to count bits or packets depends on the 
>process /in front/ of the queue (whether its a bit transmission line 
>or processing packet headers). The internals of the buffer itself 
>are irrelevant.
>
>Then the question is how prevalent are per-packet processes as 
>sources of congestion? Answer: There seems to be good reason why 
>per-packet congestion will remain in the minority relative to per-byte...
>
>During the process of writing draft-ietf-byte-pkt-congest, all the 
>machine design folk who were consulted said that machines are 
>generally designed so that any per-packet-processing can cope with a 
>workload at line rate consisting mostly of small packets.
>
>IOW, machine designs tend to use bit-congestion to protect the 
>packet-processor from congestion.
>
>____________________
>For those who prefer an example, assume:
>MTU,       M = 1,500B = 12,000b
>Line rate, X =    48Gbps
>
>[The numbers are chosen to make the maths easy, not to reflect 
>typical scenarios. I'm going to work in bits not bytes from now on.]
>
>Imagine at some time that 260 of these fixed-size packet buffers are 
>full, with 10 large and 250 small packets.
>
>packet size       | S    | 12,000b  |   480b   |
>no. pkts buffered | N    |     10   |   250    |
>buffer space used | NM   |    120kb |     3Mb  |
>pkt bits buffered | NS   |    120kb |   120kb  |
>time to drain all | NS/X |  2,500ns | 2,500ns  |
>
>Although each small packet takes up the same space in the buffer as 
>a large packet, it's faster to get rid of it. 250 small packets take 
>as long to drain as 10 large packets.
>
>
>Bob
>
>
>________________________________________________________________
>Bob Briscoe,                                BT Innovate & Design
>

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design