Re: [conex] byte-counting in conex-destopt

Mirja Kühlewind <mirja.kuehlewind@ikr.uni-stuttgart.de> Wed, 09 November 2011 16:59 UTC

Return-Path: <mirja.kuehlewind@ikr.uni-stuttgart.de>
X-Original-To: conex@ietfa.amsl.com
Delivered-To: conex@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7199B21F8C4B for <conex@ietfa.amsl.com>; Wed, 9 Nov 2011 08:59:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.015
X-Spam-Level:
X-Spam-Status: No, score=-2.015 tagged_above=-999 required=5 tests=[AWL=-0.066, BAYES_00=-2.599, HELO_EQ_DE=0.35, MIME_8BIT_HEADER=0.3]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ogaPaSZa9Y00 for <conex@ietfa.amsl.com>; Wed, 9 Nov 2011 08:59:34 -0800 (PST)
Received: from mailsrv.ikr.uni-stuttgart.de (mailsrv.ikr.uni-stuttgart.de [129.69.170.2]) by ietfa.amsl.com (Postfix) with ESMTP id 1BA5B21F8C49 for <conex@ietf.org>; Wed, 9 Nov 2011 08:59:34 -0800 (PST)
Received: from netsrv1.ikr.uni-stuttgart.de (netsrv1-c [10.11.12.12]) by mailsrv.ikr.uni-stuttgart.de (Postfix) with ESMTP id 0F4BE633B1; Wed, 9 Nov 2011 17:59:32 +0100 (CET)
Received: from vpn-2-cl177 (vpn-2-cl177 [10.41.21.177]) by netsrv1.ikr.uni-stuttgart.de (Postfix) with ESMTP id F38AE59A8A; Wed, 9 Nov 2011 17:59:31 +0100 (CET)
From: Mirja Kühlewind <mirja.kuehlewind@ikr.uni-stuttgart.de>
To: conex@ietf.org
Date: Wed, 09 Nov 2011 17:59:31 +0100
User-Agent: KMail/1.9.10 (enterprise35 0.20101217.1207316)
References: <20111030141755.21962.83789.idtracker@ietfa.amsl.com> <20111107170515.GB45061@verdi>
In-Reply-To: <20111107170515.GB45061@verdi>
X-KMail-QuotePrefix: >
MIME-Version: 1.0
Content-Type: Text/Plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <201111091759.31110.mkuehle@ikr.uni-stuttgart.de>
Subject: Re: [conex] byte-counting in conex-destopt
X-BeenThere: conex@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Congestion Exposure working group discussion list <conex.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/conex>, <mailto:conex-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/conex>
List-Post: <mailto:conex@ietf.org>
List-Help: <mailto:conex-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/conex>, <mailto:conex-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Nov 2011 16:59:38 -0000

Hi John,

I guess there is some more discussion needed on this issue (in the next 
meeting). For now, my understanding was that the abstract mechanism draft 
says that ConEx bits should be accounted byte-wise and I just try to reflect 
this here.

Some more points see below.

Mirja


On Monday 07 November 2011 18:05:15 John Leslie wrote:
> internet-drafts@ietf.org <internet-drafts@ietf.org> wrote:
> > http://www.ietf.org/internet-drafts/draft-ietf-conex-destopt-01.txt
>
>    In this post I treat the byte-count language added.
> ]
> ] If the X bit is set, all three other bit (L, E, C) MAY be set. When
> ] ever one if this bits is set,
>
>    I'm guessing the intent is "Whenever one of these bits is set".
>
> ] the number of bytes carried by this IP packet (incl. IP header)
> ] SHOULD be accounted when determining congestion or credit information.
>
>    Implicit byte-count, IMHO, is a bad idea. Inevitably it leads to
> different parties counting differently. And, we have a synchronization
> problem too, where sometimes we're counting bytes in this packet and
> sometimes we're counting bytes in a previous packet (at least one RTT
> ago).
>
>    There's plenty of room for an _explicit_ byte count; and IMHO if
> we want to go down this path of auditing congestion volume in bytes,
> we owe it to implementors to be explicit.
What do you mean by 'explicit'? You want to put the number of bytes in the 
CDO? 

>
> ] In IPv6 the length can easily be calculated by the value given in
> ] the Payload Length header field (payload length + option space)
> ] plus a fixed value of 40 Bytes for the IP header itself.
>
>    Yes, this is explicit about counting bytes-on-the-wire at IP layer;
> but even so, I see room for confusion: I'd reword slightly (again,
> if we want to go down this path at all).
Feel free to make a text proposal here! Its all open for discussion and 
re-wording!

>
> ] In principle all of these three bits (L, E, C) MAY be set in the
> ] same packet. In this case the packet size MUST be accounted more
> ] than once for each respective ConEx information counter.
>
>    This is confusing. There's no justification given for keeping
> three separate counters, nor an algorithm for how to relate them.
I though the justification was given in the abstract mechanism draft. 
Basically these signals are not related...

>
> ] In practice loss and ECN marks can not occur at the same time,
>
>    Actually, this isn't quite true. Loss must be detected by
> timeout or funky ACKs, while ECN marks are known at packet receipt.
> Thus, both loss and ECN could be detected at the same time.
That's really true! I will change this! Was just a quick first effort to get 
some text here.

>
> ] so there should usually be a way to signal the respective ConEx
> ] information in different packets.
>
>    "Where there's a will, there's a way..."
>
> ] In many cases if congestion occurs the sender will not sent
> ] additional credit bit,
>
>    I think there's a typo there, but I'm not sure what it is...
>
> ] but if e.g. a sender assumes losses because of an audit function
>
>    I'm not clear what is intended by "losses because of an audit
> function": some proposed audit functions may actually drop packets
> due to the audit information...
Yes and the receiver/sender will not be able to distinguish if there was a 
loss because of congestion or if the audit was dropping something.

>
> ] or needs to maintain a certain sending rate to make an application
> ] layer service work, the occurrence of credit bits (c) in parallel
> ] to congestion exposure bit (L, E) is reasonable.
>
>    I'm not sure I'd go as far as "reasonable", but it's certainly
> "plausible".
>
>    However, there's really no reason to believe that a sender will
> desire _exactly_ the same byte-count of "credit" as "loss".
No, a sender should try to have at least as much credit as loss is expected. I 
guess usually there will be (slightly) more credits.

>
>    Again, there's plenty of room for an explicit byte-count _if_
> we want to go down that path.
>
> ] If a network node extracts the ConEx information from a connection,
> ] this node is usually supposed to hold this information byte-wise,
> ] e.g. comparing the total number of bytes sent with the number of
> ] bytes sent with ConEx congestion mark (L, E) to determine the current
> ] whole path congestion level.
>
>    There's that "congestion level" term again. I'm hoping we can
> avoid it...
I would actually prefer 'congestion level' against only 'congestion' to make 
sure that this time not 'congestion volume' was meant. But of course I will 
go in line with the use case draft...

>
>    I remain unconvinced that _I_ want to go down the path of counting
> bytes. I'd argue that in the case of packet loss, we're tending to
> a active-queue-managment paradigm where drop is correlated to packet
> count. (Agreed, for ECN there's a "cost" correlated to bytes allowed
> through with ECN marks.) It's _hard_ to cure ISPs of thinking in
> terms of percent of packets lost when they consider "congestion".
If you drop every _packet_ with the same probability, that's perfectly right. 

Assume two senders that send the same amount of data (e.g. 150kbyte), but 
sender A sends only 10 packets and sender B sends 100. Now there is a 
congestion (level) of 10%, so 10% of all packets get a marking.

sender A -> 1 packet marked
sender B -> 10 packets marked

But both have the same amount of data/bytes marked!

packet size of A: 1500 byte
packet size of B: 150 byte

sender A -> 1 * 1500byte = 1500byte
sender B -> 10 * 150byte = 1500byte

If all packets are equal sized (as implicitly assumed above) and you count the 
number of marked packets, you will see on both connections a congestion level 
of 1/10 = 10/100 = 10%.

Now let's calculate the congestion (level) on a per byte base:

(1500 * 1)/(1500*10) = 10%
(150 *10)/(150*100) = 10%

Works as well! Great!

The only problem occurs now, if you do not have equal sized packets. In this 
case byte-wise accounting gives still the right congestion level but 
packet-wise accounting not.


>    The argument in favor of counting bytes seems to rest on "academic
> purity", which IMHO will always lose to "pragmatic" in the RealWorld.
I understand this. I still would like to put in the draft that conex is 
defined byte-wise (because its the only way to do this correctly and 
otherwise there are ways to cheat by playing with the packet sizes) but to 
also make clear in the draft that in the usually case of equal sized packets 
it doesn't make a different. Thus it might be sufficient in many cases to 
count packets (depending on what you gone use the ConEx infos for).

>
> ] When equally sized packets can be assumed accounting the number of
> ] packets (and comparing the total number to marked once) should
> ] deliver the same result.
>
>    One can _always_ "assume"... :^(
>
> ] But a network node MUST be aware that this estimation can be quite
> ] wrong
>
>    (That's not a correct use of 2119 MUSTard.)
>
>    "wrong" is in the eye of the beholder. We can at most talk of
> "compliant" with a standard. (And I don't believe this is the right
> document for that.)
>
> ] and thus is not reliable if e.g. different sized packed are send.
>
>    (Presumably "sent" is intended.)
>
>    "reliable" doesn't seem the right word here.
>
> ====
>
>    Let me argue a bit about the difficulties of synchronizing byte
> counts.
>
>    If we insist on implicit byte counts, at some point we will force
> the sending of small packets to get byte counts right. 
No, ConEx is not indented to change the packet size. In the TCP mod draft we 
currently basically say that you can send more bytes as ConEx marked than 
needed and store this information (for a certain time) such that you 
basically have sent those marked bytes in advance. In fact there is no need 
to get the conex marked bytes absolutely accurate (only make sure to send 
enough bytes such that the audit will be happy).

> (IMHO this 
> will lead to greater packet drops.) And, we can't drive the byte
> count near zero because of IPv6 overhead. Indeed, the sender will
> have to guess at IPv6 overhead, because the OS composes the actual
> packet. Worse yet, no amount of jawboning in RFCs will prevent
> middleboxes from fidddling with packets. :^( :^( :^(
>
>    The smaller the packet, the greater the problem of miscounting
> bytes-on-the-wire. And we can't count bytes-on-the-wire accurately
> except at the node in question. Even if we could count accurately,
> we don't know an appropriate estimate of byte-overhead per packet
> _beyond_ bytes-on-the-wire.
>
>    Necessarily, any useful audit function must try to match actual
> congestion to congestion-expected marks. These _cannot_ happen at
> the identical time: typically they're one RTT apart. But in the
> 'Net we work in, RTT is unknowabble except to end-points. We must,
> therefore, work in approximations.
We still work on approximations. The RTT is a totally different issue with the 
audit device...

>
>    IMHO, approximating as packet-count is much easier and not
> significantly less useful. I'd really appreciate if we could stop
> arguing "academic purity" and instead list problems of utility.
>
> --
> John Leslie <john@jlc.net>
> _______________________________________________
> conex mailing list
> conex@ietf.org
> https://www.ietf.org/mailman/listinfo/conex