Re: [iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?

Sebastian Moeller <moeller0@gmx.de> Thu, 11 August 2022 17:36 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 85B59C157B43; Thu, 11 Aug 2022 10:36:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.655
X-Spam-Level:
X-Spam-Status: No, score=-1.655 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WZzekhLQAoOF; Thu, 11 Aug 2022 10:36:04 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 642B0C14F74D; Thu, 11 Aug 2022 10:36:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1660239361; bh=1UJgR8yZVIstz4JoCE1bPi+t9SKkGX91LXer92R74Ww=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=MT3nSOzKHc+4B7XucfwU1PR+URbjdh9FS7h1NJLxI7/7HIUcTY7mYy1dgG+rf00xa +BrpIbLK/vXpOwwxcjF4BngtLl7evbR4xBWfM2LPyPgCdrAL3sYlJvsBmjlBY5zpvb wWFYH7DheU/pIIZ6VaDn7RbASGjfX7G+URb+H++c=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from smtpclient.apple ([95.116.115.133]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MUXtY-1nvpYm3aSW-00QTzT; Thu, 11 Aug 2022 19:36:00 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <CADVnQykhS7XpCN3ntODEsRJH_ch9V3M-zdr1yUXqhaK2SH1Uew@mail.gmail.com>
Date: Thu, 11 Aug 2022 19:35:59 +0200
Cc: iccrg IRTF list <iccrg@irtf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <E24CC21F-CD26-44FC-9FA1-A2294BC71167@gmx.de>
References: <CADVnQykxwaqZTGXR-ZMYLEem0rKfAcT7KkHYgsF4dBdWvi2k4w@mail.gmail.com> <CADVnQykhS7XpCN3ntODEsRJH_ch9V3M-zdr1yUXqhaK2SH1Uew@mail.gmail.com>
To: Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3696.120.41.1.1)
X-Provags-ID: V03:K1:CYnsIUKsrYLn0dP9Av+QV6tK5soubtH59GA4rBbA0P3jenvKu25 3Dd08JUHEqFbtb3b1C82F7WEIGUKyUrWCiJFCezQgBcstPaM8r3xEI6kLRYPK3rg0FcTeAe UHY36gzTh7kn2nrPrz4JYjUuaeRTcOXqCTJoPt5bkfy/MF1XBGdxKtBmg+RY3OFWW9bWve+ 35akYGf1OHSvgri0atNnQ==
X-UI-Out-Filterresults: notjunk:1;V03:K0:WOHgn+C+q2o=:1tY7rWeiIxVgPkR4rtjZL3 JiCdTb0QVPfTZdal1kHup8Lzv1sLx55ZbxsSYIDWz5r0zOAo2hjw1BRbKp8LN2EcSJ++WMEUN pHiQgHZrBqKtVdm1GPOObFyfyKYbjXNqenSuhNT+NExc4rqyH49vkDoSXHPbA+IviAC/OvjOF d98CFehgS56qovaMmNbl2aBlG7q3Xhh+GOs8MSuLoFNfIjc932DCCk91554OP/wXWxQmU8KLE e8qoznD+m2fKeIcGJPRaxttkjs9mxsvprfkVXLau0E+wVlSqrrqhQMuFktzM0Vp4bA4GPzUvI ZKiSg8Q2Bu1MjpcFTaqaSWnwBap471nRsBM1IvYmx8oWCced2gjDIh694nRjQCxABQKezbzDc 59QwkDo00fqSO8s32VKZCcszBru67U4MHyLExgsIXlNr8GUgbj68/xojgk+BZSFcwOUYj+lRq 598u8pnnChRBk2Rzt7MAj0IKLPhq/mGqNJyq846E1Ab98moIAP31bjwtJ1zibtmCILjaJdZn+ iUHLxfqHcGMjgnDKZQHy8vGvKLfuEoJVjaQHwL285XJaoFeessVZTC9PczDK3Fch4PA7d3MqE 8f9WmCIhRQhGDlwhiataSeFV9osEcO6c8/jJ0rLvmUk4iqSZVVv9XesQ5jPoQzZXCm9zE2280 f//XMRTOkNmsxN/Y1rQSXBkvIGmiSFqpjg+9RR53vfYeI/k8T7YWWHYpkoo+25GxsAKjAe9dN qtVez8hoNSc4zFVBbXZg62SEr3LEMwO2nFapRPOrESqgnqgjSdwfM9jvcy2FgRMVYjEU+cx99 T6o7aBWjM4QaD25tC6b2VQLiGUjRTKS6rju2MEF3FMY+XUnxm/7fFt/me9BWIuyPLdChB7wJX axxAI7sIi6YXlg+nxW2d4JEDBeqFzLNjnk9uB0nrLeOLIcn7eyruDOTI6MLRBC8Qsu5fQv2pX TesDne1O+9zlscl7XnagrXJiRRXy1dVYL0TAJMtpZjHz+KSv0BzC4uHOHBOJoNBH0xWBpIz9n f8qvlefu0j8vT2/7t2Sn7+oSo26T6PTfxdbskm2OrtUDKnjhw1cWMkTY9mRskL2Ecc97+QczO fA0UXadK0xKbSP6LOtwmVmdoM/i8TdCRhsynIAZ7W/enKIm4SlmHbfMtQ==
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/i81k2W2ASPg_WS5z4jIlMqdv_ww>
Subject: Re: [iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Aug 2022 17:36:10 -0000

Dear Neal,

over in the tsvwg we have a draft for ECN encapsulation (https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-encap-guidelines-17) where section 4.6 touches the same issue (especially goal #2 and its example implementation is aimed at propagating CE marks in a way that roughly conserves the number of marked bytes*). It would be great if you could have a look at that section and maybe post comments over in the tsvwg list?

Thanks in advance
	Sebastian

*) The proposed method clearly is incomplete and feels like it was devised in a text editor and is not a description of implemented and working code, which fits with your observation that eve TCP Prague uses packet marking over byte marking logic. My personal preference (for what it is worth) would be not to include incomplete methods and examples in an RFC that have never been put to a real test.



> On Aug 11, 2022, at 16:43, Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org> wrote:
> 
> Another argument against byte-counting ECN responses:
> 
> (4) The fact that Prague updates its EWMA alpha once per round trip seems to suggest that it is trying to maintain a time-weighted marking probability, since it is effectively taking one "frac" sample per round trip. If byte-based weighting was really preferable, then presumably the EWMA alpha should be updated per equal volume of data delivered, e.g. updating the EWMA alpha per 100 KBytes of data delivered. Since Prague updates the EWMA alpha once per round trip, the EWMA will diverge wildly from the inter-round-trip byte weighting if the volumes of data (in bytes) in each round trip diverge widely, which is quite common for web or RPC traffic.
> 
> best regards,
> neal
> 
> 
> 
> On Wed, Aug 10, 2022 at 6:11 PM Neal Cardwell <ncardwell@google.com> wrote:
> Re:
>   https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-01
> 
> and the passages:
> 
> "2.3.2.  Moving Average of ECN Feedback
> ...it measures the fraction, frac, of ACKed bytes that carried ECN
> feedback over the previous round trip. ...
> 
> 2.4.3.  Additive Increase and ECN Feedback
> ...a Prague CC applies additive increase irrespective of its CWR
> state, but only for bytes that have been ACK'd without ECN feedback.
> ...  This approach reduces additive increase as the marking
> probability increases..."
> 
> I was curious about the design choice to specify that the algorithm
> reacts to the fraction of *bytes* that have been CE-marked instead of
> the fraction of *packets*. IMHO it would be useful for the document to
> outline the motivation.
> 
> Apologies if I have missed this in previous e-mail discussions or
> presentations. I may well have. :-)
> 
> I can imagine a number of potential reasons why it could be
> advantageous to react to the fraction of packets CE-marked rather than
> the fraction of bytes CE-marked:
> 
> (1) AFAICT byte counters distort the path's ECN marking probability
> more than using packet counters. For example, suppose we have a round
> trip with 100 packets sent at roughly uniform intervals across the
> round trip time:
> 
> o  99 packets of 1 byte each, all CE-marked
> o 1 packet of 1000 bytes that was not CE-marked
> 
> Then the byte-based Prague "frac" ("the fraction, frac, of ACKed bytes
> that carried ECN feedback over the previous round trip") is:
> 
>   99 bytes / 1099 bytes ~= .09
> 
> Whereas the fraction of ACKed packets that carried ECN feedback is:
> 
>    99 packets / 100 packet = .99
> 
> So in this toy example there is a >10x difference in the CE "frac"
> signal depending on whether bytes or packets are counted.
> 
> And given that these packets were spaced uniformly across the round
> trip, 99% of the time the bottleneck had excess queuing. This 99%
> number is well reflected in a packet-based "frac", but seems to imply
> that the byte-based "frac" approach dramatically underestimates the
> probability that a packet will encounter excessive queuing, aka the
> packet CE marking probability.
> 
> The Prague draft in section 1 mentions:
> 
> " The Prague CC is a particular instance of a scalable congestion control. ...
> For a scalable congestion control B=1, so its response function takes
> the form cwnd = K/p. ...
> p:  Steady-state probability of drop or marking"
> 
> So Prague is defined as a scalable congestion control, which has a
> response function that is a function of the probability of ECN
> marking. But AFAICT the "frac" mentioned in the Prague spec is a
> byte-weighted number, and by contrast the fraction of *packets*
> CE-marked is a much better estimate of the probability of a packet
> being CE-marked (which is my interpretation of the somewhat ambiguous
> "probability of drop or marking").
> 
> (2) The current Linux TCP reference implementation of TCP Prague does
> not actually use bytes; it uses packets. Likewise, DCTCP and BBRv2 use
> packets rather than bytes. So AFAIK the real-world deployment
> experience with shallow-threshold ECN thus far is almost entirely with
> packet-based algorithms rather than byte-based algorithms. It seems
> risky to specify Prague with a byte-based approach that has not been
> tested, especially given that the byte-based and packet-based
> algorithms can measure massively different signals in some cases (see
> (1) above).
> 
> (3) AFAIK byte counters are not available when relying on the AccECN
> ACE field if there is ACK loss, since the CE marks counted in the ACE
> field cannot be properly matched against the size of segments that
> were already ACKed and freed. So in environments where only the ACE
> field is available then this would imply that TCP Prague cannot be
> used (since Prague is specified only in bytes). This would seem to
> significantly limit the utility of the ACE field and/or byte-based
> Prague, in such scenarios. If Prague were defined in terms of packets
> then it seems that perhaps it could be more likely to be useful in
> paths that only support the ACE field and strip out the AccECN option?
> 
> 
> In summary, if byte counting is considered preferable, IMHO it would
> be good to document in this draft why this is so, change the Linux TCP
> Prague code to use the byte-based approach, and then for the
> definition of "p" in the draft to specify that it means the
> probability that a payload "byte" is CE marked rather than leaving the
> bytes/packets distinction ambiguous.
> 
> best regards,
> neal
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg