Re: [iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?

Neal Cardwell <ncardwell@google.com> Thu, 11 August 2022 14:43 UTC

Return-Path: <ncardwell@google.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D367AC157B3E for <iccrg@ietfa.amsl.com>; Thu, 11 Aug 2022 07:43:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.607
X-Spam-Level:
X-Spam-Status: No, score=-17.607 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oV9yYJTO0Ufi for <iccrg@ietfa.amsl.com>; Thu, 11 Aug 2022 07:43:32 -0700 (PDT)
Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F3DF2C157B36 for <iccrg@irtf.org>; Thu, 11 Aug 2022 07:43:31 -0700 (PDT)
Received: by mail-qk1-x732.google.com with SMTP id v14so8052127qkf.4 for <iccrg@irtf.org>; Thu, 11 Aug 2022 07:43:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=k5HZIrtQ9TyuT5Ug6Vc75icNtT/K+RqhxtMsp4f7LK8=; b=E0IXRXqm6eNCUrA79OYIvvvY2Lj2XFijQf0g68QBIXnjFDO0Fs1hMnaZg7F73wbFsQ 2qoeZVrVaE0J1sqIori5NTUUMD2GV5XUQcpC3mdIuBBCVF2E6KRn69S5MCAdnc2JXlVH 7oaP9aCyzwc8TIsF0qAP8IebjWusM2/Ls96nbAcylrh9/DQbqWat8S2sgnSnGudoUaWW mKhsUx67Irj+vubVzIP3s6QO8pzjgnwXdOuIhaGUX4Q2O+I1LdpTv3aCifphh7eyv8lp brzZ2foO2x+PCe+Ct+9G6Gn9AJVC2uQ1poVKNQsQooYBZqyDO/xzdZzzed8ooz9QqvBN 4ZkA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=k5HZIrtQ9TyuT5Ug6Vc75icNtT/K+RqhxtMsp4f7LK8=; b=FkCFM1rfVMNNFxu1++0W93VCyurHUDSWAbGUDBE/2c/NmFRqXygRGdjOuJ3WWQYDRD OqxClIQeaTaok+ZjJsRLDhCL0D44JOvyT7H+1zQADv1irfwB6GVxbMB3Arpz0hqoZJHx hLpPd3akHwvJJlT+q7BRJ8vpcM0azgeAUcUpglQ//w0mQMYT15GLL940+sW0WRvAOOWL Ecn1Y0fU8YvYQoW4yxouVGytbgMKZ5QPJFEsx5yzQIKhQbRI+FSZm201Skbpww0YC+Mo VTZm37wn0Tm0/mXPBYr2zPoKGm9Ud2fyqAo0kf7y4dph/TkebCdChvJ+Bgcj/rBcpLwS YzMA==
X-Gm-Message-State: ACgBeo2l11NMHhmcliQOIGyAjjk7Wzyk5/sgPdH30/Ay0iu7tQMuXlEB yaXXaUE9xoG4CsuT0kViqYj0ytHO0yS7thxMqKuFcA==
X-Google-Smtp-Source: AA6agR7EQcEqsNlFsQTwtcAe/ci38AEyah+cfcG3Hbhrl7HFwmk+QVGyZ+mK4emqYx80QDS0BiaF8jQMJlcbxNKU7mE=
X-Received: by 2002:a05:620a:372c:b0:6b6:133e:1f4b with SMTP id de44-20020a05620a372c00b006b6133e1f4bmr24537468qkb.358.1660229010481; Thu, 11 Aug 2022 07:43:30 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQykxwaqZTGXR-ZMYLEem0rKfAcT7KkHYgsF4dBdWvi2k4w@mail.gmail.com>
In-Reply-To: <CADVnQykxwaqZTGXR-ZMYLEem0rKfAcT7KkHYgsF4dBdWvi2k4w@mail.gmail.com>
From: Neal Cardwell <ncardwell@google.com>
Date: Thu, 11 Aug 2022 10:43:14 -0400
Message-ID: <CADVnQykhS7XpCN3ntODEsRJH_ch9V3M-zdr1yUXqhaK2SH1Uew@mail.gmail.com>
To: Bob Briscoe <ietf@bobbriscoe.net>, "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>, "De Schepper, Koen (Koen)" <koen.de_schepper@nokia.com>
Cc: iccrg IRTF list <iccrg@irtf.org>
Content-Type: multipart/alternative; boundary="0000000000006afc0b05e5f830fa"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/ano1wTVm14VuRCYWI3bXs_4f5ig>
Subject: Re: [iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Aug 2022 14:43:32 -0000

Another argument against byte-counting ECN responses:

(4) The fact that Prague updates its EWMA alpha once per round trip seems
to suggest that it is trying to maintain a time-weighted marking
probability, since it is effectively taking one "frac" sample per round
trip. If byte-based weighting was really preferable, then presumably the
EWMA alpha should be updated per equal volume of data delivered, e.g.
updating the EWMA alpha per 100 KBytes of data delivered. Since Prague
updates the EWMA alpha once per round trip, the EWMA will diverge wildly
from the inter-round-trip byte weighting if the volumes of data (in bytes)
in each round trip diverge widely, which is quite common for web or RPC
traffic.

best regards,
neal



On Wed, Aug 10, 2022 at 6:11 PM Neal Cardwell <ncardwell@google.com> wrote:

> Re:
>
> https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-01
>
> and the passages:
>
> "2.3.2.  Moving Average of ECN Feedback
> ...it measures the fraction, frac, of ACKed bytes that carried ECN
> feedback over the previous round trip. ...
>
> 2.4.3.  Additive Increase and ECN Feedback
> ...a Prague CC applies additive increase irrespective of its CWR
> state, but only for bytes that have been ACK'd without ECN feedback.
> ...  This approach reduces additive increase as the marking
> probability increases..."
>
> I was curious about the design choice to specify that the algorithm
> reacts to the fraction of *bytes* that have been CE-marked instead of
> the fraction of *packets*. IMHO it would be useful for the document to
> outline the motivation.
>
> Apologies if I have missed this in previous e-mail discussions or
> presentations. I may well have. :-)
>
> I can imagine a number of potential reasons why it could be
> advantageous to react to the fraction of packets CE-marked rather than
> the fraction of bytes CE-marked:
>
> (1) AFAICT byte counters distort the path's ECN marking probability
> more than using packet counters. For example, suppose we have a round
> trip with 100 packets sent at roughly uniform intervals across the
> round trip time:
>
> o  99 packets of 1 byte each, all CE-marked
> o 1 packet of 1000 bytes that was not CE-marked
>
> Then the byte-based Prague "frac" ("the fraction, frac, of ACKed bytes
> that carried ECN feedback over the previous round trip") is:
>
>   99 bytes / 1099 bytes ~= .09
>
> Whereas the fraction of ACKed packets that carried ECN feedback is:
>
>    99 packets / 100 packet = .99
>
> So in this toy example there is a >10x difference in the CE "frac"
> signal depending on whether bytes or packets are counted.
>
> And given that these packets were spaced uniformly across the round
> trip, 99% of the time the bottleneck had excess queuing. This 99%
> number is well reflected in a packet-based "frac", but seems to imply
> that the byte-based "frac" approach dramatically underestimates the
> probability that a packet will encounter excessive queuing, aka the
> packet CE marking probability.
>
> The Prague draft in section 1 mentions:
>
> " The Prague CC is a particular instance of a scalable congestion control.
> ...
> For a scalable congestion control B=1, so its response function takes
> the form cwnd = K/p. ...
> p:  Steady-state probability of drop or marking"
>
> So Prague is defined as a scalable congestion control, which has a
> response function that is a function of the probability of ECN
> marking. But AFAICT the "frac" mentioned in the Prague spec is a
> byte-weighted number, and by contrast the fraction of *packets*
> CE-marked is a much better estimate of the probability of a packet
> being CE-marked (which is my interpretation of the somewhat ambiguous
> "probability of drop or marking").
>
> (2) The current Linux TCP reference implementation of TCP Prague does
> not actually use bytes; it uses packets. Likewise, DCTCP and BBRv2 use
> packets rather than bytes. So AFAIK the real-world deployment
> experience with shallow-threshold ECN thus far is almost entirely with
> packet-based algorithms rather than byte-based algorithms. It seems
> risky to specify Prague with a byte-based approach that has not been
> tested, especially given that the byte-based and packet-based
> algorithms can measure massively different signals in some cases (see
> (1) above).
>
> (3) AFAIK byte counters are not available when relying on the AccECN
> ACE field if there is ACK loss, since the CE marks counted in the ACE
> field cannot be properly matched against the size of segments that
> were already ACKed and freed. So in environments where only the ACE
> field is available then this would imply that TCP Prague cannot be
> used (since Prague is specified only in bytes). This would seem to
> significantly limit the utility of the ACE field and/or byte-based
> Prague, in such scenarios. If Prague were defined in terms of packets
> then it seems that perhaps it could be more likely to be useful in
> paths that only support the ACE field and strip out the AccECN option?
>
>
> In summary, if byte counting is considered preferable, IMHO it would
> be good to document in this draft why this is so, change the Linux TCP
> Prague code to use the byte-based approach, and then for the
> definition of "p" in the draft to specify that it means the
> probability that a payload "byte" is CE marked rather than leaving the
> bytes/packets distinction ambiguous.
>
> best regards,
> neal
>