[iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?
Neal Cardwell <ncardwell@google.com> Wed, 10 August 2022 22:11 UTC
Return-Path: <ncardwell@google.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 10E0AC157B42 for <iccrg@ietfa.amsl.com>; Wed, 10 Aug 2022 15:11:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -22.608
X-Spam-Level:
X-Spam-Status: No, score=-22.608 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fZkdqUY6XnbV for <iccrg@ietfa.amsl.com>; Wed, 10 Aug 2022 15:11:24 -0700 (PDT)
Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com [IPv6:2607:f8b0:4864:20::f33]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 51EC4C157903 for <iccrg@irtf.org>; Wed, 10 Aug 2022 15:11:19 -0700 (PDT)
Received: by mail-qv1-xf33.google.com with SMTP id m10so12052979qvu.4 for <iccrg@irtf.org>; Wed, 10 Aug 2022 15:11:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc; bh=sub8LZdkRs2MDS7DW5D0NcDm3FT7382wDwvuyxaQuqw=; b=X4KPdzQjhwZcmFsf+h1ip30IDk0sfw05Q1lrWwjQH7K4te0ybUgNk0x0EjtYf0wGek /9Fr+zMUtlYeBuewmBLHyWc2kPmLr3i7yBBPoHxyKPfCP4IXxY5zN77OA8NVaGskeWY+ i/CODA4p2PEMMpRLZhEeTMZhjEj+1pKasqpH/MUCRNmAZF3yMqUJYcmqMWH/3DfGr722 Oo+/fMuGcaBSTz5scDWOvHANfWM6wISbM5iVO6iAliavMAwTwJHAWWp1KKVoVmpCH9FW dPWOfUN4tmOwUGsC5LI/i1X7Us0iwngQolAQu+ZJo0t21OR//JirbPVtqmSF5YaYP/L9 wK6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc; bh=sub8LZdkRs2MDS7DW5D0NcDm3FT7382wDwvuyxaQuqw=; b=d2mQTNpQzS1+Q9CQzNvkslgnwk07apQnbiwKrjLMGpLEvPCwFhH37eB/FGgwckuebp A56B2Der4bnJuL1PyHaRJs9zu4JfyapXuo6v+SUd0UW/RVcNkESAdK2JTjiYBKLRXLs3 w0Sat5/ncP9zDLaf9Nky+LLr30NQX8aKuVki5mPTfMcjE3kDHaCyFGLhLugFSQb1DJoA 0dT7BCnb4xTYQCeD2zTVWHi447LXKMh2JfNgWMx/qS8qpldG4LsZlWkJ8wNIKX3mXWD1 K8HSkZSR+VXGEiCFbZy8+D3gGfQvFaFmzCGNykrAURKgM+C3hKZet/yYYEBN7qxtYuP3 QrFg==
X-Gm-Message-State: ACgBeo0VwBC1R86fCSc54OPSDC0Cv0HYKWmXBX1hr3SGGkodY/DNsW4/ v0kX6bGx+G+ES5x1zS7owecZd2grC+poeFi7PuMCPw==
X-Google-Smtp-Source: AA6agR5PeW7YJxJ6FJwgFDSmc39zEjEErRSbmnu7mGRwlptmezAu6cDgDHHs4VJ2iTBsPzoz629gAp2YdOJlVLIECIs=
X-Received: by 2002:ad4:4ee8:0:b0:474:6f9c:a103 with SMTP id dv8-20020ad44ee8000000b004746f9ca103mr25679262qvb.47.1660169478282; Wed, 10 Aug 2022 15:11:18 -0700 (PDT)
MIME-Version: 1.0
From: Neal Cardwell <ncardwell@google.com>
Date: Wed, 10 Aug 2022 18:11:02 -0400
Message-ID: <CADVnQykxwaqZTGXR-ZMYLEem0rKfAcT7KkHYgsF4dBdWvi2k4w@mail.gmail.com>
To: Bob Briscoe <ietf@bobbriscoe.net>, "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>, "De Schepper, Koen (Koen)" <koen.de_schepper@nokia.com>
Cc: iccrg IRTF list <iccrg@irtf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/ePpX9rC9mTc_RPOZrIaa2i2KRWk>
Subject: [iccrg] draft-briscoe-iccrg-prague-congestion-control: CE-marked bytes or packets?
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Aug 2022 22:11:25 -0000
Re: https://datatracker.ietf.org/doc/html/draft-briscoe-iccrg-prague-congestion-control-01 and the passages: "2.3.2. Moving Average of ECN Feedback ...it measures the fraction, frac, of ACKed bytes that carried ECN feedback over the previous round trip. ... 2.4.3. Additive Increase and ECN Feedback ...a Prague CC applies additive increase irrespective of its CWR state, but only for bytes that have been ACK'd without ECN feedback. ... This approach reduces additive increase as the marking probability increases..." I was curious about the design choice to specify that the algorithm reacts to the fraction of *bytes* that have been CE-marked instead of the fraction of *packets*. IMHO it would be useful for the document to outline the motivation. Apologies if I have missed this in previous e-mail discussions or presentations. I may well have. :-) I can imagine a number of potential reasons why it could be advantageous to react to the fraction of packets CE-marked rather than the fraction of bytes CE-marked: (1) AFAICT byte counters distort the path's ECN marking probability more than using packet counters. For example, suppose we have a round trip with 100 packets sent at roughly uniform intervals across the round trip time: o 99 packets of 1 byte each, all CE-marked o 1 packet of 1000 bytes that was not CE-marked Then the byte-based Prague "frac" ("the fraction, frac, of ACKed bytes that carried ECN feedback over the previous round trip") is: 99 bytes / 1099 bytes ~= .09 Whereas the fraction of ACKed packets that carried ECN feedback is: 99 packets / 100 packet = .99 So in this toy example there is a >10x difference in the CE "frac" signal depending on whether bytes or packets are counted. And given that these packets were spaced uniformly across the round trip, 99% of the time the bottleneck had excess queuing. This 99% number is well reflected in a packet-based "frac", but seems to imply that the byte-based "frac" approach dramatically underestimates the probability that a packet will encounter excessive queuing, aka the packet CE marking probability. The Prague draft in section 1 mentions: " The Prague CC is a particular instance of a scalable congestion control. ... For a scalable congestion control B=1, so its response function takes the form cwnd = K/p. ... p: Steady-state probability of drop or marking" So Prague is defined as a scalable congestion control, which has a response function that is a function of the probability of ECN marking. But AFAICT the "frac" mentioned in the Prague spec is a byte-weighted number, and by contrast the fraction of *packets* CE-marked is a much better estimate of the probability of a packet being CE-marked (which is my interpretation of the somewhat ambiguous "probability of drop or marking"). (2) The current Linux TCP reference implementation of TCP Prague does not actually use bytes; it uses packets. Likewise, DCTCP and BBRv2 use packets rather than bytes. So AFAIK the real-world deployment experience with shallow-threshold ECN thus far is almost entirely with packet-based algorithms rather than byte-based algorithms. It seems risky to specify Prague with a byte-based approach that has not been tested, especially given that the byte-based and packet-based algorithms can measure massively different signals in some cases (see (1) above). (3) AFAIK byte counters are not available when relying on the AccECN ACE field if there is ACK loss, since the CE marks counted in the ACE field cannot be properly matched against the size of segments that were already ACKed and freed. So in environments where only the ACE field is available then this would imply that TCP Prague cannot be used (since Prague is specified only in bytes). This would seem to significantly limit the utility of the ACE field and/or byte-based Prague, in such scenarios. If Prague were defined in terms of packets then it seems that perhaps it could be more likely to be useful in paths that only support the ACE field and strip out the AccECN option? In summary, if byte counting is considered preferable, IMHO it would be good to document in this draft why this is so, change the Linux TCP Prague code to use the byte-based approach, and then for the definition of "p" in the draft to specify that it means the probability that a payload "byte" is CE marked rather than leaving the bytes/packets distinction ambiguous. best regards, neal
- [iccrg] draft-briscoe-iccrg-prague-congestion-con… Neal Cardwell
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… rrs
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… Neal Cardwell
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… Neal Cardwell
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… Sebastian Moeller
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… Sebastian Moeller
- Re: [iccrg] draft-briscoe-iccrg-prague-congestion… Neal Cardwell