Re: [iccrg] editorial comments on draft-briscoe-iccrg-prague-congestion-control-02

Sebastian Moeller <moeller0@gmx.de> Sat, 05 August 2023 08:37 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.4\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <176686bb-b75a-a545-5ab7-6a9cc6ce097a@bobbriscoe.net>
Date: Sat, 05 Aug 2023 10:37:19 +0200
Cc: Neal Cardwell <ncardwell@google.com>, iccrg IRTF list <iccrg@irtf.org>, Greg White <g.white@cablelabs.com>, "De Schepper, Koen (Koen)" <koen.de_schepper@nokia.com>, Vidhi Goel <vidhi_goel@apple.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <7553D74D-5050-4E4A-947A-7CC59F97E584@gmx.de>
References: <CADVnQynoZxSX1biBDkGV-PV5zQP4vgxuG=9t8HfNm80_q+zdeg@mail.gmail.com> <176686bb-b75a-a545-5ab7-6a9cc6ce097a@bobbriscoe.net>
To: Bob Briscoe <ietf=40bobbriscoe.net@dmarc.ietf.org>
UI-OutboundReport: notjunk:1;M01:P0:kTEcexN/QOU=;fWgxN0RJW0VvKdQ0zhlD2pPZwbg JkpD6g1VJ0WcEx08wzZNvU4dfKwcDEeTh2q3WBIQbgi3yw9INVZ3aXv6bb0CgTxSV5218v6cd pkXZkGbQITA0PRV3094ffc7FXBwLlCZhR3KbbfgkvK6Rv55I0hknMYIAqxPfGGBagXl3SWVSx jW8Wtjb9HuTKuYHBZ+BXKj6IKzxvaPzpdOV8GrlozclXUiTI+lde582N962jpepSdHSgiFWyH WjetqcWKbLmiW6oUXWbzG1b1reFnmW0C70t5i0aTDMHqBmAp0ptTK7OoGJjuZ4St/XcK+IRco XLUbmSQojWt86DE9WGRaviuOwCjbALu9gzOCqRbXsu8AsT3oxk7XBvxmmHqUI0mg6xDfm17S4 ZMPlclBWcXdhflxEZ/2J1D0sQyA9Ajea8TyihQbB7GiV5uv1PXoacv0ia3/0KUU3kCFu7RcBa epGnSQXhTwsJM42Raq78JyNf3xkT96s2PXPcD4hUeJa5mjd1QU0eA0p7D+xYEu3DmUttD0vzS zLZjlM4IgFyFcIYrrEi3cLXbxY2dduZ9WXNSOgjV3llkWpn/k7w8ECOkULgXbHDrtGXK1X4u/ ikWikwo2b60eIwpGeApilPe2rIl1YjgaIYBui7rGYH0UB4eRj4BR6hea8QC51vLb5TgrZJbtS ILzArZ28YB1PBP/5Xrc8H7MpAmyT/vdAwVfS1ih7O+p19/YudA1EDa8q7bL4e1a0v2cYLLxeB 2bgjkt/PC6SZVDvIld1E3p6GfV8Nk6gEgv62yw+zlClIuMpJNz1dlIbPZqyN3EMJfB4NkZcYo zrumDU6gEfjxTvHEFLyweSWtaLeooeweUAGk3t1fdmLOGrn9ZwBOYT4gt5TDOJBrFHuuu2hJp 0WZk20c8hJes3378//PuYHk+bl6O3PoOK6sfwWsQouhnzc7RXj3ZhbtwY42wq3xtToyR2bs1i Lb8+VPUAvoRxTl7C6MJuP5Y7DfQ=
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/-qqx_9y29tCCuNbiWODQD5i6UE8>
Subject: Re: [iccrg] editorial comments on draft-briscoe-iccrg-prague-congestion-control-02
Precedence: list

Question below, prefixed [SM]

> On Aug 5, 2023, at 03:47, Bob Briscoe <ietf=40bobbriscoe.net@dmarc.ietf.org> wrote:
[...]
>> -----------
>> A system wide option is available to disable AccECN negotiation, but the Prague CC module will always override this setting, as it depends on AccECN. Then, solely in this case, AccECN will only be active for TCP flows using the Prague CCA.
>> -->
>> A system-wide sysctl is available to enable or disable AccECN negotiation. However, the Prague CC module overrides this sysctl and will always enable AccECN negotiation, since it depends on AccECN (i.e., when the system-wide sysctl disables AccECN negotiation, TCP flows using the Prague CCA will still attempt AccECN negotiation).
> 
> [BB] Yes, it was badly worded. I've had another go myself:
> 
> A system-wide option is available to enable or disable AccECN negotiation. However, TCP flows using the Prague CCA module depend on AccECN; so they  always ignore this system-wide sysctl and enable AccECN negotiation anyway. 

[SM] This seems to violate the principle of least surprise. If there is a toggle to disable AccECN system-wide it needs to be honored. TCP Prague could maybe write a message in the kernel/system log noting that AccECN is missing and TCP prague might fail somehow. Alternatively at least introduce another sysctl to disable the use TCP Prague completely. The administrator of a system should be in control and that means the administrator can also put the system in odd states. 

> 
>> 
>> -----------
>> A Prague CCA triggers update of its moving average once per RTT by recording the packet it sent after the previous update, then watching for the ACK of that packet to return.
>> -->
>> A Prague CCA triggers update of its moving average ECN mark rate once per rtt_virt [see Section 2.4.4].
>> 
> 
> [BB] Thx for catching this.
> 
>> -----------
>> To maintain its moving average, it measures the fraction, frac, of ACKed bytes or ACKed packets
>> -->
>> [IMHO the spec should specify whether the CCA is measuring using bytes or packets, since the answers may be very different depending on the approach, leading to unfairness between implementations with different approaches. I would argue for using the fraction of packets marked (as IIRC  I have argued on some IETF mailing list or another). And Linux TCP Prague is already doing this.]
> 
> [BB] Agreed that this ought to say just packets, to document what Linux Prague uses. 
> 
> If packet sizes were independently and identically distributed (IID), on average any differences would cancel out, 'cos the distribution of packet sizes is in both the top and bottom of the fraction. That assumes all L4S AQMs mark packets independently of size, which is currently true (and recommended by RFC7141).
> 
> Nonetheless, if packet sizes do vary, they would very likely not be IID. For instance, if one end was sending ECN-capable pure ACKs, it would be likely to be sending a lot in a row, not just randomly. Then measuring bytes would be the right thing (adding a nominal header size to each packet if an exact one were not available).

	[SM] Why? If you want rate fairness (as your "right thing" seems to imply) then just use a rate-equalizing scheduler... 

> BTW, I do remember you raising this on a list somewhere. I meant to reply, and I guess it's still in my todo list somewhere - I'll dig it out.
> 
> If we conclude thatRFC7141 is OK on this point, then we'll need to write something in the future work section under congestion metrics about this (and we'll have to implement it).

	[SM] I have mentioned before that I for one consider RFC7141 to be wrong on the 
" When a transport detects that a packet has been lost or congestion
   marked, it SHOULD consider the strength of the congestion indication
   as proportionate to the size in octets (bytes) of the missing or
   marked packet."

section. A flow should try to get as veridical an estimate about a congestin event as possible and react to that best estimate of the congestion, and if as RFC7141 recommends congestion marking does not take packet size into account, nor should the receiver of the congestion signal.

Sidenote:  RFC7141 is ratified since 9.5 years and has been arguing for this odd dichotomy between encoding congestion signals and interpreting congestion signal since the first draft in 2007. The fact that apparent ZERO implementatins of the recommended approach seem to exist, let alone seem to be quantitatively used over the internet IMHO really should end that folly. Protocol stacks should not make up congestion signal, but simply respond appropriately to the best congestion estimate they can reasonably maintain.

[...]

>> -----------
>> Also, integer rounding bias ought to be removed from the multiplicative decrease calculation.
>> -->
>> [I would suggest spelling out how to do this correctly to increase the odds that this is implemented correctly by implementors that can't look at the GPL tcp_prague.c reference code]
> 
> [BB] I introduced a pseudcode name for the carry variable into the previous sentence, Then added the pseudocode below:
>     "... delay can be made significantly less jumpy by tracking a fractional value, cwnd_carry, alongside the integer window and carrying over any fractional remainder to the next reduction." ... Specifically:
> 
> #define ONE_CWND (1LL << 20)        /* Must be signed */
> #define MAX_ALPHA (1ULL << 20)
> 
> /* On CE feedback, calculate the reduction in cwnd */
>     /* Adding MAX_ALPHA to the numerator effectively adds 1/2 
>      *  which compensates for integer division always rounding down */
>     reduction = (alpha * cwnd * ONE_CWND + MAX_ALPHA) / MAX_ALPHA / 2;
>     cwnd_carry -= reduction;
> 
> /* Round reduction into whole segments and carry the remainder */
>     if (cwnd_carry <= -ONE_CWND) {
>         cwnd_carry += ONE_CWND;
>         cwnd = max(cwnd - 1, MIN_CWND);
>         ssthresh = cwnd;
>     }
> 
> 
>> 
>> 
>> -----------
>> Example functions for the virtual RTT are:
>> 	• rtt_virt = max(srtt, RTT_VIRT_MIN);
>> 	• rtt_virt = srtt + AdditionalRTT;
>> where RTT_VIRT_MIN and AdditionalRTT are constants. The current default is rtt_virt = max(srtt, 25ms), which addresses the main Prague requirement for when the RTT is smaller than typical.
>> -->
>> The virtual RTT, rtt_virt is computed as:
>> 	• rtt_virt = max(srtt, RTT_VIRT_MIN);
>> where RTT_VIRT_MIN = 25ms.This addresses the Prague requirement for Reduced RTT-Dependence when the RTT is smaller than typical public Internet RTTs.
> 
> [BB] The fluffiness is because this is a case where implementations might differ, so I've made it clearer what the Linux implementation does but also left in the other example. Also the constants depend on the deployment environment. Specifically:
> 
> Example functions that implementations might use for the virtual RTT are:
>     rtt_virt = max(srtt, RTT_VIRT_MIN);
>     rtt_virt = srtt + AdditionalRTT;
> where the parameters RTT_VIRT_MIN or AdditionalRTT would be set for a particular deployment environment.
> 
> The Linux implementation of Prague uses the first example and, for the public Internet, it sets RTT_VIRT_MIN=25ms. Thus, Linux Prague defines 
> rtt_virt = max(srtt, 25ms), which addresses the Prague requirement for Reduced RTT-Dependence when the RTT is smaller than typical public Internet RTTs.

	[SM] I still think this is not a general solution*... this really just takes the edge off TCP Pragues increased RTT bias compared to other TCPs but will only be noticeable if the TCP Prague flow has an RTT below 25ms, if we look at competition say between TCP Prague @25ms RTT with TCP Prague@160ms we will still see a considerably larger RTT bias than with say TCP Cubic @25ms and TCP Cubic@160ms.

*) IIRC this was only introduced as a counter-measure after some testing (https://github.com/heistp/l4s-tests) demonstrated quite noticeable increased RTT bias over TCO Cubic:
https://camo.githubusercontent.com/0ca81a2fabe48e8fce0f98f8b8347c79d27340684fe0791a3ee6685cf4cdb02e/687474703a2f2f7363652e646e736d67722e6e65742f726573756c74732f6c34732d323032302d31312d3131543132303030302d66696e616c2f73312d6368617274732f727474666169725f63635f71646973635f31306d735f3136306d732e737667

Especially the middle section with the FIFO. This test was using 10ms versus 160ms RTT and making Prague always act as if having 25 ms RTT ameliorated the issue somewhat, bit did not actually generally solve it. I find it odd to find a section "Reduced RTT-Dependence" in this draft given that essentially TCP Prague comes with a noticeably increased RTT bias (at least the default implementation for Linux). Big fan of truth in advertising.... 


>

Re: [iccrg] editorial comments on draft-briscoe-i… Bob Briscoe
Re: [iccrg] editorial comments on draft-briscoe-i… Sebastian Moeller
Re: [iccrg] editorial comments on draft-briscoe-i… Bob Briscoe
Re: [iccrg] editorial comments on draft-briscoe-i… Sebastian Moeller
Re: [iccrg] editorial comments on draft-briscoe-i… Neal Cardwell
Re: [iccrg] editorial comments on draft-briscoe-i… Sebastian Moeller
Re: [iccrg] editorial comments on draft-briscoe-i… Bob Briscoe