Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 1 (Was: Re: ProceedingCUBICdraft- thoughts and late follow-up)

Bob Briscoe <ietf@bobbriscoe.net> Fri, 19 August 2022 17:10 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2F3DEC14F73E; Fri, 19 Aug 2022 10:10:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 58tW75Y0eAuv; Fri, 19 Aug 2022 10:10:40 -0700 (PDT)
Received: from mail-ssdrsserver2.hostinginterface.eu (mail-ssdrsserver2.hostinginterface.eu [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84E8BC1522A7; Fri, 19 Aug 2022 10:10:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Content-Type:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=gw/mRYnq0gNj2BVmXPYxTAxBIEHizUPBfEWglEE5Paw=; b=jZCuDkiFnoDGVTMJKaAudT38GN jkaml2zbQmYmJ5pHrjwOPW6w3GAHbLiyqjS9Ut8zXrPT36YadPDg89orYw3jTtSUA4XdS5jbfaRPw RCZHt04F1YlEOrQ6CEjBSA/XMUblpeP7x/AiwVfuuh3fEtXiwLbKmzr+DY4UH4EL0RDG3d+vG+aPO djFUHhusM5uzoC/3yJuhDRK03F/ZrVyxzpKT1h+sd16U3G9m2M2LaVTvFGb+xiu5c5kIXWpifTb38 CLTdx0/PH0TISZv3hE1+xM9n2FLiavsb4aL3Th7R5dtXb3Y0xJzBA3EpFkNAglBMno0FwZIPDD6DH 1pzdrPDQ==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:39894 helo=[192.168.1.4]) by ssdrsserver2.hostinginterface.eu with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.95) (envelope-from <ietf@bobbriscoe.net>) id 1oP5W8-0003iM-0k; Fri, 19 Aug 2022 18:10:35 +0100
Content-Type: multipart/alternative; boundary="------------Q8OujOqeOi1FjObqXM6rf2wm"
Message-ID: <423e29e5-4f92-4d9c-455c-f676533cc367@bobbriscoe.net>
Date: Fri, 19 Aug 2022 18:10:31 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0
Content-Language: en-GB
To: Markku Kojo <kojo@cs.helsinki.fi>, Yoshifumi Nishida <nsd.ietf@gmail.com>
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, Lars Eggert <lars@eggert.org>, tcpm-chairs <tcpm-chairs@ietf.org>
References: <alpine.DEB.2.21.2206061517230.7292@hp8x-60.cs.helsinki.fi> <alpine.DEB.2.21.2206141739100.7292@hp8x-60.cs.helsinki.fi> <CAAK044QqfB1_gnDLNKNd15XskrC1FWhxfmytw8xvSu9uCHFRWQ@mail.gmail.com> <alpine.DEB.2.21.2206200339080.7292@hp8x-60.cs.helsinki.fi> <CAAK044QiP-EvLR0MAbpfH274+M0KyhO9v_qop1tBKcUW6EVBZw@mail.gmail.com> <alpine.DEB.2.21.2206301512430.7292@hp8x-60.cs.helsinki.fi>
From: Bob Briscoe <ietf@bobbriscoe.net>
In-Reply-To: <alpine.DEB.2.21.2206301512430.7292@hp8x-60.cs.helsinki.fi>
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hostinginterface.eu
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hostinginterface.eu: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hostinginterface.eu: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/zzUJ3cx01nHmxoMfa45U5uU9Vv4>
Subject: Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 1 (Was: Re: ProceedingCUBICdraft- thoughts and late follow-up)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2022 17:10:44 -0000

Markku,


On 04/07/2022 01:09, Markku Kojo wrote:
> Hi Yoshi,
>
> CC'ing also Bob as we haven't heard his view on my explanation w.r.t 
> why the model for determining alpha in incorrect. Bob, my explanation 
> is available at
>
> https://mailarchive.ietf.org/arch/msg/tcpm/bds-h_a6-NliTjx-ZqUSaFpSSnA/

[BB] Recently I responded to this email about CUBIC's Reno-Friendly 
region, but I didn't respond to your new points below about high-BDP 
competition.


>
> On Tue, 21 Jun 2022, Yoshifumi Nishida wrote:
>
>> Hi Markku,
>>
>> I think the important point is "any potentially incorrect behaviour 
>> that later gets observed must of
>> course be reconsidered and corrected."  I believe all people in this 
>> community will agree with it.
>> So, I think the gap in the discussion is whether we publish the doc 
>> and fix it when we see some issues or
>> we don't publish the doc until we can think there's no potential issue.
>>
>> If we chose the prior one and we found some issues right after we 
>> publish the doc, then I will admit the
>> decision was a mistake.
>> But, I think the possibility for it is very low as we probably will 
>> need to monitor extensively as you
>> mentioned.
>
> When I said
>
>  "any potentially incorrect behaviour that later gets observed must of
>  course be reconsidered and corrected."
>
> I referred to RFC 9002. I believe we all agree on that as you said.
>
> However, what I meant was that we should not repeat any mistake done 
> with publishing some earlier document and use that as an excuse for 
> repeating the mistake instead of fixing or documenting possible issues 
> with a draft before publishing it.
>
>> In the end, we all know evaluating CC is not an easy task at all and 
>> CUBIC is not an exception. Hence, I
>> think
>> the publishing and fixing later strategy isn't a bad idea here. OTOH, 
>> If we choose the latter strategy, I am
>> concerned
>> it may take years for publishing and it won't be good for the community.
>
> What is important is that we look into the issues that have been 
> raised and separately decide how to handle each of them. Now all 
> discussions have been at very general level not tackling the actual 
> issues at all. Each of the issues require (potentially) a different 
> resolution.
>
>> So, if we want to hold this process, I would like to see some solid 
>> evidence for the negative impacts in the
>> current CUBIC logics.
>> Some might say the people who propose CUBIC should prove it's safe, 
>> which I can agree with when many people
>> are skeptical about it.
>> But, from my point of view, the current situation is the opposite. 
>> Hence, I think the people who oppose
>> publishing should prove its risk.
>
> Having data to prove something is always important. Regarding the 
> issue 1 that we are discussing in this thread:
>
> 1) The original paper that tried to validate the model for determining
>   the AI factor alpha failed its validation attempt.
>
> 2) Bob and I have carefully described why the model is incorrect (my
>   explanation corrected Bob's analysis for the tail-drop case and
>   complemented Bob's analysis for the AQM case).
>
> 3) In the tail-drop case, which is the classical case and quite likely
>   still the default for most bottlenecks in the Internet, when a Reno
>   CC and CUBIC flow are competing, the CUBIC sender will opt out roughly
>   every second cwnd reduction and thereby have a significant negative
>   impact on the competing Reno CC flow.
>   I thought you agreed my explanation of the problem was correct for the
>   tail-drop case?
>   Do you think that opting out every second cwnd reduction is not worth
>   addressing in any way when the correct operation would be to reduce
>   cwnd every time like a competing TCP-compatible flow does? Do you think
>   that it does not have significant enough impact? For some evidence, see
>   the next item.
>
> 4) For some evidence simply look at the original CUBIC paper [HRX08].
>    It clearly reveals that CUBIC dominates Reno TCP (SACK TCP) in the
>    regions where SACK TCP alone is able to fully utilize the available
>    bandwidth. In Figure 10 (c) in all cases up until 200 Mbps SACK TCP
>    is able to fully utilize the bottleneck link. However, when SACK TCP
>    competes with CUBIC, CUBIC steals bandwidth from SACT TCP. Even in
>    the case with 400 Mbps where SACK TCP leaves less than 10% of the
>    link capacity underutilized, CUBIC stels the b/w from SACK TCP leaving
>    only less than 10% to SACK TCP. Also in Fig 10 (a) with 40-160 ms RTT,
>    CUBIC steals clearly more capasity from SACK TCP than what SACK TCP
>    alone is not able to utilize in these cases.

[BB] Here you have now switched to complaining about the high BDP region.

Given the above, I can think of two reasons why the IETF might still 
want to make CUBIC stds track:

 1. Lack of robustness of Reno to noise (occasional losses due to short
    flows or transmission errors)
 2. The higher the harm ratio, the rarer the likelihood of the scenario


      1. Robustness to noise

The experiments you point to in the original CUBIC paper were conducted 
on a testbed. Although it is possible to find noise-free conditions over 
some paths on the Internet, it is more common to experience noise. In a 
noisy environment (e.g. with losses due to occasional short flows, or 
variations in capacity), Reno does not manage to hold on to the share of 
capacity shown in the Reno-v-Reno cases of those figures. Under even 
slightly noisy conditions, CUBIC is better at using the capacity that 
Reno cannot use.


      2. Prevalence and severity

IIRC, the RFCs that discuss 'fairness' or harm do not mention 
prevalence. I think this is a mistake. If the worst cases of relative 
performance are in the least likely scenarios, I think it is reasonable 
that they should be given less weight in our decision-making.

For instance, it is rare to find a path that is high enough BDP for 
CUBIC to switch into true CUBIC mode but still low enough bandwidth for 
starvation to be a problem. High RTT flows are already fairly rare. The 
chances of a high RTT Reno and a high RTT CUBIC flow competing over a 
sustained period will be more than doubly rare, because there is (now) 
little Reno deployed. When such rare events do occur, as long as the 
Reno flow makes decent progress, I believe we do not need to be overly 
concerned.

For instance, eye-balling the charts you point to. the lowest absolute 
throughput of a Reno flow is about 20 Mb/s (in the 400Mb/s, 160ms case). 
That's certainly not 'fair' relative to 360 Mb/s for CUBIC, but 20 Mb/s 
is still decent progress.

If we plot harm ratios by eyeballing Fig 10a, we see that harm becomes 
high as RTT becomes less typical:

Link: 400 Mb/s 	Reno throughput [Mb/s] 	Harm ratio
RTT [ms] 	vs. Reno 	vs. CUBIC
10 	178 	200 	-12%
20 	180 	160 	11%
40 	181 	98 	46%
80 	156 	50 	68%
160 	130 	20 	85%


The harm ratio is defined as "How much Reno loses when competing against 
CUBIC relative to when competing against another Reno". For example, in 
the last row, it's (130 - 20)/130.

It's hard to judge harm against prevalence, but I think it's reasonable 
to say that 46% and 68% harm are OK, whereas 85% would not be OK if 
typical, but given RTT=160ms is fairly uncommon, and the chances of long 
Reno meeting long CUBIC are even less common, it seems reasonable to me.

[BTW, you haven't pointed out Fig 10b) where, at low RTT (10ms), as long 
as capacity is not low (e.g. 10 Mb/s), Reno does better against CUBIC 
than against itself (negative harm), which is useful data regarding the 
Reno-Friendly regime without an AQM.]

(Note: I've not included the 400ms RTT case given in the paper, as I 
think that goes beyond the scope of terrestrial networks  - we can 
expect that new CCs will be needed there.)


      3. Process

You are pointing out results in a paper that were well-known to those of 
us working on this before we started the process of moving CUBIC to the 
stds track. If I were a chair, I would not allow concerns to hold up the 
WG's progress when they have been well-known for 14 years but are now 
posted /after/ WGLC has ended.

You have pointed out that RFC5681 says "a TCP MUST NOT be more 
aggressive than the following algorithms allow", which it inherited from 
RFC2581. However, the WG is proposing a stds track RFC so, if the WG 
chooses to, it can update that aspect of RFC5681 to match the more 
liberal fairness advice in RFC5033 for alternative (especially high 
speed) congestion controls, or perhaps the quantified advice in RFC8085, 
which allows a flow to "compete fairly within an order of magnitude" (I 
assume that means a rate ratio of up to 10:1, but it's not totally clear).


>
>    Also the draft cites paper [HLRX07] in section 5.2 and says
>
>     "Our test results in [HLRX07] indicate that CUBIC uses the spare
>      bandwidth left unused by existing Reno TCP flows in the same
>      bottleneck link without taking away much bandwidth from the existing
>      flows."
>
>    If you look at the Fig 3 a in [HLRX07], it clearly indicates that Reno
>    TCP (SACK TCP) is able to fully utlize the available link capacity 
> like
>    all other variants with 20 and 40 ms RTT and almost fully with 80 ms
>    RTT. However, if we look at the Fig 12 a, where SACK TCP is clearly
>    shown to be friendly to itself with the above RTTs, we see that
>    CUBIC steals a notable amount of link capacity from SACK TCP with the
>    same RTTs.
>
>    Do you think that the above text claiming "that CUBIC uses the spare
>    bandwidth left unused by existing Reno TCP flows in the same
>    bottleneck link without taking away much bandwidth from the existing
>    flows" is correct statement and can be published as an objective
>    statement?

[BB] This paper is about precisely what I was just saying about noise.
I agree with you that text is incorrect. It should say

     "The test results in [HLRX07] indicate that,*in typical cases with 
a degree of background traffic,* CUBIC uses the spare
      bandwidth left unused by existing Reno TCP flows in the same
      bottleneck link without taking away much bandwidth from the existing
      flows."

Let's tabulate those results using the 40ms case ('cos Fig 3b doesn't 
give a point at 20ms)
40ms case in Figs 3 & 12 of [HLRX07] 	Flow B 	Utilization
(lone flow B) 	rate ratio
B arrives staggered
after A (Reno)
[lower/higher rate] 	lower (Reno)
flow's utilization 	Harm ratio
No noise 	Reno 	94% 	95% 	46% 	0%
No noise 	CUBIC 	95% 	19% 	15% 	67%
Background noise 	Reno 	82% 	88% 	38% 	0%
Background noise 	CUBIC 	93% 	38% 	26% 	33%


Note: the noise used for the utilization results was different from that 
for the friendliness, but let's do as you have and pretend these results 
are comparable.

So in the noise case, Reno loses 33% of what it would have got if 
competing with another Reno flow.
And in the less likely case of no noise, Reno loses 67% of ---ditto---.

Are you saying we should be concerned by those figures? I'm not 
concerned by either.


>
> Given all the above, do you think we can ignore the issue 1 and just 
> hide it?

[BB] This is not Issue 1, which was about the Reno-Friendly equation. 
You are raising an old point about the CUBIC regime at high BDP. Surely 
you understand that the WG is entitled not to have to reconsider old 
questions if they are raised after WGLC. However, I do think the wording 
you've pointed out regarding the [HLRX07] reference is worth changing, 
as suggested above.



Bob

>
> Thanks,
>
> /Markku
>
>> -- 
>> Yoshi
>>
>> On Mon, Jun 20, 2022 at 4:42 PM Markku Kojo <kojo@cs.helsinki.fi> wrote:
>>       Hi Yoshi,
>>
>>       On Wed, 15 Jun 2022, Yoshifumi Nishida wrote:
>>
>>       > Hi Markku,
>>       >
>>       > Thanks for the response. Yes, you got valid points. But, I 
>> still have some comments.
>>       >
>>       > First thing I would like to clarify is that we acknowledge 
>> the model used for CUBIC has not
>>       been validated as
>>       > you pointed out.
>>
>>       Note that the model is not only unvalidated but it is also 
>> *incorrect*,
>>       that is, it does not perform as intended. And, the reason for the
>>       incorrect behaviour is different with a bottleneck at a 
>> tail-drop router
>>       and with a bottleneck at an AQM router.
>>
>>       > However, at the same time, I believe it doesn't mean the 
>> model has significant threats to the
>>       Internet. We've
>>       > never seen such evidence even though CUBIC has been widely 
>> deployed for a long time.
>>
>>       This seems to be something that many don't quite understand: we 
>> cannot see
>>       any such evidence unless somebody measures the impact to the other
>>       competing traffic. It is not visible by observing the behaviour 
>> of a CUBIC
>>       sender that people deploying CUBIC are interested in and are 
>> likely
>>       to monitor quite extensively. The early measurements published 
>> along with
>>       the original CUBIC paper already show this problem as I have 
>> pointed out
>>       but the draft claims the opposite. AFAIK since then no published
>>       measurement have been carried out.
>>
>>       > I am personally thinking
>>       > that we will need to see tangible evidence for the threats to 
>> leave out the fact that it has
>>       been widely
>>       > used.
>>
>>       No, I don't think so. The responsibility of showing that there 
>> is no
>>       threats (or that CUBIC behaves as intended) is for those 
>> proposing an
>>       alternative CC algorithm.
>>
>>       > The second thing I would like to mention is that I am not 
>> sure how many drafts have been passed
>>       through the
>>       > RFC5033 process.
>>       > For example, RFC8995, RFC9002 are congestion control related 
>> standard docs, but in my
>>       understanding, the
>>       > process had not been applied to them.
>>
>>       Sorry for being unclear. I meant all stds track TCP congestion 
>> control
>>       RFCs.
>>
>>       RFC8985 is a loss-detection algorithm, it does not specify any new
>>       congestion control algos nor does it modify any existing CC 
>> algos as the
>>       RFC clearly states. Indeed, it has a potential to make TCP quite
>>       aggressive. Therefore, IMO tcpm wg should be very careful in 
>> reviewing
>>       any congestion control algo proposal that possibly employs RACK 
>> for loss
>>       detection.
>>
>>       I did not have cycles to closely follow the wg discussions on 
>> RFC 9002, so
>>       I cannot tell how thoroughly it was evaluated before publishing.
>>       Quite likely not quite to the extent that RFC 5033 requires. 
>> While it
>>       follows quite closely and for most part the current stds track TCP
>>       behaviour (NewReno / Reno CC), it has elements that potentially 
>> are more
>>       agressive than current stds track TCP CC algos. If it was not 
>> evaluated
>>       for all parts as RFC 5033 requires, it is a mistake that has 
>> happened and
>>       any potentially incorrect behaviour that later gets observed 
>> must of
>>       course be reconsidered and corrected.
>>
>>       > Some may say that because these proposals are not big 
>> threats, but from my point of view, they
>>       are more
>>       > aggressive than NewReno in some ways.
>>       > I am not sure what's the clear differences between CUBIC 
>> draft and them. I personally haven't
>>       seen very solid
>>       > evidence that they are not unfair to the current standards.
>>       > We may need to redefine or enhance the process in the future, 
>> but at this point, I personally
>>       don't have a
>>       > strong reason to set a high bar only for this draft. Because 
>> I believe all docs should be
>>       treated equally.
>>
>>       IMO, if the IETF made a mistake with one (or some) earlier 
>> published RFCs
>>       that must never be used as an excuse to repeat the same mistake.
>>
>>       > Hence, describing the fact that the CUBIC draft hasn't passed 
>> the RFC5033 process in the doc
>>       looks
>>       > sufficient to me.
>>
>>       What would be the rational reason to hide the fact that the 
>> model CUBIC
>>       uses to determine alpha is incorrect?
>>
>>       Thanks,
>>
>>       /Markku
>>
>>       > Thanks,
>>       > --
>>       > Yoshi
>>       >
>>       >
>>       > On Tue, Jun 14, 2022 at 8:02 AM Markku Kojo 
>> <kojo@cs.helsinki.fi> wrote:
>>       >       Hi Yoshi,
>>       >
>>       >       I moved your comment and the discussion on your reply 
>> under this thread on
>>       >       the Issue 1 (see below)
>>       >
>>       >       On Tue, 14 Jun 2022, Markku Kojo wrote:
>>       >
>>       >       > Hi all,
>>       >       >
>>       >       > this thread starts the discussion on the issue 1: the 
>> incorrect model for
>>       >       > determining CUBIC alpha for the congestion avoidance 
>> (CA) phase (Issue 1 a)
>>       >       > and the inadequate validation of a proper constant C 
>> for the CUBIC window
>>       >       > increase function (Issue 1 b).
>>       >       >
>>       >       >
>>       >       > Issue 1 a)
>>       >       > ----------
>>       >       >
>>       >       > The model that CUBIC uses to be fair to Reno CC (in 
>> Reno-friendly region) is
>>       >       > unvalidated and actually incorrect.
>>       >       >
>>       >       > A more detailed description of the issue:
>>       >       >
>>       >       > The original paper manuscript that CUBIC bases its 
>> behaviour in the
>>       >       > Reno-friendly region did a preliminary attempt to 
>> validate the model but
>>       >       > failed (and the paper never got published). This is 
>> the only known attempt to
>>       >       > validate the model and even this failed validation 
>> attempt was quite light,
>>       >       > consisting of only a couple of network settings and 
>> obviously did not use any
>>       >       > replications for the results shown in the paper. 
>> Hence, even the statistical
>>       >       > validity of the results remains questionable. Results 
>> were shown only for a
>>       >       > setting with AQM enabled at the bottleneck router. 
>> The results for a
>>       >       > tail-drop case are missing in the paper manuscript.
>>       >       >
>>       >       > The report (creno.pdf, see a pointer to the doc in 
>> the email pointed to
>>       >       > below) that Bob wrote provides some explanation why 
>> the model does not give
>>       >       > correct results and thereby the resulting behaviour 
>> presented in the original
>>       >       > paper notably deviates from that of Reno CC. The 
>> email that I wrote to the wg
>>       >       > list
>>       >       >
>>       >       > 
>> https://mailarchive.ietf.org/arch/msg/tcpm/bds-h_a6-NliTjx-ZqUSaFpSSnA/
>>       >       >
>>       >       > complements Bob's explanation for the AQM case and 
>> corrects Bob's analysis
>>       >       > for the tail-drop case, explaining why the model is 
>> incorrect for the
>>       >       > traditional and still today prevailing tail-drop 
>> router case.
>>       >       >
>>       >       > Consequently, the use of the incorect model results 
>> in unknown behaviour of
>>       >       > CUBIC when in the Reno-friendly region. Moreover, it 
>> is quite likely that the
>>       >       > behaviour is different with different AQM 
>> implementations at the bottleneck,
>>       >       > resulting in even more random behavior. This alone is 
>> very problematic and
>>       >       > becomes more problematic when considering how moving 
>> out from the
>>       >       > Reno-friendly region is specified: when the genuine 
>> CUBIC formula gives a
>>       >       > larger cwnd than the cwnd that the Reno-friendly 
>> model gives, CUBIC moves to
>>       >       > the genuine CUBIC mode that is significantly more 
>> aggressive than Reno CC.
>>       >       >
>>       >       > Therefore, if the incorrect model gives too low cwnd 
>> for mimicked Reno CC,
>>       >       > CUBIC moves too early to the genuine CUBIC mode and 
>> becomes too agggressive
>>       >       > too early even though it should behave equally 
>> aggressive as Reno CC. On the
>>       >       > other hand, if the incorrect model gives too large 
>> cwnd, CUBIC is too
>>       >       > aggressive throughout the Reno-friendly region.
>>       >       > In summary, if the model is not correct, it results 
>> in more aggressive
>>       >       > behaviour than Reno CC no matter which direction the 
>> model fails.
>>       >       >
>>       >       > And very importantly: some people have suggested that 
>> CUBIC should replace
>>       >       > the current stds track CC algos and become the 
>> default. The behaviour of Reno
>>       >       > CC is very thoroughly studied and very well 
>> understood. If we replace it with
>>       >       > *unknown* behaviour, how can we anymore specify what 
>> is the correct and
>>       >       > allowed aggressiveness for any upcoming CC when the 
>> behaviour of the new
>>       >       > default itself is unknown, making comparative 
>> analysis of other CCs against
>>       >       > CUBIC in the Reno-frindly region very difficult? The 
>> behaviour is assumed to
>>       >       > be the same as Reno CC but the actual behaviour is 
>> random, it may be 2 times
>>       >       > or 8 times more aggressive than Reno, for example.
>>       >
>>       >
>>       >       On Tue, 7 Jun 2022, Yoshifumi Nishida wrote:
>>       >
>>       >       > Hi Markku,
>>       >       >
>>       >       > Thanks for the detailed feedback. This is very useful.
>>       >       > One thing I would like to clarify is that we’ve 
>> already acknowledged the
>>       >       > TCP friendly  model in the draft has some unsolved 
>> discussions. But, I
>>       >       > believe our > current > consensus is to not change 
>> the logics for it in
>>       >       > the current draft as it will require long term 
>> evaluations.
>>       >       >
>>       >       > So, I would like to check if you’re suggesting we 
>> should update the
>>       >       > draft against it or you have some ideas to address 
>> these issues in
>>       >       > some ways (e.g adding more clarification in the 
>> draft, mentioning it in
>>       >       > the write-up, etc)
>>       >       >
>>       >       > Thanks,
>>       >       > --
>>       >       > Yoshi
>>       >
>>       >       I think the problem is even trickier because it is hard 
>> to see how it
>>       >       would be possible to correct the model that is based on 
>> wrong
>>       >       assumptions. This said, it is important for the wg to 
>> consider whether it
>>       >       is ready to suggest publishing a congestion control 
>> algorithm that is not
>>       >       correct and has not been validated. And, if the answer 
>> is yes, how to
>>       >       justify it and what would be the appropriate status for 
>> the RFC as well
>>       >       as the way forward after publishing the draft.
>>       >
>>       >       I fully symphatize those who have deployed CUBIC and 
>> understand that
>>       >       there is a pressure to publish the draft with no 
>> modifications to what has
>>       >       been implemented. However, RFC 5033 was written 
>> specifically to avoid this
>>       >       kind of situation where an CC algo has been (widely) 
>> deployed and only
>>       >       then brought to IETF standardization. It is 
>> understandable that those who
>>       >       have deployed the CC algo would be very reluctant to 
>> modify the algo. On
>>       >       the other hand, AFAIK all current stds track CC algos 
>> have had various
>>       >       issues that have been brought up during the 
>> standardization process but
>>       >       these issues have been resolved before publishing the 
>> draft. So, why we
>>       >       should make an exception? IMO, wide deployment cannot 
>> be the answer
>>       >       because it does not automatically reveal the negative 
>> impact to other
>>       >       traffic but specific comparative measurements must be 
>> carried out.
>>       >       Also, why should the IETF set a precedent for any 
>> future congestion
>>       >       control drafts, implying that it is ok to first deploy 
>> a CC algo and then
>>       >       bring it to IETF and use the (wide) deployment as an 
>> argument against
>>       >       modifying it regardless of whatever issues it might have?
>>       >
>>       >       So, I don't have a good answer. IMO, if the draft is 
>> published with
>>       >       unresolved issues, the draft itself must clearly 
>> identify and document
>>       >       the issues and give some kind of justification and a 
>> clear way forward.
>>       >       That is, we must ensure there is an initiative set and 
>> path to follow in
>>       >       order to correct any shortcomings in a published RFC. 
>> Otherwise, the
>>       >       issues are very likely ignored and forgotten forever.
>>       >
>>       >       Thanks,
>>       >
>>       >       /Markku
>>       >
>>       >
>>       >
>>       >       > Issue 1 b)
>>       >       > ----------
>>       >       >
>>       >       > Another issue related to the operating in the 
>> Reno-friendly region is the
>>       >       > question when CUBIC should operate in the 
>> Reno-friendly region and when it
>>       >       > may move out of it. Obviously CUBIC should stay in 
>> the Reno-friendly region
>>       >       > when Reno CC would be able to fully utilize the 
>> available network capacity.
>>       >       > In practice, this is specified by selecting the value 
>> for constant C in the
>>       >       > formula that is used to determine cwnd in the 
>> "genuine" CUBIC mode. However,
>>       >       > selecting a proper value for C has not been properly 
>> validated in a wide
>>       >       > range of environments as required in RFC 5033.
>>       >       >
>>       >       > Preliminary validation of constant C has been done 
>> for the original CUBIC
>>       >       > paper. That is good enough for a scientific paper but 
>> not adequate for an
>>       >       > IETF stds track algo. There seems to be no additional 
>> evaluation since the
>>       >       > timeframe of the CUBIC paper publication around 15 
>> years ago. Particularly,
>>       >       > there seems to be no evaluation with AQM at the 
>> bottleneck router or with a
>>       >       > buffer-bloated bottleneck router, not to mention many 
>> other network
>>       >       > environments. Nor is there any data available for a 
>> non-SACK TCP sender.
>>       >       >
>>       >       > The evaluation of 1 a) and 1 b) must be done 
>> separately. Othserwise, it is
>>       >       > very hard to tell whether any deviations are due to 
>> the incorrect model or
>>       >       > incorrect value of C. The original CUBIC paper and 
>> some other papers show
>>       >       > that CUBIC is not fair to Reno CC in certain network 
>> conditions where Reno CC
>>       >       > has no problems in utilizing the available network 
>> capacity; instead, CUBIC
>>       >       > steals capacity from Reno CC.
>>       >       >
>>       >       > Thanks,
>>       >       >
>>       >       > /Markku
>>       >       >
>>       >
>>       >
>>       >
>>
>>
>>

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/