Re: [tcpm] 2nd WGLC for draft-ietf-tcpm-rfc8312bis

Markku Kojo <kojo@cs.helsinki.fi> Thu, 17 February 2022 14:41 UTC

Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC51C3A08F2; Thu, 17 Feb 2022 06:41:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6gSQthIyWfu3; Thu, 17 Feb 2022 06:41:50 -0800 (PST)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E13523A08CC; Thu, 17 Feb 2022 06:41:42 -0800 (PST)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Thu, 17 Feb 2022 16:41:30 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type:content-id; s=dkim20130528; bh=Lnp0qk cBw5TDBtt7KPj6Mtk43b0pcAbMMUR5xZKaJwQ=; b=k8KJWxjaBj04SyKDGXTwLi VMy8NpJFPzQYFFNjBKsUXxidJ+v54UJCTOjzqRpEiJmxOITt3JbYB+lC41zoCqQ0 Ptpu2e5BEDnTs4HKBtE/9m9xkvh3Xr+ZfULe4z1OyZKLLrBgTxghaKecPwpEhB00 AlRZA8KueJNbH0CQhNtUs=
Received: from hp8x-60 (88-113-49-197.elisa-laajakaista.fi [88.113.49.197]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Thu, 17 Feb 2022 16:41:30 +0200 id 00000000005A014E.00000000620E5E9A.0000504D
Date: Thu, 17 Feb 2022 16:41:29 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Yoshifumi Nishida <nsd.ietf@gmail.com>
cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, tcpm-chairs <tcpm-chairs@ietf.org>
In-Reply-To: <CAAK044SUv2pjPSi_9jitNdtTHtGR-DVhiEn77yCf8M6B=bgKwQ@mail.gmail.com>
Message-ID: <alpine.DEB.2.21.2202171538190.4019@hp8x-60.cs.helsinki.fi>
References: <164318837039.21788.17451980682651967578@ietfa.amsl.com> <EEA435EC-AAAC-4899-8E94-2D54EDE5F72E@eggert.org> <CAAK044S9HQXvfvgM6mBuvOWJPHtCaa6xo6CoP2r8Vq61tKaY5g@mail.gmail.com> <alpine.DEB.2.21.2202120048000.4019@hp8x-60.cs.helsinki.fi> <CAAK044SUv2pjPSi_9jitNdtTHtGR-DVhiEn77yCf8M6B=bgKwQ@mail.gmail.com>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-20581-1645108890-0001-2"
Content-ID: <alpine.DEB.2.21.2202171541080.4019@hp8x-60.cs.helsinki.fi>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/TkLSTJGq6W8yo3jfQIpB8MXI7QI>
Subject: Re: [tcpm] 2nd WGLC for draft-ietf-tcpm-rfc8312bis
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Feb 2022 14:41:57 -0000

Hi Yoshi,

On Tue, 15 Feb 2022, Yoshifumi Nishida wrote:

> Hi Markku,
> 
> Thanks for the comments. I think these are very valid points. 
> However, I would like to check several things as a co-chair and a doc shepherd before we
> discuss the points you've raised.
> 
> In my understanding (please correct me if I'm wrong), when this draft was adopted as an WG
> item, I think the goal of the doc was some minor updates from RFC8312 which include more
> clarifications, minor changes and bug fixes. 
> However, if we try to address your concerns, I think we'll need to invent a new version of
> CUBIC something like CUBIC++ or NewCUBIC in the end. 
> I won't deny the value of such doc, but, this seems not to be what we agreed on
> beforehand.  
> if we proceed in this direction, I think we will need to check the WG consensus whether
> this should be a new goal for the doc.
> 
> So, I would like to check if this is what you intend for the doc or you think we can
> address your points while aligning with the original goal.
> Also, if someone has opinions on this, please share.

I think it is important that we remember the status of RFC 8312 and the 
decades long process that has been followed in tsv area for new 
TCP congestion control algorithms that have been proposed and submitted 
to IETF. In order to ensure that new cc algos are safe and fair, the 
process that has been followed for all current stds track TCP cc algos 
has required that the cc algo is first accepted and published as 
experimental RFC and only once enough supportive experimental evidence 
has been gathered the doc has become a candidate to be forwaded to stds 
track. We have even agreed on a relatively strict evaluation process to 
follow when cc algos are brought to the IETF to be published as 
experimental:

https://www.ietf.org/about/groups/iesg/statements/experimental-congestion-control/

RFC 8312 was published as "Informational" and if I recall correctly the 
idea was "just to publish what's out there" for the benefit of the 
community. RFC 8312 was never really evaluated, particularly not in the 
way new cc algos are supposed to be as per the agreed process.

I do not recall what/how exactly was agreed when rfc8312bis was launched 
but I would be very interested to hear the justification why this doc 
does not need to follow the process mentioned above but we would like to 
propose IETF to publish a non-evaluated Informational doc to be published 
"with minor updates", i.e., without actual evaluation, as a stds track 
RFC? If the target really remains as PS then the bar should be even 
higher than what is described for experimental docs in the above process 
document, i.e, what we have followod for experimental to be moved to stds 
track.

The only justification that I have heard has beed "because CUBIC has long 
and wide deployment experience" and "the Internet has not smelted or that 
"we should have noticed if there were problems". We must, however, 
understand that in order to have noticeable bad impact CUBIC should cause 
some sort of congestion collapse. Congestion collapse, however, is not an 
issue with CUBIC nor with any other CC algo that applies an RTO mechanisms 
together with correctly implemented Karn's algo that retains the 
backed-off RTO until an Ack is received for a new (not rexmitted) data 
packet. The issue is fairness to competing traffic. This cannot be 
observed by deploying and measuring the performance and behaviour of CUBIC 
alone. CUBIC being more aggressive than current stds track TCP CC would 
just gives good performance results that one running CUBIC would be happy 
with. One must evaluate CUBIC's impact on the competing (Reno CC) traffic 
in range of environments which requires carefully designed active 
measurements with thoroughly-analyzed results (as required by the above 
process document, RFC 5033 and RFC 2914). What we seem to be missing is 
this evidence on CUBIC's impact and that is something the IETF must focus 
on, not just that whether CUBIC can achieve better performance than other 
existing CCs. The latter has been shown in many publications and is the 
majos focus in  many scientific papers proposing new algos.
I appreciate a lot that CUBIC has been implemented/developped and 
deployed for long and I wonder whether those deploying CUBIC have 
unpublished results the wg could review before taking the decicion?

I suggest everyone to read carefully RFC 2914 Sec 3.2 and particularly 
what it says about more aggressive (than RFC 5681) congestion control 
algorithms:

  Some of these may fail to implement
  the TCP congestion avoidance mechanisms correctly because of poor
  implementation [RFC2525].  Others may deliberately be implemented
  with congestion avoidance algorithms that are more aggressive in
  their use of bandwidth than other TCP implementations; this would
  allow a vendor to claim to have a "faster TCP".  The logical
  consequence of such implementations would be a spiral of increasingly
  aggressive TCP implementations, or increasingly aggressive transport
  protocols, leading back to the point where there is effectively no
  congestion avoidance and the Internet is chronically congested.

And:

  It is convenient to divide flows into three classes: (1) TCP-
  compatible flows, (2) unresponsive flows, i.e., flows that do not
  slow down when congestion occurs, and (3) flows that are responsive
  but are not TCP-compatible.  The last two classes contain more
  aggressive flows that pose significant threats to Internet
  performance,

As I have tried to point out there are several features with CUBIC where 
it is likely to be (or to me it seems it obviously is) more aggressive 
than what is reguired to be TCP-compatible. I'm not aware of evidince 
presented to tcpm (or IETF/IRTF) which shows opposite (and I happy to be 
educated what I have missed).

You may take my comments to be a part of the expert review phase 
performed by the IRTF/ICCRG for CUBIC. I'm not requesting to modify this 
doc to CUBIC++ (or something) but it seems to be that this would be 
necessary if this doc intends to become published as PS. For experimental, 
I think it would need some addtioinal updates and record the areas 
uncertainty and where more experimentation (clearly) is required.

Thanks,

/Markku

> Thanks,
> --
> Yoshi
> 
> 
> On Fri, Feb 11, 2022 at 9:34 PM Markku Kojo <kojo@cs.helsinki.fi> wrote:
>       Hi Yoshi, all,
>
>       It seems to me that many issues that I raised have been solved, thanks.
>       However, there are still a number of important issues that have not been
>       addressed adequately. At least the following:
>
>       #135 on W_max. Yoshi's observation was correct that this is not resolved:
>       the co-authors and original developpers of CUBIC (@lisongxu and
>       @sangtaeha) agreed in their last message that Wmax needs different
>       treatment for slow start and congestion avoidance and plan comprehensive
>       (new) evaluation of it. This is obviously an open issue but the issue
>       was closed?
>
>       #85 (& #86 with basically same issue and these two were combined) This
>       (#85) is about ECN but the major issue is on using the same MD
>       factor in slow start and in congestion avoidance when using
>       loss-based CC. This (#85) remained closed even though I provided a
>       thorough explanation why it is wrong and against the original theory and
>       design by Van Jacobson, against the congestion control principles (RFC
>       2914) and two co-authors agreed on this in their same last message to
>       #135 when they agreed on Wmax needing rework. This is an important issue
>       that the wg should consider very carefully because it is not only
>       updating RFC 5681 but also in conflict with RFC 2914. How can
>       tcpm (and IETF) suggest and allow one CC algo to not follow congestion
>       control principles as set in RFC 2914 while requiring all other CCs to
>       follow RFC 2914 guidelines?
>       The current draft does not provide any justification for using the same
>       MD factor in slow start as in congestion avoidance. Nor am I
>       awere of any experimental data that would support this change.
>       The fact thet CUBIC has been long deployed does not alone provide any
>       supporting evidence because CUBIC is likely to give good performance as
>       it is overagressive and thereby unfair to competing traffic and users
>       tend to be happy when measuring the performance of the sending CUBIC
>       only, not the competing traffic that is badly impacted. HyStart++ is
>       suggested as mitigation to the problem but it cannot; HyStart++ is only
>       applicable during initial slow start, not during slow start after RTO!
>       That is, the "SHOULD use HYStart++" text in Sec 4.10 is impossible
>       to implement as I have pointed out in my comments earlier. Using a proper
>       MD factor in slow start is even more important if loss is detectected
>       during a RTO recovery because the sender is likely to face heavy
>       congestion in such a case and it is very bad if the sender continues
>       sending with overaggressive rate, stealing the capacity from and causing
>       harm to coexisting flows. In addition, as I have explained, HyStart++
>       does not remove the problem even for the initial slow start as it is not
>       shown to work always. Instead, the results with the HyStart++ draft show
>       that it reduces 50% of rexmits and only 36% RTOs, meaning that there is
>       likely to be a notable percentage of cases when a sender is still
>       in slow start when first loss is detected (i.e., HyStart++ had no effect)
>       and a significant number of cases where a CUBIC sender is
>       overaggressive continuing with a 1-40% larger cwnd than what is the
>       available capacity. Note also that any delay-based heuristics like
>       HyStart++ are known to work poorly in various wireless environmens where
>       link delay tends to vary a lot. We may come up with some other MD factor
>       that 0.5 when in slow start and HyStart++ is in use, but that is
>       experimental, if not research, and definitely not ready for stds track.
>
>       #114, #132, and #143 w.r.t flightsize vs. cwnd. The current text
>             does not quite correcly reflect what stacks that use cwnd do.
>             I'll comment in #143 separately.
>
>       #96 & #98: The text added does not address the problems raised which are
>       also evidenced in the paper pointed by Bob in #96. Even though CUBIC has
>       been modified a bit after the paper was published, it does not
>       automatically mean that the problem has been shown resolved: experimental
>       evidence is required but not provided. The fact that CUBIC does not
>       change MD factor for fast convergence is the root of the problem
>       evidenced in the paper and remains so in the algo specified in this
>       draft. This is also a significant problem when competing with Reno CC
>       because CUBIC behaves much more aggressive than Reno CC when there is
>       sudden congestion and all competing flows must converge fast down to a
>       small fraction of the current cwnd to be fair to each other. This again
>       cannot be evidenced not to be a problem by long deployment experience
>       unless experimental data that measures the impact on competing traffic is
>       presented to back the claims. Adjusting just Wmax for fast convergence is
>       not enough and is even likely to be ineffective because there tend to be
>       several losses when sudden congestion is hit, and particularly if NewReno
>       is in use the sender stays several RTTs in Fast Recovery being
>       overaggressive and then possibly continues at the same rate in CA which
>       is unlikely to reach evan close to Wmax before a new loss hits
>       the sender again. That is, lower Wmax and lower additive increase factor
>       do not compensate the use of larger MD factor when sudden congestion is
>       encountered.
>
>       #93 & #94 & (#89) Sec 5.3 still does not address any difficult
>       environments, in particular buffer-bloated paths (nor does Sec 5.4).
>       We need evidence (results) that show CUBIC is fair towards other CCs
>       (Reno) also in such environments. Note that CUBIC's decision to leave
>       Reno-friendly region is based on the size of cwnd which tends to be
>       incorrect with buffer-bloated bottlenecks because with huge buffers the
>       cwnd can be many times larger than what is actually needed to fully
>       utilize the available network bit-rate. Therefore, Reno CC has no problem
>       in fully utilizing such bottleneck links and CUBIC must stay in
>       Reno-friendly region longer but it leaves it too early because the same C
>       is used as with non-bloated environments. We lack experiments showing
>       CUBIC follows the congestion control principles and is fair to current
>       standard TCP CC; to my understanding no experiments with buffer-bloated
>       bottlenecks are cited to back up the claims even though buffer bloat is
>       very well known to be a common (difficult) environment in today's
>       Internet.
>
>       #90 The current text on applying undo (a response to detected false
>       fastrexmit) does not provide correct result if someone implements it.
>       I have explained the problems there in github but seem to have not
>       replied to latest comments by Neal. I'll reply and try to explain more.
>       Again one major problem here is that the draft suggest a new response
>       algo for false fast rexmits but does not provide any experimental data to
>       support it. Long deployment experience has been suggested as
>       justification but again without any carefully evaluated experimental
>       data and evidence there is no meat. The issue is important to solve but is
>       not specific to CUBIC. Instead, it is general problem for all TCP CC
>       variants. IMHO, this is not ready for standards track but deserves a draft
>       of its own so that it can be carefully evaluated and discussed ion the
>       list. AFAIK there has been no discussion on this on the tcpm list, so
>       those probably interested and having experience are likely to be unaware
>       that this is part of CUBIC draft.
>
>       #88 The problem with correctness of the AIMD model and setting alpha for
>       CUBIC requires further consideration. Bob provided an analysis that
>       leaves things still open. It seems that I never had time to review and
>       comment the analysis and clarify why the model does not work. I'll do that
>       separately as it is important to ensure CUBIC behaves fairly as intended
>       for the Reno-friendly region.
>
>       Best regards,
>
>       /Markku
>
>       On Mon, 31 Jan 2022, Yoshifumi Nishida wrote:
>
>       > Hello,
>       >
>       > After some discussions among chairs, we decided to run the 2nd WGLC on
>       draft-ietf-tcpm-rfc8312bis in
>       > consideration of the importance of the draft. 
>       > We'll be grateful if you could send your feedback to the ML. The WGLC runs
>       until *Feb 11*.
>       >
>       > If interested, you can check in-depth past discussions in the following URL.
>       > https://github.com/NTAP/rfc8312bis/
>       >
>       > Thank you so much!
>       > --
>       > tcpm co-chairs
>       >
>       >
>       > On Wed, Jan 26, 2022 at 2:50 AM Lars Eggert <lars@eggert.org> wrote:
>       >       Hi,
>       >
>       >       this -06 version rolls in all the changes requested during (and after)
>       WGLC ended.
>       >
>       >       I'll leave it up to the chairs to decide if another WGLC is warranted
>       or the document can
>       >       progress as-is.
>       >
>       >       Thanks,
>       >       Lars
>       >
>       >
>       >       > On 2022-1-26, at 11:12, internet-drafts@ietf.org wrote:
>       >       >
>       >       >
>       >       > A New Internet-Draft is available from the on-line Internet-Drafts
>       directories.
>       >       > This draft is a work item of the TCP Maintenance and Minor
>       Extensions WG of the IETF.
>       >       >
>       >       >        Title           : CUBIC for Fast and Long-Distance Networks
>       >       >        Authors         : Lisong Xu
>       >       >                          Sangtae Ha
>       >       >                          Injong Rhee
>       >       >                          Vidhi Goel
>       >       >                          Lars Eggert
>       >       >       Filename        : draft-ietf-tcpm-rfc8312bis-06.txt
>       >       >       Pages           : 35
>       >       >       Date            : 2022-01-26
>       >       >
>       >       > Abstract:
>       >       >   CUBIC is a standard TCP congestion control algorithm that uses a
>       >       >   cubic function instead of a linear congestion window increase
>       >       >   function to improve scalability and stability over fast and long-
>       >       >   distance networks.  CUBIC has been adopted as the default TCP
>       >       >   congestion control algorithm by the Linux, Windows, and Apple
>       stacks.
>       >       >
>       >       >   This document updates the specification of CUBIC to include
>       >       >   algorithmic improvements based on these implementations and recent
>       >       >   academic work.  Based on the extensive deployment experience with
>       >       >   CUBIC, it also moves the specification to the Standards Track,
>       >       >   obsoleting RFC 8312.  This also requires updating RFC 5681, to
>       allow
>       >       >   for CUBIC's occasionally more aggressive sending behavior.
>       >       >
>       >       >
>       >       > The IETF datatracker status page for this draft is:
>       >       > https://datatracker.ietf.org/doc/draft-ietf-tcpm-rfc8312bis/
>       >       >
>       >       > There is also an HTML version available at:
>       >       > https://www.ietf.org/archive/id/draft-ietf-tcpm-rfc8312bis-06.html
>       >       >
>       >       > A diff from the previous version is available at:
>       >       > https://www.ietf.org/rfcdiff?url2=draft-ietf-tcpm-rfc8312bis-06
>       >       >
>       >       >
>       >       > Internet-Drafts are also available by rsync at
>       rsync.ietf.org::internet-drafts
>       >       >
>       >       >
>       >       > _______________________________________________
>       >       > tcpm mailing list
>       >       > tcpm@ietf.org
>       >       > https://www.ietf.org/mailman/listinfo/tcpm
>       >
>       >       _______________________________________________
>       >       tcpm mailing list
>       >       tcpm@ietf.org
>       >       https://www.ietf.org/mailman/listinfo/tcpm
>       >
>       >
>       >
> 
> 
>