Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rfc8312bis-03

Yoshifumi Nishida <nsd.ietf@gmail.com> Sun, 05 September 2021 10:47 UTC

Return-Path: <nsd.ietf@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 926C03A0E03 for <tcpm@ietfa.amsl.com>; Sun, 5 Sep 2021 03:47:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CXBK_KZ9JhMU for <tcpm@ietfa.amsl.com>; Sun, 5 Sep 2021 03:47:15 -0700 (PDT)
Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 880953A0DBB for <tcpm@ietf.org>; Sun, 5 Sep 2021 03:47:15 -0700 (PDT)
Received: by mail-qv1-xf2e.google.com with SMTP id z2so2410804qvl.10 for <tcpm@ietf.org>; Sun, 05 Sep 2021 03:47:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AGbMAgTkm7trtOBjL6WQex76Wb8KSaRwUkpRtIxsIto=; b=Ev9zC3Lsbq+XUlan8dU5PiAj2ew8+mW/PElNm9pqQgtajozNPGSsDXVZbgZz4sp6cm qoNddZzH8ZrY5dsa/3r7PArVzys2HpfxXBgN/tnldrOiyqeOSzw8/u6LrAtxkK0mVf27 wlqMCXc3OsMpzf0/tn+1idVVYtkwbFXqdjT2JlYMPzK/XLcClA09+5MP40iyUjRaUzIF mm5Kq7wP9jQwPD8Epjy387ZfVtGoQe5NwT55CZTY05HGVxf8IfcNzP8vwP/Sku9dZGMn HdZWwsr264h0brmEVdfWIxh4u/x/+07+WW63WRmuSV81tWsK/nS6vug164l1pOuUjpGh mJ2g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AGbMAgTkm7trtOBjL6WQex76Wb8KSaRwUkpRtIxsIto=; b=PiJEZeS54fJSIobnQ5X3Nj6xM9OYeFGegRp3woB5d5BcTchXJ7qpYTWLAA0FHShtER BBzhrebnr2TONC+qdZGkwSkk9GFImPY6tUDoX76NP8h2K82wYkw08GKvoliq/dgLtcMH DAYJpcIoNzGdfUMlMngszeTa4uMEeDAUXNFeuMWyEIbiNyuq2DQqKE/w5bRgeoTLwnks DIynrnAz1z9mR98mmsKdKn5k3FtjoiQg7yHLP6AX+HdMHlvbjltFJLItjpicwzdmsKY0 N01+ZHzjf/3i6hhqXLQOuQwdSGlLip2Y48wSShXL7OerCPs8F9Pv1ICUYaNAWoKtHgvx mVAA==
X-Gm-Message-State: AOAM531ej9veCllj/SzHfRU9Ep4GcmNapGM5CVKSqj9MGEVmygLAIZId 2WGhX1ROfzXv1aU+BeISSbMNK2zG5AnJbiKM8A3D7uOuv6c=
X-Google-Smtp-Source: ABdhPJw1MYWQSWwd6EsJuXKhEUvmQVrOg0Lns8Nw5l4RktWd2ip+sdHPhBl/1kQ4CINQL/eFy9iyUEE/61IPX8qdSE4=
X-Received: by 2002:a05:6214:2b4:: with SMTP id m20mr7338582qvv.35.1630838833146; Sun, 05 Sep 2021 03:47:13 -0700 (PDT)
MIME-Version: 1.0
References: <CAAK044SjMmBnO8xdn2ogWMZTcecXoET1dmZqd6Dt3WzOUi359A@mail.gmail.com> <alpine.DEB.2.21.2108300740560.5845@hp8x-60.cs.helsinki.fi>
In-Reply-To: <alpine.DEB.2.21.2108300740560.5845@hp8x-60.cs.helsinki.fi>
From: Yoshifumi Nishida <nsd.ietf@gmail.com>
Date: Sun, 5 Sep 2021 03:47:02 -0700
Message-ID: <CAAK044SA=MFTUmZktDWA05ijg4QuAG1E_niSSc3a5v=-6zdg4A@mail.gmail.com>
To: Markku Kojo <kojo@cs.helsinki.fi>
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000055ee1b05cb3d41f8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/GqRiEgcT0VkBpn_3OUzF1-sxYQ8>
Subject: Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rfc8312bis-03
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Sep 2021 10:47:22 -0000

HI Markku,

Thanks for the detailed review. These are indeed very good points. Thank
you so much!
I personally would like to discuss 4. on the ML as it may be rather a big
and fundamental points than the others.
I believe the equations for AIMD TCP calculation came from "A Comparison of
Equation-Based
and AIMD Congestion Control" and It seems to me your point is that draft
doesn't use the equations properly.
However, I'm not very sure how it's not correctly used yet.
Could you elaborate this a bit more?

Thanks,
--
Yoshi

On Mon, Aug 30, 2021 at 9:33 AM Markku Kojo <kojo@cs.helsinki.fi> wrote:

> Hi Yoshi, all,
>
> On Wed, 18 Aug 2021, Yoshifumi Nishida wrote:
>
> > Hi,The chairs think 8312bis draft is in good shape and it's ready for
> submission.
> > We know we had some discussions on how to handle ABC draft in the draft
> during the
> > last WG meeting and there are on-going related discussions on the ML.
> > However, we still think we can proceed with the draft as the
> relevance between the
> > draft and the current discussions is not very high.
> >
> > If anyone has different thoughts on this, please let us know.
> > Otherwise, we will prepare a writeup soon and submit it to IESG.
>
> My sincere apologies as this comes very late in the process but I was not
> able to follow IETF mailing list at the time of WGLC nor during
> the recent weeks, and I couldn't attend IETF 111.
>
> I took a look at the draft and some statements of it seemed to be in
> conflict with other Standards Track RFCs. I believe the tcpm
> wg should look more closer to these to avoid publishing conflicting
> guidance in different Standards Track RFCs. In addition, there are some
> issues I am concerned about.
>
> 1. ECN
>
> a) The draft modifies RFC 3168 when ECE arrives and would result
>     in cwnd < 2 MSS by setting a lower bound of 2 MSS for cwnd (only
>     ssthresh is supposed to have a lower bound of 2 MSS).
>     This is in conflict with RFC 3168, RFC 5033, and RFC 2914 which
>     require "full backoff", that is, a sender must continue decreasing
>     sending rate as long as congestion persists. This is a fundamental
>     property for any congestion control mechanism. For ECN, RFC 3168
>     (sec 6.1.2) requires that cwnd is halved until the minimum cwnd
>     of one MSS is received, and then the sender continues reducing
>     sending rate by using a timer with exponential backoff, if more
>     ECE-echo packets keep on arriving.
>
>     This implementation bug has been long with Linux and is present
>     in other stacks as well and should get corrected ASAP with
>     appropriate advise in all published RFCs, instead of replicating
>     the bug in the RFC series.
>
> b) RFC 8311 (sec 4.1) allows modifying the TCP-sender response to
>     ECE for experimental purposes only. Has there been any discussion
>     with tsvwg in that modifying the TCP-response to ECE in CUBIC is
>     conflict with RFC 8311 as CUBIC is currently intended to become
>     a Standards Track RFC?
>
> c) ABE (RFC 8511) is currently the only experimental RFC to modify
>     the TCP-sender response to ECE. ABE allows modifying multiplicative
>     decrease factor only for AIMD TCP and only when ECE arrives in
>     congestion avoidance, that is, not when the sender is in slow-start.
>
>     Applying a decrease factor of 0.7 (or higher) when a congestion
>     singnal arrives and ends the initial slow start would be
>     inconsiderate because it extends the convergence time from
>     the slow-start overshoot. ABE has found that using a larger decrease
>     factor yields performance improvement when applied in congestion
>     avoidance, but not otherwise. Do we have data that would support
>     different findings with CUBIC?
>
> 2. Slow-Start Overshoot w/ loss-based congestion conrol
>
>     The larger decrease factor of 0.7 seems unadviseable also if
>     used in the initial slow start with loss based congestion
>     control (w/ Not-ECT traffic); packets start getting dropped
>     when a TCP sender has increased cwnd in slow start such that
>     the available network bandwidth and buffering capacity at the
>     bottleneck is filled, but the TCP sender continues sending
>     more packets for one RTT doubling cwnd and hence also the number
>     of packets inflight before the congestion signal reaches the sender.
>     Now, even if the sender uses the standard decrease factor of 0.5,
>     the cwnd gets reduced only to a value that equals to the cwnd just
>     before (or around) the congestion point. That is, the network is
>     still full when the sender enters fast recovery but we do not
>     expect more drops during fast recovery in a deterministic model.
>     Only in congestion avoidance after the recovery, the sender
>     increases cwnd again and gets a packet drop that takes the
>     sender to a normal sawtooth cycle in an ideal case. So, the
>     convergence time from slow-start is expexted to be fast though
>     in reality loss recovery does not always work ideally with
>     such many drops in a window of data.
>
>     However, if the sender applies decrease factor of 0.7, it
>     continues in fast recovery with a 40% higher cwnd than what is
>     the available network capacity. This is very likely to result in
>     significant number of packet losses during fast recovery, and
>     very likely to result in loss of retransmissions. So, it is no
>     wonder that so many people have been very concerned about the
>     slow-start overshoot and the problems it creates.
>     It is very obvious that applying decrease factor of 0.7 in
>     the initial slow start is likely to extend the convergence
>     time from the slow-start overshoot significantly. Or, do we
>     have data that shows that such concern is unnecessary?
>     Also, a number of new loss-recovery mechanisms have been
>     introduced maybe mainly because of this?
>     I would hesitate recommending decrease factor of 0.7 when
>     a congestion event occurs during the initial slow start.
>
>
> 3. RACK (and QUIC)
>
>     The draft states that RACK (and QUIC loss detection) can be used
>     with CUBIC to detect losses. However, it seems to have gone
>     unnoticed that RACK may also detect loss of a retransmission in
>     which case the congestion control response is required to be taken
>     twice, i.e., ssthresh and cwnd must be lowered again (MUST in
>     RFC 5681 Sec. 4.3). Once RACK got published all new congestion
>     controls and updates to existing RFCs must include this essential
>     congestion control response, if the congestion control mechanism
>     intends to use RACK for loss detection.
>
>     This draft does not have any such requirement nor does it specify
>     how this is done?
>
> 4. Fairness to AIMD congestion control
>
>     The equation on page 12 to derive increase factor α      that
>                                                        cubic
>
>     intends to achieve the same average window as AIMD TCP seems to
>     have its origins in a preliminary paper that states that the
>     authors do not have an explanation to the discrepancy between
>     their AIMD model and experimental results, which clearly deviate.
>     It seems to have gone unnoticed that the equation assumes equal
>     drop probability for the different values of the increase factor
>     and multiplicative decrease factor but the drop probability
>     changes when these factors change. The equations for the drop
>     probability / the # of packets in one congestion epoch
>     are available in the original paper and one can easily verify
>     this. Therefore, the equations used in CUBIC are not correct
>     and seem to underestimate _W_est_ for AIMD TCP, resulting in
>     moving away from AIMD-Friendly region too early. This gives
>     CUBIC unjustified advantage over AIMD TCP particularly in
>     environments with low level of statistical multiplexing. With
>     high level of multiplexing, drop probability goes higher and
>     differences in the drop probablilities tend to get small. On the
>     other hand, with such high level of competition, the theoretical
>     equations may not be that valid anymore.
>
>
> 5. Contribution to buffer bloat and slower convergence due to
>     larger decrease factor
>
>     This draft uses a larger cwnd decrease factor, resulting in larger
>     average cwnd and buffer occupation. This means that it is
>     likely to contribute significantly to buffer bloat, particularly
>     when considering also the use of concave increase function in the
>     beginning of the congestion avoidance that keeps the cwnd close
>     to maximum most of the time as carefully explained in the draft.
>     This means that CUBIC keeps also buffer bloated router queues
>     very efficiently full at all times.
>
>     Currently the draft does mention the slower convergence speed
>     as the only side effect for the larger decrease factor and does
>     not discuss the contribution to buffer bloat. It would be
>     important to assess this together with measurement data to
>     back up any observations.
>
>     Do we have data in different environments, including buffer-bloated
>     environments that show how much effect CUBIC has compared to
>     AIMD TCP?
>     And, how does larger decrease function impact convergence speed,
>     particularly in buffer-bloated environments.
>     Many people have complained that window-based (TCP) congestion
>     control drives buffer bloat. Of course, also the current standard
>     AIMD TCP tends to fill in the buffer-bloated queues but it
>     unlikely does it as effectively as CUBIC? This would be good to
>     understand better.
>
> 6. Citing Experimental RFCs as if being a part of CUBIC
>
>     The draft says that CUBIC MAY implement DSACK [RFC3708], limited slow
>     start [RFC 3742], [RFC7661] and hybrid slow start [cites a paper].
>     Aren't the first three down references? Not sure if it is appropriate
>     for a Stds Track document to cite experimental work or a paper like
>     this even though it's a MAY.
>
>
> 7. Discussion
>
> I regret to say that the discussion in Sec 5. brings up surprisingly
> little data to back up the claims that are made. Given the long
> deployment experience that is emphasised in the draft, there, however, is
> little evidence (measurement data) summarised and cited to back up the
> claims. "There is a long deployment experience" does not provide
> any evidence as such. There should be a lot of studies with measurement
> data accumulated over the years that would support the assertions in the
> doc. Or, is there?
>
> Sec 5.1
>
> In this subsection, one should show the impact of CUBIC when
> competing with AIMD TCP. The numbers in tables are derived from
> analytical models that give average window size with fixed random
> loss probabilities and unlimited bandwidth. That is not the same as when
> flows are combeting in the same congested bottleneck that builds a queue.
> Loss probabilities for different flows are likely to be different
> especially at lower levels of statistical multiplexing.
>
> The first para of sec 5.1 does not sound like true. Simply looking at the
> original CUBIC paper [HRX08] reveals that CUBIC dominates AIMD TCP (SACK
> TCP) in the regions where SACK TCP alone is able to fully utilize the
> available bandwidth (Figure 10 c up until 200 Mbps, and to some extent in
> Fig 10 a with 40 ms delay). And ín all cases where SACK TCP alone is not
> able to utilize all available b/w, CUBIC steals multiple times more b/w
> from SACK TCP than what SACK TCP is not able to utilize. Figures 5 and 6
> tell the same story. Has something changed and/or is there possibly data
> that provides alternative evidence.
>
> In addition, the recommended value for constant C and the two alternative
> values presented in the draft are the same as in the original paper. It
> would be interesting to see if there has been any experimentation with
> different values and what might be the outcome?
>
> Sec 5.3
>
> Any experimental data to summarize and cite?
>
> Sec 5.4
>
> The text correctly states that CUBIC fills queues faster than AIMD TCP
> and increases the risk of standing queues. Then it proposes queue sizing
> and AQM as a solution, which is odd. Applying AQM to keep the queues
> shorter of course decreases the RTT (delay) seen but it does not help
> with standing queues (they remain standing but are just shorter).
>
> Sec 5.5
>
> Setting lower bound of 2 MSS for cwnd with ECN may result in symptoms of
> congestion collapse with certain specific conditions, e.g., if the actual
> (physical) queue size is very large and there is a mix of ECN-capable
> (ECT) and not ECN-capable flows. When the number of ECN capable flows
> increase the start starving the not ECN-capable flows as ECT flows stop
> responding to congestion and start increasing the queue such that AQM has
> to drop almost all not ECT packets.
>
> Sec 5.6
>
> Competing CUBIC flows will converge but it happens very slowly and
> requires a large amount of data to send, i.e., short flows are more
> unlikely to live long enough to converge. This seems to be case at least
> according to the results in original paper [HRX08, Fig 4 b].
> Summary and citing some performance data would be very useful and much
> more convincing.
>
> Sec 5.8
>
> The MUST NOT requirement would be much better placed with other
> specifications in Sec 4 and would benefit from more accurate description.
>
> Sec 5.9
>
> The statement made here is not convincing and is likely to be incorrect.
> E.g., CUBIC with larger decrease factor would most likely release
> capacity notably slower than AIMD TCP if there is sudden congestion.
>
>
> 8. The draft says in the intro that CUBIC is to be regarded as *current
> standard* for TCP congestion control. It sounds a bit like it would
> obsolete RFC 5681 which is not the intent. RFC 5681 still has its
> specific role as the document that gives the baseline and generic
> guidelines for TCP (and other) congestion control.
> Instead, I think this document should articulate very carefully its role
> among the congestion control algorithms. How, I am not sure. Maybe
> simply as an alternative for RFC5681 congestion avoidance and
> multiplicative decrease.
>
> Please note also that when specifying these algorithms this document is
> in direct conflict with a MUST in RFC5681 which says: "however, a TCP
> MUST NOT be more aggressive than the following algorithms allow (that is,
> MUST NOT send data when the value of cwnd computed by the following
> algorithms would not allow the data to be sent)."
> Therefore, the draft should make this differentation very clear maybe
> already in the abstract and justify the deviations much better than it
> currently does (accompanied with evidence = data). This is very important
> i order to make a convincing case why it is ok for this doc to deviate
> from the current Standards Track TCP normative statements.
>
> Misc comments:
>
> _epoch_start_: needs more accurate and consistent definition when the
> exactly the epoch starts. Is it when congestion event occurs or when
> TSP sender enters congestion avoidance first time after an congestion
> event. If it is  different in different scenarios that would be good to
> present systematically.
>
> In many occassions:
>
>   "(upon receiving) an ACK" -> "(upon receiving) a new ACK"
>
> On page 13:
>
>    " the sender MAY employ a Fast
>      Recovery algorithm to gradually adjust the congestion window to its
>      new reduced _ssthresh_ value."
>
> I assume this is aiming at saying that something similar to PRR MAY be
> used to reduce cwnd. This, however, is somewhat vaguely said and using
> fasr recovery is misleading. We need to remember also that it might not
> be trivial to have it right. So, dunno whether it would be useful to drop
> this.
>
> Best regards,
>
> /Markku