Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2
Yoshifumi Nishida <nsd.ietf@gmail.com> Tue, 19 July 2022 09:37 UTC
Return-Path: <nsd.ietf@gmail.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 21989C13C52E for <tcpm@ietfa.amsl.com>; Tue, 19 Jul 2022 02:37:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.104
X-Spam-Level:
X-Spam-Status: No, score=-7.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d413Y-TEWEEt for <tcpm@ietfa.amsl.com>; Tue, 19 Jul 2022 02:37:23 -0700 (PDT)
Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B44CAC13C50B for <tcpm@ietf.org>; Tue, 19 Jul 2022 02:37:23 -0700 (PDT)
Received: by mail-wr1-x42c.google.com with SMTP id z12so20727691wrq.7 for <tcpm@ietf.org>; Tue, 19 Jul 2022 02:37:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5Smytj/oHdirjVsSLuOxDBP3RrfWH+VyDZJY8CxrNFE=; b=gtvb2j7j4b2HqFBBFy/u07+nKMaXQHIT0oR2d2MjPpmVsXVoO0Usd+Jq0VU2mHFCo4 s0Xc8si9zzvUAsSuS0nSOkThgH635Xcerx+S3DrEmCuCnjXu0s9eiVZqUmFWaLpZpkwY twBTYVvPbgXl6BdQsUrLcMiJb3VwmWBWi1SEw/fvF+NQF8TEhIoxEm3U7TL2Xs4535VH 5DVsMKsDLF9AVm/BEuq//GuJaiPz/E907L5uHlFmfRswOZz48szTbUumVDZC+gV0U6aX 1zSsMzrmDSiyfi1yb3aoyw9AHjE63vRprTMVkuT/29GhnuICG/hlzZkcxmZvuDGlHOyd w4Kw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5Smytj/oHdirjVsSLuOxDBP3RrfWH+VyDZJY8CxrNFE=; b=f/8As3SOvDsf/88gMihtZTxzOsKuZ16YviAnYYOKy7/utRuvqvvyn0NXO45qWjDZnD 8FH1UMohQKV1MmB1/KofbUurhbLZ90hvknQd9rFYLoh7SJDFB9/ecFX3F3SbjBf1zSwF iM5bdECVindzFG5hX9Bjfu+QSx0PmSdUD3//Ryc/x87klsjPCawQh1L+hKdVhdvZAXUt 1b2WRbG3890xETljBLOJaW967GgdlkGo7cghj6BAlHVnX3xtXCnrA12n9eJ6wDSgPktz C8Fj+elvJ7wkxR2BIxKMH7u3twajxFUsKkf1cQ3t6dETucGjz6bq7QAAg95aMIxnYXul tJSg==
X-Gm-Message-State: AJIora9iCwx0+JPtVAt5g/tyTocnCB52roAM08QIZvqZypnQhJnUOjaJ nrzUbkr5cJSQIJ89aVrvw/5C4XDuqVVjb80xd8aaChIJ
X-Google-Smtp-Source: AGRyM1sjVVU/wRg25IaXSAFyVS3UYPu3o6EIvg3K8YUHKhmmwR7nKWmQhXiNq25zeoLiDagmE/XoRrLYxPonMQfkCJc=
X-Received: by 2002:adf:a4dc:0:b0:21e:42d8:6e8b with SMTP id h28-20020adfa4dc000000b0021e42d86e8bmr647446wrb.196.1658223442023; Tue, 19 Jul 2022 02:37:22 -0700 (PDT)
MIME-Version: 1.0
References: <alpine.DEB.2.21.2206141500480.7292@hp8x-60.cs.helsinki.fi> <alpine.DEB.2.21.2207112144430.7292@hp8x-60.cs.helsinki.fi> <7CF26B3A-D6C3-48F6-AA82-424231DD95D4@apple.com> <CADVnQykd9z=vgkQ-FkQ8-sj_E0BrQnpwhsj8AoF9QgQiQNQEhg@mail.gmail.com>
In-Reply-To: <CADVnQykd9z=vgkQ-FkQ8-sj_E0BrQnpwhsj8AoF9QgQiQNQEhg@mail.gmail.com>
From: Yoshifumi Nishida <nsd.ietf@gmail.com>
Date: Tue, 19 Jul 2022 02:37:10 -0700
Message-ID: <CAAK044TTg1p8ebJ9yd7uEES+KQskVFYw=wHimj9qrSJXDTASUA@mail.gmail.com>
To: Neal Cardwell <ncardwell@google.com>
Cc: Vidhi Goel <vidhi_goel=40apple.com@dmarc.ietf.org>, Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000038617f05e4253b0c"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/FXQXoVOPiTQh13ORQbgghBTNk7Y>
Subject: Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Jul 2022 09:37:25 -0000
Hi folks, I think I understand this issue, but I'm personally not sure how bad this is. Because this looks a rather pathological case to me, also I don't think this can cause congestion collapse as this is still multicative decrease. It seems to me that this is a kind of shooting in the foot, a suboptimal case. However, there are some advantages in the current logic. I'm not very sure if we should sacrifice better results to address some rare cases. I think we will need more analysis of the pros and cons for this. Thanks, -- Yoshi On Wed, Jul 13, 2022 at 7:17 AM Neal Cardwell <ncardwell@google.com> wrote: > Hi Markku and TCPMers, > > My understanding of Markku's concern here is that in slow start the cwnd > can continue to grow in response to ACKs after the lost packet was sent, so > that the cwnd is often twice the level of in-flight data at which the loss > happened, by the time the loss is detected. So the cwnd ends up at 2 * 0.7 > = 1.4x the level at which losses happened, which causes an unnecessary > follow-on round with losses, in order to again cut the cwnd, this time > to 1.4 * 0.7 = 0.98x of the level that causes losses, which is likely to > finally fit in the network path. > > However, there are two technical issues with this concern, as expressed in > the proposed draft text in this thread: > > (1) The analysis for slow-start is not correct for the very common case > where the flow is application-limited in slow-start, in which case the cwnd > would not grow at all between the packet loss and the time the loss is > detected. So the text is needlessly strict in this case. > > (2) For CUBIC the problematic dynamic (of cwnd growth between loss and > loss detection exceeding the multiplicative decrease) can also occur > outside of slow-start, in congestion avoidance. The CUBIC cwnd growth in > congestion avoidance can be up to 1.5x per round trip. So after a packet > loss the cwnd could grow by 1.5x before loss detection and then be cut in > response to loss by 0.7, causing the ultimate cwnd to be 1.5 * 0.7 = 1.05x > the volume of in-flight data at the time of the packet loss. This would > likely cause an unnecessary follow-on round of packet loss due to failing > to cut cwnd below the level that caused loss. So the problem is actually > wider than slow-start. > > AFAICT a complete/general fix for this issue is best solved by recording > the volume of inflight data at the point of each packet transmission, and > then using that metric as the baseline for the multiplicative decrease when > packet loss is detected, rather than using the current cwnd as the > baseline. This is the approach that BBRv2 uses. Perhaps there are other, > simpler approaches as well. > > I also agree with Vidhi's concern, that a change to the multiplicative > decrease changes the algorithm substantially. To ensure that the draft/RFC > is not recommending something that has unforeseen significant negative > consequences, we shouldn't make such a significant change to the text until > we get experience w/ the new variation. > > best regards, > neal > > > On Tue, Jul 12, 2022 at 6:08 PM Vidhi Goel <vidhi_goel= > 40apple.com@dmarc.ietf.org> wrote: > >> Hi Markku, >> >> I emailed about this to other co-authors and we think that this change is >> completely untested for Cubic and we think that this could be considered of >> a future version of Cubic, not the current rfc8312bis. >> To change Beta from 0.7 to 0.5 during slow-start, we would at least need >> some experience either from lab testing or deployment since all current >> deployments of Cubic for both TCP and QUIC use 0.7 as Beta during slow >> start. Since a lot of implementations currently use hystart(++) along with >> Cubic, we don’t see any high risk of overaggressive sending rate and that >> is what the current rfc8312bis suggests as well. In fact, changing Beta >> from 0.7 to 0.5 can still be aggressive without using hystart. >> >> Thanks, >> Vidhi >> >> > On Jul 11, 2022, at 5:55 PM, Markku Kojo <kojo= >> 40cs.helsinki.fi@dmarc.ietf.org> wrote: >> > >> > Hi all, >> > >> > below please find proposed text to solve the Issue 2 a). I will propose >> text to solve 2 b) once we have come to conclusion with 2 a). For >> description and arguments for issues 2 a) and 2 b), please see the original >> issue descriptions below. >> > >> > Sec 4.6. Multiplicative Decrease >> > >> > Old: >> > The parameter Beta__cubic_ SHOULD be set to 0.7, which is different >> > from the multiplicative decrease factor used in [RFC5681] (and >> > [RFC6675]) during fast recovery. >> > >> > >> > New: >> > If the sender is not in slow start when the congestion event is >> > detected, the parameter Beta__cubic_ SHOULD be set to 0.7, which >> > is different from the multiplicative decrease factor used in >> > [RFC5681] (and [RFC6675]. >> > This change is justified in the Reno-friendly region during >> > congestion avoidance because a CUBIC sender compensates the higher >> > multiplicative decrease factor than that of Reno by applying >> > a lower additive increase factor during congestion avoidance. >> > >> > However, if the sender is in slow start when the congestion event is >> > detected, the parameter Beta__cubic_ MUST be set to 0.5 [Jacob88]. >> > This results in the sender continuing to transmit data at the maximum >> > rate that the slow start determined to be available for the flow. >> > Using Beta__cubic_ with a value larger than 0.5 when the congestion >> > event is detected in slow start would result in an overagressive send >> > rate where the sender injects excess packets into the network and >> > each such packet is guaranteed to be dropped or force a packet from >> > a competing flow to be dropped at a tail-drop bottleneck router. >> > Furthermore, injecting such undelivered packets creates a danger of >> > congestion collapse (of some degree) "by delivering packets through >> > the network that are dropped before reaching their ultimate >> > destination." [RFC 2914] >> > >> > >> > [Jacob88] V. Jacobson, Congestion avoidance and control, SIGCOMM '88. >> > >> > Thanks, >> > >> > /Markku >> > >> > On Tue, 14 Jun 2022, Markku Kojo wrote: >> > >> >> Hi all, >> >> >> >> this thread starts the discussion on the issue 2: CUBIC is specified >> to use incorrect multiplicative-decrease factor for a congestion event that >> occurs when operating in slow start. And, applying HyStart++ does not >> remove the problem, it only mitigates it in some percentage of cases. >> >> >> >> I think it is useful to discuss this in two phases: 2 a) and 2 b) >> below. >> >> For anyone commenting/arguing on the part 2 b), it is important to >> first >> >> acknowledge whether (s)he thinks the original design and logic by Van >> Jacobson is correct. If not, one should explain why Van's design logic is >> incorrect. >> >> >> >> Issue 2 a) >> >> ---------- >> >> >> >> To begin with, let's but aside a potential use of HyStart++ (also >> assume tail drop router unless otherwise mentioned). >> >> >> >> The use of an MD factor larger than 0.5 is against the theory and >> original design by Van Jacobson as explained in the congavoid paper >> [Jacob88]. Any MD factor value larger then 0.5 will result sending extra >> packets during Fast Recovery following the congestion event (drop). All >> extra packets will become dropped at a tail-drop bottleneck (if a lonely >> flow). >> >> >> >> Note that at the time when the drop becomes signalled at the TCP >> sender, the size of the cwnd is double the available network capacity that >> slow start determined for the flow. That is, using MD=0.5 is already as >> aggressive as possible, leaving no slack. Therefore, if MD=0.7 is used, the >> TCP sender enters fast recovery with cwnd that is 40% larger that the >> determined network capacity and all excess packets are guaranteed to become >> dropped, or even worse, the excess packets are likely to force packets for >> any competing flows to become unfairly be dropped. >> >> >> >> Moreover, if NewReno loss recovery is in use, a CUBIC sender will >> >> operate overagressively for a very long time. For example, if the >> >> available network capacity for the flow is 100 packets, cwnd will have >> >> value 200 when the congestion is signalled and the CUBIC sender enters >> >> fast recovery with cwnd=140 and injects 40 excess packets for each of >> >> the subsequent 100 RTTs it stays in fast recovery, forcing 4000 >> packets to become inevitably and totally unnecessarily dropped. >> >> >> >> Even worse, this behaviour of sending 'undelivered packets' is against >> >> the congestion control principles as it creates a danger of congestion >> >> collapse (of some degree) "by delivering packets through the network >> >> that are dropped before reaching their ultimate destination." [RFC >> 2914] >> >> >> >> Such undelivered packets unnecessarily eat capacity from other flows >> >> sharing the path before the bottleneck. >> >> >> >> RFC 2914 emphasises: >> >> >> >> "This is probably the largest unresolved danger with respect to >> >> congestion collapse in the Internet today." >> >> >> >> It is very easy to envision a realistic network setup where this >> creates a degree of congestion collapse where a notable portion of useful >> network capacity is wasted due to the undelivered packets. >> >> >> >> >> >> [Jacob88] V. Jacobson, Congestion avoidance and control, SIGCOMM '88. >> >> >> >> >> >> Issue 2 b) >> >> ---------- >> >> >> >> The CUBIC draft suggests that HyStart++ should be used *everywhere* >> instead of the traditional Slow Start (see section 4.10). >> >> >> >> Although the draft does not say it, seemingly the authors suggest >> using HyStart++ instead of traditional Slow Start in order to avoid the >> problem of over-aggressive behaviour discussed above. This, however, has >> several issues. >> >> >> >> First. it is directly in conflict with HyStart++ specification which >> says that HyStart++ should be used only for the initial Slow Start. >> However, the overaggressive behaviour after slow start is also a potential >> problem with slow start during an RTO recovery; in case of sudden >> congestion that reduces available capacity for a flow down to a fraction of >> the currently available capacity, it is very likely that an RTO occurs. In >> such a case the RTO recovery in slow start inevitably overshoots and it is >> crucial for all flows not to be overaggressive. >> >> >> >> Second, the experimental results for initial slow start in HyStart++ >> draft suggest that while HyStart++ achieves good results HyStart++ is >> unable to exit slow start early and avoid overshoot in a significant >> percentage of cases. >> >> >> >> Given the above issues, the CUBIC draft must require that MD of 0.5 is >> used when the congestion event occurs while the sender is (still) in slow >> start. The use of MD=0.5 is an obvious stumble in the original CUBIC and >> the original CUBIC authors have already acknowledged this. It seems also >> obvious that instead of correcting the actual problem (use of MD other than >> 0.5), HyStart and HyStart++ have been proposed to address the design >> mistake. While HyStart++ is a useful method also when used with MD=0.5, >> when used alone it only mitigates the impact of the actual problem rather >> than solves the problem. >> >> >> >> What should be done for the cases where HyStart++ exits slow start but >> >> is not able to avoid (some level of) overshoot and dropped packets is >> IMO an open issue. Resolving it requires additional experiments and it >> should be resolved separately when we have more data. For now when we do >> not have enough data and understanding of the behaviour we should IMO >> follow the general IETF guideline "be conservative in what you send" and >> specify that MD = 0.5 should be used for a congestion event that occurs for >> a packet sent in slow start. >> >> >> >> Thanks, >> >> >> >> /Markku >> >> >> > >> > _______________________________________________ >> > tcpm mailing list >> > tcpm@ietf.org >> > https://www.ietf.org/mailman/listinfo/tcpm >> >> _______________________________________________ >> tcpm mailing list >> tcpm@ietf.org >> https://www.ietf.org/mailman/listinfo/tcpm >> >
- [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Vidhi Goel
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Rodney W. Grimes
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Neal Cardwell
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Yoshifumi Nishida
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Yoshifumi Nishida
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Michael Welzl
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Markku Kojo
- Re: [tcpm] CUBIC rfc8312bis / WGLC Issue 2 Michael Welzl