Re: [tcpm] Linux doesn’t implement RFC3465
Yuchung Cheng <ycheng@google.com> Fri, 30 July 2021 23:05 UTC
Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 984893A153E for <tcpm@ietfa.amsl.com>; Fri, 30 Jul 2021 16:05:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -18.096
X-Spam-Level:
X-Spam-Status: No, score=-18.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.499, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ptJ9BJnNdvzM for <tcpm@ietfa.amsl.com>; Fri, 30 Jul 2021 16:05:33 -0700 (PDT)
Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CB3F93A0B1D for <tcpm@ietf.org>; Fri, 30 Jul 2021 16:05:32 -0700 (PDT)
Received: by mail-wm1-x331.google.com with SMTP id m20-20020a05600c4f54b029024e75a15716so7354157wmq.2 for <tcpm@ietf.org>; Fri, 30 Jul 2021 16:05:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7ql2chQuXsqycgRQbWCinOP+1x3fBJMukZWdQHmzYdw=; b=D+BV1q3B/6soreWHxhw5SPw1nCBqjDzHMW+rDxajJZxA3f2QL4UhXW+UcOzr98Qco7 sxVGNDi28VSHbDZtM+tDY13NdVV+fG6aA8HAxOyxmkogR+PMwDfcr9A3sHhLz3XGoAcX wRXUlOgEM5mlDvaeHiXdKtL6vinOZ7dWthiIHRbrK7dTyyK5XYCU83YbkCJlmGrhYvIv IQ0rPniAnxZ3mCyISyMwFFFCHwHgY114oT7oRrGBRj8/3W1CMWjan78eJ/LIS8eyjBwE tEcObTV/hX4ONelQtNU3wjQT1mjONxJnCt9xdiCoyiDfCU4wxGS6F5eBmRMYp+oPo71u DFxA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7ql2chQuXsqycgRQbWCinOP+1x3fBJMukZWdQHmzYdw=; b=AUT4wHduIQ+2blV2u9QOgE5JyeDvd8QdwNtPBwby59CaJL/7Sp1OdEtvLjK+A7ycB7 38hBF+NFU2HFzDzdHiDUCKOBrM8Y24DwoEGwagSXPatuhg36m2wBNdUDaNoevKb4ispu qNEU93IXOdY1B4cv0rvBcJDvmn5KLhFFBZMqQ+ha/BUqKmd53TDTMYCsdKMnMti6dH7L Yl1ppz6IvhAXWbWziLgmcKDNCb/AqzqSzpLQeSLkABSJtOH2eLk1qUXbqyCWeXxPp29b 6ErpG0KnlCmyMhSw/MtbBP6Vs/c8Ue3VJKrCJJ4+YTrftbDhi2fMFORq9F3FLrbuphsJ WsQQ==
X-Gm-Message-State: AOAM530M1qTqbo0YfgsOwtaXRnghW1xqEXT6tQJuhW7kQJ0X3SwMjS6F sdvJaiojrZutKwEUI41GQAD0hkEzhjahuOh/GsLj0Q==
X-Google-Smtp-Source: ABdhPJwW7sT6H/7qQtR83cLRjCBN9pY+NWuxOVWkAXEEtg+YiexJXmisW+FBjNPyN3D4juSh+y/gprdiDFnCoF/J+XE=
X-Received: by 2002:a7b:ce10:: with SMTP id m16mr5318048wmc.75.1627686329254; Fri, 30 Jul 2021 16:05:29 -0700 (PDT)
MIME-Version: 1.0
References: <78EF3761-7CAF-459E-A4C0-57CDEAFEA8EE@apple.com> <CADVnQynkBxTdapXN0rWOuWO3KXQ2qb6x=xhB35XrMU38JkX2DQ@mail.gmail.com> <601D9D4F-A82C-475A-98CC-383C1F876C44@apple.com> <54699CC9-C8F5-4CA3-8815-F7A21AE10429@icsi.berkeley.edu> <DF5EF1C7-0940-478A-9518-62185A79A288@apple.com> <E150D881-4AB3-4AEA-BE0C-1D4B47B2C531@icir.org> <CADVnQynjE+D-OSvdOVROjT3y1cnHHWqdNQSmphLAJ+HsBTUAJQ@mail.gmail.com> <A1B50403-2405-4348-9626-025D255DEAE7@icir.org> <CADVnQykM8p-bVz_oPrje1yNh9_7_isAUL+wnQWDoY9Gs18sLPQ@mail.gmail.com> <11FE4818-87E7-4FD8-8F45-E19CD9A3366A@apple.com> <CAK6E8=fFWAE_NSr45i2mdh6NmYDusUFW3GYGtuo-FcL07sox9A@mail.gmail.com> <D6B865F7-9865-4B6F-986B-F44ABE5F12B0@apple.com> <CAK6E8=ep0wNzLq59GnenSAZSq3STTgERBAr6bTMqn0txg==18A@mail.gmail.com> <0CB3BAA7-721D-42F0-B302-B626B26A4D32@apple.com>
In-Reply-To: <0CB3BAA7-721D-42F0-B302-B626B26A4D32@apple.com>
From: Yuchung Cheng <ycheng@google.com>
Date: Fri, 30 Jul 2021 16:04:52 -0700
Message-ID: <CAK6E8=c8oMX4YSjOssd9+PGZEzFHVNwXSajqro3aJ3vns5SbOg@mail.gmail.com>
To: Vidhi Goel <vidhi_goel=40apple.com@dmarc.ietf.org>
Cc: Mark Allman <mallman@icir.org>, Neal Cardwell <ncardwell@google.com>, Extensions <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000076845705c85f418f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/oxoqnrjuXnqjIexqoTCMs81ZKgc>
Subject: Re: [tcpm] Linux doesn’t implement RFC3465
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jul 2021 23:05:40 -0000
github repo sounds good to me. On Fri, Jul 30, 2021 at 1:26 PM Vidhi Goel <vidhi_goel= 40apple.com@dmarc.ietf.org> wrote: > Hi Mark, Yuchung, Neal, > > Given that we have some ideas / suggestion text on improving 3465, how do > you think we should proceed? > Mark, do you want to start a GitHub repo or such with some of the changes > already suggested so far and others can review and / or contribute to the > bis draft? > > Thanks, > Vidhi > > On Jul 30, 2021, at 11:04 AM, Yuchung Cheng < > ycheng=40google.com@dmarc.ietf.org> wrote: > > > > On Thu, Jul 29, 2021 at 6:03 PM Vidhi Goel <vidhi_goel= > 40apple.com@dmarc.ietf.org> wrote: > >> Well, perhaps. L=2 was designed to exactly counteract delayed ACKs. >>>> So, it isn't exactly a new magic number. We could wave our hands >>>> and say "5 seems OK" or "10 seems OK" or whatever. And, I am sure >>>> we could come up with something that folks felt was fine. However, >>>> my feeling is that if we want to worry about bursts then let's worry >>>> about bursts in some generic way. And, if you have some way to deal >>>> with bursts then L isn't needed. And, if you don't have a way to >>>> deal with bursts then a conservative L seems fine. But, perhaps >>>> putting the effort into a generic mechanism instead of cooking yet >>>> another magic number we need to periodically refresh is probably a >>>> better way to spend effort. >>>> >>> >>> Yes, I very much agree that "putting the effort into a generic mechanism >>> instead of cooking yet another magic number we need to periodically refresh >>> is probably a better way to spend effort.” >>> >>> >>> I agree that defining such a number doesn’t fully solve the problem but >>> it gives some recommendation for implementations that don’t do pacing. So, >>> defining a somewhat less restrictive value for L (5 or 10) would be a last >>> resort for implementations that don’t pace. >>> >> How about putting a number 10, and also put all the rationales to follow >> to decide a higher or lower value. It's never one-size for all. >> >> >> That sounds great. Something on the lines of, >> >> “This document RECOMMENDS using mechanisms like Pacing to control how >> many bytes are sent to the network at a point of time. But if it is not >> possible to implement pacing, an implementation MAY implicitly pace their >> traffic by applying a limit L to the increase in congestion window per ACK >> during slow start. In modern stacks, acknowledgments are aggregated for >> various reason, CPU optimization, reducing network load etc. Hence it is >> common for a sender to receive an aggregated ACK that acknowledges more >> than 2 segments. For example, a stack that implements GRO could aggregate >> packets up to 64Kbytes or ~44 segments before passing on to the TCP layer >> and this would result in a single ACK to be generated by the TCP stack. >> Given that an initial window of 10 packets in current deployments has >> been working fine, the draft makes a recommendation to set L=10 during slow >> start. This would mean that with every ACK, we are probing for a new >> capacity by sending 10 packets in addition to the previously discovered >> capacity. Implementations MAY choose to set a lower limit if they believe >> an increase of 10 is too aggressive." >> >> Does this sound like what we would like to say? >> > Thanks for taking a shot. I would put more description on Pacing to ensure > better implementation. How about: > "Pacing here refers to spread packet transmission following a rate based > on the congestion window and round trip." with a citation of > https://datatracker.ietf.org/doc/html/rfc7661#section-4.4.2 > > > > I would also refer to IW RFC 6928 in case it gets increased / updated a > few years later. > Hmm maybe we should also move RFC6928 to the standard track :-) > > >> - >> Vidhi >> >> On Jul 29, 2021, at 1:47 PM, Yuchung Cheng < >> ycheng=40google.com@dmarc.ietf.org> wrote: >> >> >> >> On Thu, Jul 29, 2021 at 1:19 PM Vidhi Goel <vidhi_goel= >> 40apple.com@dmarc.ietf.org> wrote: >> >>> Well, perhaps. L=2 was designed to exactly counteract delayed ACKs. >>>> So, it isn't exactly a new magic number. We could wave our hands >>>> and say "5 seems OK" or "10 seems OK" or whatever. And, I am sure >>>> we could come up with something that folks felt was fine. However, >>>> my feeling is that if we want to worry about bursts then let's worry >>>> about bursts in some generic way. And, if you have some way to deal >>>> with bursts then L isn't needed. And, if you don't have a way to >>>> deal with bursts then a conservative L seems fine. But, perhaps >>>> putting the effort into a generic mechanism instead of cooking yet >>>> another magic number we need to periodically refresh is probably a >>>> better way to spend effort. >>>> >>> >>> Yes, I very much agree that "putting the effort into a generic mechanism >>> instead of cooking yet another magic number we need to periodically refresh >>> is probably a better way to spend effort.” >>> >>> >>> I agree that defining such a number doesn’t fully solve the problem but >>> it gives some recommendation for implementations that don’t do pacing. So, >>> defining a somewhat less restrictive value for L (5 or 10) would be a last >>> resort for implementations that don’t pace. >>> >> How about putting a number 10, and also put all the rationales to follow >> to decide a higher or lower value. It's never one-size for all. >> >> Also I believe it's time to move ABC into the standards track, in the era >> of (bigger and bigger) stretch ACKs. >> >> >>> Thanks, >>> Vidhi >>> >>> >>> >>> On Jul 29, 2021, at 8:19 AM, Neal Cardwell <ncardwell@google.com> wrote: >>> >>> >>> >>> On Thu, Jul 29, 2021 at 10:06 AM Mark Allman <mallman@icir.org> wrote: >>> >>>> >>>> >> (b) If there is no burst mitigation then we have to figure out >>>> >> if L is still useful for this purpose and whether we want to >>>> >> retain it. Seems like perhaps L=2 is sensible here. L was >>>> >> never meant to be some general burst mitigator. However, >>>> >> ABC clearly *can* aggravate bursting and so perhaps it makes >>>> >> sense to have it also try to limit the impact of the >>>> >> aggravation (in the absence of some general mechanism). >>>> > >>>> > Even if recommending a static L value, IMHO L=2 is a bit >>>> > conservative. >>>> >>>> Well, perhaps. L=2 was designed to exactly counteract delayed ACKs. >>>> So, it isn't exactly a new magic number. We could wave our hands >>>> and say "5 seems OK" or "10 seems OK" or whatever. And, I am sure >>>> we could come up with something that folks felt was fine. However, >>>> my feeling is that if we want to worry about bursts then let's worry >>>> about bursts in some generic way. And, if you have some way to deal >>>> with bursts then L isn't needed. And, if you don't have a way to >>>> deal with bursts then a conservative L seems fine. But, perhaps >>>> putting the effort into a generic mechanism instead of cooking yet >>>> another magic number we need to periodically refresh is probably a >>>> better way to spend effort. >>>> >>> >>> Yes, I very much agree that "putting the effort into a generic mechanism >>> instead of cooking yet another magic number we need to periodically refresh >>> is probably a better way to spend effort." >>> >>>> >>>> >> - During slow starts that follow RTOs there is a general >>>> >> problem that just because the window slides by X bytes >>>> >> doesn't say anything about the *network*, as that sliding can >>>> >> happen because much of the data was likely queued for the >>>> >> application on the receiver. So, e.g., you can RTO and send >>>> >> one packet and get an ACK back that slides the window 10 >>>> >> packets. That doesn't mean 10 packets left. It means one >>>> >> packet left the network and nine packets are eligible to be >>>> >> sent to the application. So, it is not OK to set the cwnd to >>>> >> 1+10 = 11 packets in response to this ACK. Here L should >>>> >> exist and be 1. >>>> > >>>> > AFAICT this argument only applies to non-SACK connections. For >>>> > connections with SACK (the vast majority of connections over the >>>> > public Internet and in datacenters), it is quite feasible to >>>> > determine how many packets really left the network (and Linux TCP >>>> > does this; see below). >>>> >>>> If you have an accurate way to figure out how many of the ACKed >>>> bytes left the network and how many were just buffered at the >>>> receiver then I see no problem with increasing based on byte count >>>> as you do in the initial slow start. >>>> >>>> (I don't remember what the paper you cite says, but my guess is it's >>>> often the case that L=1 is a reasonable substitute for something >>>> complicated here. But, perhaps I am running the simulation in my >>>> head wrong ... it has been a while, admittedly!) >>>> >>>> > Yes, offload mechanisms are so pervasive in practice, >>>> >>>> I am trying to build a mental model here. How pervasive would you >>>> guess these are? And, where in the network? I have assumed that >>>> they are for sure pervasive in data centers and server farms, but >>>> not for the vast majority of Internet-connected devices. >>>> >>> >>> From my impression looking at public Internet traces, aggregation >>> mechanisms that cause TCP ACKs for more than 2 segments are very common. I >>> suspect that's because the majority of public Internet traffic these days >>> has a bottleneck that is either wifi, cellular, or DOCSIS, and all of these >>> have a shared medium with a large latency overhead for L2 MAC control of >>> gets to speak next. So a lot of batching happens, both in big batches of >>> data that arrive at the client in the same L2 medium time slot, and big >>> batches of ACKs that accumulate while the client waits (often several >>> milliseconds, sometimes even tens of milliseconds) for its chance to send a >>> big stretch ACK or batch of ACKs. >>> >>> This brings up a related point: even if there is some ABC-style per-ACK >>> L limit on cwnd increases, the time structure of most public Internet ACK >>> streams is massively bursty because of these aggregation mechanisms >>> inherent in L2 behavior on most public Internet bottlenecks (wifi, >>> cellular, DOCSIS). So even if there is a limit L that limits the per-ACK >>> behavior to be smooth, if there is no pacing of data segments then the data >>> transmit time structure will still be bursty because the ACK arrivals these >>> days are very bursty. >>> >>> best regards, >>> neal >>> >>> >>> _______________________________________________ >>> tcpm mailing list >>> tcpm@ietf.org >>> https://www.ietf.org/mailman/listinfo/tcpm >> >> >
- [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Lars Eggert
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 David Lang
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Praveen Balasubramanian
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Neal Cardwell
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Vidhi Goel
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Neal Cardwell
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yoshifumi Nishida
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yuchung Cheng
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Neal Cardwell
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yoshifumi Nishida
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Vidhi Goel
- Re: [tcpm] [EXTERNAL] Re: Linux doesn’t implement… Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Yuchung Cheng
- Re: [tcpm] Linux doesn’t implement RFC3465 Mirja Kuehlewind
- Re: [tcpm] Linux doesn’t implement RFC3465 Martin Duke
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Martin Duke
- Re: [tcpm] Linux doesn’t implement RFC3465 Vidhi Goel
- Re: [tcpm] Linux doesn’t implement RFC3465 Michael Tuexen
- Re: [tcpm] Linux doesn’t implement RFC3465 Mark Allman