Re: [tcpm] Linux doesn’t implement RFC3465

Yuchung Cheng <ycheng@google.com> Mon, 02 August 2021 22:03 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F37A63A1E6D for <tcpm@ietfa.amsl.com>; Mon, 2 Aug 2021 15:03:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -18.097
X-Spam-Level:
X-Spam-Status: No, score=-18.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.499, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JNgkwMbbgCmH for <tcpm@ietfa.amsl.com>; Mon, 2 Aug 2021 15:03:36 -0700 (PDT)
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5D72C3A1E6B for <tcpm@ietf.org>; Mon, 2 Aug 2021 15:03:36 -0700 (PDT)
Received: by mail-wr1-x42b.google.com with SMTP id c9so2840150wri.8 for <tcpm@ietf.org>; Mon, 02 Aug 2021 15:03:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=q5pRtvXuaftl++0tFF3rxhViLYj3YNX1DEXHq1brC2c=; b=hR12TzSQQ+9QUc4lXIfFAXznahUrmRxdmUZcmFcwoMhEhyq2yryPouZvtPnZNE12JF q/UNz116HdQZtIKx4FrUABG6G5wuQn6kEN8DP3hctIyiEr3pq9lLoyELBjEbt1KnE/2V YG+P76Q7oHfIOC/FVsFBoLiUphvbNe1WCZBVZcuaCF4lyf4NZFgmceruUj4Va6+R5htg 5J0RTpSruX68P7flT9lKrBUCWAsYc2ZlHwyB1xQEt4FX3fS8r0b8yaM1R5QR3fiDoM9v 0WJSrB7Y8VIoNH2kHY5XNdRdKMsbqIF2CVVY3+sB7j+AwKtwD/Na4vPMqs7Xsna3P0+Q OUxA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=q5pRtvXuaftl++0tFF3rxhViLYj3YNX1DEXHq1brC2c=; b=jEOuhZwLj6cOEqDHA+RlhHk9wjBfGBIXTFd493ZtCLnJ00bOLqnBLQ188bvmcvVoHn gUL0N9+56zSfIF1uOmdzJJPejA2WioB6J/Io4fr8eBrRYFYL5QrguWbJ0oR8AA/m3kIO sUtz/bK4UJK9n6LHxfyFNrW2/EPvlSLdDem/MfK/SnbJAWQbQ/0HYedHXL4567sSEX5s X5tcXdY2naEXhG6HvKUQqTWGsNEzhgKbqqVXzF3+x7V9QpodcTaggoI+B1fsupgF1F9P PpffB0h2qAoUyDXl9pViF74LxQ+UUweQlaWulfy7jUHjm2ffms1uVL4mULPDPcopAbQN NXEQ==
X-Gm-Message-State: AOAM531jjcN15oLZwPS0R7IBcKpDXp20tApqZfVCEzhMbcrHZRHXQKj0 8esR1jDVDG0r5XaoNkRSy69hhpDAWHDk81yEMxmj2w==
X-Google-Smtp-Source: ABdhPJx4ee7Um0juGw4ZkLYYT0ffIHIdiKQ+1WplqMqnKz3XX/PLi7/vrP9f42rDAjsPL10oEamZUjLzcfaHf3HIzFA=
X-Received: by 2002:adf:8169:: with SMTP id 96mr19340474wrm.424.1627941809546; Mon, 02 Aug 2021 15:03:29 -0700 (PDT)
MIME-Version: 1.0
References: <78EF3761-7CAF-459E-A4C0-57CDEAFEA8EE@apple.com> <CADVnQynkBxTdapXN0rWOuWO3KXQ2qb6x=xhB35XrMU38JkX2DQ@mail.gmail.com> <601D9D4F-A82C-475A-98CC-383C1F876C44@apple.com> <54699CC9-C8F5-4CA3-8815-F7A21AE10429@icsi.berkeley.edu> <DF5EF1C7-0940-478A-9518-62185A79A288@apple.com> <E150D881-4AB3-4AEA-BE0C-1D4B47B2C531@icir.org> <CADVnQynjE+D-OSvdOVROjT3y1cnHHWqdNQSmphLAJ+HsBTUAJQ@mail.gmail.com> <A1B50403-2405-4348-9626-025D255DEAE7@icir.org> <CADVnQykM8p-bVz_oPrje1yNh9_7_isAUL+wnQWDoY9Gs18sLPQ@mail.gmail.com> <11FE4818-87E7-4FD8-8F45-E19CD9A3366A@apple.com> <CAK6E8=fFWAE_NSr45i2mdh6NmYDusUFW3GYGtuo-FcL07sox9A@mail.gmail.com> <D6B865F7-9865-4B6F-986B-F44ABE5F12B0@apple.com> <756432D9-4331-454D-82EB-346CF54A355E@icir.org>
In-Reply-To: <756432D9-4331-454D-82EB-346CF54A355E@icir.org>
From: Yuchung Cheng <ycheng@google.com>
Date: Mon, 02 Aug 2021 15:02:52 -0700
Message-ID: <CAK6E8=c5A1K_B8FAeD=PtrwRgph_4Dy3Je8CrUW_P=J=MQqNtQ@mail.gmail.com>
To: Mark Allman <mallman@icir.org>
Cc: Vidhi Goel <vidhi_goel@apple.com>, Neal Cardwell <ncardwell@google.com>, Extensions <tcpm@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000004651de05c89abd40"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/4n6zblhCDk5nPYnvJy0OxfPIoAk>
Subject: Re: [tcpm] Linux doesn’t implement RFC3465
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Aug 2021 22:03:42 -0000

The fact is that Linux CC has long moved to infinite L since 2013*

On Mon, Aug 2, 2021 at 11:03 AM Mark Allman <mallman@icir.org> wrote:

>
> >  “This document RECOMMENDS using mechanisms like Pacing to control
> >  how many bytes are sent to the network at a point of time. But if
> >  it is not possible to implement pacing, an implementation MAY
> >  implicitly pace their traffic by applying a limit L to the
> >  increase in congestion window per ACK during slow start. In
> >  modern stacks, acknowledgments are aggregated for various reason,
> >  CPU optimization, reducing network load etc. Hence it is common
> >  for a sender to receive an aggregated ACK that acknowledges more
> >  than 2 segments. For example, a stack that implements GRO could
> >  aggregate packets up to 64Kbytes or ~44 segments before passing
> >  on to the TCP layer and this would result in a single ACK to be
> >  generated by the TCP stack. Given that an initial window of 10
> >  packets in current deployments has been working fine, the draft
> >  makes a recommendation to set L=10 during slow start. This would
> >  mean that with every ACK, we are probing for a new capacity by
> >  sending 10 packets in addition to the previously discovered
> >  capacity. Implementations MAY choose to set a lower limit if they
> >  believe an increase of 10 is too aggressive."
> >
> > Does this sound like what we would like to say?
>
> Not really, IMO.  I think a few things here ...
>
>   - I agree that if pacing is in play that we don't need to worry
>     about an L.
>
>   - I think the above L=10 reasoning is at best pretty weak.  Just
>     because IW=10 works OK once / connection does not mean
>     continually sending "10 more" will work out OK.  It may.  It may
>     not.  But, the above sort of coupling between IW-10 and L=10
>     seems highly tenuous without any sort of data.
>
>   - The real issue with picking a number is that it is so hard to
>     reason about because the behavior All Depends.  E.g., consider
>     something like IW=10.  We know that will allow 10 or fewer
>     segments to be pumped into the network when the connection
>     starts.  That's pretty easy to reason about / understand.  But,
>     with L=10 we might have bursts anywhere from 2 packets to
>     cwnd+10 packets on every ACK---depending on how the ACKs are
>     stretched.  And, cwnd isn't a constant.  So, if the idea is to
>     somehow limit bursting then sometimes we're limiting to X and
>     others to Y and still others to Z.  It's an inconsistent mess.
>     Making L something arbitrary without evidence seems like a bad
>     path to me.
>
>   - Of course, by making L=10 a MAY we're effectively saying "no L,
>     anywhere" anyways.  If we're going to define an L it should be a
>     SHOULD unless there is pacing.
>
>   - Somewhat related to the above, it isn't clear what "10 more" is
>     more than in qualitative terms, as well.  Say an ACK rolls in
>     that covers 10 packets.  How were those packets sent?  Were they
>     sent a back-to-back burst so "10 more" is in fact a back-to-back
>     burst that is 10 more than previously (or, 2x)?  Or, did we send
>     those 10 packets in 5 little bursts of 2 packets each so that
>     "10 more" is actually 18 more than the previous burst size---a
>     10x increase?  Of course, it could be BOTH for the same ACK!
>     I.e., the segments were sent 2 at a time and aggregated
>     somewhere in the middle.  This lack of clarity again makes the
>     choice of L feel pretty arbitrary.
>
> I agree we can elide L if pacing is in place.  But, twiddling with L
> by feel is crude and will produce an inconsistent approach to bursts
> that doesn't seem to me to be particularly helpful because we don't
> really grok the implications.
>
> allman
>