Re: [tcpm] Linux doesn’t implement RFC3465

Mark Allman <> Mon, 02 August 2021 18:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8CEA13A136E for <>; Mon, 2 Aug 2021 11:03:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 4ycFysFxf66r for <>; Mon, 2 Aug 2021 11:03:07 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id D5C063A1340 for <>; Mon, 2 Aug 2021 11:03:07 -0700 (PDT)
Received: by with SMTP id n16so18497610oij.2 for <>; Mon, 02 Aug 2021 11:03:07 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version; bh=R9U+bmt68GlzGDZBwuAQDdPXoUa7/USRVrnXmpaHB6I=; b=bK/N8qUzHQajMydn0mLLZgrjWfKDs5a7YnHYQjktma5V/8Zr0gC/wTZx/wrb+Jo/km FvTwX01eWVnSgO6iK3IZzsSiWR4vgqAD4n3qBWglI+NxCcSvqNe1/CDZ/driFMkopoD8 P+qAH/UUwrm4FzyitxRN8dRF5JMVZW+sgsQ27AXAWLq5I39hALdLmBBmlK/GTE5qxEli KXmPQxum9Io+fqkeXkr/uagZMH4VGm3fhSfpfjn1Q95XNODOv3ui20I11c7697dsgFNX rba2qQS2c07h8oPVd7mv/u6ohDfARtmuGd4sQsdPSzpwiWAjYFAgJCAQ8x90gDgSHfIa M4iQ==
X-Gm-Message-State: AOAM530O9RODJi5iihb/PvgPfhZ+Nm7fddxgLz9RfGod+NRo5mxwjz4r gGJWECmUyUppV8IdF55D06ljCw==
X-Google-Smtp-Source: ABdhPJzMbuAwBoGVJySoizdFbIjYhK40t7yZBaRxVrnsOJGEth+L0d0IVGmbTKK8RUw3KG64HDoueg==
X-Received: by 2002:a05:6808:1508:: with SMTP id u8mr11885774oiw.170.1627927386947; Mon, 02 Aug 2021 11:03:06 -0700 (PDT)
Received: from [] ( []) by with ESMTPSA id bi18sm64845oib.54.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 02 Aug 2021 11:03:05 -0700 (PDT)
From: "Mark Allman" <>
To: "Vidhi Goel" <>
Cc: "Yuchung Cheng" <>, "Neal Cardwell" <>, Extensions <>
Date: Mon, 02 Aug 2021 14:03:03 -0400
X-Mailer: MailMate (1.13.2r5673)
Message-ID: <>
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <> <>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_7EE1EE75-028A-4A42-AB33-477186506F6C_="; micalg=pgp-sha1; protocol="application/pgp-signature"
Archived-At: <>
Subject: Re: [tcpm] =?utf-8?q?Linux_doesn=E2=80=99t_implement_RFC3465?=
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 02 Aug 2021 18:03:22 -0000

>  “This document RECOMMENDS using mechanisms like Pacing to control
>  how many bytes are sent to the network at a point of time. But if
>  it is not possible to implement pacing, an implementation MAY
>  implicitly pace their traffic by applying a limit L to the
>  increase in congestion window per ACK during slow start. In
>  modern stacks, acknowledgments are aggregated for various reason,
>  CPU optimization, reducing network load etc. Hence it is common
>  for a sender to receive an aggregated ACK that acknowledges more
>  than 2 segments. For example, a stack that implements GRO could
>  aggregate packets up to 64Kbytes or ~44 segments before passing
>  on to the TCP layer and this would result in a single ACK to be
>  generated by the TCP stack. Given that an initial window of 10
>  packets in current deployments has been working fine, the draft
>  makes a recommendation to set L=10 during slow start. This would
>  mean that with every ACK, we are probing for a new capacity by
>  sending 10 packets in addition to the previously discovered
>  capacity. Implementations MAY choose to set a lower limit if they
>  believe an increase of 10 is too aggressive."
> Does this sound like what we would like to say?

Not really, IMO.  I think a few things here ...

  - I agree that if pacing is in play that we don't need to worry
    about an L.

  - I think the above L=10 reasoning is at best pretty weak.  Just
    because IW=10 works OK once / connection does not mean
    continually sending "10 more" will work out OK.  It may.  It may
    not.  But, the above sort of coupling between IW-10 and L=10
    seems highly tenuous without any sort of data.

  - The real issue with picking a number is that it is so hard to
    reason about because the behavior All Depends.  E.g., consider
    something like IW=10.  We know that will allow 10 or fewer
    segments to be pumped into the network when the connection
    starts.  That's pretty easy to reason about / understand.  But,
    with L=10 we might have bursts anywhere from 2 packets to
    cwnd+10 packets on every ACK---depending on how the ACKs are
    stretched.  And, cwnd isn't a constant.  So, if the idea is to
    somehow limit bursting then sometimes we're limiting to X and
    others to Y and still others to Z.  It's an inconsistent mess.
    Making L something arbitrary without evidence seems like a bad
    path to me.

  - Of course, by making L=10 a MAY we're effectively saying "no L,
    anywhere" anyways.  If we're going to define an L it should be a
    SHOULD unless there is pacing.

  - Somewhat related to the above, it isn't clear what "10 more" is
    more than in qualitative terms, as well.  Say an ACK rolls in
    that covers 10 packets.  How were those packets sent?  Were they
    sent a back-to-back burst so "10 more" is in fact a back-to-back
    burst that is 10 more than previously (or, 2x)?  Or, did we send
    those 10 packets in 5 little bursts of 2 packets each so that
    "10 more" is actually 18 more than the previous burst size---a
    10x increase?  Of course, it could be BOTH for the same ACK!
    I.e., the segments were sent 2 at a time and aggregated
    somewhere in the middle.  This lack of clarity again makes the
    choice of L feel pretty arbitrary.

I agree we can elide L if pacing is in place.  But, twiddling with L
by feel is crude and will produce an inconsistent approach to bursts
that doesn't seem to me to be particularly helpful because we don't
really grok the implications.