Re: [tcpm] Linux doesn’t implement RFC3465

"Mark Allman" <> Wed, 27 November 2019 13:14 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7CC4F1208B1 for <>; Wed, 27 Nov 2019 05:14:28 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ULEYwxMEEaHk for <>; Wed, 27 Nov 2019 05:14:27 -0800 (PST)
Received: from rock.ICSI.Berkeley.EDU (rock.ICSI.Berkeley.EDU []) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5654A12083F for <>; Wed, 27 Nov 2019 05:14:27 -0800 (PST)
Received: from localhost (localhost.localdomain []) by rock.ICSI.Berkeley.EDU (Postfix) with ESMTP id 2BC832C4081; Wed, 27 Nov 2019 05:14:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at ICSI.Berkeley.EDU
Received: from rock.ICSI.Berkeley.EDU ([]) by localhost (maihub.ICSI.Berkeley.EDU []) (amavisd-new, port 10024) with LMTP id NPQhYeScFLLv; Wed, 27 Nov 2019 05:14:26 -0800 (PST)
Received: from ( []) by rock.ICSI.Berkeley.EDU (Postfix) with ESMTP id 9A7322C407E; Wed, 27 Nov 2019 05:14:26 -0800 (PST)
Received: from [] (localhost []) by (Postfix) with ESMTP id 3B21A1EDC31D; Wed, 27 Nov 2019 08:14:26 -0500 (EST)
From: "Mark Allman" <>
To: "Vidhi Goel" <>
Cc: "Neal Cardwell" <>,
Date: Wed, 27 Nov 2019 08:14:26 -0500
X-Mailer: MailMate (1.13r5655)
Message-ID: <>
In-Reply-To: <>
References: <> <> <>
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <>
Subject: Re: [tcpm] =?utf-8?q?Linux_doesn=E2=80=99t_implement_RFC3465?=
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 27 Nov 2019 13:14:28 -0000

> +Mark Allman

Just to clear it up, I *was* at BBN long ago when the ABC document
was written.  It's a cool place to work.  I recommend it.  But, I
now hang out at ICSI.

> I believe that ABC was written to solve the problem with ACK
> counting by counting the number of bytes acknowledged for
> misbehaving receivers. Limiting the increase to 2*MSS was a good
> solution to avoid bursts at the time.

The main motivation behind ABC was to counteract delayed ACKs.  The
common approach at the time was to just bump cwnd by one MSS every
time an ACK rolled in.  So, if an ACK covered two segments because
the receiver was delaying the ACKs then the growth rate during slow
start was 1.5x per RTT instead of the 2x that was really
envisioned.  During congestion avoidance the growth was 1 MSS every
2 RTTs instead of the envisioned every RTT.

A secondary motivation was to counteract these ACK division attacks
that Savage taught us about.  I.e., we could ACK an MSS-sized packet
one byte at a time and the sender would then increase the cwnd by
MSS*MSS bytes in the prevalent ACK counting scheme (i.e., cwnd would
get bumped by MSS bytes for every ACK).

The limit has two roots ...

(1) The limit is important in slow starts that follow an RTO.  As
    the RFC discusses, in this case we might retransmit a single
    packet and this will cause the receiver's window to slide a
    great deal.  Therefore, an ACK may indicate that a ton of data
    has left the network, but that isn't really the case.  So, we
    don't want to increase the cwnd based on all the new bytes

    I have since mostly decided that this use of L is crude.
    Probably there is a more elegant way to do it by using the
    scoreboard and the SACK information to get a better
    understanding of what left the network and when.  That said, in
    this case L is simple and probably about right most of the

(2) I think there was some general conservativeness to bursts and
    using L everywhere quelled some of the worry.  Here L=2 was used
    to exactly offset delayed ACKs.

> I agree that increasing the congestion window and controlling the
> burst rate are orthogonal issues.

Yes.  In fact, we did subsequent research on mitigating bursts
because we never really thought of ABC as somehow the way to control
bursts (research papers available, but never went into RFCs).

And, in a world that leverages stretch ACKs as a routine I think
Linux's approach of not using an L may well be correct.  Documenting
that and the reasoning behind it in modern networks seems useful to