Re: [tcpm] Linux doesn’t implement RFC3465

Vidhi Goel <vidhi_goel@apple.com> Wed, 28 July 2021 21:13 UTC

Return-Path: <vidhi_goel@apple.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 354653A20CC for <tcpm@ietfa.amsl.com>; Wed, 28 Jul 2021 14:13:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.55
X-Spam-Level:
X-Spam-Status: No, score=-2.55 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.452, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=apple.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J4J5cuY1CjiN for <tcpm@ietfa.amsl.com>; Wed, 28 Jul 2021 14:13:02 -0700 (PDT)
Received: from rn-mailsvcp-ppex-lapp14.apple.com (rn-mailsvcp-ppex-lapp14.rno.apple.com [17.179.253.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 921D93A20CA for <tcpm@ietf.org>; Wed, 28 Jul 2021 14:13:02 -0700 (PDT)
Received: from pps.filterd (rn-mailsvcp-ppex-lapp14.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp14.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 16SJue64026700; Wed, 28 Jul 2021 14:12:57 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=yLkqzVKmsrF7i6fPAprFbLpdhaE5XgdqCDEvCAobIjM=; b=YYQW2/B/UGh/wUxN700Or/15oV/BzQnP354Ct0BYHFapvlyxZtIMSlsvkczAsKCnyTiC RU0ev/hSwKtq+hhfQZR/OJAIA4J30tqJsD7ME35L/qpKYRAT9N71QAzIEW+/xlb673Ye fY54RBRKxz+1153QXcgKudyYIJjPiU06CwTgaZ4mE+AYZO0gW0/oCt5Hf7m1LJESVU9f qlcqH/iErUlPXg9KjAp9Y8YtYK501CAghCxJ8ru9Mg7SCSvsK1hFrMmE/4wyRuiE9uu/ 7yyfBjr5z1Pow9rWyv4CWvqPjxZ+HlVQdy2yNYnlYHMpxyrJYjIGstNDHuU4a9VDy1ug GA==
Received: from rn-mailsvcp-mta-lapp01.rno.apple.com (rn-mailsvcp-mta-lapp01.rno.apple.com [10.225.203.149]) by rn-mailsvcp-ppex-lapp14.rno.apple.com with ESMTP id 3a235q4xv9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Wed, 28 Jul 2021 14:12:57 -0700
Received: from rn-mailsvcp-mmp-lapp03.rno.apple.com (rn-mailsvcp-mmp-lapp03.rno.apple.com [17.179.253.16]) by rn-mailsvcp-mta-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPS id <0QWZ0113C49LC660@rn-mailsvcp-mta-lapp01.rno.apple.com>; Wed, 28 Jul 2021 14:12:57 -0700 (PDT)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp03.rno.apple.com by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) id <0QWZ00C0048IJ000@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Wed, 28 Jul 2021 14:12:57 -0700 (PDT)
X-Va-A:
X-Va-T-CD: 34c0898ec7cdaf6a757ba12558229a8b
X-Va-E-CD: 8ad83cf34a8c3732cb73dea81270d36d
X-Va-R-CD: 2eec4d4b333905bf2498bde820845c7b
X-Va-CD: 0
X-Va-ID: d4ab3681-5f79-4d6d-9b54-1acbe64fae7e
X-V-A:
X-V-T-CD: 34c0898ec7cdaf6a757ba12558229a8b
X-V-E-CD: 8ad83cf34a8c3732cb73dea81270d36d
X-V-R-CD: 2eec4d4b333905bf2498bde820845c7b
X-V-CD: 0
X-V-ID: 23484147-0e61-4540-98e3-9aa7ca3d2b21
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-28_10:2021-07-27, 2021-07-28 signatures=0
Received: from smtpclient.apple (unknown [17.234.58.195]) by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.9.20210415 64bit (built Apr 15 2021)) with ESMTPSA id <0QWZ00C1P49KVG00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Wed, 28 Jul 2021 14:12:57 -0700 (PDT)
From: Vidhi Goel <vidhi_goel@apple.com>
Message-id: <DF5EF1C7-0940-478A-9518-62185A79A288@apple.com>
Content-type: multipart/alternative; boundary="Apple-Mail=_9785424E-2E44-4306-B0EF-54A5614BD58C"
MIME-version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.11\))
Date: Wed, 28 Jul 2021 14:12:56 -0700
In-reply-to: <54699CC9-C8F5-4CA3-8815-F7A21AE10429@icsi.berkeley.edu>
To: Mark Allman <mallman@icir.org>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>
References: <78EF3761-7CAF-459E-A4C0-57CDEAFEA8EE@apple.com> <CADVnQynkBxTdapXN0rWOuWO3KXQ2qb6x=xhB35XrMU38JkX2DQ@mail.gmail.com> <601D9D4F-A82C-475A-98CC-383C1F876C44@apple.com> <54699CC9-C8F5-4CA3-8815-F7A21AE10429@icsi.berkeley.edu>
X-Mailer: Apple Mail (2.3654.100.0.2.11)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-28_10:2021-07-27, 2021-07-28 signatures=0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/6QivvFL-B7MGIUgcFWY9aEE4xoQ>
Subject: Re: [tcpm] =?utf-8?q?Linux_doesn=E2=80=99t_implement_RFC3465?=
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2021 21:13:07 -0000

Resurrecting the 3465 thread.

In the TCPM meeting at IETF 111, we discussed about this issue of L=2 which is a MUST in RFC 3465. This is a very strict requirement and stacks like Linux already doesn’t follow it.
> The basic principle in the Linux code is that the sender should use the ACKs to learn about the capacity of the path (both in volumetric and rate dimensions), and should not ignore that information. This allows the sender to quickly grow and achieve high throughput, even in the presence of stretch ACKs, which are pervasive, due to TSO/GSO, GRO/LRO, etc.
> 
> Considering bursts is important, but that can be tackled as an orthogonal issue. Bursts are avoided in the Linux TCP ecosystem by the combination of TSO autosizing, pacing, TSQ, and fair queueing.

As Neal described, the congestion controller should use the information in stretch ACKs to increase its congestion window so that it correctly adjusts the cwnd based on available link capacity.
Burstiness is an orthogonal issue which can be solved by pacing.

QUIC loss recovery (RFC 9002) also follows this approach.

Mark,
As a lot of transport and congestion control drafts reference RFC 3465, do you think we should update this RFC to reflect the current deployment? This would also be useful for someone who is just starting with a new implementation.

Thanks,
Vidhi

> On Nov 27, 2019, at 5:14 AM, Mark Allman <mallman@icir.org> wrote:
> 
> 
>> +Mark Allman
> 
> Just to clear it up, I *was* at BBN long ago when the ABC document
> was written.  It's a cool place to work.  I recommend it.  But, I
> now hang out at ICSI.
> 
>> I believe that ABC was written to solve the problem with ACK
>> counting by counting the number of bytes acknowledged for
>> misbehaving receivers. Limiting the increase to 2*MSS was a good
>> solution to avoid bursts at the time.
> 
> The main motivation behind ABC was to counteract delayed ACKs.  The
> common approach at the time was to just bump cwnd by one MSS every
> time an ACK rolled in.  So, if an ACK covered two segments because
> the receiver was delaying the ACKs then the growth rate during slow
> start was 1.5x per RTT instead of the 2x that was really
> envisioned.  During congestion avoidance the growth was 1 MSS every
> 2 RTTs instead of the envisioned every RTT.
> 
> A secondary motivation was to counteract these ACK division attacks
> that Savage taught us about.  I.e., we could ACK an MSS-sized packet
> one byte at a time and the sender would then increase the cwnd by
> MSS*MSS bytes in the prevalent ACK counting scheme (i.e., cwnd would
> get bumped by MSS bytes for every ACK).
> 
> The limit has two roots ...
> 
> (1) The limit is important in slow starts that follow an RTO.  As
>    the RFC discusses, in this case we might retransmit a single
>    packet and this will cause the receiver's window to slide a
>    great deal.  Therefore, an ACK may indicate that a ton of data
>    has left the network, but that isn't really the case.  So, we
>    don't want to increase the cwnd based on all the new bytes
>    ACKed.
> 
>    I have since mostly decided that this use of L is crude.
>    Probably there is a more elegant way to do it by using the
>    scoreboard and the SACK information to get a better
>    understanding of what left the network and when.  That said, in
>    this case L is simple and probably about right most of the
>    time.
> 
> (2) I think there was some general conservativeness to bursts and
>    using L everywhere quelled some of the worry.  Here L=2 was used
>    to exactly offset delayed ACKs.
> 
>> I agree that increasing the congestion window and controlling the
>> burst rate are orthogonal issues.
> 
> Yes.  In fact, we did subsequent research on mitigating bursts
> because we never really thought of ABC as somehow the way to control
> bursts (research papers available, but never went into RFCs).
> 
> And, in a world that leverages stretch ACKs as a routine I think
> Linux's approach of not using an L may well be correct.  Documenting
> that and the reasoning behind it in modern networks seems useful to
> me.
> 
> allman
>