[tcpm] Seeking WG opinions on ACKing ACKs with good cause (was: Possible error in accurate-ecn)

Bob Briscoe <ietf@bobbriscoe.net> Fri, 12 March 2021 10:54 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 233073A1852 for <tcpm@ietfa.amsl.com>; Fri, 12 Mar 2021 02:54:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.432
X-Spam-Level:
X-Spam-Status: No, score=-1.432 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7Id3pFiQVDoY for <tcpm@ietfa.amsl.com>; Fri, 12 Mar 2021 02:54:11 -0800 (PST)
Received: from mail-ssdrsserver2.hosting.co.uk (mail-ssdrsserver2.hosting.co.uk [185.185.84.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CD8673A1850 for <tcpm@ietf.org>; Fri, 12 Mar 2021 02:54:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:References:Cc:To:Subject:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=bLQhA6YXYwqShuBysTmUW6wGnOGhL/37pWg3LhvHPj4=; b=8kYRmGdyJlaiQATkuXH9MRM/Q V0Lg9KDeAyM5PQQwOzCWF9z5w5ikJxZUwxEnTP1uIodKhaFiXMNKGZ3z6u+FNYWyLFIcvGvyFuZkG h7N/Mqdx7HMtb5I31Z91O9wyJXbMyzW9i2Xnbew/94fcvhzIT0Kcs9A0tiU2JuEQXyFTkV6y2v1/0 e6DcTraIkXrGtcduGk3BkdB6zPCyUcPJKD7xWMaL6oGzOqrGdbVRxgWPMnfY8vgUDCGr2dPS4xXbj 3SJscj/k0OtDaprAL79gnY0j+XrPCO3ujJ7u2crwymYOwWIfzRToujoQkeypqdO91g649Q12Izzsf yFLA8q1yg==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:52716 helo=[192.168.1.11]) by ssdrsserver2.hosting.co.uk with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1lKfQg-000429-Bn; Fri, 12 Mar 2021 10:54:02 +0000
From: Bob Briscoe <ietf@bobbriscoe.net>
To: Yoshifumi Nishida <nsd.ietf@gmail.com>, tcpm IETF list <tcpm@ietf.org>
Cc: "Scheffenegger, Richard" <rs.ietf@gmx.at>, Mirja Kuehlewind <ietf@kuehlewind.net>, Yoshifumi Nishada <nishida@sfc.wide.ad.jp>
References: <47df9b8b-515e-d40d-3473-599b0a3e3876@bobbriscoe.net> <6031BE2B-4D33-426F-BA17-DDF15CF821DE@kuehlewind.net> <060c8bd8-d64b-3e46-7874-742e35e6d114@bobbriscoe.net> <221e58f3-ada0-c880-db72-d98af84fedb8@gmx.at> <bd6ab65d-ccd5-9fa9-58be-6d9fea4af870@bobbriscoe.net> <CAAK044QgF4pz5Wamnxkobthou5ac4_LBxh8=nBYWyOxQUtcW-Q@mail.gmail.com> <8151fdef-ae78-80f3-adfc-d40db878ac8e@gmx.at> <CAAK044RhdAYexcGRj_XDkdY_o6JqB0DDo1X0H2AeFkRcsb0i4A@mail.gmail.com>
Message-ID: <48c5910d-5340-acd6-8fd9-fff1b7758310@bobbriscoe.net>
Date: Fri, 12 Mar 2021 10:54:01 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <CAAK044RhdAYexcGRj_XDkdY_o6JqB0DDo1X0H2AeFkRcsb0i4A@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------46CA4262AECCD7D03D659E3D"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hosting.co.uk
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hosting.co.uk: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hosting.co.uk: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/xudSM54FV2HRyzF9fbrj34-0ST8>
Subject: [tcpm] Seeking WG opinions on ACKing ACKs with good cause (was: Possible error in accurate-ecn)
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Mar 2021 10:54:14 -0000

Yoshi, tcpm list,

As promised we set up a small design team on the single issue of 
occasionally ACKing pure ACKs if they carry new info (ECN marking at the 
IP layer). The design team has two solutions and everyone would be 
prepared to accept either, but preferences differ. So we're seeking 
wider opinions (more opinions obviously won't narrow the choices, but at 
least the WG can then make a decision informed by those who care).

To understand the question, you can either read the email below. Or the 
3 slides for the tcpm meeting are briefer, and include pictures:
https://datatracker.ietf.org/meeting/110/materials/slides-110-tcpm-draft-ietf-tcpm-accurate-ecn-00#page=6

Here's the current draft text in question (see here for context 
https://tools.ietf.org/html/draft-ietf-tcpm-accurate-ecn-14#section-3.2.2.5 
):


            3.2.2.5.1 Data Receiver Safety Procedures

    An AccECN Data Receiver:

    o  SHOULD immediately send an ACK whenever a data packet marked CE
       arrives after the previous packet was not CE.

    o  MUST immediately send an ACK once 'n' CE marks have arrived since
       the previous ACK, where 'n' SHOULD be 2 and MUST be in the range 2
       to 6 inclusive.

Only the second bullet is in question. Here is the proposed diff for 
each alternative, then we explain:

Alternative A

    o  MUST immediately send an ACK once 'n' CE marks have arrived since
-     the previous ACK,
+     the previous ACK and there is outstanding data to acknowledge,
       where 'n' SHOULD be 2 and MUST be in the range 2 to 6 inclusive.


Alternative B

    o  MUST immediately send an ACK once 'n' CE marks have arrived since
-     the previous ACK, where 'n' SHOULD be 2 and MUST be in the range 2
+     the previous ACK, where 'n' SHOULD be 3 and MUST be in the range 3
       to 6 inclusive.


Extra guidance text would be required in each case too (see the end).

Background:
AccECN is a change to the TCP wire protocol that requires the packet 
count of congestion feedback to include any congestion experienced (CE) 
arriving on Pure ACKs (amongst other things). AccECN doesn't require 
Pure ACKs to be ECN-capable, but allows for them to be. Similarly, 
AccECN doesn't require any congestion response to CE on pure ACKs, but 
having the feedback information there allows a response to be added with 
a one-ended update, if desired/necessary. Basically the data receiver is 
a 'dumb reflector'.

The above two bullets were designed to ensure that an ACK is triggered 
a) on the first sign of congestion, and b) frequently enough for the 
count of CE markings to be fed back using the 3-bit ACE field before it 
wraps, even if an occasional pure ACK is lost.

We then realized that the wording could require an ACK to be triggered 
in response to a CE-marked pure ACK. The circumstance when this could 
occur would be when peer X sends a volley of data to Y then stops, and 
the path back from Y to X is congested (probably by other flow(s) so 
that many of the ACKs are CE-marked. The second bullet above under 
alternative (B) would require X to ACK every 'n-th' CE-marked pure ACK. 
However, if Y immediately started sending a volley of data to X, Y could 
misinterpret those ACKs (of ACKs) from X as DupACKs.

There are two ways to deal with this:
A) Some of us prefer to completely prevent ACKs on pure ACKs, on the 
basis that they do not want to risk sometimes generating more ACKs today
B) Others want to ensure that these rules will cause pure ACKs to be 
ACKed when the amount of CE on the ACKs merits it. But sparingly and 
strongly damping any ACK ping-pong.

There are complexity arguments on both sides.

B more complex:
     extra (non-mandatory) 'if' condition for lack of SACK options on a 
pure ACK (to decide it's not a DupACK).
B less complex:
     consistent handling of CE marking whether on pure ACKs or data 
(which would probably remove an 'if' condition).

A more complex:
     CE markings on a string of pure ACKs can build up without feeding 
them back, until released by a data packet (if ever).
     More code at the other end to deal with the resulting risk of many 
wraps of the ACE field (or ignore?).
A less complex:
     less different from current TCP.

Extra guidance text would be necessary in either case.

* Alt A) would need text on handling the risk of many ACE wraps
     (to be written).

* Alt B) would need something like the following changes:

     For the avoidance of doubt, the above change-triggered ACK mechanism
     is deliberately worded to solely apply to data packets, and to ignore
     the arrival of a control packet with no payload, because it is
-  important that TCP does not acknowledge pure ACKs.  The change-
+  important that TCP does not acknowledge pure ACKs which convey no new
+  state information to the sender. The change-
     triggered ACK approach can lead to some additional ACKs but it feeds
     back the timing and the order in which ECN marks are received with
     minimal additional complexity.  If only CE marks are infrequent, or
     there are multiple marks in a row, the additional load will be low.
     Other marking patterns could increase the load significantly.
+
+  Providing feedback on the congestion state of the return channel
+  after a sender has ceased transmitting more data helps inform the
+  clients TCP congestion controller about the state of the return path.
+  Should the role of data sender and receiver subsequently change, the
+  new sender has more up to date knowledge of the network state,
+  preventing transmissions of inappropriate size at that moment.

     Even though the first bullet is stated as a "SHOULD", it is important
     for a transition to immediately trigger an ACK if at all possible, so
     that the Data Sender can rely on change-triggered ACKs to detect
     queue growth as soon as possible, e.g. at the start of a flow. This
     requirement can only be relaxed if certain offload hardware needed
     for high performance cannot support change-triggered ACKs (although
     high performance protocols such as DCTCP already successfully use
     change-triggered ACKs).  One possible compromise would be for the
     receiver to heuristically detect whether the sender is in slow-start,
     then to implement change-triggered ACKs while the sender is in slow-
     start, and offload otherwise.

+   The second bullet creates a possible case where an AccECN 
implementation
+   could sometimes ACK pure ACKs, which in turn might be mistaken for
+   duplicate ACKs (in scenarios where TCP peers take turns to send
+   sets of data packets). To prevent spurious transmissions in such
+   circumstances, if SACK has been negotiated, an implementation could
+   optionally assume that an ACK is not a Duplicate ACK if it has no 
SACK option,
+   which would indicate it was an ACK of an  ACK. Alternatively it 
could use
+   timestamp options to rule out DupACKs.




Bob

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/