Re: [tcpm] I-D Action: draft-ietf-tcpm-accurate-ecn-13.txt - ACK Thining

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Fri, 06 November 2020 14:01 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 67A673A11C0 for <tcpm@ietfa.amsl.com>; Fri, 6 Nov 2020 06:01:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.145
X-Spam-Level:
X-Spam-Status: No, score=-2.145 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.247, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CHD3EOI_cDcc for <tcpm@ietfa.amsl.com>; Fri, 6 Nov 2020 06:01:12 -0800 (PST)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id 558673A11AF for <tcpm@ietf.org>; Fri, 6 Nov 2020 06:01:11 -0800 (PST)
Received: from GF-MacBook-Pro.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 157A11B002AE; Fri, 6 Nov 2020 14:01:08 +0000 (GMT)
To: Bob Briscoe <ietf@bobbriscoe.net>
References: <160388925181.18695.7892567372446756190@ietfa.amsl.com> <4017c549-ac6d-d633-6432-20a6a8a9a342@bobbriscoe.net> <3c8de57b23994824b6c51cf5d7fba7ec@hs-esslingen.de> <5dd8f210-2fe2-bd9b-5e69-4a87016f5416@bobbriscoe.net> <dae037f1-1b39-2a79-1f78-0ddc13e54507@bobbriscoe.net> <CAAK044SkeTcSshPRxZgxniCDzek+9YuyUYpkNsFPe+=ExoAvJA@mail.gmail.com> <FFA9CFA4-48E2-4432-8DFC-BF9EA08FBAB6@ericsson.com> <CAAK044TM=rB7mVv7=O8pdyZR6xNT642PaCgXFyAbDD_fvtZZ3A@mail.gmail.com> <286e7716-be59-a023-7786-21e8175a3eb3@bobbriscoe.net>
Cc: tcpm IETF list <tcpm@ietf.org>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <ef350894-09da-a89a-6349-b6528b9210e7@erg.abdn.ac.uk>
Date: Fri, 06 Nov 2020 14:01:07 +0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.3.3
MIME-Version: 1.0
In-Reply-To: <286e7716-be59-a023-7786-21e8175a3eb3@bobbriscoe.net>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/Y_6LEVJmLDwU0DJseEOOQT2noLs>
Subject: Re: [tcpm] I-D Action: draft-ietf-tcpm-accurate-ecn-13.txt - ACK Thining
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Nov 2020 14:01:14 -0000

I see this rev introduced a section on ACK Filtering, which I have 
specific comments on. We need to consider the cost of sending ACKs - 
especially anything that results in them being large - however, this 
needs to be done by choosing a default at the receiver - not as an 
adaptation, the adaption if necessary needs to be introduce more ACKs 
when this is beneficial, as in some cases for DAAS, not in trying to 
figure something from delayed or "missing" ACKs.

1) So let me say first that I think the suggestion for updating the 
receiver to thin, and every 2-6 packets seems reasonable.

     “MUST immediately send an ACK once 'n' CE marks have arrived since
       the previous ACK, where 'n' SHOULD be 2 and MUST be no greater
       than 6.”

I do agree that endpoints can do much more to reduce the return path 
capacity that they consume - and that the endpoints have the opportunity 
to encode the feedback information more efficiently than can be done by 
a device (or queue) in the network. From our experiments, n=10 would 
also often be likely OK.

2. Althouigh in some cases a receiver can predict a good "n", I do not 
believe a receiver can typically find out how to choose “n” by observing 
the ACKs on a connection.

Sure, if the path is modelled as a simple bit-congestive path that has a 
queue, then you could monitoring timing/loss of packets and determine a 
rate that tries not to fill a queue. I would not know how common such a 
bottleneck link is.

I do know that many bottleneck links are much more complex and need to 
consider the scheduling transmission of small ACK packets on the return 
path performance. In many radio technologies, the overhead from 
transmission of ACKs is proportionally much higher than for data 
segments. These layer 2s are often constrained in terms of packets (e.g. 
where lower layer framing and transmission scheduling is an important 
consideration to the efficiency/cost of using the service). This has 
often been “fixed” using ACK Thinning or PEPs. Protocols that send fewer 
ACKs/data packet are not impacted.


2. I do not believe a receiver can find out how top to choose “n” by 
observing the ACKs received on a connection.

Sure, if the path is modelled as a simple bit-congestive path that has a 
queue, then you could monitoring timing/loss of packets and determine a 
rate that tries not to fill a queue. I would not know how common such a 
bottleneck link is.

I do know that many bottleneck links are more complex and needs to 
consider the scheduling transmission of small ACK packets on the return 
path performance. In many radio technologies, the overhead from 
transmission of ACKs is proportionally much higher than for data 
segments -  These layer 2s are constrained in terms of packets (e.g. 
where lower layer framing and transmission scheduling is an important 
consideration to the efficiency/cost of using the service). This has 
often been “fixed” using ACK Thinning or PEPs. Protocols that send fewer 
ACKs/data packet are not impacted.

Scheduling (at the IP layer,can also often result in variations in 
timing - something that is hard to predict and I would suggests make 
automated selection of “n” very hard, and the need to thin is often not 
directly visible to a user of the path. Traffic multiplexed with 
protocols that use different ACK policies (perhaps QUIC?) also make this 
sort of detection less reliable.

3) A queue might be competing for scheduling opportunities against other 
IP flows (e.g., WFQ), or competing for transmission opportunities with a 
shared resource (common in radio links).

ACKs can and do take transmission opportunities away from the resource 
pools that would be used by other services that share the same resource 
pool. This can be reduced by some form of per-flow scheduling with 
deeper per-flow queues - but that in itself can induce unwanted coupling 
between flows - limiting the peak rate of one flow, to achieve a higher 
ACK ratio for another flow.

Consider a QUIC flow sharing a return path with a TCP flow, the higher 
rate of larger QUIC ACKs reduces the transmission opportunities for TCP 
ACKs, which in turn suppresses the TCP flow, limiting the overall system 
performance. If the ACKs use the same transmission resource, they could 
take directly away from transmission opportunities in the forward 
direction, but even if they do not - the rate of ACKs retuned impacts 
the sending rate end to end.

4) The new section adds some cross-layer requirements on sub-IP layers, 
which seem to imply more complexity. I’m not comfortable with such a 
proposal.

It seems that the text suggests not implement thinning and that these 
systems implement logic to detect the presence of the option by 
observing SYNs and then configure the link to do something special. I 
don’t think think this is the correct approach - the result will be 
throttling of the connection speed. Is it necessary that the sub-IP 
layer does not implement thinning for connections that use AccECN feedback?

When the text says: “SHOULD preserve the timing of each ACK”
- does that imply that nodes need to discard rather than delay packets 
when link scheduling or transmission scheduling is used. This seems a 
very odd requirement, it might be better too simply strip-out the AccECN 
option or discard all packets that set this.

I can’t see how queuing the ACKs, or deleting ACKs rather than queuing 
them would improve the performance of AccECN TCP connections.

Gorry