Re: [tcpm] [EXTERNAL] Re: Seeking WG opinions on ACKing ACKs with good cause

Christian Huitema <huitema@huitema.net> Sun, 11 July 2021 15:54 UTC

Return-Path: <huitema@huitema.net>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3BF33A11C5 for <tcpm@ietfa.amsl.com>; Sun, 11 Jul 2021 08:54:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.013
X-Spam-Level:
X-Spam-Status: No, score=0.013 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8AXqkinlWM7W for <tcpm@ietfa.amsl.com>; Sun, 11 Jul 2021 08:54:49 -0700 (PDT)
Received: from mx36-out10.antispamcloud.com (mx36-out10.antispamcloud.com [209.126.121.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 41CA93A11C3 for <tcpm@ietf.org>; Sun, 11 Jul 2021 08:54:48 -0700 (PDT)
Received: from xse169.mail2web.com ([66.113.196.169] helo=xse.mail2web.com) by mx133.antispamcloud.com with esmtp (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1m2bn2-001AcW-Iu for tcpm@ietf.org; Sun, 11 Jul 2021 17:54:45 +0200
Received: from xsmtp21.mail2web.com (unknown [10.100.68.60]) by xse.mail2web.com (Postfix) with ESMTPS id 4GNBKq1Mp1zLNw for <tcpm@ietf.org>; Sun, 11 Jul 2021 08:54:39 -0700 (PDT)
Received: from [10.5.2.49] (helo=xmail11.myhosting.com) by xsmtp21.mail2web.com with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1m2bmx-0006lB-2r for tcpm@ietf.org; Sun, 11 Jul 2021 08:54:39 -0700
Received: (qmail 28795 invoked from network); 11 Jul 2021 15:54:38 -0000
Received: from unknown (HELO smtpclient.apple) (Authenticated-user:_huitema@huitema.net@[172.58.43.16]) (envelope-sender <huitema@huitema.net>) by xmail11.myhosting.com (qmail-ldap-1.03) with ESMTPA for <tcpm@ietf.org>; 11 Jul 2021 15:54:38 -0000
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
From: Christian Huitema <huitema@huitema.net>
Mime-Version: 1.0 (1.0)
Date: Sun, 11 Jul 2021 08:54:37 -0700
Message-Id: <474DCED0-622E-413F-A4A0-9539548A6377@huitema.net>
References: <2482da71-5b79-1933-f975-e46cd7661c39@bobbriscoe.net>
Cc: Yoshifumi Nishida <nsd.ietf@gmail.com>, tcpm@ietf.org, Mirja Kuehlewind <ietf@kuehlewind.net>, rs.ietf@gmx.at, nishida@sfc.wide.ad.jp
In-Reply-To: <2482da71-5b79-1933-f975-e46cd7661c39@bobbriscoe.net>
To: Bob Briscoe <ietf@bobbriscoe.net>
X-Mailer: iPhone Mail (18F72)
X-Originating-IP: 66.113.196.169
X-Spampanel-Domain: xsmtpout.mail2web.com
X-Spampanel-Username: 66.113.196.0/24
Authentication-Results: antispamcloud.com; auth=pass smtp.auth=66.113.196.0/24@xsmtpout.mail2web.com
X-Spampanel-Outgoing-Class: unsure
X-Spampanel-Outgoing-Evidence: Combined (0.15)
X-Recommended-Action: accept
X-Filter-ID: Pt3MvcO5N4iKaDQ5O6lkdGlMVN6RH8bjRMzItlySaT9WLQux0N3HQm8ltz8rnu+BPUtbdvnXkggZ 3YnVId/Y5jcf0yeVQAvfjHznO7+bT5yjTC8Fov82/EJuxz8FihBPKj/EwzSHE5FGYwwjsNRPCIf2 C/+5oaXB4LmK7oUXREfmD6wdmZPcItWbGe10hXJtXL4FsauCVkDjmcYJdU3yWp7KuHNaaKdg7iBE ZefdsNUFWKwa/wzJUjmazeC7Imcapebr0kNyYC289u5HlaNj1BQ6V51u76v35b1wNe/MvdKAGdwU TZKlze5ERymXAD3v2+J9PgaoF8SQHto3le4zsHTaeQtlKubP6iUTjj6yPARK6buALVaA782LKxg6 vRmng8N1aLhXqdc+jC1RcnVud53D5caUhbVtvqItBqoizkEt9O20UjkwI0v+LOlw05G4BS+iyyNq bT8dUMXMJ4tUCMj6G37ZfAMLceP5aNHPt26RBupu5v1nytoNnc138GfEJRQ2qC7jjynPIHPNqSn4 QTXUjLjYWQt1/5xnQymMoPsgr/U0flMcy2Vi/IcBgY4arPaiJ1W6hAyiRC61jekdwIcXNugoOEbH RyFULpSjm7jZ1h/HfDRQ5Ig8VhPsPE8NaP2gA77cO7WeI9Ftai6fuoBBJaBtD32RALQCSg1oDtOl fOBg6oyRcc61yKYNvDGqDRojSVizNl0ce/s7u0P9b9Tml6eOMCV9kYYwkPx6ZsXvIUzTXkDAiiJi mGhLUFuS2lhaIetXfCg1JdAVrOwKfG1URGUMPfpOEfupo44Glz9UoFIvD3sIcP1fhJPM6B/8FXX1 YOP/O5KiF8LgVDF1JIEcSSViKbm9TDKvWYmkK3CJ+dym1L8cD17Js0v4cp1M66lXricRzNzTJabt hC1ICDcKVNeVJ9BXyu9+ceCqThTYg2px1fSoqxQCCHnLMo/m9VKh99btUAanjnMCAH2co+fBoeG+ Hs0afhsY/5zhNYWRVYKU9W9tbmVXJBqdHHDmZEKhyNAv1N35kYWaEdgLurFV5oTvAcwA4rM3FkfW 8/1kE/e7sUnsVpINvARNxpFO
X-Report-Abuse-To: spam@quarantine11.antispamcloud.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/M29cd1ev3J-cKNZuZvS_KF_qmZA>
Subject: Re: [tcpm] [EXTERNAL] Re: Seeking WG opinions on ACKing ACKs with good cause
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Jul 2021 15:54:54 -0000

 

> On Jul 11, 2021, at 4:42 AM, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> Christian,
> 
>> On 11/07/2021 06:38, Christian Huitema wrote:
>> 
>>> On 7/10/2021 2:13 PM, Bob Briscoe wrote:
>>> Yoshi, Richard, Mirja, tcpm list,
>>> 
>>> We came to a good consensus in late March on the conditions to place around ACKing ACKs in order to keep congestion feedback flowing, but to ensure ping-pong ACKs are rapidly damped as well as avoiding false DupACK detection. I have written that up in a rev of the draft, which I will post separately so as not to distract from the question I have here...
>>> ...While writing up, I realized we had glossed over a possibly more important point, when we proposed to change the number in the basic rule from 2 to 3, as follows:
>>> 
>>>       An AccECN Data Receiver MUST immediately send an ACK once 'n'
>>>       CE marks have arrived since the previous ACK, where 'n' SHOULD
>>>       be 3 and MUST be in the range 3 to 6 inclusive.
>>> 
>>> 
>>> I'd like to suggest an alternative:
>>> 
>>>       An AccECN Data Receiver MUST immediately send an ACK once 'n'
>>>       CE marks have arrived since the previous ACK. If there is new
>>>       data to acknowledge, 'n' SHOULD be 2.  If there is no new data
>>>       to acknowledge, 'n' SHOULD be 3 and MUST be no less than 3.
>>>       In either case, 'n' MUST be no greater than 6.
>>> 
>>> 
>>> Rationale: The data ACKs case shouldn't be compromised by the ACKs of ACKs case.
>>> 
>>> When we were originally only thinking of the data ACKs case (before we realized this rule might trigger ACKs of ACKs), we recommended n=2  because it ensures congestion information is kept fresh, and during runs of congestion it generates more ACKs, which might make it more likely that at least one ACK survives some types of coalescing before the ACE field wraps. Also 2 is the default for the equivalent repetition of DCTCP feedback.
>>> 
>>> Then, we noticed the ACKs of ACKs case, and we wanted to damp any ACK ping-pong, so we recommended 3. Without really discussing the pros and cons, we extended 3 to all cases (both data ACKs and ACKs of ACKs). I suspect the main reason was that it is just simpler to have a single rule. But these two cases are likely to be controlled from different parts of the code anyway.
>>> 
>>> I've also realized that, in the data ACKs case, it was wrong to set a lower bound on 'n', let alone a /mandatory/ lower bound. There might be scenarios where it makes sense to trigger an ACK every CE mark; we have no reason to stop implementers doing that if they want. A lower bound could even be perversely read to mean that you're not allowed to immediately ACK data until you've received 'n' CE marks.
>>> 
>>> Thoughts? 
>> 
>> [CH] I am looking at this from the QUIC point of view. In general, frequent ACKs provide better control, but there are two points of tension: managing low bandwidth return paths; and, reducing the impact of ACK on performance.
>> 
>> There are a number of scenarios in which the return path used by ACKs has a much lower bandwidth than the data path. If ACKs are two frequent, the return path can become congested. ACKing every two or 3 packets is only sustainable if the return path bandwidth is not too small compared to the data path -- 1/10th is generally sustainable, but lower that 1/20th and the ACKs have to be spaced. The alternative to spacing would be increased RTT due to congestion on the return path, and random losses of ACKs. Of course, even with ACK spacing, there are limits to asymmetry: control becomes loose if the transport does not receive multiple ACKs per RTT.
>> 
>> ACK frequency has been found a key factor in the performance of QUIC implementations. For example, implementations may achieve 5Gbps with spaced ACK, but the performance drops to a few 100 Mbps if we insist on an ACK for 2 packets. High performance requires sending packets in batches, using API like UDP GSO, and performance seems best when the frequency of ACKs more or less matches the frequency of batches.
> 
> [BB] I think a brief summary of AccECN might be in order here - it seems like you're starting from scratch, but we're aware of all the above.
> 
> The implementation will have its own ACK ratio - that's out of scope of AccECN, except to set this max of 6 CEs, which is to mitigate wrap of the 3-bit counter of CE-marks (which in the worst case of 100% marking in the data direction could then induce 1 ACK per 6 packets). This shouldn't limit forward performance, because it only increases the reverse ACK rate if there is heavy congestion in the forward path, when the Data Sender should be reducing the forward rate anyway.

1 ACK per 6 packets would cause performance issues on high speed links. Common setup is "4 to 8 ACK per RTT", which means intervals much larger than 6 packets. 


> 
> AccECN can also use the AccECN TCP Option that provides a much larger counter (24-bit), but AccECN has to work reasonably well with just the 3-bit counter in the main TCP header in case the option can't traverse the path. However, the Data Receiver doesn't know whether the option is reaching the other end, so it has to assume it might not be.

In QUIC, the ECN counters are 62 bits, so I guess the guidance can be much looser.


> 
> My original question asks what the recommended value of 'n' should be for how much the counter of CE-marked data packets should be allowed to increment between ACKs. We don't have to recommend anything, but it helps implementers if we do. AccECN is TCP only. So when we originally recommended 2, we figured that TCP's default ACK ratio is 2. So, even if ECN marking in the forward direction was 100%, a default increment of 2 CE marked packets ('n') wouldn't cause any more frequent ACKs than the default.

The default ACK ratio in TCP is one thing, the practical ratio is another. On many paths, middleboxes perform ACK thinning, pruning ACKs to improve performance. A three bit counter is not very robust against such thinning. To be robust, the counter should be wide enough to accumulate an RTT worth of ECN marks. 

> 
> Nonetheless, if the implementer has chosen a longer ACK ratio (or perhaps allowed the ratio to be controlled by the sender e.g. draft-gomez-tcpm-ack-rate-request) they don't have to use the recommended n=2; the proposed wording allows a range.
> 
> Don't worry, we certainly wouldn't "insist on an ACK for 2 packets".
> 
> 
> Also, FYI, AccECN provides feedback about ECN marking on the ACK stream, back to the Data Receiver who is generating the ACKs. It then has a hook on which it can hang feedback control of TCP's ACK stream without waiting for anyone else to deploy anything (it's up to the implementer or the IETF to add this if they want). However, the scope of the AccECN spec is deliberately wire protocol only, so it doesn't say what to do with this feedback.

Yes, using ECN on the ACK stream has potential. There has been very little research on congestion control for ACK. I wish there would be more. 

-- Christian Huitema