Re: [tcpm] A longer follow-up on my comment on draft-gomez-tcpm-delack-suppr-reqs

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 30 April 2020 08:36 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE0AF3A0BA9 for <tcpm@ietfa.amsl.com>; Thu, 30 Apr 2020 01:36:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 95hzUX4TF5cC for <tcpm@ietfa.amsl.com>; Thu, 30 Apr 2020 01:36:53 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 14EE63A0BA8 for <tcpm@ietf.org>; Thu, 30 Apr 2020 01:36:52 -0700 (PDT)
Received: from GF-MacBook-Pro.local (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id B02FE1B001AB; Thu, 30 Apr 2020 09:36:47 +0100 (BST)
To: Bob Briscoe <in@bobbriscoe.net>
Cc: tcpm IETF list <tcpm@ietf.org>
References: <CAK6E8=dpmjf9MDQQVSkURcxnFrB3ZK_zqUDyoKs=MJgeLCeqvQ@mail.gmail.com> <8c3aca0d-0ea6-69f6-5ad8-fbdc984de77b@erg.abdn.ac.uk> <42a218f6-3d9d-25f1-9e00-476144b671fc@bobbriscoe.net>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <e3fa357f-84d3-ebcb-ac93-1e83e1c9930e@erg.abdn.ac.uk>
Date: Thu, 30 Apr 2020 09:36:47 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <42a218f6-3d9d-25f1-9e00-476144b671fc@bobbriscoe.net>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/9akmWY3qvAy7tgrat9uo8Uj-i5k>
Subject: Re: [tcpm] A longer follow-up on my comment on draft-gomez-tcpm-delack-suppr-reqs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Apr 2020 08:36:56 -0000

See some more explanation in line.

On 29/04/2020 22:58, Bob Briscoe wrote:
> Gorry,
>
> On 29/04/2020 18:54, Gorry Fairhurst wrote:
>>
>>
>> First of all thank you for bringing this topic to the WG. I do think 
>> it is important that we discuss what ACKs are used for, how we expect 
>> them to be used and what the practical implications are for using 
>> ACKs across Internet paths.
>>
>> About the slides, I didn't have much time at the Mic. so maybe I 
>> sounded rather harsh, but I would suggest we do need to set a 
>> relatively high bar to adopting a piece of work in this space ... and 
>> we do need data and real understanding of what we are trying to fix 
>> and how we can avoid making it worse by accident in other cases.
>>
>> We went through a rather lengthy discussion in tsvwg on a similar 
>> (but different) topic regarding immediate negative acknowledgment for 
>> SCTP (RFC 7053- SACK-IMMEDIATELY Extension for SCTP) and found there 
>> were a few key places where for that protocol this did make sense.
>>
>> However, I actually don't believe your list of issues (i). Here is 
>> why... (I am interested if you think differently or know more).
>>
>> I've tried to roughly follow your points:
>>
>> * Slow start
>> - I agree Delayed ACKs increase slow start, the more you delay the 
>> bigger the impact. However, the impact is for cases where ACKs are 
>> delayed on intervals comparable with or more than the Path RTT.
> I appreciate you taking time to qualify your initial harsh words. I 
> would ask you to still take even more time over this review. People 
> listen to you on this subject, and I think you are glossing over large 
> areas of the issue too quickly.
>
>> - DAASS is a simple fix, that appears to help a lot when there is 
>> data waiting for the arrival of the delayed ACK. 
>
> Any heuristic at the receiver to detect the end of a pattern of 
> start-up such as slow-start, ossifies that pattern into the Internet. 
> Sender control of the DelACK ratio removes that ossification.
>
I initially thought the same, DAASS tried to fix multiple problems. 
However, as regards DAASS  reducing the waiti for the ACK to a rounds of 
initial cwnd growth, I think the key understanding is the sender use of 
byte accounting (for stretch-ACKs) not individual ACKs, (that might 
imply a need for some burst-mitigation for stretch ACKs, because you 
reduce ACK clocking).

I like what I understand Chrome's QUIC does:  In the first 100 received 
packets, send an ACK for each packet (could be each pair of packets).  
This rapidly grows a cwnd of about 140 packets (assuming exponetial 
growth). Once the cwnd is >100 then an ACK covering 10 packets/segments 
becomes a small proportion of the cwnd (the additional delay from 
per-packet ACKs is small). The receiver can moves to a larger default 
ACK ratio.

>
>> ABC is important also (well what I think people understand by ABC - 
>> rather than the RFC on this).
>
> Yes.
>
>> * High bit rate environments and short data segments
>>
>> If you really don’t want delayed ACKs, the protocol using TCP ought 
>> to disable nagle. Or you would perhaps use DAASS... I’m not sure 
>> which problem you speak about?
>>
>>
>> * Beyond classic ACK transmission behavior
>> I do agree that Delayed ACKs preclude using sender behaviors intended 
>> to quickly and non-intrusively probe for available capacity during 
>> slow start. This is important to me, but I am very wary about the 
>> idea of sending more ACKs to do this, and we need to be careful to 
>> gain sufficient information about what the receiver has seen - not 
>> just to get a bunch of separate (each cumulative) ACKs. This really 
>> doesn’t give much fidelity to the sender. I’d be happy to talk more 
>> about what might be better!
>
> Indeed, ACK frames in QUIC that give the arrival times of data packets 
> are an alternative - as long as they are not stretched beyond the time 
> at which they would have been useful.
>
Interesting, I think.
> But we are more limited for space in TCP. As long as a TCP server is 
> serving bulk and short flows, it can put more effort into receiving 
> more ACKs during flow startup, and save its processing power by using 
> less frequent ACKs for the majority of its data - transmitted in 
> congestion avoidance.
>
>
Also seems true.
>> * IoT scenarios
>> I can see the argument of an ACK energy cost for the IoT device. At 
>> least, I would assume IoT devices can be tuned, and the apps they 
>> talk to can be tuned. Appropriate guidance helps!
>
> Tuning is for a scenario, not for widely differing scenarios. The 
> message of this draft is that a connection sometimes needs to be able 
> to adapt its ACK ratio at run-time. Not every IoT device is always 
> deployed for just one scenario. Neither is its peer. To me this says 
> adaptation mustn't be complex (and doesn't have to be).
>
Heterogeniety of the network segments is tricky. Some optimisations 
could be handled by link-specific methods - header compression perhaps 
is an example, other things need to be adapted end-to-end. However, 
end-to-end transport has to work for all types of path (links) - and 
often does not know the path.
>>
>> * Bursty Apps
>> I think you can add varying workloads as something important. By 
>> which I mean apps that do transactional stuff, or are controlled by 
>> applications where the data transmission need varies.When we looked 
>> at CWV and the ideas that the applications control the traffic 
>> patterns in one group of applications, we also noted that these 
>> applications change the way they use the network. That’s important 
>> for a timely restart to growing the cwnd (etc). Such applications can 
>> also be very sensitive to network delay.  These are probably also of 
>> interest!
>
> Yes.
>
>>
>> So....  the reason for hesitancy overall is I see this as tricky 
>> space to get correct.
>
> But trickiness is the point of starting this requirements document, 
> isn't it?
> Please can you articulate better why you seem negative about this 
> exercise?
>>
>> There are many places where fewer ACKs are good - if you have per ACK 
>> interrupt costs - if the capacity consumed by ACKs is important - the 
>> cost of sending in the link technology is high - etc. This is 
>> complicated for TCP (and much less so for QUIC - because QUIC has a 
>> notion of pacing; QUIC’s loss recovery is different; and QUIC’s ACKs 
>> are not easily thinned, at least at present). Any TCP method has to 
>> live with networks that experience pain from more ACKs, but also may 
>> deploy (various) mitigations.
>
> Yes. But reading between the lines I hear you say "Don't start this 
> exercise".
> Pls explicitly articulate your implicit message.
>
The slide asked about adopting this work - my words were against doing 
that at that time with that revision of this document.

I miss seeing presentations with results and showing the thinking. I  
wonder if we need a specific agenda slot to focus on such topics. I 
expect we'll learn a lot from some data from a variety of 
problem-spaces/use-cases, especially if the timeslot is sufficient to 
include thinking and experience from people outside of TCPM. (That's 
another thought - and probably should be another thread).

>>
>> There are also cases where you’d like more information than a delayed 
>> ACK provides (e.g. chirping and similar probing methods) and there 
>> are application pathologies that would love an ACK at the end of a 
>> packet burst (be that 1 or 100s of packets). Any method has to live 
>> with a wide variety of paths and applications.
>>
>> Happy to discuss more. And I suspect other people may also respond on 
>> list. We have lots of data for QUIC and TCP - but most of our data 
>> focusses on testbeds or on actual broadband satellite services.
>
> Thanks for elaborating.
>
>
>
> Bob
Gorry
>
>
>>
>> Gorry
>>
>>
>>
>> _______________________________________________
>> tcpm mailing list
>> tcpm@ietf.org
>> https://www.ietf.org/mailman/listinfo/tcpm
>