Re: [tcpm] A longer follow-up on my comment on draft-gomez-tcpm-delack-suppr-reqs

Bob Briscoe <in@bobbriscoe.net> Fri, 01 May 2020 16:46 UTC

Return-Path: <in@bobbriscoe.net>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 90B6A3A1787 for <tcpm@ietfa.amsl.com>; Fri, 1 May 2020 09:46:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rVQohSqeMvS7 for <tcpm@ietfa.amsl.com>; Fri, 1 May 2020 09:46:38 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 279E33A1785 for <tcpm@ietf.org>; Fri, 1 May 2020 09:46:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=+XvQNzTta6vH/kVfSRf4LzergL515zfQwq+3Mmy5heg=; b=h5UY6lZPHEZeNuO1SayOe6fai1 41+ZcVRff80OCU0+e77iGAlW8LlydsGCfk6xpuorMl+5trDCmUUNS01RqswiwieFPX/nIanhKnihj RxoEAKKoDeLbJu1r2mkgwPdl1xyyOpH67m0isLzN6WirrLRrOm53s2UkMJj52aW1yJY01Mvt49dCT fVFtEr+q4KywREN0jQPfUlAjYihjZy9aIfntxAHzgyMtRo6WVfOUqjWF5xTQiO+Nyx31sbiuiTfG5 5bt+orOU08PymEsa9E37R8eVjsGZ4vSARiUVmZPr5JNXTFi3Hcwtqf36BqKDPoOmmwq9AZGrLmh2H SiujNHnA==;
Received: from [31.185.128.97] (port=46538 helo=[192.168.0.6]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <in@bobbriscoe.net>) id 1jUYo8-00CRpl-1z; Fri, 01 May 2020 17:46:36 +0100
To: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Cc: tcpm IETF list <tcpm@ietf.org>
References: <CAK6E8=dpmjf9MDQQVSkURcxnFrB3ZK_zqUDyoKs=MJgeLCeqvQ@mail.gmail.com> <8c3aca0d-0ea6-69f6-5ad8-fbdc984de77b@erg.abdn.ac.uk> <42a218f6-3d9d-25f1-9e00-476144b671fc@bobbriscoe.net> <e3fa357f-84d3-ebcb-ac93-1e83e1c9930e@erg.abdn.ac.uk>
From: Bob Briscoe <in@bobbriscoe.net>
Message-ID: <fb32c080-deee-f92f-2aa5-ef40caaf9a8d@bobbriscoe.net>
Date: Fri, 01 May 2020 17:46:35 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <e3fa357f-84d3-ebcb-ac93-1e83e1c9930e@erg.abdn.ac.uk>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/JVn8CCFA88_SU9ytaP2kIit8GpU>
Subject: Re: [tcpm] A longer follow-up on my comment on draft-gomez-tcpm-delack-suppr-reqs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 May 2020 16:46:42 -0000

Gorry,

On 30/04/2020 09:36, Gorry Fairhurst wrote:
> See some more explanation in line.
>
> On 29/04/2020 22:58, Bob Briscoe wrote:
>> Gorry,
>>
>> On 29/04/2020 18:54, Gorry Fairhurst wrote:
>>
>>> - DAASS is a simple fix, that appears to help a lot when there is 
>>> data waiting for the arrival of the delayed ACK. 
>>
>> Any heuristic at the receiver to detect the end of a pattern of 
>> start-up such as slow-start, ossifies that pattern into the Internet. 
>> Sender control of the DelACK ratio removes that ossification.
>>
> I initially thought the same, DAASS tried to fix multiple problems. 
> However, as regards DAASS  reducing the waiti for the ACK to a rounds 
> of initial cwnd growth, I think the key understanding is the sender 
> use of byte accounting (for stretch-ACKs) not individual ACKs, (that 
> might imply a need for some burst-mitigation for stretch ACKs, because 
> you reduce ACK clocking).
>
> I like what I understand Chrome's QUIC does:  In the first 100 
> received packets, send an ACK for each packet (could be each pair of 
> packets).  This rapidly grows a cwnd of about 140 packets (assuming 
> exponetial growth). Once the cwnd is >100 then an ACK covering 10 
> packets/segments becomes a small proportion of the cwnd (the 
> additional delay from per-packet ACKs is small). The receiver can 
> moves to a larger default ACK ratio.

[BB] If it's likely that we need to move beyond slow-start behaviour in 
future, building heuristics around slow-start is the worst possible step 
we could take. That's the ossification I'm talking about.

When we implemented paced chirping, it was really hard not to confuse 
the heuristics in Linux receivers. Linux receivers are built to detect 
the end of today's slow-start behaviour. Our sender wasn't using 
anything like slow start behaviour, which triggered their heuristics 
much too early.

Magically increasing the delack ratio after 100 packets is another 
behaviour we must not encourage.

Sender control of the DelAck ratio can be really simple. Then we have a 
simple solution that will last for ever.
We must not dig the heuristics and magic numbers hole any deeper.


>
>>
>>> ABC is important also (well what I think people understand by ABC - 
>>> rather than the RFC on this).
>>
>> Yes.
>>
>>> * High bit rate environments and short data segments
>>>
>>> If you really don’t want delayed ACKs, the protocol using TCP ought 
>>> to disable nagle. Or you would perhaps use DAASS... I’m not sure 
>>> which problem you speak about?
>>>
>>>
>>> * Beyond classic ACK transmission behavior
>>> I do agree that Delayed ACKs preclude using sender behaviors 
>>> intended to quickly and non-intrusively probe for available capacity 
>>> during slow start. This is important to me, but I am very wary about 
>>> the idea of sending more ACKs to do this, and we need to be careful 
>>> to gain sufficient information about what the receiver has seen - 
>>> not just to get a bunch of separate (each cumulative) ACKs. This 
>>> really doesn’t give much fidelity to the sender. I’d be happy to 
>>> talk more about what might be better!
>>
>> Indeed, ACK frames in QUIC that give the arrival times of data 
>> packets are an alternative - as long as they are not stretched beyond 
>> the time at which they would have been useful.
>>
> Interesting, I think.
>> But we are more limited for space in TCP. As long as a TCP server is 
>> serving bulk and short flows, it can put more effort into receiving 
>> more ACKs during flow startup, and save its processing power by using 
>> less frequent ACKs for the majority of its data - transmitted in 
>> congestion avoidance.
>>
>>
> Also seems true.
>>> * IoT scenarios
>>> I can see the argument of an ACK energy cost for the IoT device. At 
>>> least, I would assume IoT devices can be tuned, and the apps they 
>>> talk to can be tuned. Appropriate guidance helps!
>>
>> Tuning is for a scenario, not for widely differing scenarios. The 
>> message of this draft is that a connection sometimes needs to be able 
>> to adapt its ACK ratio at run-time. Not every IoT device is always 
>> deployed for just one scenario. Neither is its peer. To me this says 
>> adaptation mustn't be complex (and doesn't have to be).
>>
> Heterogeniety of the network segments is tricky. Some optimisations 
> could be handled by link-specific methods - header compression perhaps 
> is an example, other things need to be adapted end-to-end. However, 
> end-to-end transport has to work for all types of path (links) - and 
> often does not know the path.

[BB] Sender control of the receiver's DelAck ratio would be always 
better and never worse in these cases. It might not solve the lack of 
path knowledge. But wherever path knowledge can be gathered, sender 
control of the receiver's DelAck ratio would then be useful/necessary.

Then, any sender can implement "link-specific methods" unilaterally (or 
not if it chooses). Sndr-rcvr co-ordination is an important piece that's 
missing.

You seem to be generating confusion. There is still good reason to add a 
generally useful missing capability (sender control of receiver DelACK 
ratio), whether or not other capabilities might sometimes be missing in 
certain odd scenarios.

Do one thing and do it well.


>>>
>>> * Bursty Apps
>>> I think you can add varying workloads as something important. By 
>>> which I mean apps that do transactional stuff, or are controlled by 
>>> applications where the data transmission need varies.When we looked 
>>> at CWV and the ideas that the applications control the traffic 
>>> patterns in one group of applications, we also noted that these 
>>> applications change the way they use the network. That’s important 
>>> for a timely restart to growing the cwnd (etc). Such applications 
>>> can also be very sensitive to network delay.  These are probably 
>>> also of interest!
>>
>> Yes.
>>
>>>
>>> So....  the reason for hesitancy overall is I see this as tricky 
>>> space to get correct.
>>
>> But trickiness is the point of starting this requirements document, 
>> isn't it?
>> Please can you articulate better why you seem negative about this 
>> exercise?
>>>
>>> There are many places where fewer ACKs are good - if you have per 
>>> ACK interrupt costs - if the capacity consumed by ACKs is important 
>>> - the cost of sending in the link technology is high - etc. This is 
>>> complicated for TCP (and much less so for QUIC - because QUIC has a 
>>> notion of pacing; QUIC’s loss recovery is different; and QUIC’s ACKs 
>>> are not easily thinned, at least at present). Any TCP method has to 
>>> live with networks that experience pain from more ACKs, but also may 
>>> deploy (various) mitigations.
>>
>> Yes. But reading between the lines I hear you say "Don't start this 
>> exercise".
>> Pls explicitly articulate your implicit message.
>>
> The slide asked about adopting this work - my words were against doing 
> that at that time with that revision of this document.
>
> I miss seeing presentations with results and showing the thinking. I  
> wonder if we need a specific agenda slot to focus on such topics. I 
> expect we'll learn a lot from some data from a variety of 
> problem-spaces/use-cases, especially if the timeslot is sufficient to 
> include thinking and experience from people outside of TCPM. (That's 
> another thought - and probably should be another thread).

[BB] I certainly agree with the need to base this on experience.
However, all we have at the moment is experience of protocols /without/ 
sender control of the DelAck behaviour of the receiver. That's useful 
experience, but only to demonstrate the /problem/.

It would be good to hear some empirical experience of protocols /with/ 
sender control.


>
>>>
>>> There are also cases where you’d like more information than a 
>>> delayed ACK provides (e.g. chirping and similar probing methods) and 
>>> there are application pathologies that would love an ACK at the end 
>>> of a packet burst (be that 1 or 100s of packets). Any method has to 
>>> live with a wide variety of paths and applications.
>>>
>>> Happy to discuss more. And I suspect other people may also respond 
>>> on list. We have lots of data for QUIC and TCP - but most of our 
>>> data focusses on testbeds or on actual broadband satellite services.
>>

[BB] As above. Data about the existing Internet without the proposed 
capability is useful to characterize the problem. But need to be aware 
of its limitations at characterizing the solution-space proposed here.


Bob


>> Thanks for elaborating.
>>
>>
>>
>> Bob
> Gorry
>>
>>
>>>
>>> Gorry
>>>
>>>
>>>
>>> _______________________________________________
>>> tcpm mailing list
>>> tcpm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/tcpm
>>
>
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/