Re: [tsvwg] I-D Action: draft-ietf-tsvwg-ecn-l4s-id-13.txt

Bob Briscoe <ietf@bobbriscoe.net> Mon, 08 March 2021 22:08 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8B61D3A1877 for <tsvwg@ietfa.amsl.com>; Mon, 8 Mar 2021 14:08:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.433
X-Spam-Level:
X-Spam-Status: No, score=-1.433 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id otmRnTUwTPtx for <tsvwg@ietfa.amsl.com>; Mon, 8 Mar 2021 14:07:56 -0800 (PST)
Received: from mail-ssdrsserver2.hosting.co.uk (mail-ssdrsserver2.hosting.co.uk [185.185.84.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 939493A1875 for <tsvwg@ietf.org>; Mon, 8 Mar 2021 14:07:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:References:Cc:To:From:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=5fRppq7v+JiCyWXQL1zevTYz+kvIotiA8fk5JFsaCac=; b=6nkaqO+XCWlcsHUcwNqxgb2yX 18YCzxsIAxZEioLqg1bTAg4jIxsarow1MnbdmefTWnL0gWWlK9+vvX+g2yvLr/4lRDu8VBfbW9Edb 2E+YBCgH5CjIGXrtfOjrUru1AbT7un3GPpXsRXxEzVeLE5VnxHZvixCymr0tPTeNQHk7p/uaT7Or0 BF0NJ3cRnhpYKNG1BZ9LGC6x4lMjINjTJPmSNwr/J59JbqRssbzwlTKijwW6f+nbT3ocRHt135RLj vMO7mLQ4p8SQ0XSayxm3scbnUB4Ig9YXUsHLGuDzCyUB2nm7v8f/ZbT3MseSsmTA/o2CkSJvcbefb lRsg8UqDA==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:37600 helo=[192.168.1.11]) by ssdrsserver2.hosting.co.uk with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1lJO2d-0004Bd-1d; Mon, 08 Mar 2021 22:07:55 +0000
From: Bob Briscoe <ietf@bobbriscoe.net>
To: Tom Henderson <tomh@tomh.org>
Cc: tsvwg IETF list <tsvwg@ietf.org>, "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
References: <161403126355.2878.575062349927307577@ietfa.amsl.com> <08b94ee3-f0f5-086e-2c89-fe6bc48baf12@tomh.org> <2a571150-ae32-8451-6103-9f76bc99bee0@bobbriscoe.net>
Message-ID: <de323032-3fa5-7e18-5ca9-2208df5ad215@bobbriscoe.net>
Date: Mon, 08 Mar 2021 22:07:52 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1
MIME-Version: 1.0
In-Reply-To: <2a571150-ae32-8451-6103-9f76bc99bee0@bobbriscoe.net>
Content-Type: multipart/alternative; boundary="------------8600B5072250A9D7343F297E"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hosting.co.uk
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hosting.co.uk: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hosting.co.uk: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ALX9uX9dsiBpZfkNn60udxf1YxI>
Subject: Re: [tsvwg] I-D Action: draft-ietf-tsvwg-ecn-l4s-id-13.txt
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Mar 2021 22:08:01 -0000

Tom,

I've stitched in your responses to the points I got to last night, then 
continued inline, tagged [BB2]...

On 08/03/2021 01:54, Bob Briscoe wrote:
> Tom,
>
> On 07/03/2021 20:24, Tom Henderson wrote:
>> Koen and Bob,
>>
>> Below are some comments on draft-ietf-tsvwg-ecn-l4s-id-13 for your 
>> and the working group's consideration.
>>
>> Title
>>
>> The title of this draft suggests that the scope is narrowly defining 
>> the identifier of L4S semantics, but the draft covers much more than 
>> this; in fact, it perhaps it could more accurately be described as an 
>> L4S protocol specification.  At the end of the abstract, the draft 
>> states "This specification defines the rules that L4S transports and 
>> network elements need to follow...", i.e. a protocol.  It also gets 
>> into operational considerations and open questions for experimentation.
>>
>> Perhaps a broader title such as "Explicit Congestion Notification 
>> (ECN) Protocol for Ultra-Low Queuing Delay (L4S)" would better match 
>> the contents.
>
> [BB] That's a good idea. I like that. Unless anyone shouts, I'll 
> change it to that in my local copy.
> I intend to post it in the morning, when the IETF servers open.
> It contains some queued up changes that resulted from Koen's Prague 
> Requirements survey - at least those that seemed non-controversial.
> And it will contain the changes from your review below.
>
>>
>> Abstract
>>
>> 1) If the title is changed to reflect a broader scope, the first 
>> sentence of the abstract should also be changed accordingly.
>>
>> 2) "L4S uses an Explicit Congestion Notification (ECN)
>>    scheme that is similar to the original (or 'Classic') ECN approach."
>>
>> Would it be clearer to state that it uses an ECN scheme similar to 
>> the scheme described in RFC 8257 DCTCP (which IMO is distinct from 
>> the original ECN approach)?
>
> [BB] This is probably nearly as true, but the purpose of this sentence 
> was to help readers who might already know the RFC3168 scheme. I think 
> that's likely to be a much larger set than those who know DCTCP. 
> (Think of all the training courses and Web pages that describe the IP 
> header, and the function of the fields).
>
> RFC3168 defines the codepoints and the valid transitions of the ECN 
> protocol. The L4S ECN protocol keeps all that, and the semantics of 
> the codepoints. The main difference is that ECT(1) is no longer 
> equivalent to ECT(0), although they still both have the meaning of 
> ECN-capable transport and unmarked.
>
> So I think that warrants the word "similar".

[TH] In that case, perhaps just call out the differences as follows?

s/is similar to the original (or 'Classic') ECN approach/differs from 
the original (or 'Classic') ECN approach in the following ways/

[BB] That loses the notion that it is largely the same scheme, with 
exceptions.

I've done this, both at the start of the abstract, and the Intro:
Old:

    This specification defines the identifier to be used on IP packets
    for a new network service called low latency, low loss and scalable
    throughput (L4S).  L4S uses an Explicit Congestion Notification (ECN)
    scheme that is similar to the original (or 'Classic') ECN approach.


New:

    This specification defines the protocol to be used for a new network
    service called low latency, low loss and scalable throughput (L4S).
    L4S uses an Explicit Congestion Notification (ECN) scheme at the IP
    layer that is similar to the original (or 'Classic') ECN approach,
    except as specified within.



>
>>
>> 1. Introduction
>>
>> 1) See comment 1 and 2 of Abstract; applicable here as well.
>
> [BB] Yup to 1 again, but same comment about 2.
>
>>
>> 2) The goal of less than 1 ms average queuing delay probably should 
>> be qualified; e.g. something like "on networks not subject to 
>> multiple access delays above these thresholds"
>
> [BB] OK. How about "queuing delay... due to e2e congestion control"
> Meaning if no possible alteration to the behaviour of the e2e CC can 
> reduce a certain type of queuing delay (e.g. queuing for medium 
> access) it's not due to e2e CC.

[TH] OK.

>
>>
>> 3) s/transport wire protocol/transport protocol/
>
> [BB] OK

[BB] Actually, I take that back. Re-reading the sentence, it was 
specifically  distinguishing behaviour from the wire protocol, and 
concluding the wire protocol could not be used as the identifier.

>
>>
>> 4) s/prevent it from/prevent such queues from
>
> [BB] Don't agree here.
> There's been no mention yet of any queues that "such queues" could 
> refer back to.
>
> The full clause was:
> "...isolate existing Classic traffic from L4S traffic to prevent it 
> from degrading"
>
> The 'it' here means the existing Classic traffic. If this was unclear 
> I'd happily change it, but I think it's OK as it is, isn't it?

[TH] I made the comment because upon first read, I didn't know which 
traffic 'it' referred to.  So perhaps change 'it' to 'the former'?

[BB] Done.

- Tom

[BB] Continuing to address the remainder...

>
>
>>
>> 1.1 Latency, Loss and Scaling Problem
>>
>> 1) "Latency is becoming the critical performance factor for many (most?)
>>    applications on the public Internet"
>>
>> perhaps soften this to "Latency control is becoming increasingly 
>> important for applications on the public Internet"

[BB] The critical performance factor means the limiting factor. Why do 
you think that is not true for the long list of applications that 
follows this statement (justifying the use of 'many')? The the 
subsequent sentence justifies this by putting it in the context of more 
bandwidth offering diminishing returns. Again, is that not true?

>>
>> 2) It seems to me that the paragraph starting with "The DualQ 
>> solution was developed..." could be deleted.  Perhaps instead in the 
>> paragraph before, shorten to state "L4S isolation can be achieved 
>> with a queue per flow (e.g. [RFC8290]) or a DualQ 
>> [I-D.ietf-tsvwg-aqm-dualq-coupled]; both approaches are addressed in 
>> this document."  My rationale for suggesting this is that pros and 
>> cons of different AQMs seems out of scope for this document.

[BB] We decided to delete all the anti FQ stuff in the original draft. 
But this is the critical point that justifies the need for an identifier 
at the IP layer. If FQ did not involve the compromises given here, there 
would be no need for an L4S identifier.

>>
>> 1.2 Terminology
>>
>> 1) The sentence "So it takes longer" is a fragment that could be 
>> appended to the previous.

[BB] Grammar: It's a sentence with subject and predicate, not just a 
fragment.

>>
>> 2) the second paragraph of Classic Congestion Control definition 
>> doesn't seem to be about terminology.  Perhaps consider to move 
>> commentary that isn't about clarifying and distinguishing terms out 
>> of this section.

[BB] I'm happy to leave this where it is. The first para defines Classic 
CC with arm-wavy adjectives. The second para puts example numbers 
against the adjectives. I agree it seems incongruous, but it is really 
just a richer definition of the term. If this had been a research paper, 
I would have defined the maths of the response functions. But I would 
avoid that in an I-D, where each formula halves the readership (I think 
Stephen Hawking said that).


>>
>> 3) For Reno-friendly, rather than defining it as what it is not 
>> ('subset of traffic that excludes...'), define what it is (e.g. 
>> 'consistent with fairness goals, as described in RFC 2914, in the 
>> presence of other flows using congestion control based on RFC 5681').

[BB] RFC2914 was written before the concept of 'TCP-friendly' (or 
Reno-Friendly) was coined to be distinct from 'TCP-compatible'.

Whatever, some of the responses to the Prague Requirements Survey seem 
not to believe that a flow should have to fall back to Reno-Friendly anwyay.
So, I'll start a separate thread on this.

>>
>> 2. Consensus Choice of L4S Packet Identifier:  Requirements
>>
>> IMO, this entire section about rationale probably belongs in a 
>> separate appendix or merged with existing Appendix B.  I would 
>> suggest to keep the main sections in this draft about specification 
>> as much as possible.

[BB] I'd like to hear the views of others on this.
I think it's important to highlight that there was a process to find the 
least worst choice.

>>
>> 3. L4S Packet Identification at Run-Time
>>
>> 1) Consider to change the title to 'L4S Packet Identification' (i.e. 
>> drop 'at Run-Time').

[BB] Done

>>
>> 2) Consider to change (for more clarity here):
>>
>>    "A network node that implements the L4S service normally classifies
>>    arriving ECT(1) and CE packets for L4S treatment."
>>
>> to
>>
>>    "A network node that implements the L4S service always classifies
>>    arriving ECT(1) packets as belonging to the L4S service, and, by 
>> default, classifies arriving CE packets as belonging to the L4S 
>> service, unless other heuristics as described below in Section 5.3 
>> are employed."

[BB] Good. Done.
I wasn't sure about "as belonging to the L4S service." So I kept "for 
L4S treatment", which seems more precise.

>>
>> 4.  Prerequisite Transport Layer Behavior (the "Prague Requirements")
>>
>> I suggest to change most, if not all, uses of 'prerequisite' to 
>> 'requirement' or else avoid the term.  For instance, the title could 
>> be "Transport Layer Requirements".

[BB] Thanks. I've removed 'Prerequisite' from all the headings except 
4.2 & 4.3, which are structured as prerequisites for the sender setting 
ECT(1).

>>
>> 4.3 Prerequisite Congestion Response
>>
>> In general, in the first paragraph, I wonder if the congestion 
>> response could be defined more directly in a prescriptive manner for 
>> endpoints (and also for compliance detection in the network), such as 
>> stating that the endpoint must track its base RTT, and it must reduce 
>> its sending rate when it observes more than two reported CE marks per 
>> RTT, possibly averaged over some time scale.

[BB] Nooo! The last thing anyone wants is to be prescriptive about CC 
design. The 2 marks per RTT is an emergent and average outcome of the 
AIMD. It would be very jumpy and unstable if it actually tried to go 
straight to it. No known scalable CC directly aims for 2 marks per RTT. 
Just as there's nothing in Reno that takes the square root of the loss 
probability then aims for it. Also, I don't know why you're saying the 
endpoint must track its base RTT?

The whole idea is to specify only the most important desired outcome, 
then allow freedom in how implementers gets there.
The desired outcome is to keep delay very low, utilization very high and 
controllability very high. And to make sure they stay that way into the 
future. No one wants higher delay, lower utilization or less 
controllability.

>>
>> 5.1. Prerequisite Classification and Re-Marking Behavior
>>
>> paragraph 3:  The term 'Classic drop' is undefined earlier and 
>> probably should be defined in the terminology section or clarified here.

[BB] I've reworded so the phrase 'Classic drop' isn't used or needed.

    Under persistent overload an L4S marking treatment MUST begin
    applying drop to L4S traffic until the overload episode has subsided,
    as recommended for all AQM methods in [RFC7567] (Section 4.2.1),
    which follows the similar advice in RFC 3168 (Section 7).  During
    overload, it MUST apply the same drop probability to L4S traffic as
    it would to Classic traffic.

    Where an L4S AQM is transport-aware, this requirement could be
    satisfied by using drop in only the most overloaded individual per-
    flow AQMs.  In a DualQ with flow-aware queue protection (e.g.
    [I-D.briscoe-docsis-q-protection]), this could be achieved by
    redirecting packets in those flows contributing most to the overload
    out of the L4S queue so that they are subjected to drop in the
    Classic queue.

>>
>> 5.2. The Meaning of L4S CE Relative to Drop
>>
>> I wonder whether this section could be clarified by stating first 
>> that in a dual queue structure with FIFO scheduling, the stated 
>> relationship must be observed, but in a flow queue scheduled AQM, 
>> each flow queue operates independently in this regard? 

[BB] Thanks, yes that's better. However, I haven't said FIFO scheduling 
'cos it's not. Instead I have had to define applicability by what it's not.

    Unless an AQM node schedules application flows explicitly, the
    likelihood that the AQM drops ...



>> The title could also perhaps substitute 'Relationship' for 'Meaning'.

[BB] Now you've made me think about this, I'd prefer "The Strength of 
L4S CE Marking Relative to Drop".

>>
>> ----
>>
>> I did not have time to review the appendices today.

[BB] They were rather dated, but I have reviewed them myself (twice) in 
the last few months.

Thanks. This review has helped update quite a bit of the early but 
outdated thought framework  that had been embedded in this draft when it 
was first written.




Bob



>>
>> - Tom
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/