[tsvwg] Questions and comments on draft-ietf-tsvwg-ecn-l4s-id-06

G Fairhurst <gorry@erg.abdn.ac.uk> Sun, 24 March 2019 18:12 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1FEE4120059 for <tsvwg@ietfa.amsl.com>; Sun, 24 Mar 2019 11:12:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UCfUeR0FxcOk for <tsvwg@ietfa.amsl.com>; Sun, 24 Mar 2019 11:12:47 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 4C09112001E for <tsvwg@ietf.org>; Sun, 24 Mar 2019 11:12:47 -0700 (PDT)
Received: from G-MacBook.local (unknown [88.208.89.189]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 0AB891B00161; Sun, 24 Mar 2019 18:12:44 +0000 (GMT)
Message-ID: <5C97C8A2.7020804@erg.abdn.ac.uk>
Date: Sun, 24 Mar 2019 19:12:50 +0100
From: G Fairhurst <gorry@erg.abdn.ac.uk>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: tsvwg WG <tsvwg@ietf.org>
CC: Bob Briscoe <bob.briscoe@bt.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/9yvdeEOE6hluMOBLwWoJ1aQJ31g>
Subject: [tsvwg] Questions and comments on draft-ietf-tsvwg-ecn-l4s-id-06
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Mar 2019 18:12:51 -0000

I have read and made a review of draft-ietf-tsvwg-ecn-l4s-id-06,
I hope this helps understand what is needed to complete this draft.

Gorry
(as an individual)

---

Questions:

Section 4.3
(1) “As with all transport behaviours, a detailed specification
will need to be defined for each type of transport or application”
- Does this require an RFC to do this? What type?

Section 4.3
(2) At the end of the para I expected to see a note about what if the 
traffic was not from a scalable congestion control. If rate-limited EF 
traffic is submitted to this queue it won’t be a scalable CC, but it 
could employ a shaper, or Circuit Breaker function that prevents it 
contributing to queueing. How is this going to be described? - I think 
the point i had is that such traffic may be responsive to ECN, but 
doesn’t need to scale when it is itself limited in what is permitted to 
be sent. Section 5.4.1.1.1. seems a hint to this. A.1.6 speaks of 
scaling to smaller cwnds, though, so how does this fit?

Section 5.1
(3) “but there is no implication that such a mechanism is necessary.”
- I agree with the statement personally, but I am not sure we have 
consensus on this point. If we do that would be fine, if we do not then 
this should additional be a topic to be considered in experimentation 
perhaps?

Section 5.1
(4) In this context, it is not clear what is meant by: “field as CE for 
an increasing proportion of packets,” - increasing with respect to what 
condition?

(5) “if
the most recent ECT packet in the same flow was ECT(0), the node MAY
classify CE packets for classic ECN [RFC3168] treatment. “
- What happens if this intentionally manipulated to try to disadvantage 
a flow? It seems like an off-path attack can introduce rogue attack 
packets that could influence this method. Please consider.

Section 5.4.1.1
(6) The text in 5.4.1.1. does not start by saying that this implies an 
additional queue, and at least for me, this is still a little hard to 
unravel. I expect it is heading in the correct direction, but it is not 
yet clear what the architecture looks like.

Section 5:
(7) I’ve been trying to see the place where the advice is to operations 
staff, rather than procurers/designers. It seems to me like that break 
is somewhere near 5.4 - am I correct? is there any chance we can place a 
section header that has a useful heading for someone looking for this?

Section 6:
(8) Section 6 describes experiments but doesn’t give any hint at when 
the IETF would have sufficient experience to know whether this 
experiment is confirmed. I am looking for what sort of things need 
experience?

(9) I wonder if appendices B, C need to be published. They suggest 
variations that the WG has not decided to take-up and it would be unwise 
to further promote these. If we keep these, I suggest we separately 
review these to check their tone correctly conveys the final status of 
the WG consensus.]

(10) I would like reassurance that we have consensus that the following 
two reactions are intentional and now forma part of experimentation. I’d 
like to suggest reaction to loss is not optional, and must be treated 
like it was a congestion loss. I therefore query this text in Appendix A:
“Current DCTCP implementations react differently to this situation.
At least one implementation reacts only to the drop signal (e.g. by
halving the CWND) and at least another DCTCP implementation reacts to
both signals (e.g. by halving the CWND due to the drop and also
further reducing the CWND based on the proportion of marked packet).
We believe that further experimentation is needed to understand what
is the best behaviour for the public Internet, which may or not be
one of these existing approaches.”

(11) In the same point it states: “Packet loss might (rarely) occur” _ 
i’d argue that packet loss can ALWAYS occur in the case of overload, and 
that this is an important case that needs to be considered to avoid 
congestion collapse. The text and my assertion appear to potentially 
conflict.
——

The rest are detialed comments on carefully reading the text (i.e., NiTs):

“and low delay is maintained during high load.”
- I understand, but would it be clearer to say what has a high level of 
load?

“The performance improvement is so
great that it is motivating initial deployment of the separate parts
of this system.”
- this seems like a boast. Could you turn it around into a fact. Such as 
: Initial deployment of the separate parts of the system has been 
motivated by the performance benefits…
- is there a reference?

Section 1.1 has the title “problem”, could this say “The Latency 
Problem”? - or something similar?

Is “ In the developed world,” acceptable as a phrase?

“Then Diffserv is of little use.” - could be quoted and misinterpreted, 
maybe better to say it can do little to reduce the latency?

“In general, AQMs” - I suggest In general, “AQM methods”, is clearer 
than AQMs. This appears several places.

“So, AQM was not widely deployed.” Is it better to say “So, this form of 
AQM was not widely deployed.”

“Flow-queuing” - needs a reference?

“Latency is not our only concern:”
- When published, I don’t think we should be stating an IETF position, 
please rephrase. Perhaps the editors mean L4S addresses more than 
reduced latency?

“The finer sawteeth have low amplitude” - perhaps not completely clear 
when read out of context, please add a few words around this such as 
“sawtooth in the congestion window” … or whatever makes sense.

“A supporting paper [DCttH15]” -m please remove “supporting” because it 
does not support THIS specification.

“Low-Latency, Low-Loss and Scalable (L4S) service: ‘
- Missing a final bracket at the end of the para before the full stop.

“But it is also” - remove “but”?

“(DSCP [RFC2474])” - I think should be “(DSCP0 [RFC2474].”

“This document is intended for experimental status, so it does not
update any standards track RFCs.”
- Please replace by “When published, this document will provide an 
experimental specification. It does not
update any standards track RFCs.”

I can’t parse the following: “Ideally, the identifier for packets using 
the Low Latency, Low Loss,
Scalable throughput (L4S) service ought to meet the following
requirements:”
- I don’t think you can use “ideally” or “ought” in the sentence scoping 
the RFC-2119 keywords. Please rephrase.
- I note that you could choose to use “RECOMMEND” rather than “SHOULD” 
since this is a requirements specification. This use is not consistent 
across RFCs but can help to separate requirements from protocol actions 
in section 4.

“to allow this experiment (amongst others).” - True, but could be 
misinterpreted that other experiments are welcome, rather than all need 
to be specified via RFC process. Is this more neutral: “to allow 
experiments such as the one defined in this specification”.

This seems loose: “As a condition for a host to send packets with the 
L4S identifier
(ECT(1)), it SHOULD implement a congestion control behaviour that
ensures the flow rate is inversely proportional to the proportion of
bytes in packets marked with the CE codepoint. “

“are examples of a scalable congestion controls.”
- remove /a/.

“A scalable congestion control MUST react…”.
- I agree. Although I think you may first wish to point to the AQM BCP 
and say that “even though the congestion-controller is optimised to 
respond to congestion-experienced marks, it also needs to respond to 
packet loss [RFC7567].”]\=
- I also have the same comment for A.1.3. to motivate why loss reaction 
is important.

Should this be with commas?:
“non-L4S but ECN-capable bottleneck”
be “non-L4S, but ECN-capable, bottleneck”

“while it temporarily falls back to coexist with
Reno .”
- remove additional space. Is the word “while” better “during the time it”?

In Section 5:
“ Of course, a packet that carried both the ECT(1) codepoint and a
relevant non-ECN identifier would also be classified into the L
queue.”
- why “of course” - I can see why this can happen. However, does it HAVE 
to happen? why can’t a system put all ECT(1) traffic in a L4S queue 
irrespective of the other classification, then return the remaining 
L-compatible traffic to the L-queue? Is that not also a valid approach? 
Or is this here stating diffserv rules? please clarify.

“be used by some network operators who believe they
identify non-L4S traffic that would be safe”
- Our ops area colleagues may be upset by “believe they” and would 
likely prefer “decide to”

“(and CE indicates that it could be).”
- perhaps:
- “(a CE-mark indicates traffic could have was originally marked as 
either ECT(0) or ECT(1).

“ at a data rate that exceeds “ -
- in various places: remove /data/?

“for policy reason”
- suggest “for a policy reason“

“MUST NOT re-mark the end-to-end L4S identifier”
- suggest adding “(ECT(1))” here too avoid any confusion.

Section 8. I think there should be some discussion on what happens if an 
attacker introduces ECT(1) rogue packets can it influence the method, 
other than an attack which seeks to induce congestion?

In Appendix B:

“In such cases, the L4S
service would have to drop rather than mark frames even though
they might contain an ECN-capable packet. “
- Aren’t all L4S packets ECN-capable, this seems like you could have an 
L4S packet that was not ECT(1) or CE… which is not true.

In B.1 - check all bullets end with “;”.

In B.2: “* CE would signify that the packet had been marked by an AQM
implementing the L4S service.”
- really why only L4S, rather than an “ECN service”

… I didn’t re-review the remainder in this pass.

Also:

“ With a RACK-like
approach, allowing longer before a loss is deemed to have occurred
maintains higher throughput in the presence of reordering {ToDo:
Quantify this statement}.”
- missing text.

========.