Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21

Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21 - Nits

Bob Briscoe <ietf@bobbriscoe.net> Tue, 19 December 2023 14:16 UTC
Content-Type: multipart/alternative; boundary="------------6gKyCkwQ00WlxLmAyQwe1G2K"
Message-ID: <32461a5d-c174-4d43-ae81-fcdfd91c49a9@bobbriscoe.net>
Date: Tue, 19 Dec 2023 14:16:02 +0000
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-GB
To: Greg White <g.white@CableLabs.com>, Thomas Fossati <Thomas.Fossati@linaro.org>, Ruediger GEIB <Ruediger.Geib@telekom.de>
Cc: tsvwg IETF list <tsvwg@ietf.org>
References: <E8AB6C25-76EB-4F89-B8CC-DE0FB8B6B688@CableLabs.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
In-Reply-To: <E8AB6C25-76EB-4F89-B8CC-DE0FB8B6B688@CableLabs.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/e18PZQjkvHw7TpoTDmG-TLLwz7c>
Subject: Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21 - Nits
Precedence: list
Greg,

On 17/12/2023 01:07, Greg White wrote:
>
> Hi Bob,
>
> I’ve made a series of edits to tackle the “Nits” first.  Also please 
> see a few responses below, marked [GW].  I adopted all suggested 
> changes unless marked otherwise here.
>
> -Greg
>
> *From: *Bob Briscoe <ietf@bobbriscoe.net>
> *Date: *Tuesday, December 12, 2023 at 11:18 AM
> *To: *Greg White <g.white@CableLabs.com>, Thomas Fossati 
> <Thomas.Fossati@linaro.org>, Ruediger GEIB <Ruediger.Geib@telekom.de>
> *Cc: *tsvwg IETF list <tsvwg@ietf.org>
> *Subject: *Review of draft-ietf-tsvwg-nqb-21
>
> Greg, Thomas, Rüdiger,
>
>
> Here's my review of draft-ietf-tsvwg-nqb-21, which I understand is 
> approaching WGLC shortly.
> I haven't read the draft through in one sitting for some time. So, I'm 
> afraid my review is long, but I've tried to suggest text for all the 
> nits, which are by far the largest contributor to the length.
>
> ------------------------------------------------------------------------
>
> [snip]
>
> ------------------------------------------------------------------------
>
>
>     Nits
>
>
>         Throughout:
>
> §1 Intro "...managed by an end-to-end congestion control algorithm. 
> Many of the commonly-deployed congestion control algorithms, such as 
> Reno, Cubic or BBR, are designed to seek the available capacity of the 
> end-to-end path..."
> §3.2 "...it has not been used for these purposes end-to-end across the 
> Internet."
> §3.2 "...meeting the performance requirements of an application in an 
> end-to-end context "
> §3.2 "These mechanisms can be difficult or impossible to implement in 
> an end-to-end context."
> §3.2 "...the NQB PHB ... could conceivably be deployed end-to-end 
> across the Internet."
> §3.2 "...the performance requirements of applications cannot be 
> assured end-to-end,"
> §4.4: "End-to-end usage and DSCP re-marking"
> §4.4 "...this PHB is expected to be used end-to-end across the Internet,"
> §4.4 "...To ensure reliable end-to-end NQB PHB treatment,"
> Appx A. "...it will severely limit the ability to provide NQB 
> treatment end-to-end."
>
> In the IETF transport area, "end-to-end" normally means 'between the 
> end hosts without network involvement'. Perhaps 'whole-path' or other 
> alternatives could be used, except in the very first case above?
>
> [GW] In the context of DSCP PHBs, the term has a defined meaning. Per 
> RFC2475:
> Use of the term "end-to-end" in a PHB
>
> definition should be interpreted to mean "host-to-host" for
>
> consistency.
>
> [GW] That said, rather than explaining that here (or assuming the 
> reader has read and remembered that detail from RFC2475), it seems 
> that I can eliminate the term or replace it with another relatively 
> easily.
>

[BB] Fine.

>
> s/this draft/this document/ (2 occurrences)
>
>
>         Abstract
>
> "properties and characteristics" just one or the other would be 
> sufficient, wouldn't it?
>
>
>         §1. Intro
>
> "microflows (see [RFC2475] for the definition of a microflow)"
> Summarize cross-reference: Surely it would be worth briefly restating 
> this definition here, e.g. "microflows (application-to-application 
> flows [RFC2475])."
> I went to RFC2475 to check whether it was somehow different to the 
> well-understood definition of microflow.
>
> "...managed by a classic congestion control algorithm (as defined in 
> [RFC9330]),"
> Summarize cross-reference: As this is not a term that Diffserv 
> engineers are likely to have come across, surely the referenced 
> definition ought to be summarized here, e.g. "(one that coexists with 
> standard Reno congestion control [RFC5681])"
>
> "...to effectively use the link..."
> I tripped up on this, initially assuming the alternative meaning of 
> effectively (as in "it's effectively worthless"). How about:
>     "...to use the link effectively..."
>     "...to use the link efficiently..."
>
> "Active Queue Management (AQM) mechanisms (such as PIE [RFC8033], 
> DOCSIS-PIE [RFC8034], or CoDel [RFC8289])..."
> It would be better to say specifically that you are talking about 
> single queue AQMs here:
> "Active Queue Management (AQM) mechanisms intended for single queues 
> (such as PIE [RFC8033], DOCSIS-PIE [RFC8034], or PI2 [RFC9332])..."
> and I think it would be preferable to leave mention of CoDel until 
> FQ-CoDel later in the para - it would be controversial to imply that 
> CoDel manages a single queue well (potentially with a large number of 
> flows).
>
> [GW] I added “intended for single queues” and reference to PI2 as 
> suggested, but left CoDel in, since the sentence mentions “can improve 
> QoE, but there are practical limits”, which I think is true for 
> CoDel.  If CoDel performance is controversial, leaving it out of the 
> list relieves it not only of the praise but also the criticism.
>

[BB] I don't think anyone is going to mind if you only refer to FQ-CoDel 
and not CoDel as well.

>
> "If the AQM attempted to control the queue much more tightly, 
> applications using those algorithms would not perform well."
> Unnecessarily vague. How about "...would not fully utilize the link"?
>
> "but these [FQ] are not appropriate for all bottleneck links, due to 
> complexity or other reasons."
> Rather than writing as if the IETF is pronouncing on this, how about
> "but not all operators think they are appropriate for all bottleneck 
> links, due to complexity or other reasons."
>
>
>         §3,. "Context"
>
> You wouldn't normally expect an introductory section headed 'Context' 
> to contain normative requirements. However, a couple of 'SHOULD's 
> appear in the last para of §3.3. It might be better to shift that 
> whole last para into the relevant requirements section (e.g. as a new 
> subsection after §§4.2 & 4.3 which are also about mixtures of 
> codepoints). Then state at the beginning of §3 that the whole section 
> is informative only. However, perhaps I'm being too purist.
>
> [GW] Point taken.  Also the first paragraph in Sec. 3.3 discusses 
> requirements on PHB implementations (though doesn’t introduce any new 
> ones directly).  That said, I’m going to leave this one as is for now. 
> I think the discussion of the relationship to L4S fits well in the 
> context section, and I don’t think the 3.3 requirements fit well in 
> section 4 (or section 5 for the first paragraph items).  The first 
> sentence of 3.3 states that NQB is defined independently of L4S (which 
> I think is the right way to handle it), and moving those requirements 
> into 4 (or 5) would start to remove that independence. If others agree 
> with you that this text needs to move, I’ll reconsider.
>

[BB] OK.

>
>         §3.1. Non-Queue-Building Behavior
>
> CURRENT:
> "highly unlikely to exceed the available capacity of the network path 
> between source and sink."
> PROPOSED:
> Add "...even at an inter-packet timescale." or similar wording.
> REASON: people usually think of data rate as an averaged measure.
>
>
>         §3.2. Relationship to the Diffserv Architecture
>
> CURRENT:
> "and given no reserved bandwidth other than the bandwidth that it 
> shares with Default traffic."
> PROPOSED:
> "and given no reserved bandwidth other than any minimum bandwidth that 
> it shares with Default traffic."
> REASON:
> The current wording implies that all operators always give Default 
> some reserved bandwidth.
>
> CURRENT:
> "Instead, the goal of the NQB PHB is to provide statistically better 
> loss, latency, and jitter performance for traffic that is itself only 
> an insignificant contributor to those degradations."
> PROPOSED:
> "Instead, the sole goal of the NQB PHB is to isolate NQB traffic from 
> other traffic that degrades loss, latency and jitter, given that the 
> NQB traffic is itself only an insignificant contributor to those 
> degradations.
> REASON:
> The current wording implies that the PHB provides the better 
> performance, which contradicts the statement in the introduction that 
> it is NQB senders that provide the better performance, not the PHB. It 
> would be worth repeating that here.
> (see also similar comment later about the first para of §5.1 "Primary 
> Requirements")
>
> CURRENT:
> "...relatively low data rates"
> PROPOSED:
> "...relatively low and smooth data rates"
>
> CURRENT:
> "The main distinctions between NQB and EF are discussed in Appendix B."
> Summarize cross-reference: It would be useful to give a summary 
> sentence here {Note 2 was my first attempt, but it's too long}.
>
> [GW] I’m struggling to come up with a useful single sentence summary 
> of Appendix B. Given that in this case the cross reference is in the 
> same document, it doesn’t seem to me to overly burdensome to ask the 
> reader to jump there to find that material.
>

[BB] How about:
CURRENT:
     The main distinctions between NQB and EF are discussed in Appendix 
B <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21#EF>.
PROPOSED
     "Unlike EF, NQB has no requirement for a guaranteed minimum rate, 
nor to police incoming traffic to such a rate, and NQB is expected to be 
treated with the same priority as Default (see Appendix B 
<https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21#EF> for 
details)."
RATIONALE:
I felt the appendix took rather a long time getting to the point, 'cos 
the first 3 paras are all about the technicality of whether Default 
traffic has a minimum rate, which isn't really the important difference. 
The last 2 paras of the Appendix are better, so I thought a summary in 
the body would be the quickest way to fix this (rather than reworking 
the entire appendix).


>         §4.1. Non-Queue-Building Sender Requirements
>
> CURRENT:
> "Microflows that align with the description of behavior in the 
> preceding paragraphs in this section SHOULD be identified to the 
> network using a Diffserv Code Point (DSCP) of 45 (decimal) so that 
> their packets can be queued separately from QB microflows."
> PROPOSED:
> "Microflows that mark their packets using a Diffserv Code Point (DSCP) 
> of 45 (decimal) SHOULD  align with the description of behavior in the 
> preceding paragraphs in this section, so that their packets can be 
> queued separately from QB microflows with minimal harm to other NQB 
> traffic."
> REASONING:
> The current wording is the wrong way round. It shouldn't recommend 
> that all traffic that behaves like NQB has to be marked as NQB. 
> Otherwise it would be saying that most EF, CS5, etc traffic SHOULD be 
> marked as NQB instead.
>
> [GW] The intent of this sentence was to say two things: 1) decimal 45 
> SHOULD be used as the NQB DSCP 2) the NQB DSCP should only be used on 
> microflows that align with “the description”.  So, I think breaking 
> those two concepts into separate sentences would achieve the goal and 
> address your issue.  I’ve got:  “Microflows that are marked with the 
> NQB DSCP SHOULD align with the description of behavior in the 
> preceding paragraphs in this section. Applications are RECOMMENDED to 
> use the Diffserv Code Point (DSCP) 45 (decimal)  to mark microflows as 
> NQB.”
>

[BB] Much better.

>
>         §4.2.Aggregation...
>
> I'd prefer to see the last para shifted to after the first. These are 
> the two paras with normative requirements in them, then the others are 
> sort-of mitigations and exceptions. Also, the last para highlights the 
> difference between treating NQB traffic as if it's Default, and 
> re-marking it to be Default, which is the big important point here.
>
> However, I can also see that the 3 paras in the middle at the moment 
> relate more to the first para. So if the authors think the logical 
> flow would be better as it is, I won't fight for this.
>
>
>         Retitle §4.2 & §4.3 (Aggregation of the NQB DSCP into another
>         PHB; and Aggregation of other DSCPs into the NQB PHB)?
>
> The (unspoken) distinction between these two sections seems to be more 
> that:
>
>   * §4.2 is about typically uncongested core networks, where
>     separation from Default (or another similar PHB) might not be
>     necessary,
>   * §4.3 is really about where a PHB isolated from Default has been
>     provided, and could be used for an aggregate of classes that would
>     all benefit from such isolation.
>
> The current titles focus on what *name* a PHB started with before it 
> was aggregated, which is a bit academic, because aggregates don't 
> necessarily bear the name of any of the classes they consist of (e.g. 
> the Elastic aggregate).
>
> [GW] Actually, the distinction is really what the **PHB** is, as 
> opposed to what the **aggregate** is. (Note, PHBs aren’t aggregated, 
> service classes are). So, (e.g.) in a node that supports the Default 
> and EF PHBs (say a default queue and a strict high priority queue) 
> these sections would recommend that NQB be aggregated with Default.  
> On the other hand, in a node that supports the NQB PHB (a 
> non-prioritized queue that shares capacity with the default PHB) it 
> could be OK to classify EF traffic into the NQB queue.  So, I think 
> the current titles are appropriate (though I’ll make the small edit 
> s/in/into/).
>

[BB] Everything you say here is true. But I didn't find it useful to 
separate the space by what queues an operator has deployed, rather than 
by what the operator is trying to achieve. They are not going to deploy 
an NQB queue then turn round and say, "Oh my word, I've just found an 
NQB PHB lying around. Now what can we aggregate into it?"

I can see now that just changing the titles wouldn't suffice - the two 
categories I suggest would require some material to be moved between the 
sections. To avoid such upheaval, perhaps a smaller change can be made:
CURRENT:
     This is particularly useful in cases where specialized PHBs for 
these other service classes are not provided.
PROPOSED:
     This is particularly useful in cases where specialized PHBs for 
these other service classes had not been provided at a potential 
bottleneck, perhaps because it was too complex to manage traffic 
contracts and conditioning.
REASON:
First the current text  doesn't mention whether it is talking about a 
highly aggregated core or potentially congested access. And second, the 
proposed text gives a motivation for why NQB might be added when queues 
for other classes hadn't been added.

> Perhaps:
> §4.2. Aggregation of the NQB DSCP without isolation from Default traffic
> §4.3. Aggregation of the NQB DSCP preserving isolation from Default 
> traffic
> Strictly, §4.2 also discusses aggregation with real-time, instead of 
> Default, but I've assumed that such detail doesn't need to be 
> explained in the section heading.
>
>
>         §4.3. Aggregation of other DSCPs in the NQB PHB
>
> If you prefer not to change the section headings as above, pls consider...
> s/in/into/
> because my parser tripped up on 'in' (and 'into' is consistent with 
> the §4.2 heading).
>
>
>         §4.4.1.
>
> s/occuring/occurring/
>
>
>         §4.5. The NQB DSCP and Tunnels
>
> "reordering-sensitive tunnel protocol"
> Summarize cross-reference: An example of one would be useful, or an 
> explanation of the implications. §4.1 of RFC2983 gives examples, but 
> we shouldn't have to read references to get at least a grasp of what 
> this draft is talking about.
>
>
>         §4.5. The NQB DSCP and Tunnels
>
> "In the case of the pipe model, any DSCP manipulation (re-marking) of 
> the outer header by intermediate nodes would be discarded at tunnel 
> egress, potentially improving the possibility of achieving NQB 
> treatment in subsequent nodes."
> The contrary could equally apply. If the DSCP re-marking of the outer 
> was part of an interconnection contract, it could well have been 
> designed to preserve the NQB treatment in downstream domains.
>
> [GW] Ok, changed it to “…. In some cases, this could improve the 
> possibility of achieving NQB treatment in subsequent nodes, but in 
> other cases it could degrade that possibility (e.g. if the re-marking 
> was designed specifically to preserve NQB treatment in downstream 
> domains).”
>

[BB] Good.

>
>         §5. NQB PHB Requirements
>
> This section is built on the goal of incentive alignment. In the IETF, 
> there ought to be consensus on incentive-alignment as a goal, but I 
> have detected that some IETF participants dismiss incentive alignment 
> if a network service does not *also* protect against malicious attack 
> (or accidents). So perhaps the following would be a useful 
> introductory sentence...
> "Incentive alignment ensures a system is robust to the behaviour of 
> the large majority of individuals and organizations who can be 
> expected to act in their own interests (including application 
> developers and service providers who act in the interests of their 
> users). Malicious behaviour is not necessarily based on rational 
> self-interest, so incentive alignment is not a sufficient defence, but 
> the large majority of users do not act out of malice. Protection 
> against malicious attacks (and accidents) is addressed in Section 5.2 
> on Traffic Protection and Section 11 on Security Considerations 
> summarizes it."
> This could replace the parenthesis later in the first para:
>     "(this is discussed further in this section and Section 11 
> <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21#Security>)"
> which read oddly to me, because it doesn't explain why the split 
> between this section and section 11 anyway.
>
>
>         §5.1. Primary Requirements
>
> "...the NQB PHB makes no guarantees ..., but instead aims to provide 
> an upper-bound to queuing delay for as many such marked microflows as 
> it can."
> This contradicts the premise that the network does not provide the low 
> delay - the low queueing delay is provided by the collective action of 
> NQB senders, and the PHB only isolates them from QB traffic. Just the 
> mention of an upper-bound gives the wrong impression that a max 
> queuing delay number can be calculated. And 'for as many such marked 
> microflows as it can' is also strange, given that delay variation 
> tends to reduce for links designed for more flows (but admittedly not 
> when more flows are used in a link than it is designed for).
> Indeed, I'm not sure that this is the right place for any of the first 
> para. It does not define a requirement. The first part about making no 
> guarantees could be moved to the introduction. And perhaps the second 
> part after the comma can just be dropped?
>
>
>         §5.1. Primary Requirements
>
> CURRENT:
> "An exception to this recommendation is discussed in Section 4.4.1."
> PROPOSED:
> "An exception to this recommendation for traffic sent towards a 
> non-DS-capable domain is discussed in Section 4.4.1."
> REASON: Summarize cross-reference: A short decription helps the reader 
> more than just a bare section number.
>
>
>         §5.1. Primary Requirements
>
> CURRENT:
> "e.g., a deficit round-robin scheduler with equal weights"
> " (e.g. with equal DRR scheduling)"
> I couldn't tell whether DRR was the main focus of these examples, or 
> the fact that the weights are equal.
>
> For equal preference, the weights of a DRR scheduler have to be 
> proportionate to the aggregate rates of each class of traffic. That's 
> hard to determine with capacity-seeking traffic, but NQB is not 
> capacity-seeking:
>
>   * Each NQB application is meant to limit its instantaneous rate to
>     within a small proportion of typical total capacity - the draft
>     suggests 1%. So giving NQB 50% forwarding preference effectively
>     gives it much more preference than it needs. Certainly, we need to
>     allow for the possibility of multiple NQB flows, therefore
>     multiples of 1%. And certainly, it would be reasonable to give NQB
>     somewhere close to the maximum proportion of capacity it might
>     need, or a little more, so that its queuing delay remains low
>     relative to Default traffic.
>
>       o For example, if there's 6Mb/s of unresponsive NQB traffic
>         scheduled by 50:50 DRR into a 100Mb/s link, and the balance is
>         consumed by 10 capacity-seeking QB flows that is probably
>         fine. However, a 45Mb/s unresponsive but smooth NQB flow could
>         also take advantage of a 50:50 scheduler. Then the 10 flows
>         would share the balance, getting 5.5Mb/s each.
>       o This inequality isn't a problem per se, but it is problematic
>         to hold up 50% scheduler weight as somehow a golden fraction.
>       o It would be more justifiable to say that:
>
>           + the scheduler weight for NQB traffic ought to be at least
>             proportionate to the fraction of capacity that NQB is
>             likely to use, and ideally a little higher than the
>             highest likely fraction (to ensure low worst-case queuing
>             delay).
>           + But then the text should admit that the fraction of
>             capacity for NQB flows is likely to be hard to ascertain
>             in a lo-stat-mux environment, so it is instead suggested
>             that a fraction like 20% - 50% would be a reasonable
>             maximum scheduler weight for NQB;
>           + Then it needs to say "The exact value is unimportant as
>             long as it's high enough," because NQB is app-limited, so
>             if its weight is too high, the unused capacity can be
>             borrowed by capacity-seeking traffic.
>           + But it shouldn't be excessive, otherwise it gives more
>             leeway for greedy abuse by NQB traffic.
>
>   * Taking Low Latency DOCSIS as another example, it uses the DualQ
>     [RFC9332] and  doesn't comply with "The node SHOULD provide a
>     scheduler ... that treats the two classes equally", because it
>     gives 90% to the L queue.
>
>       o Admittedly:
>
>           + the 90% has to serve L4S as well as NQB traffic.
>           + the coupling between the DualQ AQMs ensures that L4S
>             traffic nearly always uses less than its 90% scheduler
>             weight, so RFC9332 recommends any high fraction, saying
>             "its exact value is unimportant", because it merely
>             ensures low delay for the L queue, not bandwidth shares
>
>       o However,  when there is no L4S traffic, the full 90% is
>         available to NQB.
>       o My point is that 90% is a good figure in this case, and 15%
>         might be a good figure in another case (e.g. with only NQB and
>         no L4S).
>       o But the take-home message is that 50% is not a golden number.
>
>
> Wouldn't it be less distracting to use an example scheduler that 
> doesn't share capacity explicitly, but instead acts on time? For instance:
>
>   * two Wireless Multimedia (WMM) Access Categories (ACs) with the
>     same EDCA parameters.
>
> [GW] Also, Appendix B mentioned “… effectively receives a rate 
> guarantee of 50% ...”, I’ve made that now “… could effectively receive 
> a rate guarantee of (e.g.) 50% …”
>
> [GW] My handling of this one likely warrants a review to see if I’ve 
> captured your thoughts adequately, please see new line 246 in: 
> https://github.com/gwhiteCL/NQBdraft/commit/33dbf848036a295ec6c7886bd30a2d032be560f6
>

[BB] The inserted clarification is good, but I'd suggest that it 
disrupts the logical flow for too long. I think it would be better as a 
subsequent para with a short signpost to it from where the equal weights 
example is first given. For example,
CURRENT (on github):

    The use of equal weights for DRR is given as a reasonable example,
    and is not intended to be proscriptive of the use of other
    scheduling weights. Ideally the DRR weight would be chosen...
    ...Thus, 50% seems a reasonable upper bound on the weight for the
    NQB PHB in these environments.
    A node that provides rate limits or rate guarantees...

PROPOSED:

    The use of equal weights for DRR is given as a reasonable example,
    and is not intended to preclude other scheduling weights (see below
    for details).
    A node that provides rate limits or rate guarantees...

    <new next para> In the DRR example above, equal scheduling weights
    was only an example. Ideally the DRR weight would be chosen...
    ...Thus, 50% seems a reasonable upper bound on the weight for the
    NQB PHB in these environments.

(Note I've also simplified the 'proscriptive' phrase.)

There are 5 instances of 'equal priority' throughout the draft (listed 
below), which I suggest are changed to 'equal access', 'equal 
preference' or similar.

§3.2.

    "...the NQB traffic is to be given a separate queue with priority
    equal to Default traffic..."

§7.3.1.

    "...the recommendation to treat NQB traffic with priority equal to
    Default traffic."
    "...The choice of separated queuing rather than equal priority..."
    "...to meet the equal priority recommendation..."
    "...with priority equal to Default traffic..."


>         §5.2.Traffic Protection
>
> CURRENT:
> "It is possible that due to an implementation error or 
> misconfiguration, a QB microflow"
> PROPOSED:
> "It is possible that, due to an implementation error or 
> misconfiguration, a QB microflow"
> (added comma)
>
> CURRENT:
> "This specification does not mandate a particular algorithm for 
> traffic protection. This is intentional, since the specifics of 
> traffic protection could need to be different..."
> PROPOSED:
> "This specification does not mandate a particular algorithm for 
> traffic protection. This is intentional, since this will probably be 
> an area where implementers innovate, and the specifics of traffic 
> protection could need to be different..."
>
>
>         §5.3. Impact on Higher Layer Protocols
>
> I understand that the exitence of this section is a requirement from 
> the PHB specification guidelines in RFC2475 (§3; guideline G.14). 
> However, there are a number of problems with where this section sits 
> in the document.
>
>   * It is within §5 'NQB PHB Requirements', but it contains no
>     requirements.
>   * It is actually about the Impact of Traffic Protection on Higher
>     Layer Protocols, but doesn't say so.
>   * It overlaps with the two paras in the previous section on Traffic
>     Protection, starting 'In the case of', and it repeats much of the
>     first of those two.
>
> Suggested remedies:
>
>   * To generalize it from just the impact of traffic protection, it
>     ought to open by saying:
>
>       o "The NQB PHB itself has no impact on higher layer protocols,
>         because it only isolates NQB traffic from non-NQB. However,
>         traffic protection of the PHB can have unintended side-effects
>         on higher layer protocols."
>
>   * Perhaps it could be shifted to an Appendix (as suggested in RFC2475)
>   * I suggest that the two paras starting 'in the case of' in §5.2
>     "Traffic Protection" are given their own subsection of §5.2,
>     perhaps titled "Potential Traffic Protection Penalties" and split
>     into 3 paras for:
>
>       o reclassify
>       o re-mark
>       o discard
>
>   * then perhaps they could refer to the appendix about impact on
>     higher layer protocols
>
> [GW] The last 2 bullets relate to the 6^th (last) item in your 
> Technical Comments, so I’ll defer those until that item is addressed.
>
>
>         §5.3. Impact on Higher Layer Protocols
>
> CURRENT:
> "The traffic protection function described here"
> Not clear which function this is referring to. Two were described (and 
> they are separated from here by a couple of long paras).
>
>
>         §6. Configuration and Management
>
> CURRENT:
> "The default for such classifiers is recommended to be the assigned 
> NQB DSCP (to identify NQB traffic) and the Default (0) DSCP (to 
> identify QB traffic)."
> SUGGESTED:
> "The default classifier to distinguish NQB traffic from traffic 
> classified as Default (DSCP 0) is recommended to be the assigned NQB 
> DSCP (45 decimal).
> REASON:
> This text as it stood recommended that the Default DSCP now only 
> identifies QB traffic. Whereas it ought to still be quite acceptable 
> to identify traffic that doesn't build queues using DSCP 0.
>
>
>         §6.1. Guidance for Lower Rate Links
>
> CURRENT:
> "it is RECOMMENDED that the NQB PHB be disabled and for traffic marked 
> with the NQB DSCP to thus be carried using the Default PHB."
> PROPOSED:
> Add: "However, the NQB DSCP SHOULD NOT {MUST NOT?} be re-marked to the 
> Default DSCP (0)."
> REASON:
> To repeat and reinforce the similar requirement earlier, but for this 
> context.
>
>
>         §7.1. DOCSIS Access Networks
>
> Add reference the white paper on Low Latency DOCSIS at the end?
>
>
>         §7.2 Mobile Networks
>
> Perhaps add a remark at the end of the 2nd para about how this relates 
> (or not) to the primary requirements in §5.1. (non-rate limiting and 
> equal preference)?
>
>
>         §7.3.1.  Interoperability with Existing Wi-Fi Networks
>
> CURRENT:
> "...the Wi-Fi link is commonly a bottleneck link.."
> PROPOSED:
> "...the Wi-Fi link can become a bottleneck link.."
> REASON:
> As it stands, this semi-contradicts the first sentence of the DOCSIS 
> section, which says DOCSIS operators commonly configure the access to 
> be the bottleneck. Saying 'can become' hints that it depends how good 
> the Wi-Fi signal path is.
>
> CURRENT:
> "Wi-Fi equipment ... will support either the NQB PHB requirement for 
> separate queuing of NQB traffic, or..."
> PROPOSED:
> "Wi-Fi equipment ... will support either the NQB PHB requirement for 
> separating queuing of NQB traffic from Default, or..."
> REASON:
> If for instance, the 45 DSCP of NQB puts it into the VIdeo access 
> category, it won't be separate from Video, only from Default.
>
> CURRENT:
> "Wi-Fi gear typically has hardware support (albeit generally not 
> exposed for user control) for adjusting the EDCA parameters in order 
> to meet the equal priority recommendation. This is discussed further 
> below."
> PROPOSED:
> "The arrangement of queues in Wi-Fi gear is typically fixed, whereas 
> most Wi-Fi gear supports adjustment of the EDCA parameters (albeit 
> generally not exposed for user control) as recommended further below 
> in order to meet the equal priority recommendation."
> REASON:
> When I read the text as it stood, it wasn't clear that it was 
> motivating the choice of separate queuing. My sentence is still rather 
> complex - perhaps it can be improved on.
>
> CURRENT:
> "A residential ISP that re-marks the Diffserv field to zero, bleaches 
> all DSCPs and hence would not be impacted by"
> PROPOSED:
> "A residential ISP that re-marks the Diffserv field to zero would not 
> be impacted by"
> REASON:
> Tautology.
>
> CURRENT:
> "* For application traffic that originates outside of the Wi-Fi 
> network, and thus is transmitted by the Access Point, opportunities 
> exist in the network components upstream of the Wi-Fi Access Point to 
> police the usage of the NQB DSCP and potentially re-mark traffic that 
> is considered non-compliant, as is recommended in Section 4.4.1 
> <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-nqb-21#unmanaged>.  
> A residential ISP that re-marks the Diffserv field to zero, bleaches 
> all DSCPs and hence would not be impacted by the introduction of 
> traffic marked as NQB. Furthermore, any change to this practice ought 
> to be done alongside the implementation of those recommendations in 
> the current document. "
>
> This bullet is too complex for me to understand. Maybe I need a break 
> from reading this, but I don't understand how the last two sentences 
> relate to the previous sentence in this bullet, or how they 'motivate 
> the choice of separated queuing', which is what all the bullets are 
> meant to be doing. I also don't know which of the two earlier 
> practices 'this practice' is referring to, re-marking or bleaching?
>
> [GW] I reworked it, please take a look and see if you find this more 
> understandable. Line 339 of:
>
> https://github.com/gwhiteCL/NQBdraft/commit/ebb3af93d5f28f4dc9c35740c5af0a0a9fc7b358
>

[BB] Yes, including the two phrases that set down the context before 
talking about it has helped massively. Just one clarifying suggestion:
s/implementation of those recommendations/
  /implementation of those policing recommendations/
or
  /the recommendations in <xref target="unmanaged"/>

Reason: I didn't immediately grock which recommendations you were 
referring back to, probably because the previous occurrence used the 
verb 'as is recommended' rather than the noun 'recommendations'.

> CURRENT:
> "...ought to be done alongside the implementation of those 
> recommendations in the current document."
> PROPOSED:
> "...would be efficient to implement at the same time as the 
> recommendations in the current document."
> (if this wording survives my previous comment)
>
>
>         §8. Acknowledgements
>
> The RFC style guide recommends acknowledgements should appear at the 
> end, before Contributors & Authors:
> https://www.rfc-editor.org/rfc/rfc7322.html#section-4
>
>
>         §11. Security Considerations
>
> The first para is about incentive-compatibility might be better 
> covered fully in one place (at the head of §5) by using the material 
> from here that explains exactly what causes the degradations. Then 
> just explain in the Security Considerations that NQB is based on 
> incentive alignment (§5) which makes it robust to self-interested 
> actors, but traffic protection against malicious actors is also 
> recommended (§5.2).
>
> CURRENT:
> "While the NQB DSCP value could be abused to gain priority on such links,"
> Before making the point that the NQB DSCP would be the least likely to 
> be abused, the blindingly obvious should be pointed out - that 
> existing WMM WiFi APs already allow DSCP 45 and all the other DSCPs in 
> half the space to gain priority today, whether or not 45 is assigned 
> to NQB. This is more relevant for upstream, then the point about least 
> worst is more relevant for downstream.
>
> s/than any of the other 31 DSCP values that are provided priority/
>  /than any of the other 31 DSCP values that are given priority/
> REASON: my parser tripped over this.
>
> CURRENT:
> "The details of any security considerations that relate to deployment 
> and operation of NQB in these network technologies are not discussed 
> here."
> PROPOSED:
> "Any security considerations that relate to deployment and operation 
> of NQB solely in specific network technologies are not discussed here."
> REASON:
> I think that's what you meant, and it justifies itself better.
>
> CURRENT:
> "While re-marking DSCPs is permitted for various reasons..., if done 
> maliciously, this might negatively affect the QoS of the tampered 
> microflow."
> PROPOSED:
> Add: "Nonetheless, an on-path attacker can also alter other mutable 
> fields in the IP header (e.g. the TTL), which can wreak much more 
> havoc than just altering QoS treatment.
>
>
>         Appendix A. DSCP Re-marking Policies
>
> s/the result would be that traffic marked with the NQB DSCP would/
>  /it would/
>
> CURRENT:
> "This could be another motivation to (as discussed in Section 4.3) 
> classify CS5-marked traffic into NQB queue."
> PROPOSED:
> "This could be another motivation to classify CS5-marked traffic into 
> the NQB queue (as discussed in Section 4.3)."
> (note omission of 'the' as well as shift of parenthetical).
>
>
>          Appendix B. Comparison to Expedited Forwarding
>
> s/Comparison to/Comparison with/
> Subtle, but my ear for English felt this sounded wrong. I didn't find 
> the language guides on the web very useful, so I'll leave you to pick 
> the one you feel is right.
>
> CURRENT:
> "While EF relies on rate policing and dropping of excess traffic, this 
> is only one option for NQB. NQB alternatively recommends that the 
> implementation re-mark and forward excess traffic using the Default 
> PHB, rather than dropping it."
> PROPOSED:
> "While EF relies on rate policing and dropping of excess traffic at 
> the domain border, this is only one option for NQB. NQB alternatively 
> recommends traffic protection located at each potential bottleneck, 
> where actual queuing can be detected and excess traffic can be 
> reclassified into the Default PHB, rather than dropping it. Local 
> traffic protection is more feasible for NQB, given the focus is on 
> access networks, where one node is typically designed to be the known 
> bottleneck where traffic control functions all reside. In contrast, EF 
> is presumed to follow the Diffserv architecture [RFC2475] for core 
> networks, where traffic conditioning is delegated to border nodes, in 
> order to simplify high capacity interior nodes."
> REASON:
> The comparison seems to have omitted discussion of traffic 
> conditioning topology (see earlier point about placement).
>
> Also see my attempt to summarize how NQB compares with EF {Note 2}
>
>
>          Appendix C. Alternate Diffserv Code Points
>
> CURRENT:
> "In networks where another ... DSCP is designated for NQB traffic, or 
> ... it could be preferred to use another DSCP."
> Tautology.
>
> I think a paragraph break is appropriate after this, and before 'In 
> end systems...' Or is there meant to be somehow a logical flow between 
> them?
> Reason: One part is about 'In networks'. The other is about 'In end 
> systems'.
>
> BTW, to make the section heading an easy read for all English speakers 
> s/alternate/alternative/, because in British English, the adjective 
> 'alternate' solely means 'interchanging', whereas 'alternative' means 
> what you intended in all English variants. Nonetheless, most Brits 
> would work out what you meant.
>
> ------------------------------------------------------------------------
>
>
>     Notes
>
> {Note 1}:
> _The Problem with Conditioning Traffic Remotely for Low Stat-Mux 
> Bottlenecks
> _
> The following explanation fis fairly long, because I have had to spell 
> out assumptions behind different ways of thinking. So apologies if 
> some of it seems patronising...
>
> When there is low statistical multiplexing (stat-mux) at a bottleneck 
> it becomes very inefficient (verging on impossible) to locate traffic 
> conditioning (aka. traffic protection) at multiple ingress points 
> remote from the PHB (the potential bottleneck).
>
> For instance, let's start with a simple toy case in the downstream 
> only, where all NQB flows (e.g. online game sync streams) have the 
> same regular packet bunching, so we can know that (say) 12 of these 
> flows in one buffer would cause too much queuing. Then if NQB traffic 
> is being conditioned remotely at 3 ingress points, how many flows at 
> each ingress would be too much? Not 12. Maybe 5? Or probably 4 to be 
> certain not to cause excessive queuing at the bottneck.
>
> But it would not be so unusual if the set of clients in a home happen 
> to call for most of their NQB traffic via just one of the ingress 
> points at peak time on one day, and most via another on the next day 
> (it's not unusual for the interests of people living together to shift 
> around together, because some families still even talk to each other 
> sometimes ;). But remote traffic conditioning has to prevent them 
> exceeding 4 flows via any single ingress, even if they aren't pulling 
> any traffic from the other ingress points.
>
> If 5 flows are passed through an aggregate traffic conditioning limit 
> of 4 flows, usually it will ruin them all. So, to ensure that more 
> than 12 flows don't ruin this family's service, their service is often 
> ruined whenever they call for more than 4 flows. Ironic but a 
> consequence of remote traffic conditioning at 3 ingresses for a low 
> stat-mux bottleneck.
>
> With traffic protection based on /actual/ queuing (located at the PHB 
> itself), traffic would only be limited when 12 flows coincided, and 
> then only when their bursts coincided.
>
> Real traffic is not as regular as these toy example flows, so remote 
> traffic conditioning is even less efficient. Short NQB flows would be 
> mixed with medium and long ones, each with different regularity and 
> different smoothness. So each traffic conditioner has to err on the 
> side of caution in case bunches of packets from the other conditioners 
> coincide at the bottleneck. So, 3 conditioners have to limit their 
> /burst/ allowances to 1/3 of the available capacity for NQB at the 
> PHB. And in low stat-mux, even for NQB, average rate will be many 
> times less than the allowance needed for bursts.
>
> In contrast, the RFC2475 Diffserv architecture (with traffic 
> conditioning around a domain border protecting the PHBs on interior 
> routers) is applicable to hi-stat-mux networks, e.g. enterpirse 
> networks attached to large core networks. Then, although the amount 
> and balance of traffic from different ingress points (the traffic 
> matrix) varies, the variation is of the same order as the average, not 
> many multiples of the average. For instance, let's multiply up the 
> above toy example by 1,000. if 12,000 flows at a link come from 3 
> ingress points, with on average 4,000 each, it is highly unlikely that 
> at peak time on one day 11,000 will all come from one ingress, and the 
> next day 11,000 will all come from another. Instead, the range of 
> variation at each of the 3 ingresses might be 3,000 to 5,000 flows. 
> Also as stat-mux increases, bursts and troughs tend to cancel out more 
> than they reinforce. So, with hi-stat-mux, remote traffic conditioning 
> can be sufficiently efficient to be worthwhile.
>
> Also, the aim of the RFC2475 architecture is to avoid traffic 
> conditioning on high capacity interior nodes, in part due to the 
> complexity at high speed, and in part because shedding traffic from 
> such a high capacity node would impact a large proportion of the 
> customer base.
>
> The opposite is true with a low stat-mux bottleneck. Adding complexity 
> to detect and handle actual queuing is feasible at lower scale, and 
> shedding traffic by definition only affects a few flows (as few as one 
> - with per-flow mechanisms).
>
>
> {Note 2} The following is my attempt at a comparison of NQB with EF 
> (written as a note-to-self originally, but feel free to lift parts for 
> this draft if you want):
>
> The main distinction between NQB and EF is that an NQB bottleneck is 
> not guaranteed to stay below a certain queue delay, so (in the 'actual 
> queuing' alternative) NQB relies on traffic protection at each 
> potential bottleneck  to shed traffic that is causing actual queuing. 
> In contrast, EF can guarantee a maximum queue at interior nodes by 
> using traffic conditioning at border nodes to shed any traffic in 
> excess of the contracted aggregate EF rate - even though accepting the 
> excess traffic might not have caused any actual queuing (see Appendix 
> B for details). The 'actual queuing' approach of NQB is more 
> appropriate where statistical multiplexing is low, e.g. in access 
> networks. With low stat-mux, there is high variation in the total load 
> of the class, so it can be highly inefficient to limit traffic at each 
> border in case correlated bursts cause queuing, compared to only 
> dealing with queuing that actually occurs.
>
>
>
> Bob
>
>
> -- 
> ________________________________________________________________
> Bob Briscoehttp://bobbriscoe.net/

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/
Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21 - N… Greg White
Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21 - N… Bob Briscoe
Re: [tsvwg] Review of draft-ietf-tsvwg-nqb-21 - N… Greg White