[tsvwg] Comments on draft-white-tsvwg-nqb-02

"Bless, Roland (TM)" <roland.bless@kit.edu> Fri, 13 September 2019 15:49 UTC

Return-Path: <roland.bless@kit.edu>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 24005120047 for <tsvwg@ietfa.amsl.com>; Fri, 13 Sep 2019 08:49:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3GhycbDmSlnl for <tsvwg@ietfa.amsl.com>; Fri, 13 Sep 2019 08:49:23 -0700 (PDT)
Received: from iramx2.ira.uni-karlsruhe.de (iramx2.ira.uni-karlsruhe.de [IPv6:2a00:1398:2::10:81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 85AE412001E for <tsvwg@ietf.org>; Fri, 13 Sep 2019 08:49:23 -0700 (PDT)
Received: from i72vorta.tm.uni-karlsruhe.de ([141.3.71.26] helo=i72vorta.tm.kit.edu) by iramx2.ira.uni-karlsruhe.de with esmtpsa port 25 iface 141.3.10.8 id 1i8np1-0002cn-AT; Fri, 13 Sep 2019 17:49:19 +0200
Received: from [IPv6:::1] (ip6-localhost [IPv6:::1]) by i72vorta.tm.kit.edu (Postfix) with ESMTPS id 308F74211D5; Fri, 13 Sep 2019 17:49:19 +0200 (CEST)
From: "Bless, Roland (TM)" <roland.bless@kit.edu>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>, Greg White <g.white@CableLabs.com>
Openpgp: preference=signencrypt
Autocrypt: addr=roland.bless@kit.edu; prefer-encrypt=mutual; keydata= xsFNBFi0OxABEACy2VohJ7VhSu/xPCt4/6qCrw4Pw2nSklWPfAYEk1QgrbiwgvLAP9WEhAIU w45cojBaDxytIGg8eaYeIKSmsXjHGbV/ZTfo8r11LX8yPYR0WHiMWZpl0SHUd/CZIkv2pChO 88vF/2FKN95HDcp24pwONF4VhxJoSFk6c0mDNf8Em/Glt9BcWX2AAvizTmpQDshaPje18WH3 4++KwPZDd/sJ/hHSXiPg1Gdhs/OG/C0CJguOAlqbgSVAe3qKOr1M4K5M+wVpsk373pXRfxd7 ZAmZ05iBTn+LfgVcz+AfaKKcsWri5CdTT+7JDL6QNQpox+b5FXZFSHnEIST+/qzfG7G2LqqY mml6TYY8XbaNyXZP0QKncfSpRx8uTRWReHUa1YbSuOxXYh6bXpcugD25mlC/Lu0g7tz4ijiK iIwq9+P2H1KfAAfYyYZh6nOoE6ET0TjOjUSa+mA8cqjPWX99kEEgf1Xo+P9fx9QLCLWIY7zc mSM+vjQKgdUFpMSCKcYEKOuwlPuOz8bVECafxaEtJJHjCOK8zowe2eC9OM+G+bmtAO3qYcYZ hQ/PV3sztt/PjgdtnFAYPFLc9189rHRxKsWSOb4xPkRw/YQAI9l15OlUEpsyOehxmAmTsesn tSViCz++PCdeXrQc1BCgl8nDytrxW+n5w1aaE8aL3hn8M0tonQARAQABzShSb2xhbmQgQmxl c3MgKFRNKSA8cm9sYW5kLmJsZXNzQGtpdC5lZHU+wsGABBMBCAAqAhsDBQkSzAMABQsJCAcC BhUICQoLAgQWAgMBAh4BAheABQJYtYdHAhkBAAoJEKON2tlkOJXuzWkP+wfjUnDNzRm4r34a AMWepcQziTgqf4I1crcL6VD44767HhyFsjcKH31E5G5gTDxbpsM4pmkghKeLrpPo30YK3qb7 E9ifIkpJTvMu0StSUmcXq0zPyHZ+HxHeMWkosljG3g/4YekCqgWwrB62T7NMYq0ATQe1MGCZ TAPwSPGCUZT3ioq50800FMI8okkGTXS3h2U922em7k8rv7E349uydv19YEcS7tI78pggMdap ASoP3QWB03tzPKwjqQqSevy64uKDEa0UgvAM3PRbJxOYZlX1c3q/CdWwpwgUiAhMtPWvavWW Tcw6Kkk6e0gw4oFlDQ+hZooLv5rlYR3egdV4DPZ1ugL51u0wQCQG9qKIMXslAdmKbRDkEcWG Oi2bWAdYyIHhhQF5LSuaaxC2P2vOYRHnE5yv5KTV3V7piFgPFjKDW+giCRd7VGfod6DY2b2y zwidCMve1Qsm8+NErH6U+hMpMLeCJDMu1OOvXYbFnTkqjeg5sKipUoSdgXsIo4kl+oArZlpK qComSTPhij7rMyeu/1iOwbNCjtiqgb55ZE7Ekd84mr9sbq4Jm/4QGnVI30q4U2vdGSeNbVjo d1nqjf3UNzP2ZC+H9xjsCFuKYbCX6Yy4SSuEcubtdmdBqm13pxua4ZqPSI0DQST2CHC7nxL1 AaRGRYYh5zo2vRg3ipkEzsFNBFi0OxABEAC2CJNp0/Ivkv4KOiXxitsMXZeK9fI0NU2JU1rW 04dMLF63JF8AFiJ6qeSL2mPHoMiL+fG5jlxy050xMdpMKxnhDVdMxwPtMiGxbByfvrXu18/M B7h+E1DHYVRdFFPaL2jiw+Bvn6wTT31MiuG9Wh0WAhoW8jY8IXxKQrUn7QUOKsWhzNlvVpOo SjMiW4WXksUA0EQVbmlskS/MnFOgCr8q/FqwC81KPy+VLHPB9K/B65uQdpaw78fjAgQVQqpx H7gUF1EYpdZWyojN+V8HtLJx+9yWAZjSFO593OF3/r0nDHEycuOjhefCrqr0DDgTYUNthOdU KO2CzT7MtweRtAf0n27zbwoYvkTviIbR+1lV1vNkxaUtZ6e1rtOxvonRM1O3ddFIzRp/Qufu HfPe0YqhEsrBIGW1aE/pZW8khNQlB6qt20snL9cFDrnB6+8kDG3e//OjK1ICQj9Y/yyrJVaX KfPbdHhLpsgh8TMDPoH+XXQlDJljMD0++/o7ckO3Sfa8Zsyh1WabyKQDYXDmDgi9lCoaQ7Lf uLUpoMvJV+EWo0jE4RW/wBGQbLJp5usy5i0fhBKuDwsKdLG3qOCf4depIcNuja6ZmZHRT+3R FFjvZ/dAhrCWpRTxZANlWlLZz6htToJulAZQJD6lcpVr7EVgDX/y4cNwKF79egWXPDPOvQAR AQABwsFlBBgBCAAPBQJYtDsQAhsMBQkSzAMAAAoJEKON2tlkOJXukMoP/jNeiglj8fenH2We 7SJuyBp8+5L3n8eNwfwY5C5G+etD0E6/lkt/Jj9UddTazxeB154rVFXRzmcN3+hGCOZgGAyV 1N7d8xM6dBqRtHmRMPu5fUxfSqrM9pmqAw2gmzAe0eztVvaM+x5x5xID2WZOiOq8dx9KOKrp Zorekjs3GEA3V1wlZ7Nksx/o8KZ04hLeKcR1r06zEDLN/yA+Fz8IPa0KqpuhrL010bQDgAhe 9o5TA0/cMJpxpLqHhX2As+5cQAhKDDsWJu3oBzZRkN7Hh/HTpWurmTQRRniLGSeiL0zdtilX fowyxGXH6QWi3MZYmpOq+etr7o4EGGbm2inxpVbM+NYmaJs+MAi/z5bsO/rABwdM5ysm8hwb CGt+1oEMORyMcUk/uRjclgTZM1NhGoXm1Un67+Rehu04i7DA6b8dd1H8AFgZSO2H4IKi+5yA Ldmo+ftCJS83Nf6Wi6hJnKG9aWQjKL+qmZqBEct/D2uRJGWAERU5+D0RwNV/i9lQFCYNjG9X Tew0BPYYnBtHFlz9rJTqGhDu4ubulSkbxAK3TIk8XzKdMvef3tV/7mJCmcaVbJ2YoNUtkdKJ goOigJTMBXMRu4Ibyq1Ei+d90lxhojKKlf9yguzpxk5KYFGUizp0dtvdNuXRBtYrwzykS6vB zTlLqHZ0pvGjNfTSvuuN
Organization: Institute of Telematics, Karlsruhe Institute of Technology
Message-ID: <9311abbb-7c9d-2f66-e308-b8638c1a81b1@kit.edu>
Date: Fri, 13 Sep 2019 17:49:19 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="------------2FF90E83B964EF452B37BE72"
Content-Language: en-US
X-ATIS-AV: ClamAV (iramx2.ira.uni-karlsruhe.de)
X-ATIS-Timestamp: iramx2.ira.uni-karlsruhe.de esmtpsa 1568389759.444490773
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/OY6yFWNUGxukDOwvpMP1FuokiyE>
Subject: [tsvwg] Comments on draft-white-tsvwg-nqb-02
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Sep 2019 15:49:27 -0000

Hi,

I'm quite late to the discussion and eventually took time to have quick read
over the draft, but I think it is rather vague at times and there is
quite some
work required to either get rid of implicit assumptions or to write them
down
explicitly.

First, some general comments:

  * I find the whole term "non-queue building flows" a bit confusing.
    What the draft actually means are application-limited, mostly low
    data rate
    flows (sparse flows in RFC8290). According to the draft, capacity
    seeking
    flows seem to _always_ be queue building.
    This is not true, because it depends on the particular congestion
    control
    and how the capacity seeking is done. For instance, TCP LoLa [1]
    flows are
    not queue building, but are capacity seeking nevertheless, i.e.,
    they will
    saturate the bottleneck link. Strictly speaking, they are building a
    _limited_
    standing queue.

  * The notion "queue building" is also somewhat fuzzy: do you mean
    flows that build
    increasing and unlimited standing queues (i.e., until buffer is
    completely filled) or
    also flows that are able to build limited standing queues? LoLa and
    BBR would fall
    into the latter class, where BBR's standing queue depends on "the BDP".

  * Moreover, even sparse flows can build a queue: assume we have a
    sparse flow
    arriving at a link that is already saturated by a single capacity
    seeking flow.
    So even if the sparse flow is only sending 0.1% of the link capacity,
    it will _cause_ a queue to build up (under the assumption that the
    other flow
    does not back off immediately). The only difference is that the
    large flow possesses
    at lot more data in the queue compared to the sparse flow. Similarly, if
    there are a lot of sparse flows saturating a bottleneck, they will
    also create a growing
    queue (until they reduce their sending rate).

  * It's not clear to me how the service rates between the NQB and QB queue
    are distributed by the scheduler. From the description in the draft,
    I have
    no clue what kind of scheduler would be a good fit for the desired
    behavior.

  * Would it be possible to let an NQB flow (or several) saturate the
    link? I think it would be a perfectly valid scenario to let
    high-data rate, low delay flows use the NQB queue.
    "NQB traffic is not rate limited or rate policed." suggests that.

  * Can a low-latency high data rate flow be in the NQB queue, e.g., a
    TCP LoLa
    flow? An NQB PHB would be nice for them, because they are neither
    suppressed
    nor delay-wise adversely affected by the (loss-based) queue filling
    flows.

  * I also second Dave Black's comments on looking at the PHB definition
    requirements. A PHB definition should define the behavioral
    characteristics
    and more details, see
    https://tools.ietf.org/html/rfc2474#section-5 and
    https://tools.ietf.org/html/rfc2475#section-3
    While the PHB definition should be more abstract, I find it useful
    to provide at least
    examples of the mechanisms that can be used for an implementation.

Some more text related comments:

 1.    "Active Queue Management (AQM) mechanisms (such as PIE [RFC8033],
       DOCSIS-PIE [RFC8034], or CoDel [RFC8289]) can improve the quality of
       experience for latency sensitive applications, but there are
       practical limits to the amount of improvement that can be achieved
       without impacting the throughput of capacity-seeking applications."
    I'm not sure what your are referring to here. My experience tells me
    that
    this is also highly related to the congestion control variant. Let's
    assume
    Cubic TCP for now: If only a few flows are present at the AQM
    bottleneck, then the
    utilization/throughput may be adversely affected, but if already a
    small number of flows is
    present, a good AQM achieves good utilization due to its
    desynchronization feature.

 2. "These applications do not make use of
       network buffers, but nonetheless can be subjected to packet delay and
       delay variation as a result of sharing a network buffer with those
       that do make use of them."
    These application are actually using the buffer too, but their
    proportion
    w.r.t. queued data is probably small compared to "queue building" flows.

 3. "Here the data rate is essentially limited by the Application itself."
    That would be good to mention earlier. Is this a requirement for NQB
    flows?
    As I wrote above: there are also elastic NQB flows possible, using
    all available
    capacity. Furthermore, a lot of app-limited NQB flows could also
    saturate
    a link, do you exclude this case?

 4. "but there are also application flows that may be in a gray area in
    between
    (e.g. they are NQB on higher-speed links, but QB on lower-speed links)."
    Is this the case I just described: enough sparse flows may saturate
    lower-speed
    links already? or do you think of something else?

 5. "As an answer the last question" => As an answer _to_ the last question

 6. "Thus, a useful property of nodes that support separate queues for
    NQB and QB
       flows would be..."
    It seems that supporting separate queues is a requirement for the
    PHB. If so,
    please state that.

 7. " and for QB flows, the QB queue provides better performance
       (considering latency, loss and throughput) than the NQB queue."
    I don't see how they achieve better latency, except in the case that
    they experience less packet loss and thus suffer from less
    retransmissions,
    but this may be quite situation dependent...

 8. Should starvation of QB flows be impossible?

 9. "and reclassify those flows/packets to the QB queue."
    This should probably be explicit about flow classification, because
    otherwise
    per-microflow reordering may also adversely affect the e2e
    performance. So
    probably there is a requirement to move all packets belonging to a flow
    consistently to the QB queue.

10. "This queue SHOULD support a latency-based queue protection mechanism"
    as others already pointed out: this seems to be an important
    requirement and
    therefore, you should either state how that is supposed to work or
    the corresponding
    requirements.

11. "NQB traffic is not rate limited or rate policed." what happens if
    the load in the
    NQB queue increases (e.g., not by queue building flows)? Will QB
    flows be starved?
    I think that some scheduler will determine the rate share between
    NQB/QB queues...

12. "The node supporting the NQB PHB makes no guarantees on latency or
    data rate for NQB marked flows"
    The EF PHB is also not giving actual guarantees other than "low
    latency, low jitter,
    low loss"...

13. "The choice of
       the 0x2A codepoint (0b101010) for NQB would conveniently allow a node
       to select these two codepoints using a single mask pattern of
       0b101x10."
    Diffserv codepoints are unstructured and should NOT be handled like
    this.
    RFC 2474 requires anyway that the "mapping of codepoints to PHBs
    MUST be configurable"

14. Section 6: a discussion is missing if intermediate domains/nodes do not
    support the proposed NQB PHB: the low latency property may get lost
    completely if it gets treated as default PHB etc. RFC 2474 states
    "Packets received with an unrecognized codepoint SHOULD be forwarded
       as if they were marked for the Default behavior (see Sec. 4), and
       their codepoints should not be changed."

15. "cable broadband services MUST be configured
       to provide a separate queue for NQB traffic that shares the service
       rate shaping configuration with the queue for QB traffic."
    I have no clue what that means...

16. Section 9 contains some "negative examples" or why existing solutions
    are not solving the problem. I think there are probably some
    hidden/implicit
    requirements that should be rather extracted from this section and put
    into the definition of this PHB. For example you seem to do not want
    "that each non-sparse flow gets an equal fraction of link bandwidth"...
    or that there is a difference between the "sparse flow" definition of
    FQ and your definition of NQB flows (which are probably sparse, but
    app-limited or latency-sensitive at least?). What about
    latency-sensitive
    high data rate flows?

17. "it places unnecessary restrictions on the scheduling between the
    two queues"
    so what is the requirement for your own approach then?

That's all for now, regards

 Roland

[1] Mario Hock, Felix Neumeister, Martina Zitterbart, Roland Bless: TCP
LoLa: Congestion Control for Low Latencies and High Throughput,
IEEE LCN 2017, http://ieeexplore.ieee.org/document/8109356/