Re: [tsvwg] NQB, what rate is permissable for flows to still "be" NQB

Greg White <g.white@CableLabs.com> Thu, 13 January 2022 18:22 UTC

Return-Path: <g.white@CableLabs.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 296B63A0122 for <tsvwg@ietfa.amsl.com>; Thu, 13 Jan 2022 10:22:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.101
X-Spam-Level:
X-Spam-Status: No, score=-2.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=cablelabs.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LPIR4vjSdZ5J for <tsvwg@ietfa.amsl.com>; Thu, 13 Jan 2022 10:22:29 -0800 (PST)
Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2134.outbound.protection.outlook.com [40.107.220.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C0133A0123 for <tsvwg@ietf.org>; Thu, 13 Jan 2022 10:22:29 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eTHUXffuX6xplRVUg0Reh/p+pWtU5gSYRSVBUGIdswa/FQBB1U2JuSohlaysmZ2FSE/IeSM/nN+j1uzg/hHtKfOFTtbav73dFkw7RlY/h7NrmQHOwX+tK84NDFiowa2IADzAyGfy4rpBWxNihQc3zU+HemXgpKVURpTe+BYVLQa2uIUF9yMhDpnkB2gjWsP1nrp04OOLnzaBf/3iGFwipIc+zndT2NDI6zxA5zZY+2TnHzjiX0Zg5Qe11eXx1Qq+JTEeiB1lLr9LFahJbadd2GRMyy0+MOcZbZ7nwj9hJChGQFGpfIqnXtDYNILoFcwa7D/rl2LAYQlqPrEP3exGiw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6TSSXmMe33INdkB8xof9ZjBKymM9JeS8cWG25PZCDLo=; b=b2bSbqpCdDChRGWh8XwywgbGN8wkn+i7Tf8yiNegmD6pQz88ghvDNbRSTCzEY1wg7LJ57USmzp/81HRCopfqOkSXPprtKpmQlDbLwmUYYu6B5hWimIxJHoq+7+LqVgLgKSF5UvOGa1V406GlVHMaZ0JDGV3c07RlD4ve+MQBjusVSQR5jFy4f0L4sdimVH5z3ggE5t9qh6hIEuNU6POP8pdSH34PpGkCO2kSlQK4fkv+wmFR1Sbx0tb2HvsLFYx20Xx93q+HDtJNVeIRKDaWjVtJJS6gY53NZuAK6YAaw+b3PbDwTYCIdHgz0TYd3yq1MO7/pDj6w7sOd49T1KGLGw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cablelabs.com; dmarc=pass action=none header.from=cablelabs.com; dkim=pass header.d=cablelabs.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cablelabs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6TSSXmMe33INdkB8xof9ZjBKymM9JeS8cWG25PZCDLo=; b=OlZ3qM9R9K+K6FZki7tdU8JT/FwS5h8KKbf4Gf8MeASjGqNKBBe+uhYI+M7CydO/KAAfhPdBcJpwpIudMVcGyGrjeG/1v6PYFBZe6XnIppdHC/j09QFaGMkcxqyKI8HS0g10eVAD2MblLgGhhT4rNSWrQr5uw3z4aC8rqBKZVzpe74bD62xR2SZtMfV/sYkcTBGEgI5XUQyU3d+XaPpnPYswd7mkmf0mSfvaZm4BpWDMKpV3+u/jaf2l6evaGm2fccjESVYjivj44HUmrmbe2K9+FhEAuBO3dyCj7l1gRAmGtz0tmKdXMGpAfgugMF2OrAqfrspLNfyK2xF1cyZxxg==
Received: from SJ0PR06MB7861.namprd06.prod.outlook.com (2603:10b6:a03:38c::19) by DM6PR06MB5691.namprd06.prod.outlook.com (2603:10b6:5:1aa::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9; Thu, 13 Jan 2022 18:22:27 +0000
Received: from SJ0PR06MB7861.namprd06.prod.outlook.com ([fe80::cd4d:4c61:fd26:89a5]) by SJ0PR06MB7861.namprd06.prod.outlook.com ([fe80::cd4d:4c61:fd26:89a5%5]) with mapi id 15.20.4888.011; Thu, 13 Jan 2022 18:22:26 +0000
From: Greg White <g.white@CableLabs.com>
To: Sebastian Moeller <moeller0@gmx.de>, TSVWG <tsvwg@ietf.org>
Thread-Topic: [tsvwg] NQB, what rate is permissable for flows to still "be" NQB
Thread-Index: AQHX2LWoyAfyPOPv9U6ygKqG9ihNHaxhLkKA
Date: Thu, 13 Jan 2022 18:22:26 +0000
Message-ID: <896555AE-986F-414B-8976-F5AAB34DEAB3@cablelabs.com>
References: <A492D4CF-F597-4A0C-AA12-08CFB7B1BFE8@gmx.de>
In-Reply-To: <A492D4CF-F597-4A0C-AA12-08CFB7B1BFE8@gmx.de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.44.20121301
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=CableLabs.com;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 1d198a03-4060-4f1c-15e5-08d9d6c1a6d4
x-ms-traffictypediagnostic: DM6PR06MB5691:EE_
x-microsoft-antispam-prvs: <DM6PR06MB5691A5E2D9279559DCA7DC26EE539@DM6PR06MB5691.namprd06.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: MI4YfofIqyRQ1Dp8TiUvipLVV8xg+jeMJempOGyzbY6In74xHpY5J1NSTzrm+BbfMu6MUO/xP0Kj/a6QKpLRRMCcPE86XfkULZWx4pkOyxy+LNL7QOPQVhEJKGXSLcFjDRXj7x1zzCezVTCW9lH1iFRRSj1jwvslt3oeHNwGu13quk/zCCzArQXE1zichPVrn0oR0FHdfjX42Q2dvIR1nFSX3kQ49BZDJCdhrG3C5l0XX6PQfKg5Iq9kMmSy5FldH8SkdYT1oMKkbR8VlTnIVtKMZNcG/KBKisslGBCz4+eUrWpFkTXTwGqak9dew64Yzrm+9RWwF2MQAr5V66i6C1Ytg7H7XuyE+ETwva9vHeXQqg3FEyCMuGlmt2xeXzMzHNrVnvdrdmj6FRst0XWVFVS5O9S6Dy7Odg1bO3Zg2iC4fRr8ijgtgOPyVJZVod/6Hx3O/kdVCZb/cHd6Jwa1HTe8OVYHKElnSmdvx4f+xmIJ2B2q1qA0MdLZs3oXA+Lm0TcULHqOA08wc46iWHaLa+XCyWw0rzsRMfQq6+ZrddWr/0FCaxjCES4lHOg7oE5JXCCI5SGwXKckq/fN8UQVFRl9BSr8qblVe3jDcD0e7kU7Fbfe4nXT57uYrkurZ6Vw83ZGLLbJg5HoX7a77xDkFbQvnrRk+39V3vpX0ouyndYqFt7/oDstJE4zti8YQAzNuvvDLNKhcKnFjhDPSD/J8cxqk188CfmGGSzm3cVcvlw71pOa35xX/el1OylHr2Q3KuzChODnk+gPDdiKmBe5G0yv0FZHIe+2Qli/w8ArCbnTmTYjyzA+MBpoL/z9Go8jPIPDmWY0KWR4WYATQafCwGhIbhKyO3/nW90wKJvfYmS6owGaJn/PPnekLB83iHrZ
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR06MB7861.namprd06.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(8936002)(2616005)(186003)(316002)(26005)(38070700005)(2906002)(8676002)(38100700002)(110136005)(122000001)(71200400001)(83380400001)(966005)(6486002)(86362001)(66946007)(66476007)(64756008)(66446008)(36756003)(66574015)(5660300002)(66556008)(76116006)(91956017)(508600001)(6512007)(33656002)(6506007)(20673002)(45980500001)(85282002); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: F9SuIdQ32/NSWjO/VfhFPv11K10Lj3lAEph9yafbhdpNbkOXRtCPDf5CrYOcyFNGwGeM8cnWl9BkbF5NFtBE4Uy6p19P2jCnbs9In0YDYGi1NSEMy3QBXUhN9NIMjdNJjT4n0QoEKXfOnH9x5hShqs6eSrAs3U+ts464q9A7TQyesjNkGWUFkYtcby3lmnvpMvIVWZxntmIpD2T+4P9mC1wH3yZzSHEwrdNJrNGuyXa+0V0jasmvXDCRwgzvZ3gZ4NT19Upu+veQV1LAQMQkF780CD4pU64vmGAPZoHAGbQQ5aN36951FU/2vwDecAV3MXjcvM6FxOar4kbop6JMwYfTVuDQ4+cAG3PyGDA6Mmhqbs8FRBY/bqGLRPG32hhj5fMTcKmvMOxLIJbnUt9AsZGMiaddFEsfMZEWP3JdDkajpYySJUnlxdWNoO8oqG5PuX4eq+CQU9034SKlJG409oYF3+sAUJU0pC8ytyfbN0ri2WrKwxorNgDtX+CgzlQNrAUNl2UagoFgyrlFhRn/1fOQxililEc5is0tC488ekBOKb9KAVafCAWH6OjkSBqv/Ljlvzx+BK8LrX96yirTLlhULqksiciZQyNfN2hgVIa5AV6yTGobNtvJaBTdZy9K1ZOpi7zgttIQ9ZXZVG4oT65XvkfQFBBjBn9B/AzU1DhQlRAgKDzVTKQ8Pte/tDYwAIOi5QmGvrZ4AKXQG0kyJNgHKXEbBnlmqgHUFac3n9NmETnJj403i+Mj2GJS8q2/6xWQ4BojsyfOmK57doj/z3Wmew5MAkcJaw3AqNQiDqM7ZIHpFMPANBobxi3EZCBPb3mJZkBdK+677cEqYcvKIq1SVV2OqnZ7CZm+FWQij8fC6DEN5iL6N6J55O21q0xjIFADAq7V8oRBqLO8eDY92fL2vRH59D0LM8Wbg9CRHofipfj3MVVzBBtmfaUDGonrs9UdnUUATii+X0VXsGwY5WHUZCdcvMwtV/anHTS+uhj6LQnDuzsBzMRpLqu3sqQdPUkh1oVmNMZ+zp/QUYrPuMRghnsvuhEimOeYl4NLzTupfeOSLSec2ihMayfN+rf/9Sey7b5ZnIDy9nUv3VzJXMusWSZ489FUlkcb8DyyWrr3F5d02d7UDtC83UaK3puBCAt8d6USUd0Ir/UCOuQh8o/IHI4M8vE8cNv2Nzdsb/B0LYpwb7fV8p7D2HXsOjoFAEaWDMs7QCYE2tgDIbWIdZKFGqjvb6kzXGHSHPHqUL+3hHrCez3k/BU0ksmdFwM/227iHsoEBNRX4bnIk5sLWs0Ieou6zCywxqarP+vam90vyD4Hr9w5z65IiGRTdQFP+iYM1nK2z682i4atDI66TJ9/Etj0mqRRlESh5++yZzFScNmweCTZmrMgxlQWwAcjH1AtYrjp28LFvwGPCaXhqzxZhrSQchTQTPnU77YXFc/4U95H7nfqR0W9f0E7rI1jY1iTXxr4B8IBIdCAYS2rfQTspk9Pasxc9nO+wHTMctlqpNSjiXZ9pOrIcCWA+obTys8qbDTYXzxCx0NQKbAmWHShYR2F29tcu3RYrl67cv0GkJUvZQR7/HnPHGCOh2cHsw2tjYcwSCKB3xROyR0AN9c8AdFe6pBVvOSnb7AKTuE=
Content-Type: text/plain; charset="utf-8"
Content-ID: <BF97F2E714D02E48825E026D8939173A@namprd06.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: cablelabs.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SJ0PR06MB7861.namprd06.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 1d198a03-4060-4f1c-15e5-08d9d6c1a6d4
X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Jan 2022 18:22:26.3554 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: ce4fbcd1-1d81-4af0-ad0b-2998c441e160
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: E5zfLuvS3lE/oaLIMPoPztX22x6tw0Wh6hTd6uWUwM8wLgjLtfw/3igJISeSH6jXRQvzqINyqeH0psFFlNrE+A==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR06MB5691
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/46k1oWi7xt_7OslKt8PIlAYjn8w>
Subject: Re: [tsvwg] NQB, what rate is permissable for flows to still "be" NQB
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jan 2022 18:22:34 -0000

Hi Sebastian,

Thank you for the constructive post.

I do agree with you that using global average statistics isn't ideal.  If we had a global CDF of access network rates, sampled randomly and updated periodically, that would be better.  And, as you indicated, we could then use the 10th or 25th percentile number as guidance, rather than some fraction of the mean.

Given this, perhaps we use language that indicates more clearly that the limits on rate are expected to evolve over time, but keep the recommended limit – labeled somehow as "current" – of a few packets per RTT or 1 Mbps.  This leaves it for future discussion and assessment based on up-to-date knowledge at the time, rather than trying to nail it down precisely today.

On the other topic you mentioned – applications that might send a few packets each "tick" – that is a good point.  The applications we're referring to are isochronous (sending state updates on a fixed interval), but each update "message" could exceed the network MTU and thus be broken into multiple packets, usually sent back-to-back.  This raises the question of how big of a burst should be allowed?  The language currently implies than an MTU-sized "burst" is ok (i.e. it doesn't limit the packet size to something smaller than that).  But, upon what foundation is this based?  Surely an application that sends a 1500 B packet, once every 30 ms, isn't significantly more NQB than an application that sends a pair of 800 B packets every 32 ms.    For that matter, an application that sends a pair of 200 B packets back-to-back likely would be more NQB than an application that sends a single 1500 B packet, yet the former application would fail the test in this recommendation.

In my opinion we only need reasonable guidance here, not precision. NQB is not a guaranteed service after all.  Since this is just a recommendation (and we provide some further qualitative description about what NQB is meant to be) perhaps it is ok to leave it to the reader to convince themselves whether or not they are (e.g.) fine marking an online game's traffic as NQB even if it occasionally sends two back-to-back packets, if the data rate on an inter-burst basis is still well less than 1 Mbps.   There will always be borderline cases and exceptions.  Perhaps a particular application indicates to its users that the minimum requirement for internet connectivity is 20 Mbps. Based on this, they might feel as if they are not taking too much of a risk by marking a smooth 3 Mbps stream as NQB. 

At the end of the day, I think the intent is that the application developer understands that by using NQB marking they take on some risk that their traffic will be sanctioned by a Traffic Protection algorithm (which could result in out-of-order delivery or loss) in exchange for a protected, low-latency queue. The lower (and smoother) the data rate, the lower that risk is. We should provide suitable guidance so that an application developer can relatively easily make the decision.   

Another aspect I'm considering is whether the draft should provide some guidance for "very low rate" links.  Guidance could be that very low rate links (e.g. links that are less than the 10th percentile of global access network rates) should implement a Traffic Protection mechanism that is more tolerant of burstiness (i.e. a temporary queue) than higher rate links might be.

-Greg


On 11/13/21, 10:41 AM, "tsvwg on behalf of Sebastian Moeller" <tsvwg-bounces@ietf.org on behalf of moeller0@gmx.de> wrote:

    Dear list,


    in last Friday's tsvwg meeting Greg asked what recommendation the NQB draft should give senders about maximally permissible rates for NQB flows. (At least I left the session with the feeling that input on this topic was requested).

    	One proposal, already under discussion on the list was to base this on some statistics of access rate, with the proposal being to take 10% of Ookla's idea of global average access rates (see https://www.speedtest.net/global-index "Speedtest Global Index") which reports global average (and per country sub-averages) of upload, download and latency numbers split into mobile and fixed broadband.

    	These are taken from random user measurement against Ookla's speedtest.net (which in itself is a collection of mostly volunteered measurement nodes that just need to fulfill a number of criteria (see https://support.ookla.com/hc/en-us/articles/234578628) to enter the speedtest network and are mostly operated by entities other than Ookla). The required minimum internet access speed (1 Gbps) makes it very likely that these measurement nodes are located in data-centers. During a typical measurement the system will try to automatically match clients with "close by" servers (but allows clients to manually select servers as an alternative).

    	The global index (from 2019 on) will include data from all countries that "have at least 300 unique user test results for mobile or fixed broadband in the reported month" (https://www.speedtest.net/global-index/about). Neither for the global average, nor for the per country numbers any information about the sample size is given (above the minimum), nor any measure of variance. So this source only gives (somewhat under-qualified) average numbers that are likely to be not really representative (of the spatial distribution of all users within a given country/the world) and  will be biased towards higher speeds (as averages of positively skewed distributions do). If we compare Ookla's current result for Germany https://www.speedtest.net/global-index/germany#fixed
    with the data from the official German speedtest site (lower plot on https://breitbandmessung.de/interaktive-darstellung-jahresvergleich *) it should become clear what a per country, or global average hides: the composition of access speeds in Germany as seen by that CDF clearly is more complicated than an average implies (it is not a smooth curve as there clearly are speed-tiers that ISPs can and do offer with their available technology). 
    	For answering the NQB problem something related to the experience a given number of endusers will have seems more appropriate than an average. So, IMHO, something like the lower 10% or 25%-percentile of the access speed (taken from a per-country CDF of reliable data sources like in ehe EU the national regulators' reports**) might be a much better number to use for the NQB recommendation than Ookla's averages. Qantiles/Median at least allow to estimate which fraction of users can expect rate issues with compliant NQB-senders. (This is not to dismiss Ookla's numbers, it is great that they compile and offer those, it is just that these are not terribly well suited for the NQB purpose).

    	But overall, given that such a recommendation can only ever be approximate and will not in itself guarantee safety for any individual connection but can just be a rough guideline for senders, I would say the current recommendation to not NQB mark flows with instantaneous rates >> 1Mbps seems just as appropriate as any statistics taken from non-official/non-representative numbers like Ookla's.


    	As written this recommendation will however run into "conflict" with gaming traffic (not cloud gaming with its video streaming but client server gaming where mainly control packets are exchanged), as games tend to send/receive packets on a "clock" (often 30 or 60 Hz, but 128 Hz is not unheard of) with one to multiple packets per clock-tick (see e.g. https://www.sciencedirect.com/science/article/abs/pii/S1084804511001731) and since back-2-back packets will create bursts at link speed (assuming the burst travels intact to the bottleneck and the bottleneck does not employ a scheduler that splits bursts up) these can exceed the current 1 Mpbs recommendation (I am grossly simplifying here): "Other types of QB flows include those that frequently send at a high burst rate (e.g. several consecutive packets sent well in excess of 1 Mbps) even if the long-term average data rate is much lower."

    	Assuming NQB is really intended for gaming control packets, this might be remedied as Greg proposed by giving a more permissive definition of "several consecutive packets" (like say expand several to >= 10 or similar). Whether that is justified on a slow connection within NQB's internal logic is a different question.

    Regards
    	Sebastian


    *) These numbers are not as current as Ookla's and hence a direct comparison seems not advised (https://www.speedtest.net/insights/blog/icymi-ookla-data-and-analysis-from-december-2020/ gives Ookla's fixed broadband global average for December 2020 as 96/52). But these graphs show the development over a number of years including the respective N for each curve and allow to see the countries dominant speed tiers (50 and 100 Mbps) and alsi allow to see the median of the 2019-2020 period (51.1 Mbps down, ~10.7 up) and that there was a small fraction of links with considerably higher speeds (500 and 1000 Mbps) which will make sure that the average (which appears purposefully not reported by the national regulator) would be noticeably larger than the median. BTW the 25%-ile was around 22.7 Mbps down 5.2 up...


    **) Unlike Ookla's "mostly-for-fun" speedtest*** (no disrespect intended) the official EU speedtests like the one in Germany that can be used as evidence at court has considerably more stringent specifications than Ookla's and generally seems to be designed and operated carefully, including approaches like having a user report the contracted speed before starting a measurement to make sure the measurement back-end has sufficient resources to saturate that speed.

    ***) To repeat, I do not want to criticize Ookla here, it is just that their numbers are not suitable for some purposes, especially purposes the test was never designed for.