[tsvwg] Question regarding acceptable sharing behavior between "normal traffic" and L4S traffic

Sebastian Moeller <moeller0@gmx.de> Mon, 18 November 2019 09:18 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B775B12086D for <tsvwg@ietfa.amsl.com>; Mon, 18 Nov 2019 01:18:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.349
X-Spam-Level:
X-Spam-Status: No, score=-2.349 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BeNEQKxLs1In for <tsvwg@ietfa.amsl.com>; Mon, 18 Nov 2019 01:18:35 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2D28E1200D6 for <tsvwg@ietf.org>; Mon, 18 Nov 2019 01:18:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1574068710; bh=SYUwxB8V1ZTmooEDltke6rL2CKN/GK2l0CtWx4SWscI=; h=X-UI-Sender-Class:From:Subject:Date:Cc:To; b=icOrB+QOIoKAI9BzUZUntUEFm7hYqHxWQTkhQx1sVGpzMjqCesVhCQDzwOQb9+8BT L7wKqyKS4P1lUlsNJoFAxwooVQPIPxu9PtaBwcl6VeZPXBExkARDIsMgINaHpGR2Of FGqMIeoRzw5Z1NUZs2NFX1uFlYpmvTvLBst/zauM=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.33] ([134.76.241.253]) by mail.gmx.com (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MEFzx-1igm2t2NK1-00AG9h; Mon, 18 Nov 2019 10:18:30 +0100
From: Sebastian Moeller <moeller0@gmx.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Message-Id: <F39C308A-739B-4B87-A72C-224E45C01E70@gmx.de>
Date: Mon, 18 Nov 2019 10:18:29 +0100
To: tsvwg IETF list <tsvwg@ietf.org>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:zJKAiVs33JHvOPzWgyFkCQandcP0UH5atFU6XFJL6vV2F8ArgBt XxK3ImZkiPRlLWdgadJUbrm4a+wwROgl0ylmg180fsyLYntkikHQXTY7AQ3S0+Ux68QKsHP +hvZm+RMhBGpageiyiSJsiKNL2/utqr8B/mTxtfoPU3eBg41L7Z5WFySdZ5qXSfl1izkXx7 HKNghK+5m7uWl3Is7SPxg==
X-UI-Out-Filterresults: notjunk:1;V03:K0:88OHtmS3H3I=:wzeI0WQSHg5A2ZuhuzdDu7 WFDRlWquBQWgMbfCPI116eXvE027JuJjjtD8yhKTjPnpKUNrVgi6cumIVyWlvSV+mBXQulqik 8hqaj3Hw7JoCACwOamYD/ce2Jf6ucG83ko72B3l7hsgVSpyf2jn+V8QYaxC6Ovr3kNRP6hpDo wr5rs3vG7GvnCSdn0OLGZlbQeDmIdPASTnrHIqYOSBEw2+Ta9FzzZmOVH55vyP6I1pOcXnWAW ctKyOoge7PntgkUvqs1zZMPiWVweGsmN0l2+znb79f3qgbeP5NBu7S6avJWHDezP6TC5tyWnu Y5rb//tb25rDYD2hSE9DATvhHnDUeERtD7b+LLUKmXxD+I+nJynpbchYPfOUbVB+VXiTroEEk 6h5HkHqG2o+L3Uo1wrWYqGpIoxlEWXP3rHsy8orOld2sxgJfIS+pVOYly+Ss7/hzKgoDph5Jm AiYMuSuXsMU9JRvlhXAURY7ms1ArHShOPTZ98TZOJRSUoUnlSF80cO6iodX+kR8IBTaPRkxnQ KdzvFecQDpPOcguB/BUyUyiGoJwlroEo3OEElYBTeKkhyM6nCsfvuQIqlS+OT4tr+Q/9pvusx 7R+ZLafHxFcfYxq7OMZSZmRCxPvJ7sFoE0gpR5/rRF2wGpsR5/4qOM3IoPoDF6v7Jxl4OrMt2 7WFX3n74W8r12eF6KUUB1hs7H/8IB1Qnrdrc7KUA8hbcwJoBfkY4+d/KewvOmstjEXiY8KNUV ppvMJdU6MmWCinmTwXl9Zigdf8xd6AurhSrNUmxH0ITTERdKF4IPFW66RkuyUdFfWdWLvpayO a0U6fvqhZaH6v+q7SZeZMbWKOoCSMkwnewKu39lxwVC5P8Wgr9HKG2FnTzHl3xV2/VN9GgYGe iXiko+fQ9WG4Tr8qFvPuJK3LjS2E1XgyrVgUH6TWLj9j2b2V91KA45YEeaxPbrytfqfXbq6if zFAGGRbwafY6pNBrjy6SUB1cT9ILL6hf9wyMbNHwNm4FYrZod5dBhWj9rw8zekidd70XwIav2 jG4mbjja9xbVnKl3ZWVJ5eDIKFgyxBIXDcSEuDGXxGDDzQ2HNVci/41JGq7U7a9kg7BQ3VMaF v1Pvj5kBUNLd/lAygH2xAEeyOumzzBTfBMwiI1tgqWKhaAOsPs1x4EOcAwFOmNmd7esQGp72G V/JFChSUajKLcDoRykUcnJJwHzRPlZclj9l4Ez4L6Qo66P2Qe+fyvsKXHniE3A34URki62sSg eemd3Q6PCFOk5FzGoS4oMfN833WrNqXvqlvNouXEUffqPDH68DergZbB11lk=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/4xX0fUZMt49BJjFhrZMSDyldD8I>
Subject: [tsvwg] Question regarding acceptable sharing behavior between "normal traffic" and L4S traffic
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Nov 2019 09:18:41 -0000

Dear All,

I have been making an argument repeatedly on this list in different threads to which I received inconsistent feed-back. So, I would very much like to raise this issue explicitly to the full list, as I believe this to be important to address in regards to L4S.

It has been observed by two independent measurements that the reference dual queue AQM demonstrates RTT-dependent unfairness between L4S and standard TCP traffic, the shorter the RTT the more of a bandwidth-advantage L4S gets.
At a nominal RTT of 0ms (which as far as I can tell just means no additional netem delay, so a real RTT in the <1ms range) TCP Prague gets ~5-times more bandwidth than TCP cubic:
The L4S team's measurement:
https://l4s.cablelabs.com/l4s-testing/key_plots/batch-l4s-s1-2-cubic-vs-prague-50Mbit-0ms_var.png
The SCE-team's measurement:
https://www.heistp.net/downloads/sce-l4s-bakeoff/bakeoff-2019-11-11T090559-r2/l4s-s1-2/batch-l4s-s1-2-cubic-vs-prague-50Mbit-0ms_fixed.png

That leads to my questions:

A) Is/was everybody aware of this unequal sharing over a nominally equal path? 

I ask because the following two statements from the L4S-arch and the dualq-coupled I.D.'s seem to indicate a different kind of more equal sharing to me:

https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-10 :
"This specification defines `DualQ Coupled Active Queue Management
  (AQM)', which enables these Scalable congestion controls to safely
  co-exist with Classic Internet traffic.
Analytical study and implementation testing of the Coupled AQM have
   shown that Scalable and Classic flows competing under similar
   conditions run at roughly the same rate."

and

https://tools.ietf.org/html/draft-ietf-tsvwg-l4s-arch-04 :
"Further, the network part is simple to
  deploy - incrementally with zero-config.  Both parts, sender and
  network, ensure coexistence with other legacy traffic."

In the limit, the dual queue AQM will share traffic between the L4S queue and the normal queue approximately 15:1. Unless coexistence is considered to mean "does not starve" I fail to understand how claims and observed data match up.

B) Does everybody here think that this what is implied by the above stanzas and that this is an acceptable guarantee to merit allowing/endorsing the use of dual queue AQM over the internet?



	Oliver Tilmans helpfully explained, that this behavior is draft-compatible but relies on a specific interpretation of "under similar conditions", once the queue under load is taken into consideration the RTT ration goes from 1:1 to 1:15 and hence the observed bandwidth sharing is considered acceptable due to conditions not being similar enough any more. 

I have two issues with that rationale:

1) this does not seem to be the natural way to read that claim (Oliver mentioned that he had the same question/observation when he started to look into dualq, so my reading of the text is not completely outlandish)

2) It is to a large degree a consequence of a design decision in the dual queue AQM that does not seem to have been sufficiently thoroughly considered.
	The point is that dual queue simply copies the recommendation to set the target for the "normal queue's" pie instance to 15 milliseconds (simply copying from QDELAY_REF in https://tools.ietf.org/html/rfc8033#appendix-B) without taking the consequence into consideration how this affects bandwidth sharing behavior between the two queues at short RTTs. According to codel's theory (https://tools.ietf.org/html/rfc8289#section-4.3) target should be set to ~5-10% of the interval, so instead of 15 ms, 5 ms probably should do to maintain bandwidth utilization while at the same time addressing part of the root cause for unequal sharing at low RTTs. It is well possible that 5ms will not wrk for a PIE derivative shaper like dualQ, but this should at least be explored before designing an undeserved advantage for L4S into the specifications.


Best Regards
	Sebastian