Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-id-17
Bob Briscoe <ietf@bobbriscoe.net> Mon, 24 May 2021 12:11 UTC
Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8CD483A25B0 for <tsvwg@ietfa.amsl.com>; Mon, 24 May 2021 05:11:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.433
X-Spam-Level:
X-Spam-Status: No, score=-1.433 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4grNXxI_9u2M for <tsvwg@ietfa.amsl.com>; Mon, 24 May 2021 05:11:52 -0700 (PDT)
Received: from mail-ssdrsserver2.hosting.co.uk (mail-ssdrsserver2.hosting.co.uk [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5C0BC3A25F3 for <tsvwg@ietf.org>; Mon, 24 May 2021 05:11:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Zbc2CxFOj5akXyyglPm6D2Telimxuey+UvootkIVH4k=; b=05vuRTZ/AtBhvJnqZ9dUsDfqHX LWfnAc82TCj16tWfMP8cXNKgyTYlJGF4oOGhb96kLkQiGIG5keE+0L2lc6lsegbLvM5OEVZHCoDAd HmZo9vFXi3wF6rR+gHLHkZn2AETDjLkW/Lf0CX9BAM8ollWlZxgG4u0CHG2H6cJ6jyF5rHAaqKxUQ Rxra35ULP17c15w+yXwqebqL65Af+FYd0WBkMrqm0hrYWd+NeiljBSgjhkliHJQfn7ytfmos69ruF tUPhAXxhVzdt6XQcsJXgLnWsbweKrziYYOwngHI2h45bDURjmBFB/fia+zrUK7uE9Lu5SbIOjYdvt I7WT/hgA==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:47364 helo=[192.168.1.11]) by ssdrsserver2.hosting.co.uk with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <ietf@bobbriscoe.net>) id 1ll9R1-0006aA-4m; Mon, 24 May 2021 13:11:50 +0100
To: Pete Heist <pete@heistp.net>
Cc: tsvwg@ietf.org, Sebastian Moeller <moeller0@gmx.de>
References: <162158815765.22731.15608328324211025925@ietfa.amsl.com> <f8ed1105-d1db-55ce-eb1f-00de8a83b0e8@bobbriscoe.net> <3F147A3D-BD68-4F0A-89FF-9A92284FF0A5@gmx.de> <c80a96a6-d6d4-3773-9048-805a76c6f926@bobbriscoe.net> <13316c291fafc4116d12cb350ac850ef1288fcd7.camel@heistp.net>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <fada746f-70a3-ce79-ff4a-69c062955175@bobbriscoe.net>
Date: Mon, 24 May 2021 13:11:48 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1
MIME-Version: 1.0
In-Reply-To: <13316c291fafc4116d12cb350ac850ef1288fcd7.camel@heistp.net>
Content-Type: multipart/alternative; boundary="------------AD9D5D22B5EE37B772C40C54"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hosting.co.uk
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hosting.co.uk: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hosting.co.uk: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/UsgG9Ae4l4zAn_IzL6Cd296Ze18>
Subject: Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-id-17
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 24 May 2021 12:11:59 -0000
Pete, On 24/05/2021 08:54, Pete Heist wrote: > Hi Bob/Sebastian, > > On Sun, 2021-05-23 at 21:54 +0100, Bob Briscoe wrote: >> Sebastian, inline [BB] >> >> On 21/05/2021 22:26, Sebastian Moeller wrote: >> >>> Bob, chairs, >>> section 6.2 with its, "use two SAs, one for ECT(1) and one for the rest" seems a bit limited since it ignores that VPNs might propagate both DSCPs and ECN bits between the layers, so IMHO a better approach might be to recommend to treat DSCP+ECN bits as one aggregate byte (let's cal it TOS ;) ) as the extra ECT(1)-SA seems to be required for all SAs that already exist to deal with multiple supported DSCPs. So in a sense the recommendation would be to double the number of SAs. >> >> [BB] Yes, we ought to reword it to say that the VPN ingress should >> /at least/ use two SAs indexed on the LSB of the ECN field, and, if >> it is also classifying on DSCPs, it could also consider classifying >> any low latency DSCP(s) with the L4S packets. To avoid the >> anti-replay problem, there would only need to be one SA configured >> per each degree of queuing delay, not one for every ECN x DSCP >> combination. >> >> We'll have to see how common multiple combinations are in practice. >> As ecn-l4s-id says, L4S with just best efforts... >> "is expected to be the most common and useful >> arrangement. But, more generally, an operator might choose to >> control bandwidth allocation through a hierarchy of Diffserv PHBs" >> >> So the ECN field could be the only field that gives a delay delta. >> However different networks will have their own view on which >> technology they want to use for low latency. So VPNs will probably >> need to cater for both DSCPs and the ECN field being used for low >> delay in different networks. >> >> >>> Also: >>> "and the current draft of DTLS 1.3 says "The receiver >>> SHOULD pick a window large enough to handle any plausible reordering, >>> which depends on the data rate." However, in practice, the size of >>> the VPN's anti-replay window is not always scaled appropriately." >>> L4S on a 10 ms path under load can introduce re-ordering in the range of 50 ms (roughly twice the difference between the L- and C-queue delay targets), re-ordering tolerance 5 times of the path RTT seems to be a bit on the high side to expect, no? >> >> [BB] IMO, the above text that I quoted from the DTLS spec. is >> reasonable, both practically (see below) and in terms of taking >> responsibility for the problem. Beyond its window, the anti-replay >> function presumes a packet is guilty of a replay attack with no >> evidence, purely because it chooses not to hold that amount of >> evidence. Therefore it's proper that it holds a sufficient window of >> evidence for any plausible reordering. >> >> BTW, the C-queue target has never been 25ms. I noticed JM said that >> incorrectly as well recently. >> * A default C queue delay target of 15ms has always been recommended >> in aqm-dualq-coupled. Under a heavy load of short and long flow >> arrivals in both the L&C queues, that results in PI2 Qdelay of about >> 25ms at the 99%ile or 30ms at the 99.9%ile. We have been considering >> whether to change the default target to 10ms for some time, but not >> done so yet. >> * Low Latency DOCSIS specifies a default C queue delay target of 10ms. >> >> So a replay window allowing for 30ms of packets at the interface rate >> would probably be sufficient. >> At 1Gb/s (say) using 1500B packets, that's a replay window of 2500 >> packets. >> >> Quoting Pete Heist's info >> herehttps://github.com/heistp/l4s-tests/#dropped-packets-for-tunnels-with-replay-protection-enabled >> : >> >>> "Modern Linux kernels have a default maximum replay window size of >>> 4096 (|XFRMA_REPLAY_ESN_MAX| in xfrm.h >>> <https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/xfrm.h>). >>> Wireguard uses a hardcoded value of 8192 with no option for runtime >>> configuration, increased from 2048 in May 2020 by this commit >>> <https://git.zx2c4.com/wireguard-linux/commit/drivers/net/wireguard?id=c78a0b4a78839d572d8a80f6a62221c0d7843135>." > > Just noting that I updated the above text to change "default maximum" > to "fixed maximum" as that's the hard limit. To allow for 30ms, that > would place a tunnel bandwidth limit at ~1.6Gbps until drops occur > that can't be avoided without increasing this xfrm limit. [BB] Yes, everyone will have to keep scaling the replay window as link rates continue to increase. But we all know that older deployments hang around. That's why in the latest ecn-l4s-id draft we recommended splitting the traffic into two SAs, to side-step the problem altogether. But it's still important that implementations scale the anti-replay window for a reasonable degree of reordering as well. > > Regarding the sojourn times, Flent plots TCP RTT using periodic > sampled values from running the 'ss' utility, which gets it from the > tcpi_rtt field of the tcp_info struct. tcpi_rtt is set to > tcp_sock->srtt_us >> 3 in tcp.c, and srtt_us is the "smoothed round > trip time << 3 in usecs", according to tcp.h. [BB] As I have said for 6 years now, by using the *smoothed* RTT, Flent is already filtering out important queue delay variability metrics. And, I've also said that, if you want the higher percentiles of delay measurements, *sampling* becomes increasingly inaccurate to conclude anything about queuing delay percentiles for any reasonable experiment duration. For certain you cannot say anything at all about the peak if you're only taking samples. Higher percentile queuing delay is not only important for assessing anti-replay. It's important for real-time applications, because they have to choose a buffering time before play-out, given any packet that arrives after play-out is useless (equivalent to a loss). E.g. a real-time app that waits for the 99.9%-ile delay will effectively discard 0.1% of packets itself. That is why, 6 years ago now, we developed a way to measure higher percentiles of queuing delay. An AQM measures the sojourn of every packet anyway, so we encoded that qDelay into the IPID field of the IP header (v4 only), so that it could be decoded at the receiver. We could have just logged the measurements at the AQM, but we wanted to have the AQM delay data in synch with the e2e measurement data, so we could verify that they match, e.g. when debugging. The code for this is open-source, and available from https://github.com/L4STeam/l4sdemo . > > Are we sure that we have a good handle on what the *peak* differences > in sojourn times between C and L can be? It's much more useful to measure high percentiles than the peak, which might be just a weird outlier. You will have seen the plots we produce of the higher queuing delay percentiles. For instance this one is for the worst-case test we do with an extremely demanding traffic pattern in each queue (L and C): 300 web flows arriving per second as well as a long-running flow: https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3 The plot linked above shows 34ms at the 99.9%-ile. I said 30ms based on the 99.9%-iles of the most recent test runs we've done (we run all the tests after every code change). Even measuring every packet, you can see that the plot becomes less regular by the 99.999%- ile, because we only run the experiments long enough to have about an order of magnitude more packets than the highest percentile we're looking for (e.g. 10^6 packets for a 5-nines percentile). If you were sampling, you would have to run an experiment for days to get reliable percentiles like this. Bob > > Regards, > Pete > >> Regards >> >> >> >> Bob >> >> >>> Regards >>> Sebastian >>> >>>> On May 21, 2021, at 11:21, Bob Briscoe<ietf@bobbriscoe.net> wrote: >>>> Chairs, list, >>>> We've posted a new rev of draft-ietf-tsvwg-ecn-l4s-id-17 attempting to address all the discussion since the last posting just before the interim. In particular: >>>> * review comments on a careful read from Gorry and the chairs >>>> * the VPN anti-replay problem >>>> * added an out-of-band test for an RFC3168 ECN AQM in a shared queue. >>>> There are a couple of outstanding discussions, which I'm sure will continue on the list, e.g. the role of RFC4774 and whether to remove any of Appx C. But it was considered better to get the queued up changes out, to re-base the discussions. >>>> This is quite an extensive set of changes, so pls check and pass any comments to the list. >>>> Thanks for everyone who is contributing, and particularly to the chairs for continuing to referee this all. We've added appropriate thanks in the Acks section. >>>> Bob >>>> On 21/05/2021 10:09,internet-drafts@ietf.org wrote: >>>> >>>>> A New Internet-Draft is available from the on-line Internet-Drafts directories. >>>>> This draft is a work item of the Transport Area Working Group WG of the IETF. >>>>> Title : Explicit Congestion Notification (ECN) Protocol for Very Low Queuing Delay (L4S) >>>>> Authors : Koen De Schepper >>>>> Bob Briscoe >>>>> Filename : draft-ietf-tsvwg-ecn-l4s-id-17.txt >>>>> Pages : 57 >>>>> Date : 2021-05-21 >>>>> Abstract: >>>>> This specification defines the protocol to be used for a new network >>>>> service called low latency, low loss and scalable throughput (L4S). >>>>> L4S uses an Explicit Congestion Notification (ECN) scheme at the IP >>>>> layer that is similar to the original (or 'Classic') ECN approach, >>>>> except as specified within. L4S uses 'scalable' congestion control, >>>>> which induces much more frequent control signals from the network and >>>>> it responds to them with much more fine-grained adjustments, so that >>>>> very low (typically sub-millisecond on average) and consistently low >>>>> queuing delay becomes possible for L4S traffic without compromising >>>>> link utilization. Thus even capacity-seeking (TCP-like) traffic can >>>>> have high bandwidth and very low delay at the same time, even during >>>>> periods of high traffic load. >>>>> The L4S identifier defined in this document distinguishes L4S from >>>>> 'Classic' (e.g. TCP-Reno-friendly) traffic. It gives an incremental >>>>> migration path so that suitably modified network bottlenecks can >>>>> distinguish and isolate existing traffic that still follows the >>>>> Classic behaviour, to prevent it degrading the low queuing delay and >>>>> low loss of L4S traffic. This specification defines the rules that >>>>> L4S transports and network elements need to follow with the intention >>>>> that L4S flows neither harm each other's performance nor that of >>>>> Classic traffic. Examples of new active queue management (AQM) >>>>> marking algorithms and examples of new transports (whether TCP-like >>>>> or real-time) are specified separately. >>>>> The IETF datatracker status page for this draft is: >>>>> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/ >>>>> There is also an htmlized version available at: >>>>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-17 >>>>> A diff from the previous version is available at: >>>>> https://www.ietf.org/rfcdiff?url2=draft-ietf-tsvwg-ecn-l4s-id-17 >>>>> Internet-Drafts are also available by anonymous FTP at: >>>>> ftp://ftp.ietf.org/internet-drafts/ >>>> -- >>>> ________________________________________________________________ >>>> Bob Briscoehttp://bobbriscoe.net/ >> > -- ________________________________________________________________ Bob Briscoe http://bobbriscoe.net/
- [tsvwg] I-D Action: draft-ietf-tsvwg-ecn-l4s-id-1… internet-drafts
- [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-id-17 Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Sebastian Moeller
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Black, David
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Jonathan Morton
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Pete Heist
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Sebastian Moeller
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Jonathan Morton
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Black, David
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Tilmans, Olivier (Nokia - BE/Antwerp)
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Black, David
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Bob Briscoe
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Pete Heist
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Sebastian Moeller
- Re: [tsvwg] New rev of draft-ietf-tsvwg-ecn-l4s-i… Black, David