Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)

Sebastian Moeller <> Sun, 15 March 2020 00:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3CF473A07F5 for <>; Sat, 14 Mar 2020 17:03:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.648
X-Spam-Status: No, score=-1.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ro3PraVA3TAI for <>; Sat, 14 Mar 2020 17:03:39 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 1C43E3A07F7 for <>; Sat, 14 Mar 2020 17:03:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;; s=badeba3b8450; t=1584230614; bh=mXzld6Pwujc5jGM8i6p2OC2iHg/iqul8dzTdDWSxZ3M=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=dAJG2aMGMjB91YdvXRdmOXIHLr/9SSSBvn6q5oLPOA/mCz+NIr+xpZGSn2ResnTi7 2OKeLqgK1iI0qpENlq/+6/B1Tf5VfU67bVQal3ueGELfBg9hic+JsUEdyCByiJYFYA ZpTYRTxcqGp5c38l+yBsa2SQWQwZgHGmmtweNElY=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from hms-beagle2.lan ([]) by (mrgmx105 []) with ESMTPSA (Nemesis) id 1Mkpap-1jalHF2Au7-00mIvY; Sun, 15 Mar 2020 01:03:34 +0100
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <>
In-Reply-To: <>
Date: Sun, 15 Mar 2020 01:03:28 +0100
Cc: Jonathan Morton <>, "" <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <>
To: Bob Briscoe <>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:hXEYyipbdJmLPc6gS7QyLbXHTnHkmfRVftvEAAqlwgBDT7rgbUW Jil7knOrxwxhDTpq99e4BJ3oD0usoAYx0diX+nfHS7tWBIeKiZKCy3LU70CpGusn99pET+6 uPkm+jWBCXrWgbdc5kWPITzXQ48BTD9P9vE0hP9LFcsuhjNVOc3kUWsuxbMTCO0XOADwtJz qOK4LLYE3eL0el7kJ76mQ==
X-UI-Out-Filterresults: notjunk:1;V03:K0:b9mInDfp6yI=:HdpY4sRKO4ll5ZuajhWER6 12hJWUi/Sx5/YpFlk8bEECqec7sZuGMPp8ETKNdbq4asn5RN6LbdfPPws2RDf8TSGXHy8nhNH iuS0aImLH+TY6/WZNO5c3IYfRyF09uDCvwYllWdtVJJ+7j5cWGhGYR60ZTXcBZNCUfna2SHeD omqblxInuDzC+KdeyFYE0h6/JAKP2MljD8N1qZuHQDRrfnFDlyVP9S90bDPmGovVQznAGZTSM JG2yU/tWH+Z1kMidacxBUgZqm+pKCrEdNjCAcuRQRaNDGQSOgVvh8pxiJqvuOGLH45H+6uq98 QhlYabEr0OQKqzLpgWtZRo9IkBVIVlRpybvTNwS5OkU1TjYACRq2LZjKlsC1BZvnUSTrsvtG1 ySA+UthMjdgO/onFb0YJNsa/ZJjJ1OvcB5COS1t069moquF/fdnFw5nc1pNymsNENDbU0ONfj GEfC5vMuRA4sUfD5ToZ4PXDNtMV9X+K3fedb5Sjj9LnhrIZ8rOmavt3pP4NlO1AYh7dvMnY9q 05PcuCDO1Jc7Y3Z8n1xoOWPJr3Dsvk6d3WY6RTCKqcIGYnu6gEdzktFLMGrcLneKU1sR4GYfq qB5yr01zS7LbqQA5fbdfvUH17NDVQN1zaaCG/V4iOIlaXtPV1FysBPpfW3NGh93gDKNSWwAik fxEl1GAgVBpNIt+NA+NW5yxL+b1tNBJWUFisJSqYuuLKIxqM5EU51fHKer7ZuiPHt6fD6iB8m p3LQbult1I6pjAuU6ZsjyN+8oIrfs3hOsOFjMLNIIrKpPHXxPDZuCcImrrLHA3xGfswwFkzxT QTBLm228nYTiHP82fHHF3YbVHi1OTAYeb0ydKrCNH6tK+mZDK7YXoEo9j4YZd/WHIQNMDw/hk 5QDOAfxsZKUdLa/6Juc/uWzBd4AC6fcClkksLI7UABQcJI2/rmNSkN0b8ObUJe5R6E7DP2mP3 x/7cdBri6YAkYJkCYWyxfOfi/QHtoKSncAvJB4WvDbvOCrluPy0yeu/VZewtvjiPj/vspb3xU t1y+6D2KChjQ684lMmS1xDU2HEYKexfyaNxy0OE98JOfVQMWPWVZiEsdaHObyO/F2pW0X5GmH 8iq4ZNzFfB8BkFfKAYVXDyffEKdbL/iF4QSB6RNt2Rw3GZGAW4sV/1EEiXb5ZdIrTRKS2WYfr g4bNb9myW7p62O9omuzkrbwjC7wZJOqtOscfG5MSy/l9gOoyWVHz+ngtiiLMPjwkXTgVE+RdD TkPIXDRKcVM7Yl3qQ
Archived-At: <>
Subject: Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 15 Mar 2020 00:03:41 -0000

Hi Bob,

thanks! More questions below ;)

> On Mar 14, 2020, at 19:07, Bob Briscoe <> wrote:
> Sebastian,
> On 13/03/2020 22:54, Sebastian Moeller wrote:
>> Dear All,
>> so in this example we need an operator that operates a tunnel (encapsulation and decapsulation) and an AQM on the same tunnel and is considered to both (mis-)configure the tunnel to allow fragmentation and operate an SCE-AQM on hops along the tunnel's path.
> Tunnels are often constructed over the top of the network operator that runs the physical equipment (incl. any AQM). Overlays are one of the common motivations for tunnelling.

	[SM] Fair enough, my question was not that much about tunnels per se, as I can understand why tunnels are used, it really is about fragmenting/defragmenting tunnels in combination with a ECN-marking AQM. But I have not posed this question clear enough (for a change I tried to avoid being overly verbose).

>> Is that really considered to be a sensible deployment that is worth catering to? I would guess having a tunnel that fragments/defragments alone would be painful/sub-optimal. Why would an operator that consciously accepted such a tunnel design (as there probably real world constraints that can justify such an arrangement) would then go and add a non-tunnel compatible AQM? Are fragmenting tunnels really such a big part of the commercial ISP world?t
> Short answer: I don't know.
> Where an operator who sets up a tunnel can control the MTU of the links, I would imagine fragmentation is highly unlikely. But for overlays using tunnelling, most tunnel endpoints use PMTUD between themselves to find a good tunnel MTU, and I would imagine they use fragmentation for IPv4 (because they have to, if their "packet too big" ICMPs are being blocked) for any incoming packets that don't fit once they add tunnel headers. But I don't know how often they actually have to fragment (depends on end-systems strategies are to keep the MTU below this point).
> The only concrete data I can find is from the year 2000.
> In this case all the packets on the link were fragments, and "A significant portion of the fragmented traffic ... is tunneled traffic." I don't know whether this link was chosen for this study because all the packets were fragments, or whether it would have been easy to find other links with a high proportion of fragments.

	[SM] I have found this as well, before asking, IMHO that is a case of a degenerate situation, where either the tunnel operator should have made sure that the tunneled path would carry MTU 1540 packets, or signaled the endpoints to reduce the payload. But I admit that there seems to exist no silver bullet that easily avoids fragmentation for IPv4...

> Finding the best max packet size is an area where neither IPv4 nor IPv6 has ever found a good solution.

	[SM] Question: Do you think that is because IPv4 fragmentation always allowed an easy way out here? I read a proposal once that it might have been more robust for each hop if encountering too large a packet to simply transfer a truncated packet, but I guess one would at least need to adjust the checksum (not the length field) to differentiate this from a partial packet loss on one of the transfers....

> Getting a tunnel to fragment and reassemble is indeed painfully sub-optimal,

	[SM] +1

> but all the other solutions have their own problems. It is possible the sub-optimality is often going on under-the-covers, just because it works. I do know that, for IPv4, the Don't Fragment (DF) flag is often ignored by tunnels, as a preferable alternative to just ditching the traffic.

	[SM] I know the perfect is the enemy of the good, but that seems more like one of these "one does not necessarily want to know/see how sausage is made" situation.

> In the absence of the IP layer solving this problem, Gorry's group found that about 20% of end-systems are just clamping the IPv4 TCP max segment size lower than necessary. And for IPv6 most are just using the min segment size. See:
> Again only for TCP, middleboxes can edit the MSS advertised in each packet to fool the hosts into using a smaller max segment size.

	[SM] Yes, except given the issues with otherwise signaling too large packets a long a path, that might still be the cleanest solution...

> You can configure a Cisco box (and I'm sure others) to do this when you're using it for tunnelling. The above study from Gorry's group didn't find much evidence of this though.

	[SM] Thanks for the link!

> PLPMTUD solves this problem end-to-end (see [RFC4821] for TCP etc. and [draft-ietf-tsvwg-datagram-plpmtud] not yet published for UDP). However, where a tunnel already solves the problem (sub-optimally) for IPv4 using fragmentation, I can't imagine that anyone would disable it, because it is still necessary and it still works.

	[SM] Well transient fragmentation (where the tunnel endpoints create the appearance of no fragmentation) comes at a considerable processing cost, and carrying fragments (especially in degenerate cases like your caida example above) skews the payload/overhead ratio badly (by artificially blowing up the packet rate), so I am confident tunnel operators would be happy to be able to drop that crutch in a heart-beat, IF they could be sure it would not be required anymore (or in a belts and suspender fashion, would keep the mechanism operational but would use PLPMTUD to make it unlikely that fragmentation actually is triggered)

> Further reading.
> I just compiled a list, then realized everything is already cited here, in the context of minimizing latency:
> You could also try .

	[SM] Again, thanks for the references.


> Bob
>> Best Regards
>> 	Sebastian
>>> On Mar 13, 2020, at 19:50, Bob Briscoe<>  wrote:
>>> Jonathan,
>>> On 13/03/2020 18:15, Jonathan Morton wrote:
>>>>> On 13 Mar, 2020, at 7:58 pm, Bob Briscoe<>  wrote:
>>>>> Are you reading my recent emails, or the main email?
>>>> The recent ones, because I'm trying to establish this fundamental point first.  Until I can figure out what you *do* mean by CE marks being doubled, there's no point in discussing anything more subtle.
>>>> So again, I ask: what context am I missing?
>>> I said "No." I.e. the AQM is not marking before the tunnel ingress. It is marking between tunnel ingress and egress.
>>> How can I know what context you're missing if you haven't read the email? Presumably that means you're missing all the context given in the email.
>>> Pls can you take these sort of emails off the IETF lists - I'm sure no-one cares for this sort of content-free conversation.
>>> Bob
>>>>  - Jonathan Morton
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoe
> -- 
> ________________________________________________________________
> Bob Briscoe