Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)

Sebastian Moeller <> Mon, 16 March 2020 09:12 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 677683A2198 for <>; Mon, 16 Mar 2020 02:12:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.648
X-Spam-Status: No, score=-1.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id wZjRPGPSGv59 for <>; Mon, 16 Mar 2020 02:12:56 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 1ECE63A2194 for <>; Mon, 16 Mar 2020 02:12:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;; s=badeba3b8450; t=1584349969; bh=k9P/SqJiaXpR+547M57z/gWQ7NVXmGLV9J5/kcwWh8Y=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=V1jneutEdTaGjPwtZZhSjo/CrwIdAaLA37RrP3OSZSvq7z81B7MXGTptugU3Y+A14 mBcEZ1EJBFKjLtnl2usalFoexxREnD1zBdaNfuX+PRxmX30UO6Xx0E+MzRgW88Ub8h aABKT7vDfhi6KIhsqc1iXwMcMgTMxo9LQWGZK4Lw=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from hms-beagle2.lan ([]) by (mrgmx004 []) with ESMTPSA (Nemesis) id 1MRmfo-1ikauh0L4U-00TB7f; Mon, 16 Mar 2020 10:12:49 +0100
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <>
In-Reply-To: <>
Date: Mon, 16 Mar 2020 10:12:44 +0100
Cc: Jonathan Morton <>, "" <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
To: Bob Briscoe <>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:UF5D0WyaelS66lLoLqWZ/Ti20+pJULf/gIznnD0GWvrK59sN5tp Sfr0jxpAztjEkGF+Fbjanpv5ply2qdFS//EF/EG6PI5SwmoIzRJIjysczqGCUEv/kg3EOv8 8Y6U03OwFuJ0BFnq8ytYa8Rvy55sAhnL4nrGMiIQt7nmeuPG24VabahNFGMmOPIrG/1Hq96 bA77/+ekEUl3HOYuii1sw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:4tL9gd49qLk=:tePtPg5yHslT1T4HIForN9 XMYFEQJpLV8doKZJmsj7M+bMcR+pyX1oj84mz51CWA7zBIXm1+nsWtRJH+4IjmUU9lL9+F//Z WNwYrYBb2QD2omo/FZOq+lu+ERNm2a/rk7fstx8m4KzhY7U6N7orkxF1/nn8IRSUJW1uTc8vO jJVhNmQQqieUHH1KEnF6b4CwSHHBpcSWYlfxVCbEC6+jHqlTXhMuMwbkkCIAAqD1EeAdQXnGg MEELjJbPUyHLgKgudmMw2us5399lf9TGdLc5Pla+a/AUpIxNBBneVcj0ocoDlesKgbpKvLdjX ld4oRRDeOwQuZ3reYoUDsIv5A1EtkRkPYeGZxIfTTUxDa2NbvU9YI1du/4uPbSKU8bHaCrj/t xdK/a6tWClEMtqy6/u72ptpblUOm1HnEG9oNAKTNPETCi3bESN95PZIJQJJ8g2+B/dE3qPrFg wfb8fHQsBIcJ1shksCbokhDzv2PJ6ddph4Q2QPknbaJYSTXAIfvpezUH4a0qRMnPnG8NGrFn4 uaOtn9U8+U0ES7M+nOR7wIwTjqJMAJjpBhBU/gb1Y0+XAeVWbDbSYV8ksTc8q5xfre9CQYghz iOGLkg5d7o7wkw2CP6Vg0hB8xHRwHslB9LU1DOBTo0fchmPsVoUd7IIioq1AyMP0oibmkl+zP HhZuBpsoSKKs3cG1k1KlhZYjyClol3mVmWdn/bSlH0ZcjwRoEGBdREwkEjGIEuwIdP6c7tsHQ 6XXzPLoleMpL3FYF/9h7AKcZad0XkHSF3w4YpXkqCLj8Hh4tQF6qvIRS50O7FtCETcswlv1nn 2pidpDPBHOvBlSzRWXfEsnGXOVa+foWppC9sM2cxxxotztyMsaqcpYM4ox/n2ulsebbADTei0 Mca+3/XeL/XyOagPKU6QTQFpsW+AhPJw3rCjtAA/24C/6A9FBr3ZDDiEWLwuBRbjh++hdi6o6 qELrQW0fEk597cYPgd/Zc+wmkfC9UX335s2tkzTGeUMnYSBsrTWb5suXCt0rqhoFzHwFNWUvg E0507xRPEHUH1C3CSkEq1D4j5APoWVfH/JeiTtc8PrBVOWi45fZoOhkhQpayG/Ju9wb8/mBNO Rk8rsp1BLwkWvx4qcfP1MxZMBA1mFeBK3IJGhe2uhgXuGyEnH28W7fbhisj3fYwHE7hkMjM7c KuDcil8XJC6l54LdBVsCIs31nv++2SDXkhv3HPKG0aGUvg9HGS2PttkQTRhZeifr3o07UQ90g O5uaDhdZyqBco/vFk
Archived-At: <>
Subject: Re: [tsvwg] Status of ECN encapsulation drafts (i.e., stuck)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 16 Mar 2020 09:12:58 -0000

Hi Bob,

> On Mar 15, 2020, at 12:12, Bob Briscoe <> wrote:
> Sebastian,
> Snipped all the comments you're agreeing with. Two responses below...
> On 15/03/2020 00:03, Sebastian Moeller wrote:
>>> [BB] In the absence of the IP layer solving this problem, Gorry's group found that about 20% of end-systems are just clamping the IPv4 TCP max segment size lower than necessary. And for IPv6 most are just using the min segment size. See:
>>> Again only for TCP, middleboxes can edit the MSS advertised in each packet to fool the hosts into using a smaller max segment size.
>> 	[SM] Yes, except given the issues with otherwise signaling too large packets a long a path, that might still be the cleanest solution...
> [BB] I should add that PLPMTUD and clamping the max packet size complement each other. 
> PLPMTUD without clamping would kill latency.
> Clamping avoids triggering fragmentation, but smaller than max packet sizes are inefficient. 
> Adding PLPMTUD eventually bumps up the MTU for each path to its max, removing the inefficiency.

	[SM] Well, realistically the ONLY viable solution is to design all tunnels to allow the de facto real-world internet MTU of 1500 and just make sure the tunnel itself uses jumbo-frame capable links to make sure fragmentation is not required. But honestly, in our current context I still wonder whether an operator failing to implement a non-fragmenting tunnel will ever deploy an ECN-enabled AQM...

>>> PLPMTUD solves this problem end-to-end (see [RFC4821] for TCP etc. and [draft-ietf-tsvwg-datagram-plpmtud] not yet published for UDP). However, where a tunnel already solves the problem (sub-optimally) for IPv4 using fragmentation, I can't imagine that anyone would disable it, because it is still necessary and it still works.
>> 	[SM] Well transient fragmentation (where the tunnel endpoints create the appearance of no fragmentation) comes at a considerable processing cost, and carrying fragments (especially in degenerate cases like your caida example above) skews the payload/overhead ratio badly (by artificially blowing up the packet rate), so I am confident tunnel operators would be happy to be able to drop that crutch in a heart-beat, IF they could be sure it would not be required anymore (or in a belts and suspender fashion, would keep the mechanism operational but would use PLPMTUD to make it unlikely that fragmentation actually is triggered)
> [BB] Just need to point out that the /operator/ cannot use PLPMTUD to make triggering fragmentation unlikely.

	[SM] The operator can however make sure its tunnels can transport ~1500 Byte ethernet payload equivalent payloads, at which points in all likelihood the tunnel will not cause additional fragmentation and the whole issue (if an ECN AQM is operatin on the tunneled packets) becomes some one else's problem.

> That has to be done at the origin sender. That's precisely why operators don't drop this crutch - because they can't be sure the sender has their own crutch.

	[SM] The operator could just drop packets if they would require fragmentation... or functionally make a fragmentation requirement exceedingly unlikely... But I accept that that is a log standing issue that probably has no solution that is all three of simple/robust, cheap and fully backward compatible...

But I do not see this as a big problem for either L4S or SCE as none of these tunnels currently uses an ECN-AQM there is next to zero legacy and hence tunnel operators can be taught to properly deal with ECN bits at the defragmentation and decapsulation stages. Sure, L4S seems coved by the existing rules (but assumes that defragmenter/decapsulators actually follow these rules!) but the rule changes for SCE seem rather straight forward to me.

BTW, I am puzzled about your claim, that L4S was carefully designed to avoid this issue for two reasons:
a) while according to the current RFCs CE-marks from a fragmented outer tunnel should propagate correctly into the inner packets ECN field on decapsulation there is little proof that all deployed tunnel defragmenting decapsulators actually do so correctly, and

b) this seems less of a conscious design decision and more a fortunate side-effect of selecting to change the meaning of CE, as far as I can see the strongest rationale for doing that was to be able to just hoist the DCTCP work into the common internet. Or do you claim that you opted to overload the CE codepoint specifically to deal wit fragmenting tunnels with ECN-using AQMs?

Best Regards

> Bob
> -- 
> ________________________________________________________________
> Bob Briscoe