Re: [tsvwg] SCE / L4S and fragmentation

Jonathan Morton <> Sun, 15 March 2020 16:15 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BE3C73A189E for <>; Sun, 15 Mar 2020 09:15:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.849
X-Spam-Status: No, score=-1.849 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id PEE_GnzD3cKF for <>; Sun, 15 Mar 2020 09:15:51 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 629EC3A188D for <>; Sun, 15 Mar 2020 09:15:51 -0700 (PDT)
Received: by with SMTP id s13so15855555ljm.1 for <>; Sun, 15 Mar 2020 09:15:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=zsl6Q55dfOzeKYTzozP+Fzs8uBGauFPJCMgTekAUmZo=; b=FgP4LLepFdZ4pGffAZqGgKatZymSrQ2MFDbKEafMZ0g0+oIgB1lAP9+Xr3q+pAjUXK u/LX0/I5MwJAL3w3BkDkH9OeqkSzCRen9ZBXtquKZSrjWFeSlypDL3RT/XHiN0QjqaeT DEzogl1XsCKmcmnF2MvJFsJAAyqElkfb80voATs8k30pAGpj6QHmznc6Zstwhr0+qN94 WM1Pjkb4aHCvyYB2d/mR41UWGNPzDaQChidHHAZghRXAa43QZN/+nQJ8tTm/5wBomkvj 8e+rkAZUxVghoIn3ccQkIZXnp+ZQmOxcPM4svIoLQBGiwyzc5NDnaKbF/VRGuIwrP4ny cqxA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=zsl6Q55dfOzeKYTzozP+Fzs8uBGauFPJCMgTekAUmZo=; b=g6eOKaCGxpQojubSojHsTXRewX5SMM2Q9pPu5JkoaQQ9yMn6XypBmUarz36zbrSoGw ZhhQW9iNshUJswW7tcrxR6l/p/XyGyxnVT2pbEMmcqlNoGzyN9DkgqKRtH+BJ2GvtjW4 KC8ncx915ITLX6S2PZfy3P+qYdw0T8qh/2OAkv7zgXi5+zShg9FnGfNU5lfIHSLwb29C z88ZQmIo3LKa2Nz549N9ynK8k/BBE+Qw7vg/JpXFwIZFpKlpBd7mBsEoelDcN3C4O3nY UEL8EwWK+B+ywoghBjS5dA05nPjRDitkZlCcjlGoUA9N9JkQ+pbbj5YyyLZj9mVMqrVl rGXw==
X-Gm-Message-State: ANhLgQ1dk8PaeYn48V89gv1135wAulirfkO7+ALXsraQoDykZuWwBd6e vEF3dCs3PjXOWDVVXjustZY=
X-Google-Smtp-Source: =?utf-8?q?ADFU+vtAJzhwgMWdigeT1bVUV2ewZmH5ayIAova2BLdm?= =?utf-8?q?jP6T2feaECObxjI1RFissg0nbaDWYAjNqA=3D=3D?=
X-Received: by 2002:a2e:3608:: with SMTP id d8mr7111417lja.189.1584288949318; Sun, 15 Mar 2020 09:15:49 -0700 (PDT)
Received: from jonathartonsmbp.lan ( []) by with ESMTPSA id j19sm31863026lfg.49.2020. (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 15 Mar 2020 09:15:48 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Jonathan Morton <>
In-Reply-To: <>
Date: Sun, 15 Mar 2020 18:15:47 +0200
Cc: "" <>, "Black, David" <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <>
To: Bob Briscoe <>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <>
Subject: Re: [tsvwg] SCE / L4S and fragmentation
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 15 Mar 2020 16:16:02 -0000

I had hoped that a thread with this title represented an opportunity to discuss the effects of fragmentation on L4S and SCE, *without* immediately descending into tribalism and ad-hominem attacks.  Accordingly, I will decline to quote *any* of Bob Briscoe's latest post, in favour of summarising the technical content.

Firstly, we have apparently agreed that fragmentation has no deleterious effects on ECN codepoints applied prior to fragmentation.  I do not understand why I am being attacked for expressing that agreement.

On the subject of IPv6 fragmentation, Bob previously asserted that RFC-3168 has nothing to say about IPv6.  This is demonstrably false by a simple text search over RFC-3168, in which the IPv6 Traffic Class Octet is repeatedly equated with the IPv4 TOS Byte.  In §5.3 dealing with fragmentation, neither IPv4 nor IPv6 are explicitly mentioned, implying that the language therein applies equally to both.  I have not personally verified that any given IPv6 receiver correctly implements these rules, but I would expect that they generally do.

So the only material difference is that IPv6 behaves as if the DF (Don't Fragment) bit is always set, which limits the scope of the problem at hand (for most transports) to the case of an on-path tunnel which performs outer fragmentation.  The only exception is when a jumbo datagram needs to be sent, but most such protocols are not ECTs (ECN Capable Transports) and are therefore out of scope for this discussion.  We can of course still discuss any specific counterexamples that might be identified.

Most ECTs are expected to set DF (on IPv4) and implement PMTUD.  Indeed, most do.  If certain tunnel implementations interact badly with PMTUD, I think that is primarily a problem for the tunnel, not the transport.  It is still desirable that the transport functions reasonably well in difficult circumstances, and that is part of what I hope to discuss in this thread.

Bob stated that the language I referred to in the quote below was present in an earlier draft of rfc6040update-shim:

> ISTR, at some point in the past, interim language was suggested which would require taking the ECN codepoint from one of the fragments constituting the packet, with the behaviour being otherwise unspecified except by the existing rules.  This would be a worthwhile improvement from SCE's point of view, and is likely to match at least some existing implementations.

That is not true, however.  What *is* present there is language which strongly advocates propagating marking on a byte-preserving basis, both for CE and ECT(1) codepoints, and that is what I have objected to, both recently and less recently.  What I referred to above is much simpler and less prescriptive, and can be summarised in the following proposed language updating RFC-3168 §5.3:

> …if any fragment of an IP packet to be reassembled has the Not-ECT codepoint set, then one of two actions MUST be taken:
>  * Set the Not-ECT codepoint on the reassembled packet.  However, this MUST NOT occur if any of the other fragments contributing to this reassembly carries any ECN codepoint other than Not-ECT.
>  * The packet is dropped.
> If both actions are applicable, either MAY be chosen.
> Reassembly of a fragmented packet MUST NOT change the ECN codepoint when all of the fragments carry the same codepoint.
> Notwithstanding the above, if any fragment of an IP packet to be reassembled has the CE codepoint set, then one of two actions MUST be taken:
>  * Set the CE codepoint on the reassembled packet.
>  * The packet is dropped.
> If neither the Not-ECT nor the CE codepoints appear on any of the fragments contributing to this reassembly, then the ECN codepoint set on the reassembled packet SHOULD be one of the ECN codepoints present on one of the fragments.  Any contributing fragment MAY be chosen as the source.

I'm sure some wordsmithing is in order on the above, but I hope this makes my position clearer.

In connection with this, Bob claims I'm asking for requirements to support SCE in a standards-track document.  If that is what I was actually asking for, then the final sentence in the above proposed language would be markedly different.  I would be asking for the existing requirement in RFC-3168 §5.3 that congestion information MUST NOT be lost during reassembly be interpreted to treat ECT(1) codepoints as congestion information.

Mindful of SCE's current status in the WG, I am deliberately *not* asking for that at this time, even though it would improve the quality of SCE signalling over tunnelled paths.  Instead, I am asking for simple, easily-implemented language which is approximately neutral between L4S and SCE, but allows for the possibility of using ECT(1) as an output from the network at some point in the future.

Finally, I thank Bob for the link to a two-year-old slide deck containing a survey of existing tunnel protocols.  I infer from this that shimmed tunnels have typically considered RFC-6040 as out of scope, as the latter only explicitly covers tunnels without a shim layer.  But the rules in RFC-6040 for handling ECN are generally applicable, I think, and have nothing whatsoever to do with fragmentation.  This latter fact informs my support for the suggestion to separate the concerns of reassembly and decapsulation.

 - Jonathan Morton