Re: Non-Last Small IPv6 Fragments

Erik Kline <ek@loon.co> Mon, 14 January 2019 20:40 UTC

Return-Path: <ek@google.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6ECA51312E4 for <ipv6@ietfa.amsl.com>; Mon, 14 Jan 2019 12:40:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.499
X-Spam-Level:
X-Spam-Status: No, score=-9.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=loon.co
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a6tC8bruq-C7 for <ipv6@ietfa.amsl.com>; Mon, 14 Jan 2019 12:40:57 -0800 (PST)
Received: from mail-it1-x134.google.com (mail-it1-x134.google.com [IPv6:2607:f8b0:4864:20::134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5E0211312A8 for <ipv6@ietf.org>; Mon, 14 Jan 2019 12:40:57 -0800 (PST)
Received: by mail-it1-x134.google.com with SMTP id p197so1427613itp.0 for <ipv6@ietf.org>; Mon, 14 Jan 2019 12:40:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=loon.co; s=google; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=07g9RT7UKxtWwbL3t/nWKtcORhN7P/CS/+HTgs94avc=; b=IuJvtr0TIvRovTHeA+ph2ASKJpUvZ3QHjOVNpg5aNWUR5cc5tnDym/oqXp2H75LQS2 bq6xY1GHsnaEE/EJIPu2+LKK5dBB4R+dTXZiqZ7cdRtPyrTrN5zsvqqoseAPCccZbQPm Cb+RWfNUzyCeVFB3xXEQDzXVNqpekAQ0tUVv8=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc:content-transfer-encoding; bh=07g9RT7UKxtWwbL3t/nWKtcORhN7P/CS/+HTgs94avc=; b=YAgMndNJ9+WW0c5t73qzO6qd0/gLqVNmmBRPkrP09YZ0SSMk6b8FnLb4WMgDlnvW58 FYkookinHGUXl3wT167zRQUaf1lZTJVfjuQtsu2Z5X6tVsoPEwrzouJDh3vrTb3BmVyl aPktmxun+WNmQxStBqYBGnZT8sFIX5FixCpL8K/Q/XYWufnCQ87qZOU3GTR+XEibkW+X FmqqYbIhARzaK5oZzo4ioEN9B0TMgTIcLFt4Aw5r815ugHSlKoW/BC00e1duFoPB2NzC pL89kme4XpncYv9D35axenMm9yLLKmQAsFcprUpQXxqFx1v5sef0pBE5xyLLRhVxPNnY M/fA==
X-Gm-Message-State: AJcUukcPC9e1nMxxwN4j2uZZReZftuE9e5eErJqAn7P8ZL6Fi4n3RgDW J4xkkw69iGWaDXBuewtP+s8nHFPu4lXEZHtA+SEqAw==
X-Google-Smtp-Source: ALg8bN66hqb7iLFCOMT3iSddJ2DGfJV66u5ECez2p0k/tTCa+cCi8UffmNLaPccjvmihtRxq1eCIfgA2PUxTdn44imw=
X-Received: by 2002:a24:3752:: with SMTP id r79mr527492itr.121.1547498456133; Mon, 14 Jan 2019 12:40:56 -0800 (PST)
MIME-Version: 1.0
References: <CAOSSMjV0Vazum5OKztWhAhJrjLjXc5w5YGxdzHgbzi7YVSk7rg@mail.gmail.com> <CALx6S35KNhV2gFp9OdU+M1zy5WUuEAEvXkDXNDWWxi7uQ4e_cw@mail.gmail.com> <CAN-Dau0rTdiiF2SjByxcMG6nhPCEjUH2pYBCOeK_FSGJ_ucDQw@mail.gmail.com> <CALx6S34AyV9OpvnjQhQc56n5vfeVgU5Zd3kheP0g+XvsMbBV9g@mail.gmail.com> <1b2e318e-1a9f-bb5d-75a5-04444c42ef20@si6networks.com> <CALx6S37TJr++fC=pVoeS=mrO1fHc4gL_Wtu-XkVTswzs2XxXCA@mail.gmail.com> <CALx6S36V7vrVyoTP0G6+S5XeFNB3KWS5UaNnVi20xogRERdCfg@mail.gmail.com> <973A1649-55F6-4D97-A97F-CEF555A4D397@employees.org> <CALx6S34YbBe8xBod3VsWVO33TpZcdxh2uV1vaO8Z_NKnVXp66g@mail.gmail.com> <A3C3F9C0-0A07-41AF-9671-B9E486CB8246@employees.org> <AEA47E27-C0CB-4ABE-8ADE-51E9D599EF8F@gmail.com> <6aae7888-46a4-342d-1d76-10f8b50cebc4@gmail.com> <EC9CC5FE-5215-4105-8A34-B3F123D574B9@employees.org> <4c56f504-7cd7-6323-b14a-d34050d13f4e@foobar.org> <9E6D4A6E-8ABA-4BAB-BEC5-969078323C96@employees.org> <CAAedzxpdF+yhBXfnwUcaQb-HkgdaqXRU3L+S7v8sS1F0OkwM9A@mail.gmail.com> <78a8a0e0-8808-364c-41f7-f81f90362432@gont.com.ar>
In-Reply-To: <78a8a0e0-8808-364c-41f7-f81f90362432@gont.com.ar>
Reply-To: ek@loon.co
From: Erik Kline <ek@loon.co>
Date: Mon, 14 Jan 2019 12:40:43 -0800
Message-ID: <CAAedzxpjxhP0nOZVU0CTwA1u3fsPFthrJASjDEfnLcRNvr2gBQ@mail.gmail.com>
Subject: Re: Non-Last Small IPv6 Fragments
To: Fernando Gont <fernando@gont.com.ar>
Cc: Ole Troan <otroan@employees.org>, IPv6 List <ipv6@ietf.org>, Bob Hinden <bob.hinden@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/hsMRtB4souQKYO09CKhofhMZ-GA>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jan 2019 20:40:59 -0000

On Sun, 13 Jan 2019 at 22:24, Fernando Gont <fernando@gont.com.ar> wrote:
>
> Hi, Erik,
>
> On 14/1/19 02:19, Erik Kline wrote:
> >
> >     > Ole Troan wrote on 13/01/2019 20:03:
> >     >> Let’s fix path MTU discovery. There’s no fix for fragmentation.
> >     >
> >     > pmtu discovery is hard because it needs a way for an intermediate
> >     node to be able to signal a transmitting node to dynamically drop
> >     the MTU if network conditions change.  The only way to reliably work
> >     around this is to transmit at 1280 and claim implementation /
> >     operational breakage if this cannot get through.  This, however,
> >     robs the host devices of the ability to use higher MTUs.
> >
> >     “Fix” as in something different than rfc8201.
> >
> >
> > How about a version of the fragment header that:
> >
> >     (a) *always* has the L4 header (or a configurable number of bytes)
> > in every non-first fragment
> >
> > This could be documented on a per-L4+ basis how much needs to be
> > included, i.e. we could have one doc for UDP and (say) a separate doc
> > for information QUIC would need on a per-fragment basis.
>
> If you are requiring that the first frag always contains the L4 ehader,
> that's already achieved by RFC7112.

Not what I suggested.  I said every fragment header would include the
L4 header (or other specifiable bytes).

> If you really mean that the FH contains the L4 header, then:
>
> * There's the issue of EHs in the fragmentable part

That would be addressed in an associated L4 doc, and it would probably
say to ignore non-transport headers.

> * It may be hard to enforce in the presence of tunnels, unless the
> tunnel reassembles and refragments where necesary
>
>
> Still, that doesn't solve the biggest issues with fragmention:
>
> * a stateful operation for anotherwise stateless protocol
>
> * i'd argue that for the IPv6 case, one of the base reasons for which
> fragments are dropped is because they employ EHs
>
>
>
> > This increases the overhead in a given fragment, but also helps to
> > ensure that (eventually) intermediate systems can examine this field and
> > preemptively make a drop/no-drop decision.
>
> Then we're back on draft-gont-v6ops-ipv6-ehs-packet-drops

Understood, but any L3 solution is going to necessitate running into
this.  UDP trailers is certainly cute (and +1 from me), but every
transport will have to have some similar fix (or applications will
have to consider rfc8085#section-3.2 type guidelines).  If there's to
be an L3 solution to this L3 problem it will face some deployment
issues but if its usefulness is sufficiently attractive it could find
reasonably wide-spread support in due time (in the absence of ipng-ng,
IPv6 is it for the long haul).

[Silly Idea #1]

(/me thinks back to spud bof)

Here's another random suggestion that maybe belongs in panrg or (more
likely) the dumpster:

A new hop-by-hop header under "00 1", i.e.

    00 - skip over this option and continue processing the header

    1 - Option Data may change en route

of the form, for example, Option Type 00 1 00001, Opt Data Len 2,
Option Data == MTU.  Here, the observation is that the main thing we
have experience with "working" (for the most generous definition of
"works") in this problemspace in the internet today is TCP MSS
Clamping.  So let's have an MTU (or MSS) HbH option that can be
revised monotonically downwards but never lower than 1280.  If it
helps pad to 8 byte alignment we can have a 2nd 2-byte field that is
untouched representing the sender's original MTU/MSS.

In this way, a sender could include an HbH with either:

    [HbH | 0 | 0x0202 | 2 | (1500=0x5dc) | Pad1 | Pad1]

or

    [HbH | 0 | 0x0202 | 4 | (1500=0x5dc) | (1500=0x5dc) ]

with one of the MTU/MSS fields to be munged along the path.

If this were the only HbH option included it adds only 8 bytes, fits
(along with most L4 headers) within a 128 byte hardware classifier,
and end systems could explore its usefulness via ping.  (Separately,
there could of course be sockopt/cmsg things to set/get the values.)

[Silly Idea #2]
Separate, even sillier idea: everyone converges on using 16 bits of
the flow ID to encode the lowest MTU encountered along the path.  Here
again, TCP MSS clamping style behaviour would be applied to this
field.