Re: [tsvwg] UDP options and header-data split (zero copy)

Tom Herbert <tom@herbertland.com> Sun, 01 August 2021 19:04 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 29E083A03EC for <tsvwg@ietfa.amsl.com>; Sun, 1 Aug 2021 12:04:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7UoHuF09PGmD for <tsvwg@ietfa.amsl.com>; Sun, 1 Aug 2021 12:03:56 -0700 (PDT)
Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DC0CD3A03EB for <tsvwg@ietf.org>; Sun, 1 Aug 2021 12:03:55 -0700 (PDT)
Received: by mail-ej1-x62a.google.com with SMTP id gn26so27240942ejc.3 for <tsvwg@ietf.org>; Sun, 01 Aug 2021 12:03:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=XfJJ8BlesWgyOBP8g4wkUWEho7ht0/BdJP6fV2TU8Lg=; b=iVZthDZiZhhy7QelXfDhYzKQ6+DaSu9JnqZTan2ybkK9Bm3i5GqwhHZJM3qFiHUg98 gfLko6AhhbImjznvKUD86/qvX3/IZK0vg22cJvC8jCtbNsGwx/WQ1Fc16dtxipLNsucT dSproo6Dd3sqRW/aUJ3qV1OTdn2ck+lIVubBNrSBKmM7gH+5c7lQXOkNlw1Wxz1XbX3U 7DOxlpgt6bmzoOO6I/Nb7JWNeYJK2/c/WYPAqKj3p9yrRT/p3Ftxemtx0Ly1HZdfjA3c f/LZfOx3CG8tyO64jJBfnPKaSoF+W+iawCNw9m0pNR9J8V42rc941HJ/QbOymkPz+udx lJJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=XfJJ8BlesWgyOBP8g4wkUWEho7ht0/BdJP6fV2TU8Lg=; b=f6woRY8gCHuEvFEgbk89hwlbALeTISeQwodW3PP6YGEbnHdGj0LoChlafNnyKLrf+Q U315tFqOE52V6VHczoc7B2TAFPrQYhi0I13O+sdE0AVMgPGrUC8hFEKQMUWE80jHDN7F wuaMP1HVtzO+lECFPpH58a1qFn110l5erDXwZR1xk1OLbE5Cm1hgEV79A6C3Sm4g8nEZ qM75lg7uiH/5frSToTZLxgz73ZkjPAFCK0/n4Ktqzz5i4HzekVJhwPhhmdnhB1F+gOzd l+C80Ewna4d8FLJSwv7ur5xEcN71Px+e++eM3WtmoJ7QXxkQodKOtlRLTS9ECkkrLJ5j iCUw==
X-Gm-Message-State: AOAM531XMFG4FLCWcf+p1k+9CQdbnFG4IujLnnabAr6BiwuNtyPd/IAA TN584YeyZ4CIxVQ9u5Y8tWelzTgzhUsvK2nfL+d1mw==
X-Google-Smtp-Source: ABdhPJxXwncSsogzyNoPId4PEWOHI0CspR5lGByOzeqgCpwSdd8gjORBum8RdsV2PRuI35xDJPeDgtGz0sMQZsn0UbM=
X-Received: by 2002:a17:906:c2d7:: with SMTP id ch23mr11764451ejb.298.1627844632906; Sun, 01 Aug 2021 12:03:52 -0700 (PDT)
MIME-Version: 1.0
References: <CALx6S37zVVXnCH+Dv7_QXgwOoqcL4h0SThh+LnmAWn-5enprZQ@mail.gmail.com> <FA155FD9-2319-405C-B082-C023DEC2BF28@strayalpha.com> <CALx6S3435ZjAz8ECgbFbH=Hxm-cXAGRQjTbxgtGb9U-CTXMw=A@mail.gmail.com> <C8CE3912-55B2-4DC0-AB39-2D6EA6953500@strayalpha.com> <1178DE92-175A-4293-8A97-9B6FEBAF7B02@strayalpha.com> <CALx6S35tB=j5y3-xr5S22y0p+WJxKX_hqk8rm30oCruFxZp5Dw@mail.gmail.com> <87662B22-F63B-4EA4-94B3-DF4B2439A4E1@strayalpha.com>
In-Reply-To: <87662B22-F63B-4EA4-94B3-DF4B2439A4E1@strayalpha.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sun, 01 Aug 2021 12:03:41 -0700
Message-ID: <CALx6S35h3H-mvkHKFcpp3-k-Sq48NAMVRe-LEhfHxEA=hP49qQ@mail.gmail.com>
To: Joseph Touch <touch@strayalpha.com>
Cc: tsvwg <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/hyjYXgYFaM-zDTZabbCYPQMKGZM>
Subject: Re: [tsvwg] UDP options and header-data split (zero copy)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Aug 2021 19:04:01 -0000

On Sun, Aug 1, 2021 at 10:48 AM Joseph Touch <touch@strayalpha.com> wrote:
>
>
>
> > On Aug 1, 2021, at 10:03 AM, Tom Herbert <tom@herbertland.com> wrote:
> >
> > On Sat, Jul 31, 2021 at 2:17 PM Joseph Touch <touch@strayalpha.com> wrote:
> >>
> >> FWIW, there’s another option here, which could be considered unnecessarily complex if there’s no need for per-reassembled datagram options for most UDP tunnels….
> >>
> >> ----
> >>
> >> Right now, the terminal fragment has a header as follows:
> >>
> >>                   +--------+--------+--------+--------+
> >>                   | Kind=4 | Len=12 |   Frag. Start   |
> >>                   +--------+--------+--------+--------+
> >>                   |           Identification          |
> >>                   +--------+--------+--------+--------+
> >>                   |  Frag. Offset   |    Frag. End    |
> >>                   +--------+--------+--------+--------+
> >>
> >>                Figure 12   UDP terminal FRAG option format
> >>
> >> As to whether to put the per-reassembled options before or after the terminal fragment, we could let the user decide (again, consistent with the idea of UDP), as follows.
> >
> > Joe,
> >
> > I suggest that the per-reassembled options be in headers in the first
> > fragment. This would work by specifying that options preceding the
> > fragment option are per-packet, options following the fragment option
> > in the first fragment apply to reassembled packets, and options
> > following a fragment option in non-first fragment would be ignored.
>
> Again, that means that the FRAG option itself would float far inside the TLV list, which complicates DMA.
>
I don't understand your concerns. The method I am describing is
functionally equivalent to how fragmentation is done in IPv6. There is
a fragment extension header that "floats" inside the IPv6 header
chain, options that are before the fragment extension header are
applied per packet and those after the extension header are applied to
the reassembled packet. This is no issue with DMA since there are no
trailers and we can use header-data split with fragments.

> Also, the trailing variant allows per-reassembled options to be arbitrarily long (limited by the reassembled length), rather than requiring them to fit inside a single fragment.
>
> That’s a key reason for using trailing options - no limit on length, even with fragmentation. That’s why we need a version that allows that, otherwise the options would be limited to what will fit in exactly one option.
>
Deployment experience has shown arbitrarily long lists of TLVs in a
datapath protocol is more a problem than a benefit. Implementations
have a lot of trouble with these and to date the only use case for
this is DoS attack.

> > So, in this manner fragments don't have any protocol trailers and
> > therefore Frag. End isn't needed and can be removed from the fragment
> > option saving two bytes (the end of the fragment would coincide with
> > the end of the packet).
>
> But then we need at least one byte back for the one bit to indicate last-fragment.
>
> > As I mentioned previously, there are other
> > advantages to placing the options that apply to the reassembled packet
> > in the first fragment instead of the last.
>
> We need to keep a way to allow arbitrary option lengths;

Why? Can you provide a *specific* example of an option or set of
options that needs hundreds of bytes of options?

> every fixed header approach to that ends up hitting a limit that later needs a workaround (see the work on extending the TCP header).
>
> > In the tunnel case we'd also cases where UDP options might be sent
> > that are not fragments, however we still need to know where the
> > payload begins. In the fragment option this is the Frag. Start field,
> > but sending a whole fragment option just to convey these two bytes of
> > information would be way overkill. IMO a better approach is to
> > indicate the length of the options in a fixed header preceding the
> > options since the information is always required.
>
> We’ve been down the path of fixed headers before.

And we've been down the path of protocols that allow arbitrarily long
lists of TLVs. See IPv6 Hop-by-Hop options and Destination options.
There is about no deployment of these and a major reason is that
implementations efficiently implement necessary support.

> It wastes space for some uses to conserve it for others. E.g., in tunnel cases where UDP CS==0, it would waste space for OCS when not needed.

Somehow the whole purpose of having a checksum in protocol is lost in
this draft. The point of a checksum is that it's an integrity check of
the protocol. Frankly, the requirements described in the draft seem
more concerned with making sure the checksum is set properly to
packets to get through some misbehaving routers on the Internet. The
argument that UDP Options doesn't need a checksum is disingenuous
given that UDP doesn't have options in the first place so making the
checksum required in UDP to cover options; however the other major
transport protocol of IETF, namely TCP, has a checksum that always
covers TCP options. Similarly, the argument that the IPV6 UDP tunnels
can have a zero checksum is disingenuous. If you read RFC6935 and
RFC6936 you'll find that the conditions for which a zero checksum can
be safely set are quite narrow. One of those conditions is if the
encapsulated protocol has an integrity check of its own.

IMO, a checksum should always be required over the surplus area. The
value we get from an integrity far exceeds the overhead which is a
whole two bytes. It's pointless to spend time quibbling rather than a
fixed overhead of two bytes is better or worse than a protocol that
saves those two bytes in some cases, but increases overhead to three
bytes.

Tom


>I
> Joe
>