Re: [tsvwg] UDP options and header-data split (zero copy)

Joseph Touch <touch@strayalpha.com> Sun, 01 August 2021 21:33 UTC

Return-Path: <touch@strayalpha.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E21FD3A1247 for <tsvwg@ietfa.amsl.com>; Sun, 1 Aug 2021 14:33:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.318
X-Spam-Level:
X-Spam-Status: No, score=-1.318 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4rNCQnKr8OPz for <tsvwg@ietfa.amsl.com>; Sun, 1 Aug 2021 14:33:09 -0700 (PDT)
Received: from server217-4.web-hosting.com (server217-4.web-hosting.com [198.54.116.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 492D23A1243 for <tsvwg@ietf.org>; Sun, 1 Aug 2021 14:33:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id:Cc:Date:In-Reply-To: From:Subject:Mime-Version:Content-Type:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=FFOTheIC+Z4jMcfP4bl3DOJkcx94zRYuXF1dRgUh5Ts=; b=4vUhYnHsXV6UF00DNt+tIiDE82 vSfnY50dNJesHeJVgQ+GbdE4V+cfrsyfcI2aiAzp6ICFHJH9fxl0ZJ71E79SWe3NerjPdYZ5vSjo/ Iqc1JsXkvwP63MkHmIklZdg2GzfTEvDlZ5vWX0MsnUHf/2/w8EQSo1hVZCgjOQC+wopdQ1EHzNMaU Ok40xU2gDsTFdt/I6uHwhry3SkN9ryEBpFMln9zd5tujYkugcA/vIF3cPO04w7s5GJdNe9/NrPK3Y zKkFkRchTGRF8c/wgVKDn8DJQ6g5fEC/5m/6K71B86INBky7LzK0+wOObmRc41aMc5lNYAaVVUxRC qsiKelDA==;
Received: from cpe-172-114-237-88.socal.res.rr.com ([172.114.237.88]:49946 helo=smtpclient.apple) by server217.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <touch@strayalpha.com>) id 1mAJ4w-001EjW-Jg; Sun, 01 Aug 2021 17:33:07 -0400
Content-Type: multipart/alternative; boundary="Apple-Mail=_FB22462B-4B5F-4935-B7A9-71383DE37C1D"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
From: Joseph Touch <touch@strayalpha.com>
In-Reply-To: <CACL_3VFO2=J2jYdzcX9o5bzUYtsDunpKWD4f_g2ypGNWcbqGAA@mail.gmail.com>
Date: Sun, 01 Aug 2021 14:33:01 -0700
Cc: Tom Herbert <tom@herbertland.com>, tsvwg <tsvwg@ietf.org>
Message-Id: <A0932E7C-183B-41EF-B2AA-838FC45A087E@strayalpha.com>
References: <CALx6S37zVVXnCH+Dv7_QXgwOoqcL4h0SThh+LnmAWn-5enprZQ@mail.gmail.com> <FA155FD9-2319-405C-B082-C023DEC2BF28@strayalpha.com> <CALx6S3435ZjAz8ECgbFbH=Hxm-cXAGRQjTbxgtGb9U-CTXMw=A@mail.gmail.com> <C8CE3912-55B2-4DC0-AB39-2D6EA6953500@strayalpha.com> <1178DE92-175A-4293-8A97-9B6FEBAF7B02@strayalpha.com> <CALx6S35tB=j5y3-xr5S22y0p+WJxKX_hqk8rm30oCruFxZp5Dw@mail.gmail.com> <87662B22-F63B-4EA4-94B3-DF4B2439A4E1@strayalpha.com> <CACL_3VFO2=J2jYdzcX9o5bzUYtsDunpKWD4f_g2ypGNWcbqGAA@mail.gmail.com>
To: "C. M. Heard" <heard@pobox.com>
X-Mailer: Apple Mail (2.3654.120.0.1.13)
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/iKTAMHNcVff3gqqCQNhVtK5RlZE>
Subject: Re: [tsvwg] UDP options and header-data split (zero copy)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Aug 2021 21:33:14 -0000


> On Aug 1, 2021, at 1:32 PM, C. M. Heard <heard@pobox.com> wrote:
> 
> On Sun, Aug 1, 2021 at 10:48 AM Joseph Touch wrote:
> Also, the trailing variant allows per-reassembled options to be arbitrarily long (limited by the reassembled length), rather than requiring them to fit inside a single fragment.
> 
> 
> If that is the intent, then draft-ietf-tsvwg-udp-options-13#section-5.5 needs significant clarification.

Yes; this is pending.

> We’ve been down the path of fixed headers before. It wastes space for some uses to conserve it for others. E.g., in tunnel cases where UDP CS==0, it would waste space for OCS when not needed.
> 
> Indeed we have, but AFAICT a definitive consensus has not been reached. 

What we have so far are three of us largely debating the issues, agree.

> As you know, recently I have been advocating a fragment format along the following lines:
> 
>                    +--------+--------+--------+--------+
>                    |  Source Port    |   Dest. Port    |
>                    +--------+--------+--------+--------+
>                    |   UDP Len=8     |  UDP Checksum   |
>                    +--------+--------+--------+--------+
>                    | OCS=2  | LEN=4  | Option Checksum |
>                    +--------+--------+--------+--------+
>                    |       ... Other Options ...       |
>                    +--------+--------+--------+--------+
>                    | FRAG=X | LEN=8  |  Frag. Offset   |
>                    +--------+--------+--------+--------+
>                    |          Identification           |
>                    +--------+--------+--------+--------+
>                    |       ... Fragment Data ...       |
>                    +--------+--------+--------+--------+
> 
> where X is one of two codepoints, depending on whether the fragment is a terminal or non-terminal fragment.

With that version, the DMA can’t happen until the whole TLV is parsed.

> This would achieve the goal of pushing all user data in self-contained fragments to the end of the packet and would thereby allow for checksum offload of encapsulated packets on commonly available hardware. However, there's one thing that Tom's header would give us that a naked stack of TLVs would not: it would enable a NIC to perform header-data split without parsing a long stack of TLVs.

We haven’t proposed a long stack of TLVs; the current text puts the FRAG field right after the optional OCS, specifically for this reason,

HOWEVER, I keep hearing that IPv6 “got it right”, yet a long stack of TLVs is exactly what it uses for frag and reassy too - not a fixed header.

> We still don't get around the need to loop through all those TLVs at some point, but that does not have to be handled in hardware that is not well-suited for it.
> 
> In https://mailarchive.ietf.org/arch/msg/tsvwg/XZxL29UA-95ReA72mxv5-kEytK0/ <https://mailarchive.ietf.org/arch/msg/tsvwg/XZxL29UA-95ReA72mxv5-kEytK0/> I floated the following variant of Tom's proposal"
> 
>     0                   1                   2                   3
>     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
>    |        Source port            |      Destination port         | |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -Base Hdr
>    |        UDP Length = 8         |   UDP Header Checksum         | |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
>    |        Payload Offset         |  Option+Payload Checksum      |\
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
>    |                                                               | |
>    ~                           UDP Options                         ~ -Ext Hdr
>    |                                                               | |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
>    |                                                               |
>    ~                         Payload Data                          ~
>    |                                                               |
>    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> In this case the fixed header just replaces the currently-defined OCS, occupying the exact same amount of space while using the same algorithm. And it is neutral as to whether FRAG is defined as in draft-ietf-tsvwg-udp-options-13, or as in my counterproposal, or in some other way, as long as FRAG data is contained in the option space.

But the format above doesn’t make sense for non-FRAGs because the payload comes earlier; if it were used, it would waste space when OCS isn’t used in addition to wasting space on the payload offset.

We’ve discussed using fixed headers before, but the issue is that UDP, like IPv6 (the most modern example of options), does have a fixed header and then a set of TLVS. There’s no pointer like you assume above to where the options end and data starts.

So if hardware supports frag/reassy for IPv6, why wouldn’t it work for UDP?

Joe