Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12

Joe Touch <touch@strayalpha.com> Mon, 14 June 2021 17:15 UTC

Return-Path: <touch@strayalpha.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A64193A2B62 for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 10:15:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.318
X-Spam-Level:
X-Spam-Status: No, score=-1.318 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MIME_QP_LONG_LINE=0.001, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F7HT3goX4B2G for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 10:15:53 -0700 (PDT)
Received: from server217-4.web-hosting.com (server217-4.web-hosting.com [198.54.116.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 19A3B3A2B7E for <tsvwg@ietf.org>; Mon, 14 Jun 2021 10:15:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id:Cc:Date:In-Reply-To: From:Subject:Mime-Version:Content-Transfer-Encoding:Content-Type:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=GfuVAdk1ctAByd34B1es7Y0sTx3rwCLKmpxFcS2rMB4=; b=DHUvTY7YaiIk82hwwLuj9tWoEr IMG8TSZotLMFMZWPvbYjB8O1kP1c9xjLuNwbem2ezzgB8EnKLpMLrZ4UsRr62EmMjzGyI/m/Zly/3 zRrCuSw7PAiidS54jIg5usd+2euariYV6QHQlZuJ5W/BdXKDyF2joJ2AsGbf02T/Uc/FSopVOhoEa neznTU1dJ1A4NDv3npfQR+yy4QeNPXqoJGVfgRmZZP7CgZcBJcPVN0lMJ8rNeIyO/216zqkNZpEcB K9LWraHUPxJxpsVSER8G8IALdKji6LOq+bfPWesJbMHy2KEiadylf6O+2xrJyAusM3usMbcL4+HmU fMUSzB+A==;
Received: from cpe-172-250-225-198.socal.res.rr.com ([172.250.225.198]:52635 helo=smtpclient.apple) by server217.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <touch@strayalpha.com>) id 1lsqBd-0011ol-1P; Mon, 14 Jun 2021 13:15:49 -0400
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (1.0)
From: Joe Touch <touch@strayalpha.com>
In-Reply-To: <CALx6S36PMx4HK-+5w=WDQCkjAmkPTsMGPYVi_=s41OvRn6t=sw@mail.gmail.com>
Date: Mon, 14 Jun 2021 10:15:44 -0700
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, TSVWG <tsvwg@ietf.org>
Message-Id: <D9B2E315-5C7A-4BE9-97A9-AF627F6FD6FF@strayalpha.com>
References: <CALx6S36PMx4HK-+5w=WDQCkjAmkPTsMGPYVi_=s41OvRn6t=sw@mail.gmail.com>
To: Tom Herbert <tom@herbertland.com>
X-Mailer: iPhone Mail (18F72)
X-OutGoing-Spam-Status: No, score=-0.5
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ce-4WKDTWDAYM0yDg42aXg-ozhU>
Subject: Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2021 17:15:58 -0000

FYI that’s what fragments look like. We can’t do this for non fragments. 

> On Jun 14, 2021, at 10:03 AM, Tom Herbert <tom@herbertland.com> wrote:
> 
> On Mon, Jun 14, 2021 at 9:31 AM Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
>> 
>>> On 14/06/2021 17:17, Tom Herbert wrote:
>>> On Sun, Jun 13, 2021 at 9:31 PM Joseph Touch <touch@strayalpha.com> wrote:
>>>> 
>>>> 
>>>> On Jun 13, 2021, at 7:20 PM, C. M. Heard <heard@pobox.com> wrote:
>>>> 
>>>>> If we DO support zero-copy and thus want to allow non-terminal fragments to have post-fragoption options that operate on each fragment, then we would add THISFRAGLEN to the nonterminal format and issue different KIND numbers to nonterminal/terminal fragment.
>>>> 
>>>> 
>>>> I for one would appreciate further discussion of these last points. I admit that I have failed to grasp Joe's message on the RDMA thread, and I would appreciate some time to think about it.
>>>> 
>>>> 
>>>> Sure - here’s how it all works. Note that this is relevant mostly for long transfers with persistent UDP fragmentation; if that is assumed to be ‘adjusted’ at the app layer (as QUIC does), then we don’t need zero-copy support...
>>>> 
>>>> - right now, UDP data can be zero-copied when received into user space, starting with the user data
>>> Only if the device supports header/data split where the headers are in
>>> one buffer and UDP data is in aligned buffer.
>>> 
>>>> - if we add options, UDP data can still be zero-copied because it hasn’t moved (it still begins the payload
>>>> - however, fragments are different because (esp given the merging of frag and lite) they don’t start at the beginning of data
>>>> - they always start after OCS (which I think we should make fit the uniform KIND/LEN/OCS format of 4 bytes)
>>>> - if the FRAG comes next, then we can move the frag content around a little and still support zero-copy
>>>> 
>>>> notably, we move the first 10 bytes of the fragment to the end
>>>> 4 for OCS
>>>> 6 for FRAG (assuming FRAG includes KIND/OPTLEN/FRAGOFFSET/ID/FRAGLEN)
>>>> that way we can zero-copy the frag packet into place, then just copy those last 8 bytes over OCS and the FRAG header
>>>> 
>>> An obvious feature we'd want is NIC hardware to do UDP options
>>> fragementation and reassembly, analogous to existing UDP Fragmentation
>>> Offload (UFO) which performs IP fragmentation of UDP packets. The
>>> impediment with supporting this is that hardware devices would need to
>>> perform protocol processing on trailers as opposed to headers. Nearly
>>> all hardware devices, including switches and NICs, are optimized to
>>> process protocol headers and in modern devices they are quite
>>> programmable in that regard. However, they typically rely on a parsing
>>> buffer that holds the first N bytes of the packet and assume that all
>>> the protocol headers lie within that. They wouldn't process data after
>>> that header in the fast path at least, and almost certainly would have
>>> capability to process protocol headers at that end of a large packet.
>>> I am doubtful we'll ever see hardware support for trailer protocols,
>>> and hence it's unlikely we'd see accelerations for UDP options like we
>>> have for TCP.
>>> 
>>> Tom
>> 
>> OK.... Is there any way that we could design to enable this?
>> 
>> I'm "fishing" for ideas because I know you've talked about the various
>> offload methods.
>> 
> 
> Gorry,
> 
> My suggestion was to place UDP options after the UDP header. Instead
> of just placing fragment header after the UDP header, place all the
> UDP options there and then follow that by the Payload. So packet looks
> like:
> 
> +-------------------+
> |   UDP header  |
> +-------------------+
> |  UDP options  |
> +-------------------+
> |     Payload      |
> +-------------------+
> 
> Now this looks a lot like a TCP packet and other variable length
> headers which we know how to handle. For zero copy we can do
> header/split by programming emerging smart devices to split through
> UDP options in one buffer and payload in another thereby also
> eliminating any need to move headers or data around.
> 
> Tom
> 
>> So for options in the trailer, this is clearly an impediment.
>> 
>> For UDP-Opt fragmentation, I understand there is no standard UDP payload,
>> 
>> .... only an option containing a fragment, so the Fragment information
>> would actually be in the" first N bytes of the packet".
>> 
>> So, what do you think  could be most likely helpful to enable fastpath
>> accelleration for the fragments?
>> 
>> Gorry
>> 
>>>> This method assumes that we try to keep FRAG early in the packet - preferably right after OCS. The later it comes, the more additional bytes we need to move to “fix” the copy (beyond the 8 bytes noted above).
>>>> 
>>>> —
>>>> 
>>>> This method is the only reason we would want to allow options after non-terminal fragments - basically to keep the fragment toward the front of the packet, using the rule that post-noninitial frag options still operate on the fragment, rather than waiting for reassembly. The exception is the terminal fragment, where post-terminal fragment options operate on the reassembled packet.
>>>> 
>>>> Joe
>>>> 
>>>> 
>>>> 
>>>> 
>>