Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12

Joe Touch <touch@strayalpha.com> Mon, 14 June 2021 17:20 UTC

Return-Path: <touch@strayalpha.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D518A3A2B80 for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 10:20:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.319
X-Spam-Level:
X-Spam-Status: No, score=-1.319 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E_G8m1hxaBSd for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 10:20:44 -0700 (PDT)
Received: from server217-4.web-hosting.com (server217-4.web-hosting.com [198.54.116.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2A3DD3A2B7C for <tsvwg@ietf.org>; Mon, 14 Jun 2021 10:20:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id:Cc:Date:In-Reply-To: From:Subject:Mime-Version:Content-Transfer-Encoding:Content-Type:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Zrr1zKh+fW02rT2RSMz8j6KD2eFs6Cjud7I6yh3S52s=; b=UTUtITWy26NDnPIj2vaEIsd50R MNY8nk3U8HGfp7O89+JhREku+e2Kz4HvpWZgWtUPCr3li0Wv4/mjCqcLz7xa+B1TtYsF6jjYTuLP3 ZQeoEV/6G/CjZySWxS2OJLJcqk4GJSCpIHtDwCSqrOIbgZvJhiTxR8bdb9rrs89wOSJ7ifyLzo3ut f26I/C+nkYf7vm//JttonuCrJk5WMraQAhvDWbS9B7oGfV6/g2/6GkCZ+HDF3y3baWUvarSNEP2Qq qbSnLT/kbCZxKDupZVLTbJyEiiWIrNPc8pXsumcR/FyGbE5SXe9TNIP64IvBqBqQUoV0SPETb5ntx QyZNTGnQ==;
Received: from cpe-172-250-225-198.socal.res.rr.com ([172.250.225.198]:52637 helo=smtpclient.apple) by server217.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <touch@strayalpha.com>) id 1lsqGM-001Acf-Jf; Mon, 14 Jun 2021 13:20:43 -0400
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (1.0)
From: Joe Touch <touch@strayalpha.com>
In-Reply-To: <D9B2E315-5C7A-4BE9-97A9-AF627F6FD6FF@strayalpha.com>
Date: Mon, 14 Jun 2021 10:20:38 -0700
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, TSVWG <tsvwg@ietf.org>
Message-Id: <DCF3D0D3-83E0-4F84-8C1F-57DF9EE63C59@strayalpha.com>
References: <D9B2E315-5C7A-4BE9-97A9-AF627F6FD6FF@strayalpha.com>
To: Tom Herbert <tom@herbertland.com>
X-Mailer: iPhone Mail (18F72)
X-OutGoing-Spam-Status: No, score=-0.5
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/Ed1EMNecQwRNUS8LhI5KxDh7V7g>
Subject: Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2021 17:20:49 -0000

Ps - we need an option length field to make fragments look like tcp. I can put that in - do we want that in OCS? Or independent?

> On Jun 14, 2021, at 10:16 AM, Joe Touch <touch@strayalpha.com> wrote:
> 
> FYI that’s what fragments look like. We can’t do this for non fragments. 
> 
>>> On Jun 14, 2021, at 10:03 AM, Tom Herbert <tom@herbertland.com> wrote:
>>> 
>>> On Mon, Jun 14, 2021 at 9:31 AM Gorry Fairhurst <gorry@erg.abdn.ac.uk> wrote:
>>> 
>>>> On 14/06/2021 17:17, Tom Herbert wrote:
>>>> On Sun, Jun 13, 2021 at 9:31 PM Joseph Touch <touch@strayalpha.com> wrote:
>>>>> 
>>>>> 
>>>>> On Jun 13, 2021, at 7:20 PM, C. M. Heard <heard@pobox.com> wrote:
>>>>> 
>>>>>> If we DO support zero-copy and thus want to allow non-terminal fragments to have post-fragoption options that operate on each fragment, then we would add THISFRAGLEN to the nonterminal format and issue different KIND numbers to nonterminal/terminal fragment.
>>>>> 
>>>>> 
>>>>> I for one would appreciate further discussion of these last points. I admit that I have failed to grasp Joe's message on the RDMA thread, and I would appreciate some time to think about it.
>>>>> 
>>>>> 
>>>>> Sure - here’s how it all works. Note that this is relevant mostly for long transfers with persistent UDP fragmentation; if that is assumed to be ‘adjusted’ at the app layer (as QUIC does), then we don’t need zero-copy support...
>>>>> 
>>>>> - right now, UDP data can be zero-copied when received into user space, starting with the user data
>>>> Only if the device supports header/data split where the headers are in
>>>> one buffer and UDP data is in aligned buffer.
>>>> 
>>>>> - if we add options, UDP data can still be zero-copied because it hasn’t moved (it still begins the payload
>>>>> - however, fragments are different because (esp given the merging of frag and lite) they don’t start at the beginning of data
>>>>> - they always start after OCS (which I think we should make fit the uniform KIND/LEN/OCS format of 4 bytes)
>>>>> - if the FRAG comes next, then we can move the frag content around a little and still support zero-copy
>>>>> 
>>>>> notably, we move the first 10 bytes of the fragment to the end
>>>>> 4 for OCS
>>>>> 6 for FRAG (assuming FRAG includes KIND/OPTLEN/FRAGOFFSET/ID/FRAGLEN)
>>>>> that way we can zero-copy the frag packet into place, then just copy those last 8 bytes over OCS and the FRAG header
>>>>> 
>>>> An obvious feature we'd want is NIC hardware to do UDP options
>>>> fragementation and reassembly, analogous to existing UDP Fragmentation
>>>> Offload (UFO) which performs IP fragmentation of UDP packets. The
>>>> impediment with supporting this is that hardware devices would need to
>>>> perform protocol processing on trailers as opposed to headers. Nearly
>>>> all hardware devices, including switches and NICs, are optimized to
>>>> process protocol headers and in modern devices they are quite
>>>> programmable in that regard. However, they typically rely on a parsing
>>>> buffer that holds the first N bytes of the packet and assume that all
>>>> the protocol headers lie within that. They wouldn't process data after
>>>> that header in the fast path at least, and almost certainly would have
>>>> capability to process protocol headers at that end of a large packet.
>>>> I am doubtful we'll ever see hardware support for trailer protocols,
>>>> and hence it's unlikely we'd see accelerations for UDP options like we
>>>> have for TCP.
>>>> 
>>>> Tom
>>> 
>>> OK.... Is there any way that we could design to enable this?
>>> 
>>> I'm "fishing" for ideas because I know you've talked about the various
>>> offload methods.
>>> 
>> 
>> Gorry,
>> 
>> My suggestion was to place UDP options after the UDP header. Instead
>> of just placing fragment header after the UDP header, place all the
>> UDP options there and then follow that by the Payload. So packet looks
>> like:
>> 
>> +-------------------+
>> |   UDP header  |
>> +-------------------+
>> |  UDP options  |
>> +-------------------+
>> |     Payload      |
>> +-------------------+
>> 
>> Now this looks a lot like a TCP packet and other variable length
>> headers which we know how to handle. For zero copy we can do
>> header/split by programming emerging smart devices to split through
>> UDP options in one buffer and payload in another thereby also
>> eliminating any need to move headers or data around.
>> 
>> Tom
>> 
>>> So for options in the trailer, this is clearly an impediment.
>>> 
>>> For UDP-Opt fragmentation, I understand there is no standard UDP payload,
>>> 
>>> .... only an option containing a fragment, so the Fragment information
>>> would actually be in the" first N bytes of the packet".
>>> 
>>> So, what do you think  could be most likely helpful to enable fastpath
>>> accelleration for the fragments?
>>> 
>>> Gorry
>>> 
>>>>> This method assumes that we try to keep FRAG early in the packet - preferably right after OCS. The later it comes, the more additional bytes we need to move to “fix” the copy (beyond the 8 bytes noted above).
>>>>> 
>>>>> —
>>>>> 
>>>>> This method is the only reason we would want to allow options after non-terminal fragments - basically to keep the fragment toward the front of the packet, using the rule that post-noninitial frag options still operate on the fragment, rather than waiting for reassembly. The exception is the terminal fragment, where post-terminal fragment options operate on the reassembled packet.
>>>>> 
>>>>> Joe
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>