Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Mon, 14 June 2021 16:32 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5EA1B3A29E4 for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 09:32:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id idpqlUMR7CN0 for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 09:31:56 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [137.50.19.135]) by ietfa.amsl.com (Postfix) with ESMTP id B177D3A29E8 for <tsvwg@ietf.org>; Mon, 14 Jun 2021 09:31:56 -0700 (PDT)
Received: from GF-MBP-2.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 4A5AA1B001BF; Mon, 14 Jun 2021 17:31:50 +0100 (BST)
To: Tom Herbert <tom@herbertland.com>, Joseph Touch <touch@strayalpha.com>
Cc: TSVWG <tsvwg@ietf.org>
References: <CACL_3VGb_9P5SfPGRJtf1ZBvEhgywc2ZEGr-qbgNOMXV20rFeA@mail.gmail.com> <CACL_3VHyoRr5ju8203DiLTUo-658DCj7ud+1dQE2o0hUPVhF0A@mail.gmail.com> <7D766992-AEEB-434F-BB1D-3817EE07DE61@strayalpha.com> <1BBDBD80-3A53-4700-A79F-9A3AE4876F2B@strayalpha.com> <CACL_3VEXCT-sSNhtncVK26DPQefDLJhqEijgDke4Q7DmhRrpTQ@mail.gmail.com> <67E79ED1-14DE-4127-83AF-D17E8C72F362@strayalpha.com> <CALx6S37faGXPaC-4qZ0e_3CM5hSFQhrDOQydVvdjxzY5zKf5SA@mail.gmail.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <4564201c-a3ca-7e17-a03f-ee9626852169@erg.abdn.ac.uk>
Date: Mon, 14 Jun 2021 17:31:49 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.10.0
MIME-Version: 1.0
In-Reply-To: <CALx6S37faGXPaC-4qZ0e_3CM5hSFQhrDOQydVvdjxzY5zKf5SA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/LL85Itwv3W2NSlD5ZZzatbjFEk0>
Subject: Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2021 16:32:01 -0000

On 14/06/2021 17:17, Tom Herbert wrote:
> On Sun, Jun 13, 2021 at 9:31 PM Joseph Touch <touch@strayalpha.com> wrote:
>>
>>
>> On Jun 13, 2021, at 7:20 PM, C. M. Heard <heard@pobox.com> wrote:
>>
>>> If we DO support zero-copy and thus want to allow non-terminal fragments to have post-fragoption options that operate on each fragment, then we would add THISFRAGLEN to the nonterminal format and issue different KIND numbers to nonterminal/terminal fragment.
>>
>>
>> I for one would appreciate further discussion of these last points. I admit that I have failed to grasp Joe's message on the RDMA thread, and I would appreciate some time to think about it.
>>
>>
>> Sure - here’s how it all works. Note that this is relevant mostly for long transfers with persistent UDP fragmentation; if that is assumed to be ‘adjusted’ at the app layer (as QUIC does), then we don’t need zero-copy support...
>>
>> - right now, UDP data can be zero-copied when received into user space, starting with the user data
> Only if the device supports header/data split where the headers are in
> one buffer and UDP data is in aligned buffer.
>
>> - if we add options, UDP data can still be zero-copied because it hasn’t moved (it still begins the payload
>> - however, fragments are different because (esp given the merging of frag and lite) they don’t start at the beginning of data
>> - they always start after OCS (which I think we should make fit the uniform KIND/LEN/OCS format of 4 bytes)
>> - if the FRAG comes next, then we can move the frag content around a little and still support zero-copy
>>
>> notably, we move the first 10 bytes of the fragment to the end
>> 4 for OCS
>> 6 for FRAG (assuming FRAG includes KIND/OPTLEN/FRAGOFFSET/ID/FRAGLEN)
>> that way we can zero-copy the frag packet into place, then just copy those last 8 bytes over OCS and the FRAG header
>>
> An obvious feature we'd want is NIC hardware to do UDP options
> fragementation and reassembly, analogous to existing UDP Fragmentation
> Offload (UFO) which performs IP fragmentation of UDP packets. The
> impediment with supporting this is that hardware devices would need to
> perform protocol processing on trailers as opposed to headers. Nearly
> all hardware devices, including switches and NICs, are optimized to
> process protocol headers and in modern devices they are quite
> programmable in that regard. However, they typically rely on a parsing
> buffer that holds the first N bytes of the packet and assume that all
> the protocol headers lie within that. They wouldn't process data after
> that header in the fast path at least, and almost certainly would have
> capability to process protocol headers at that end of a large packet.
> I am doubtful we'll ever see hardware support for trailer protocols,
> and hence it's unlikely we'd see accelerations for UDP options like we
> have for TCP.
>
> Tom

OK.... Is there any way that we could design to enable this?

I'm "fishing" for ideas because I know you've talked about the various 
offload methods.

So for options in the trailer, this is clearly an impediment.

For UDP-Opt fragmentation, I understand there is no standard UDP payload,

.... only an option containing a fragment, so the Fragment information 
would actually be in the" first N bytes of the packet".

So, what do you think  could be most likely helpful to enable fastpath 
accelleration for the fragments?

Gorry

>> This method assumes that we try to keep FRAG early in the packet - preferably right after OCS. The later it comes, the more additional bytes we need to move to “fix” the copy (beyond the 8 bytes noted above).
>>
>> —
>>
>> This method is the only reason we would want to allow options after non-terminal fragments - basically to keep the fragment toward the front of the packet, using the rule that post-noninitial frag options still operate on the fragment, rather than waiting for reassembly. The exception is the terminal fragment, where post-terminal fragment options operate on the reassembled packet.
>>
>> Joe
>>
>>
>>
>>