Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12

"C. M. Heard" <heard@pobox.com> Tue, 15 June 2021 05:10 UTC

Return-Path: <heard@pobox.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3EB333A1FB3 for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 22:10:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=pobox.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p3eM-u1jbPdN for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 22:10:19 -0700 (PDT)
Received: from pb-smtp21.pobox.com (pb-smtp21.pobox.com [173.228.157.53]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 264093A1FAF for <tsvwg@ietf.org>; Mon, 14 Jun 2021 22:10:18 -0700 (PDT)
Received: from pb-smtp21.pobox.com (unknown [127.0.0.1]) by pb-smtp21.pobox.com (Postfix) with ESMTP id 62B0113C6F1 for <tsvwg@ietf.org>; Tue, 15 Jun 2021 01:10:15 -0400 (EDT) (envelope-from heard@pobox.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=pobox.com; h= mime-version:references:in-reply-to:from:date:message-id:subject :to:cc:content-type; s=sasl; bh=njBhL0bKv1yFN3H8mQCe4K5r9s/8Y1sk vixs6dg1ikc=; b=FnZhJS30PhsuVrakzj05Qrxd1Nuwsk0TbDOar9+lrUwjFYmN Aw0eH35ckeKzQ2L+ey2xoDG3LN71cJ+U9WS4QDnrEME8ag50fxw1M81O3AEEraad 8hAfDEW0BAOhWdr5t6w7f5mmCX1XOJZPBhdpsl9jKJSsVD6bHsNgPlqR6NI=
Received: from pb-smtp21.sea.icgroup.com (unknown [127.0.0.1]) by pb-smtp21.pobox.com (Postfix) with ESMTP id 5A4C913C6F0 for <tsvwg@ietf.org>; Tue, 15 Jun 2021 01:10:15 -0400 (EDT) (envelope-from heard@pobox.com)
Received: from mail-il1-f173.google.com (unknown [209.85.166.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pb-smtp21.pobox.com (Postfix) with ESMTPSA id 0947613C6EE for <tsvwg@ietf.org>; Tue, 15 Jun 2021 01:10:13 -0400 (EDT) (envelope-from heard@pobox.com)
Received: by mail-il1-f173.google.com with SMTP id q18so793473ile.10 for <tsvwg@ietf.org>; Mon, 14 Jun 2021 22:10:13 -0700 (PDT)
X-Gm-Message-State: AOAM531TbrO6lIiZ+ID3P7ybbaLwk5lcTL3ADkFCk8tXO2xr1rguhl5A HmDNqX1JijXhcgr/2l6wkfpnFXkqMpP43zliYf0=
X-Google-Smtp-Source: ABdhPJwqLFdJ1Ts3mtnaXcwK5w7n5uzCMHBL9tQiXvPIC0ffKm22tnXw0FIsAXov3HyWD4PkDMghirerLND2uq0mJ1o=
X-Received: by 2002:a92:7510:: with SMTP id q16mr15892689ilc.291.1623733811765; Mon, 14 Jun 2021 22:10:11 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VGb_9P5SfPGRJtf1ZBvEhgywc2ZEGr-qbgNOMXV20rFeA@mail.gmail.com> <CACL_3VHyoRr5ju8203DiLTUo-658DCj7ud+1dQE2o0hUPVhF0A@mail.gmail.com> <7D766992-AEEB-434F-BB1D-3817EE07DE61@strayalpha.com> <1BBDBD80-3A53-4700-A79F-9A3AE4876F2B@strayalpha.com> <CACL_3VEXCT-sSNhtncVK26DPQefDLJhqEijgDke4Q7DmhRrpTQ@mail.gmail.com> <67E79ED1-14DE-4127-83AF-D17E8C72F362@strayalpha.com>
In-Reply-To: <67E79ED1-14DE-4127-83AF-D17E8C72F362@strayalpha.com>
From: "C. M. Heard" <heard@pobox.com>
Date: Mon, 14 Jun 2021 22:10:00 -0700
X-Gmail-Original-Message-ID: <CACL_3VGOVTjzOBBCS4b+4X_cTFX6T=gYO4_htvr2idzQGUP+oQ@mail.gmail.com>
Message-ID: <CACL_3VGOVTjzOBBCS4b+4X_cTFX6T=gYO4_htvr2idzQGUP+oQ@mail.gmail.com>
To: Joseph Touch <touch@strayalpha.com>
Cc: TSVWG <tsvwg@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000000f67f305c4c6fd64"
X-Pobox-Relay-ID: F6DF4744-CD97-11EB-8BDC-FA9E2DDBB1FC-06080547!pb-smtp21.pobox.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/rmMSPDv1wJae4qgA7lqVCjIfALM>
Subject: Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jun 2021 05:10:25 -0000

On Sun, Jun 13, 2021 at 9:31 PM Joseph Touch wrote:

> On Jun 13, 2021, at 7:20 PM, C. M. Heard wrote:
>
>> I for one would appreciate further discussion of these last points. I
>> admit that I have failed to grasp Joe's message on the RDMA thread, and I
>> would appreciate some time to think about it .
>>
>
> Sure - here’s how it all works. Note that this is relevant mostly for long
> transfers with persistent UDP fragmentation; if that is assumed to be
> ‘adjusted’ at the app layer (as QUIC does), then we don’t need zero-copy
> support...
>
> - right now, UDP data can be zero-copied when received into user space,
> starting with the user data
> - if we add options, UDP data can still be zero-copied because it hasn’t
> moved (it still begins the payload
> - however, fragments are different because (esp given the merging of frag
> and lite) they don’t start at the beginning of data
> - they always start after OCS (which I think we should make fit the
> uniform KIND/LEN/OCS format of 4 bytes)
> - if the FRAG comes next, then we can move the frag content around a
> little and still support zero-copy
>
> notably, we move the first 10 bytes of the fragment to the end
> 4 for OCS
> 6 for FRAG (assuming FRAG includes KIND/OPTLEN/FRAGOFFSET/ID/FRAGLEN)
> that way we can zero-copy the frag packet into place, then just copy those
> last 8 bytes over OCS and the FRAG header
>
> This method assumes that we try to keep FRAG early in the packet -
> preferably right after OCS. The later it comes, the more additional bytes
> we need to move to “fix” the copy (beyond the 8 bytes noted above).
>
> —
>
> This method is the only reason we would want to allow options after
> non-terminal fragments - basically to keep the fragment toward the front of
> the packet, using the rule that post-noninitial frag options still operate
> on the fragment, rather than waiting for reassembly. The exception is the
> terminal fragment, where post-terminal fragment options operate on the
> reassembled packet.
>

I'm not understanding this AT ALL, and I apologize if there is well-known
stuff of which I am embarrassingly ignorant. That being said:

EVERY description of a zero-copy receive describes something involving MTUs
and highly constrained header length that allow the user data in a TCP
segment or UDP packet to be mapped to one or more kernel pages. Here is one
example:

PATH to TCP 4K MTU and RX zerocopy
<https://netdevconf.info/0x14/pub/slides/62/Implementing%20TCP%20RX%20zero%20copy.pdf>

In every case that I have found, the solutions apply only to a highly
constrained environment, such as a data center, and not over the Internet
writ large. Some even involve requiring the application to process the
transport headers, which is surely not an outcome that we wish in general.

If I am wrong -- and it would most assuredly not be the first time -- I am
eager to be disabused, preferably with a complete and open description of a
zero-copy technology without such shortcomings.

But if my conclusions are substantially correct, I don't think that TSVWG
should expend effort on zero copy for UDP fragment reassembly. Transport
options for UDP need to apply across the general Internet.

NOTE: the unfavorable conclusions that I make about zero-copy do NOT apply
to checksum offload; the advantages and applicability of that technology
(especially with OCS now defined to be an equivalent to the CCO proposal)
are readily apparent, even though they are not realizable in every
implementation.

Thanks

Mike Heard






>
>