Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12

Tom Herbert <tom@herbertland.com> Mon, 14 June 2021 16:18 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B10DC3A299D for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 09:18:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g5bHfE0aUWZl for <tsvwg@ietfa.amsl.com>; Mon, 14 Jun 2021 09:18:08 -0700 (PDT)
Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B7ED3A299C for <tsvwg@ietf.org>; Mon, 14 Jun 2021 09:18:08 -0700 (PDT)
Received: by mail-ej1-x636.google.com with SMTP id ce15so17577370ejb.4 for <tsvwg@ietf.org>; Mon, 14 Jun 2021 09:18:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=W8+t5dTaBHXmzrW7v8mCmmjK248lx0ZOzqcI54RBXdM=; b=RRHeETLeRmDyHN6Ycsj6LkX5dC7GPv54U0CltL/PY2prrekWipdR5zXKfE5r2T6vZX Mf00Sjd6IN2uIjLTMeeWXyVKiSw9+uMjDxPm/njCzrRPGPk5oIyWogmJTMjXS3JCSdex l963UHfIYwrc0OyqExHIoRCP78GLR/Ftj47tPtPdBOh7vWlFlW2WRSNKYuFqtv627eGY GIPxgcBKkorWG792ampUSn7m4T4xJJ2Qg8eRTvk9OCIsg8OYdoPsCbdCK2HkFpWALjcV /wgOp6VJMIws4cpHCo1paI2Aj1BYM6cswqf/mrqs0GgTXz2e3ygy/Cq9ysxIokbEtqGQ ge1g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=W8+t5dTaBHXmzrW7v8mCmmjK248lx0ZOzqcI54RBXdM=; b=NS2OCC0Ohfo1jE0cfH0Ir2awSUpRGUmDRaIrbrK6wkKOB1ujiKI6w6nb6mUzRpnQbB L02g59xpsOdoPTip45raorLvV2twARpkFiKDP1NtNLYlRGxxm2STGVrFsdNpxiEDOtxl ++plJKNk3RiZKrpYnPdquAl6PvzShJiJlsSUK6QAuATYhfYbCiFKQv8ZhuePvlJWY4Io iEMF/AjiRtNwk49mhKiZUYKmLHnHT9aNZA23I2tVNLSrg2eFoCohgqqgYLyr1feKP3Sv KAoe9ac4XJyStmCFLAxe5bDCW5wOR2XpLU2mH7g+9medN6yZOtaPsBpCpfDgS/6fglW2 aX9w==
X-Gm-Message-State: AOAM531Fp4wSBjzbwJRRz/MVqQjxZelNUm1gbmrVOAAslee5zYx/s/TI yoZqgpmGsa1ehcx7CeGl2oLjgONkFOB+99nhNbqFYQ==
X-Google-Smtp-Source: ABdhPJyeec4XsmJCyNJxtKc5VOMfC1Zyy2eKYIVadjl1THUmObfcjfnwNVR2D4WvrzGGJ3Fo+BbO/0Y1oZBZAqt9NBo=
X-Received: by 2002:a17:906:eb17:: with SMTP id mb23mr16374511ejb.239.1623687481473; Mon, 14 Jun 2021 09:18:01 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VGb_9P5SfPGRJtf1ZBvEhgywc2ZEGr-qbgNOMXV20rFeA@mail.gmail.com> <CACL_3VHyoRr5ju8203DiLTUo-658DCj7ud+1dQE2o0hUPVhF0A@mail.gmail.com> <7D766992-AEEB-434F-BB1D-3817EE07DE61@strayalpha.com> <1BBDBD80-3A53-4700-A79F-9A3AE4876F2B@strayalpha.com> <CACL_3VEXCT-sSNhtncVK26DPQefDLJhqEijgDke4Q7DmhRrpTQ@mail.gmail.com> <67E79ED1-14DE-4127-83AF-D17E8C72F362@strayalpha.com>
In-Reply-To: <67E79ED1-14DE-4127-83AF-D17E8C72F362@strayalpha.com>
From: Tom Herbert <tom@herbertland.com>
Date: Mon, 14 Jun 2021 09:17:50 -0700
Message-ID: <CALx6S37faGXPaC-4qZ0e_3CM5hSFQhrDOQydVvdjxzY5zKf5SA@mail.gmail.com>
To: Joseph Touch <touch@strayalpha.com>
Cc: "C. M. Heard" <heard@pobox.com>, TSVWG <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/VNCSnBpXNPro_z4SJ99wJ_t0q34>
Subject: Re: [tsvwg] A review of draft-ietf-tsvwg-udp-options-12
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2021 16:18:14 -0000

On Sun, Jun 13, 2021 at 9:31 PM Joseph Touch <touch@strayalpha.com> wrote:
>
>
>
> On Jun 13, 2021, at 7:20 PM, C. M. Heard <heard@pobox.com> wrote:
>
>> If we DO support zero-copy and thus want to allow non-terminal fragments to have post-fragoption options that operate on each fragment, then we would add THISFRAGLEN to the nonterminal format and issue different KIND numbers to nonterminal/terminal fragment.
>
>
>
> I for one would appreciate further discussion of these last points. I admit that I have failed to grasp Joe's message on the RDMA thread, and I would appreciate some time to think about it.
>
>
> Sure - here’s how it all works. Note that this is relevant mostly for long transfers with persistent UDP fragmentation; if that is assumed to be ‘adjusted’ at the app layer (as QUIC does), then we don’t need zero-copy support...
>
> - right now, UDP data can be zero-copied when received into user space, starting with the user data

Only if the device supports header/data split where the headers are in
one buffer and UDP data is in aligned buffer.

> - if we add options, UDP data can still be zero-copied because it hasn’t moved (it still begins the payload
> - however, fragments are different because (esp given the merging of frag and lite) they don’t start at the beginning of data
> - they always start after OCS (which I think we should make fit the uniform KIND/LEN/OCS format of 4 bytes)
> - if the FRAG comes next, then we can move the frag content around a little and still support zero-copy
>
> notably, we move the first 10 bytes of the fragment to the end
> 4 for OCS
> 6 for FRAG (assuming FRAG includes KIND/OPTLEN/FRAGOFFSET/ID/FRAGLEN)
> that way we can zero-copy the frag packet into place, then just copy those last 8 bytes over OCS and the FRAG header
>

An obvious feature we'd want is NIC hardware to do UDP options
fragementation and reassembly, analogous to existing UDP Fragmentation
Offload (UFO) which performs IP fragmentation of UDP packets. The
impediment with supporting this is that hardware devices would need to
perform protocol processing on trailers as opposed to headers. Nearly
all hardware devices, including switches and NICs, are optimized to
process protocol headers and in modern devices they are quite
programmable in that regard. However, they typically rely on a parsing
buffer that holds the first N bytes of the packet and assume that all
the protocol headers lie within that. They wouldn't process data after
that header in the fast path at least, and almost certainly would have
capability to process protocol headers at that end of a large packet.
I am doubtful we'll ever see hardware support for trailer protocols,
and hence it's unlikely we'd see accelerations for UDP options like we
have for TCP.

Tom

> This method assumes that we try to keep FRAG early in the packet - preferably right after OCS. The later it comes, the more additional bytes we need to move to “fix” the copy (beyond the 8 bytes noted above).
>
> —
>
> This method is the only reason we would want to allow options after non-terminal fragments - basically to keep the fragment toward the front of the packet, using the rule that post-noninitial frag options still operate on the fragment, rather than waiting for reassembly. The exception is the terminal fragment, where post-terminal fragment options operate on the reassembled packet.
>
> Joe
>
>
>
>