Re: [tsvwg] RDMA Support by UDP FRAG Option
Tom Herbert <tom@herbertland.com> Sat, 19 June 2021 20:40 UTC
Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 902043A1F5D for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 13:40:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.003
X-Spam-Level:
X-Spam-Status: No, score=0.003 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ORwGqQ1tma6U for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 13:40:33 -0700 (PDT)
Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7FCC63A1F5C for <tsvwg@ietf.org>; Sat, 19 Jun 2021 13:40:33 -0700 (PDT)
Received: by mail-ed1-x52b.google.com with SMTP id s15so13371479edt.13 for <tsvwg@ietf.org>; Sat, 19 Jun 2021 13:40:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=dr2rPPS2g+5ZY8gKtmRz4jTiPI+2UwHNOZ2N3/RX5dg=; b=m26p69ZANtSsoj8/sJVLXv5MPZi2JXD99J/xPwApOcDxGkCOtVnE4ksShNeV+CPEzf mfMOkw8Qvmi1sVuJWEYvSNC2TdgULphFqYq8y3TlSDUAk4XwUrkhjj/gmwQTsa/NiYtk vBobIs0nj0QeK9WE1slVbTBXREVAR9o4Ib9y10EP3ItR0nRJrZtNfwOiPZWjom2SUQt7 6L+8vjUDoFiAuxpV5YhubNPZLsBii7JH3ko+XxOLUq3iwYqCLGJCyT5bxA1G0AhBWh+f Jf1xgsmzjtXmlWL/2ruNbEl5KiZqlM8G88lFzlerjylm9sm2541mF6KcD5sZMBTROKh1 +IDQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=dr2rPPS2g+5ZY8gKtmRz4jTiPI+2UwHNOZ2N3/RX5dg=; b=N1iCrk+jJ57BkRQa61tjaZXhNaSRcF9J1PlZdCwQt2kcsN3Granyvirz+inSZvNlUv BJUm7vpFT+Pb4VSZ36SEqp81CLi54VYm5ac3K3uhNKCEy2+uf9bEUG1/mzgQ3jkoekFo 5FZ+03TJSLdsxLT+NazmJIEWLCUCcTu8P2JpjZKVkOx5aBHzduBy/cTpWxKfsV1GStfc YyOL4R3gCLuj8Yw4HgQYh7s6c6exjto9s5Tcm620caFUPogrllVLtWYL9ATL7/ze1qG4 qzcjzXmumnncF56Aq99EMeMYyLckk5IadHNEwRwSh5BhHEH5M+nLsbN8aXGtCSz/2EDe Orsg==
X-Gm-Message-State: AOAM532qkVOkg2KcknyeBqfJPf9KXf0d08Oz79kreX8BBjeuJjWIrxC4 XE+KOMpNIW6/GmOQSrTfj9sVLCG1ybcXQcG/vYLQsw==
X-Google-Smtp-Source: ABdhPJx+p8ut2i4NuPrFnej/VM8zuOAXBVcmxNJxY07dBO22LES2FlSHK4J6bDM8/6tfDxRpHNDDkV/P2hUmvc1rQUs=
X-Received: by 2002:a05:6402:26c7:: with SMTP id x7mr12397738edd.383.1624135230899; Sat, 19 Jun 2021 13:40:30 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VEyLdQZ-3hvzXxyA8ehtWs2hXESZ2OqyAx+BeSg85+-cA@mail.gmail.com> <CACL_3VFE4TjKvmkfZjvNpWo6vVfKjz5w85=Q+yqnYZKcwbYLmQ@mail.gmail.com> <63FFC34B-2179-47F1-B325-21CAC3D1543A@strayalpha.com> <CACL_3VHTfxWaBj7TFEmBXBqovrrAj7XuFEZFUag_iBHr3Hx09g@mail.gmail.com> <0EBFC9B0-591A-4860-B327-6E617B83F4D1@strayalpha.com> <CALx6S34pT81TbfQDk2vKF8wBrXL312As79K=rEzUQ3Lmg7UvpA@mail.gmail.com> <7C51D926-9DBB-41F5-93B2-10F716F672B1@strayalpha.com> <CALx6S37uN8TsXQZ3cv5jmxwxSyBRjK=-GQ_MsWxPWSs21XoGHw@mail.gmail.com> <CACL_3VEx7+VnLz7OLdXyhZU41e+-oBz3dc8JdMV_7pLMfic6=w@mail.gmail.com> <fcc8762f-c042-7999-d2e4-f28384950a19@erg.abdn.ac.uk> <CALx6S36sWGcZmFpAhF4DfOMyf6Z0w5F9bemNfeM1yWV-r0M+BA@mail.gmail.com> <8af3abf9-943f-13c1-e239-5efca27cf68c@erg.abdn.ac.uk> <CACL_3VHdyLAmzMbWsTVfJD+4tTzsMvcTzKS1B1CAdZ3k5U957g@mail.gmail.com> <CALx6S34DUrUBYd94LPPg4Hgh0FnZYZjZ4eKEYuaxb-7zbzb=pQ@mail.gmail.com> <F2C7D790-4037-4D41-B30D-0F66AF084635@strayalpha.com>
In-Reply-To: <F2C7D790-4037-4D41-B30D-0F66AF084635@strayalpha.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sat, 19 Jun 2021 13:40:19 -0700
Message-ID: <CALx6S37VN_GyyQ7E_rnNCOG2tPS5wVR9jdGMjgy0aaAFYT7anQ@mail.gmail.com>
To: Joseph Touch <touch@strayalpha.com>
Cc: "C. M. Heard" <heard@pobox.com>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, TSVWG <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ccVIgu7KgQnidb6EuFWsX28slns>
Subject: Re: [tsvwg] RDMA Support by UDP FRAG Option
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jun 2021 20:40:39 -0000
On Sat, Jun 19, 2021 at 12:56 PM Joseph Touch <touch@strayalpha.com> wrote: > > Tom, > > > On Jun 19, 2021, at 10:11 AM, Tom Herbert <tom@herbertland.com> wrote: > > > > ... > > There is another serious problem with transport checksums and use of > > the UDP surplus area. Most of this discussion has presumed that the > > UDP checksum is the one in the packet being offloaded, but that may > > not be the case. Consider the case where a sender sends a TCP packet > > that is encapsulated GRE/UDP or VXLAN (a very common use case in > > virtual networks where VMs send packets on their virtual networks). > > The stack will attempt to offload the innermost checksum which is the > > TCP checksum. > > I’m assuming this means TCP in GRE/UDP (it seems ambiguous). > > > The TCP checksum is the one offloaded regardless of > > whether or not the outer UDP checksum is zero (if it's non-zero then > > the stack would set it using local checksum offload (LCO)). > > Can you explain this? Packets are both defined and processed from the outside in; to do anything else might not yield a meaningful result. > It is described in https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html. > > The > > offloaded TCP checksum computation would start the computation at the > > first byte of the TCP header through the end of the whole packet. > > If TCP interprets “end” to mean anything beyond what the UDP header indicates, it is that TCP stack that is broken. > The assumption is that there are no bits in the packet beyond the transport layer on transmit. This is a valid assumption since there are currently no use cases where this is true; UDP options would be the first instance of supporting trailers. On the receive side, stacks will properly handle checksum offload with data in the surplu space. > (Presumably we’re talking about non-UDP fragmented packets; with UDP fragmentation, the TCP operation should never happen until the UDP fragments are reassembled). > > > So > > unless the surplus area is properly checksummed then the computed TCP > > checksum will be invalid and the packet will be dropped at the > > receiver. > > See above. > > > This is not just a problem for offload for offload, I > > believe that this wouldn't work properly in existing software stacks > > without some major changes. > > Same problem, same answer. > > > So to make all uses of transport checksum computation and offload > > reasonably robust, when the UDP surplus area is being used both the > > UDP checksum and checksum over the surplus area MUST always be set. > > If protocol stacks that try to peek ahead in layers don’t follow the rules, there’s not much we can do, ever. > > > FYI, here is some nice background checksum offload is > > https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html > > There is an error in draft-herbert-remotecsumoffload in Section 2.1; it states that the UDP checksum is over the upper layer packet length (it is not). RFC768 defines the UDP checksum over the pseudoheader, UDP header, and data (not referring to the IP packet except for pseudo header info). In fact, it also refers to the pseudoheader as using the UDP Length, not the IP length (adjusted or not). > > The same error appears in the draft-herbert-vxlan-rco in the definition of packet_csum. > > If either doc reflects how offloading is implemented, then those are bugs that should be fixed. > > In TCPM we’ve identified a number of additional issues with TCP offload, notably regarding how they incorrectly coalesce packets with different TCP headers. > > If we’re not calling out these behaviors as the bugs they are, there’s little point in doing much of anything in the IETF. > You're welcome to call these behavior bugs if you want, but the fact is they are correct behavior and robust behavior for all currently defined IETF protocols as evidenced by the fact that the Internet runs on billions of devices with these behaviors. The problem you are hitting is that we have over thirty years of deployment and implementation experience with Internet protocols that follow some basic principles and conventions like use protocol headers and not trailers. So while a protocol that diverges from those principles and conventions might be academically correct on paper, in deployment it may be replete with a myriad of issues which is what we see when delving into the details of how UDP options interact with real stacks and devices. If the goal is to produce a deployable and performant protocol, which I believe is the purpose of IETF, then we need to take realities of deployment and implementation into account. Tom > Joe
- [tsvwg] A counterproposal to Section 5.5 of draft… C. M. Heard
- Re: [tsvwg] A counterproposal to Section 5.5 of d… Joseph Touch
- Re: [tsvwg] A counterproposal to Section 5.5 of d… C. M. Heard
- [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- [tsvwg] incorrectly coalesce packets [was: Re: RD… Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] incorrectly coalesce packets [was: Re… Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch