Re: [tsvwg] RDMA Support by UDP FRAG Option
Tom Herbert <tom@herbertland.com> Sat, 19 June 2021 22:07 UTC
Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8EBA63A07D4 for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 15:07:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.004
X-Spam-Level:
X-Spam-Status: No, score=0.004 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Q39ggM2roqZu for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 15:07:45 -0700 (PDT)
Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3FBC83A07D1 for <tsvwg@ietf.org>; Sat, 19 Jun 2021 15:07:44 -0700 (PDT)
Received: by mail-ej1-x62f.google.com with SMTP id gb32so13720800ejc.2 for <tsvwg@ietf.org>; Sat, 19 Jun 2021 15:07:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KrZYEzDht41/SopOtDlSCatIhjMP31TUzMAnZq9Afo8=; b=MSSK2TRWMk05J1IjpMZjao8VV1GHVSFuWgKGpaNbawxt1QTIX5lmwQcA0EXSOJxhmZ 84qjkCGi5grdaE+pdzq/PfqHdwtlgr3U1SoigB1okVz5ZXDmAO4bPEHu+OEhLuEG/85k jiwl/y7bJGYGHiBjmy37cLYPEm8bcRFkGMIioyr1bDnEmAnzKfyqCy42XUv3QCFDKFTM J2WyP/qB5UyNa71ntoBVCcBo43OOPtypGs02hj6vEEmgRYxfJnQTlQNcicr7Y07ma4/B xPbYCFa1jR2OWQSG4wbfA9IShfewbxcnGi0oAbTS+ZMBOFkWl4WXu1XAXkZvnY7e1UV0 35CQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KrZYEzDht41/SopOtDlSCatIhjMP31TUzMAnZq9Afo8=; b=n2r2mUYokpmchO7w8kA60sSLre+UkGYRNQ2q+8I5CRBshS2r/2RLeR2HttOfqNt2vX /xuKEcNx4+SSFhuDA2dewDVsSE9NObM2xZO3LqvVe59zKdWCoqpnVNPhTI/5mSAMiLlM LxxwNmd8g44eHvebn2SgnBxobsJOVvIuQ8nCHhUlGDiID5qAk/89b+49w9j/2W/M34y+ 7gM2G9E46VXnqadnYL0dlz/CXxQKYjN27OxmQC7Ruc0R/WwCKMNy4bxJxIPrkDOTxtFj g++l270DuLeAzCo/DrPuCfRa9e0Ct2aHyAVmeGpGsTZogNba3kDXkrr/pFC4NKIwXJvw EIHw==
X-Gm-Message-State: AOAM532m/S2fnDGdXjNEaPGFHJabyVj7rB6KuFHhq/BS9fWOLEnZfYDe Ws2pZfJ6DlBI2R0NenYNCYzj7wRTtZjlzen4GfwMiA==
X-Google-Smtp-Source: ABdhPJyWJKImBy/M3HHX0NUNR4NgndeEJyktXXtEjO0ejfQTYj6hyWojj1SmGoLvspNk3eCZjlkT7AvTAAx9U3XDwtg=
X-Received: by 2002:a17:906:4c58:: with SMTP id d24mr14244980ejw.298.1624140461669; Sat, 19 Jun 2021 15:07:41 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VEyLdQZ-3hvzXxyA8ehtWs2hXESZ2OqyAx+BeSg85+-cA@mail.gmail.com> <CACL_3VFE4TjKvmkfZjvNpWo6vVfKjz5w85=Q+yqnYZKcwbYLmQ@mail.gmail.com> <63FFC34B-2179-47F1-B325-21CAC3D1543A@strayalpha.com> <CACL_3VHTfxWaBj7TFEmBXBqovrrAj7XuFEZFUag_iBHr3Hx09g@mail.gmail.com> <0EBFC9B0-591A-4860-B327-6E617B83F4D1@strayalpha.com> <CALx6S34pT81TbfQDk2vKF8wBrXL312As79K=rEzUQ3Lmg7UvpA@mail.gmail.com> <7C51D926-9DBB-41F5-93B2-10F716F672B1@strayalpha.com> <CALx6S37uN8TsXQZ3cv5jmxwxSyBRjK=-GQ_MsWxPWSs21XoGHw@mail.gmail.com> <CACL_3VEx7+VnLz7OLdXyhZU41e+-oBz3dc8JdMV_7pLMfic6=w@mail.gmail.com> <fcc8762f-c042-7999-d2e4-f28384950a19@erg.abdn.ac.uk> <CALx6S36sWGcZmFpAhF4DfOMyf6Z0w5F9bemNfeM1yWV-r0M+BA@mail.gmail.com> <8af3abf9-943f-13c1-e239-5efca27cf68c@erg.abdn.ac.uk> <CACL_3VHdyLAmzMbWsTVfJD+4tTzsMvcTzKS1B1CAdZ3k5U957g@mail.gmail.com> <CALx6S34DUrUBYd94LPPg4Hgh0FnZYZjZ4eKEYuaxb-7zbzb=pQ@mail.gmail.com> <F2C7D790-4037-4D41-B30D-0F66AF084635@strayalpha.com> <CALx6S37VN_GyyQ7E_rnNCOG2tPS5wVR9jdGMjgy0aaAFYT7anQ@mail.gmail.com> <C9BB95CC-1A12-48B6-9E90-8ED56EF40F27@strayalpha.com>
In-Reply-To: <C9BB95CC-1A12-48B6-9E90-8ED56EF40F27@strayalpha.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sat, 19 Jun 2021 15:07:30 -0700
Message-ID: <CALx6S36FK7NVzMTdh+aUSpBdXrfT5C=KsAwoVBR8gU06E0TW5g@mail.gmail.com>
To: Joseph Touch <touch@strayalpha.com>
Cc: "C. M. Heard" <heard@pobox.com>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, TSVWG <tsvwg@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000489ac405c525ab8a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/Tza4WX9Q3_Bq0RVCEoa6z7mn4bU>
Subject: Re: [tsvwg] RDMA Support by UDP FRAG Option
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jun 2021 22:07:51 -0000
On Sat, Jun 19, 2021, 2:58 PM Joseph Touch <touch@strayalpha.com> wrote: > > > > On Jun 19, 2021, at 1:40 PM, Tom Herbert <tom@herbertland.com> wrote: > > > > On Sat, Jun 19, 2021 at 12:56 PM Joseph Touch <touch@strayalpha.com> > wrote: > >> > >> Tom, > >> > >>> On Jun 19, 2021, at 10:11 AM, Tom Herbert <tom@herbertland.com> wrote: > >>> > >>> ... > >>> There is another serious problem with transport checksums and use of > >>> the UDP surplus area. Most of this discussion has presumed that the > >>> UDP checksum is the one in the packet being offloaded, but that may > >>> not be the case. Consider the case where a sender sends a TCP packet > >>> that is encapsulated GRE/UDP or VXLAN (a very common use case in > >>> virtual networks where VMs send packets on their virtual networks). > >>> The stack will attempt to offload the innermost checksum which is the > >>> TCP checksum. > >> > >> I’m assuming this means TCP in GRE/UDP (it seems ambiguous). > >> > >>> The TCP checksum is the one offloaded regardless of > >>> whether or not the outer UDP checksum is zero (if it's non-zero then > >>> the stack would set it using local checksum offload (LCO)). > >> > >> Can you explain this? Packets are both defined and processed from the > outside in; to do anything else might not yield a meaningful result. > >> > > It is described in > > https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html > . > > > >>> The > >>> offloaded TCP checksum computation would start the computation at the > >>> first byte of the TCP header through the end of the whole packet. > >> > >> If TCP interprets “end” to mean anything beyond what the UDP header > indicates, it is that TCP stack that is broken. > >> > > The assumption is that there are no bits in the packet beyond the > > transport layer on transmit. This is a valid assumption since there > > are currently no use cases where this is true; > > That is the fundamental error; this is not a valid assumption. > > > UDP options would be > > the first instance of supporting trailers. > > TCP has had trailers too, at one point. > > > On the receive side, stacks > > will properly handle checksum offload with data in the surplu space. > > > >> (Presumably we’re talking about non-UDP fragmented packets; with UDP > fragmentation, the TCP operation should never happen until the UDP > fragments are reassembled). > >> > >>> So > >>> unless the surplus area is properly checksummed then the computed TCP > >>> checksum will be invalid and the packet will be dropped at the > >>> receiver. > >> > >> See above. > >> > >>> This is not just a problem for offload for offload, I > >>> believe that this wouldn't work properly in existing software stacks > >>> without some major changes. > >> > >> Same problem, same answer. > >> > >>> So to make all uses of transport checksum computation and offload > >>> reasonably robust, when the UDP surplus area is being used both the > >>> UDP checksum and checksum over the surplus area MUST always be set. > >> > >> If protocol stacks that try to peek ahead in layers don’t follow the > rules, there’s not much we can do, ever. > >> > >>> FYI, here is some nice background checksum offload is > >>> > https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html > >> > >> There is an error in draft-herbert-remotecsumoffload in Section 2.1; it > states that the UDP checksum is over the upper layer packet length (it is > not). RFC768 defines the UDP checksum over the pseudoheader, UDP header, > and data (not referring to the IP packet except for pseudo header info). In > fact, it also refers to the pseudoheader as using the UDP Length, not the > IP length (adjusted or not). > >> > >> The same error appears in the draft-herbert-vxlan-rco in the definition > of packet_csum. > >> > >> If either doc reflects how offloading is implemented, then those are > bugs that should be fixed. > >> > >> In TCPM we’ve identified a number of additional issues with TCP > offload, notably regarding how they incorrectly coalesce packets with > different TCP headers. > >> > >> If we’re not calling out these behaviors as the bugs they are, there’s > little point in doing much of anything in the IETF. > >> > > You're welcome to call these behavior bugs if you want, but the fact > > is they are correct behavior and robust behavior for all currently > > defined IETF protocols as evidenced by the fact that the Internet runs > > on billions of devices with these behaviors. > > “Currently works” is not the same as “correct”. > > Correct follows the spec; the above notes do not. They were never in WG > docs or that would have been presumably corrected. > > > The problem you are > > hitting is that we have over thirty years of deployment and > > implementation experience with Internet protocols that follow some > > basic principles and conventions like use protocol headers and not > > trailers. > > It’s not about headers vs. trailers. It’s about whether you follow the > specs or not. > > > So while a protocol that diverges from those principles and > > conventions might be academically correct on paper, in deployment it > > may be replete with a myriad of issues which is what we see when > > delving into the details of how UDP options interact with real stacks > > and devices. If the goal is to produce a deployable and performant > > protocol, which I believe is the purpose of IETF, then we need to take > > realities of deployment and implementation into account. > > We need to start bu not continuing to refer to false claims in unpublished > drafts. > > When we find errors, we should fix them, not propagate them. > Joe, A nice thing about an open source project like Linux is that anyone who thinks there's a bug they can submit a patch and it will be accepted *if* you can justify the patch to the maintainers. Good luck, if you want to take that on! Tom > Joe
- [tsvwg] A counterproposal to Section 5.5 of draft… C. M. Heard
- Re: [tsvwg] A counterproposal to Section 5.5 of d… Joseph Touch
- Re: [tsvwg] A counterproposal to Section 5.5 of d… C. M. Heard
- [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- [tsvwg] incorrectly coalesce packets [was: Re: RD… Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] incorrectly coalesce packets [was: Re… Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch