Re: [tsvwg] RDMA Support by UDP FRAG Option
Joseph Touch <touch@strayalpha.com> Sat, 19 June 2021 21:58 UTC
Return-Path: <touch@strayalpha.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B3653A0654 for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 14:58:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.454
X-Spam-Level:
X-Spam-Status: No, score=0.454 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EXUi7BYHJpSt for <tsvwg@ietfa.amsl.com>; Sat, 19 Jun 2021 14:58:22 -0700 (PDT)
Received: from server217-4.web-hosting.com (server217-4.web-hosting.com [198.54.116.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 030083A0650 for <tsvwg@ietf.org>; Sat, 19 Jun 2021 14:58:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id: Content-Transfer-Encoding:Cc:Date:In-Reply-To:From:Subject:Mime-Version: Content-Type:Sender:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=SzCyrYMYBGuMFY0AiPiIP3OnODbqtaRdzrQ9qqPwuR4=; b=N79q2vxsnOnkt6/a0YReGQbGUN D2tMJpo7BcidpFxAjGM3F9PE+r0K4ajXeo34uw27sH94E9G1i0ZnQBSEQxv656cvb31QY6S0XTCIv wOLWxxdn6YWTPmIb/osXVx1kgVaJu1hPURRDiL4KsbXQTHKLndGL4sCY22qQl7rwZdNaJCuT/4QfR MhSva4Tv9JixcAcfhSO4xbmGDB+DVCBnfUgIJ+VA2pGPpvTlG1muiatCxZjKwW/CKS3pJ7Kl8LVk2 MICdJLEIzO7+QEVCrkibM5plk+NG1bUOjq9y/6xVhbnE9SE+H8qwQbsItQ0xKNE/+qikInvcSp/vV 1JftyjAA==;
Received: from cpe-172-250-225-198.socal.res.rr.com ([172.250.225.198]:58291 helo=smtpclient.apple) by server217.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <touch@strayalpha.com>) id 1luiyl-002dHN-A3; Sat, 19 Jun 2021 17:58:20 -0400
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.100.0.2.22\))
From: Joseph Touch <touch@strayalpha.com>
In-Reply-To: <CALx6S37VN_GyyQ7E_rnNCOG2tPS5wVR9jdGMjgy0aaAFYT7anQ@mail.gmail.com>
Date: Sat, 19 Jun 2021 14:58:13 -0700
Cc: "C. M. Heard" <heard@pobox.com>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, TSVWG <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <C9BB95CC-1A12-48B6-9E90-8ED56EF40F27@strayalpha.com>
References: <CACL_3VEyLdQZ-3hvzXxyA8ehtWs2hXESZ2OqyAx+BeSg85+-cA@mail.gmail.com> <CACL_3VFE4TjKvmkfZjvNpWo6vVfKjz5w85=Q+yqnYZKcwbYLmQ@mail.gmail.com> <63FFC34B-2179-47F1-B325-21CAC3D1543A@strayalpha.com> <CACL_3VHTfxWaBj7TFEmBXBqovrrAj7XuFEZFUag_iBHr3Hx09g@mail.gmail.com> <0EBFC9B0-591A-4860-B327-6E617B83F4D1@strayalpha.com> <CALx6S34pT81TbfQDk2vKF8wBrXL312As79K=rEzUQ3Lmg7UvpA@mail.gmail.com> <7C51D926-9DBB-41F5-93B2-10F716F672B1@strayalpha.com> <CALx6S37uN8TsXQZ3cv5jmxwxSyBRjK=-GQ_MsWxPWSs21XoGHw@mail.gmail.com> <CACL_3VEx7+VnLz7OLdXyhZU41e+-oBz3dc8JdMV_7pLMfic6=w@mail.gmail.com> <fcc8762f-c042-7999-d2e4-f28384950a19@erg.abdn.ac.uk> <CALx6S36sWGcZmFpAhF4DfOMyf6Z0w5F9bemNfeM1yWV-r0M+BA@mail.gmail.com> <8af3abf9-943f-13c1-e239-5efca27cf68c@erg.abdn.ac.uk> <CACL_3VHdyLAmzMbWsTVfJD+4tTzsMvcTzKS1B1CAdZ3k5U957g@mail.gmail.com> <CALx6S34DUrUBYd94LPPg4Hgh0FnZYZjZ4eKEYuaxb-7zbzb=pQ@mail.gmail.com> <F2C7D790-4037-4D41-B30D-0F66AF084635@strayalpha.com> <CALx6S37VN_GyyQ7E_rnNCOG2tPS5wVR9jdGMjgy0aaAFYT7anQ@mail.gmail.com>
To: Tom Herbert <tom@herbertland.com>
X-Mailer: Apple Mail (2.3654.100.0.2.22)
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/LEtBkxEa2jW0ZqI1APO9OPxWex8>
Subject: Re: [tsvwg] RDMA Support by UDP FRAG Option
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Jun 2021 21:58:27 -0000
> On Jun 19, 2021, at 1:40 PM, Tom Herbert <tom@herbertland.com> wrote: > > On Sat, Jun 19, 2021 at 12:56 PM Joseph Touch <touch@strayalpha.com> wrote: >> >> Tom, >> >>> On Jun 19, 2021, at 10:11 AM, Tom Herbert <tom@herbertland.com> wrote: >>> >>> ... >>> There is another serious problem with transport checksums and use of >>> the UDP surplus area. Most of this discussion has presumed that the >>> UDP checksum is the one in the packet being offloaded, but that may >>> not be the case. Consider the case where a sender sends a TCP packet >>> that is encapsulated GRE/UDP or VXLAN (a very common use case in >>> virtual networks where VMs send packets on their virtual networks). >>> The stack will attempt to offload the innermost checksum which is the >>> TCP checksum. >> >> I’m assuming this means TCP in GRE/UDP (it seems ambiguous). >> >>> The TCP checksum is the one offloaded regardless of >>> whether or not the outer UDP checksum is zero (if it's non-zero then >>> the stack would set it using local checksum offload (LCO)). >> >> Can you explain this? Packets are both defined and processed from the outside in; to do anything else might not yield a meaningful result. >> > It is described in > https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html. > >>> The >>> offloaded TCP checksum computation would start the computation at the >>> first byte of the TCP header through the end of the whole packet. >> >> If TCP interprets “end” to mean anything beyond what the UDP header indicates, it is that TCP stack that is broken. >> > The assumption is that there are no bits in the packet beyond the > transport layer on transmit. This is a valid assumption since there > are currently no use cases where this is true; That is the fundamental error; this is not a valid assumption. > UDP options would be > the first instance of supporting trailers. TCP has had trailers too, at one point. > On the receive side, stacks > will properly handle checksum offload with data in the surplu space. > >> (Presumably we’re talking about non-UDP fragmented packets; with UDP fragmentation, the TCP operation should never happen until the UDP fragments are reassembled). >> >>> So >>> unless the surplus area is properly checksummed then the computed TCP >>> checksum will be invalid and the packet will be dropped at the >>> receiver. >> >> See above. >> >>> This is not just a problem for offload for offload, I >>> believe that this wouldn't work properly in existing software stacks >>> without some major changes. >> >> Same problem, same answer. >> >>> So to make all uses of transport checksum computation and offload >>> reasonably robust, when the UDP surplus area is being used both the >>> UDP checksum and checksum over the surplus area MUST always be set. >> >> If protocol stacks that try to peek ahead in layers don’t follow the rules, there’s not much we can do, ever. >> >>> FYI, here is some nice background checksum offload is >>> https://www.kernel.org/doc/html/latest/networking/checksum-offloads.html >> >> There is an error in draft-herbert-remotecsumoffload in Section 2.1; it states that the UDP checksum is over the upper layer packet length (it is not). RFC768 defines the UDP checksum over the pseudoheader, UDP header, and data (not referring to the IP packet except for pseudo header info). In fact, it also refers to the pseudoheader as using the UDP Length, not the IP length (adjusted or not). >> >> The same error appears in the draft-herbert-vxlan-rco in the definition of packet_csum. >> >> If either doc reflects how offloading is implemented, then those are bugs that should be fixed. >> >> In TCPM we’ve identified a number of additional issues with TCP offload, notably regarding how they incorrectly coalesce packets with different TCP headers. >> >> If we’re not calling out these behaviors as the bugs they are, there’s little point in doing much of anything in the IETF. >> > You're welcome to call these behavior bugs if you want, but the fact > is they are correct behavior and robust behavior for all currently > defined IETF protocols as evidenced by the fact that the Internet runs > on billions of devices with these behaviors. “Currently works” is not the same as “correct”. Correct follows the spec; the above notes do not. They were never in WG docs or that would have been presumably corrected. > The problem you are > hitting is that we have over thirty years of deployment and > implementation experience with Internet protocols that follow some > basic principles and conventions like use protocol headers and not > trailers. It’s not about headers vs. trailers. It’s about whether you follow the specs or not. > So while a protocol that diverges from those principles and > conventions might be academically correct on paper, in deployment it > may be replete with a myriad of issues which is what we see when > delving into the details of how UDP options interact with real stacks > and devices. If the goal is to produce a deployable and performant > protocol, which I believe is the purpose of IETF, then we need to take > realities of deployment and implementation into account. We need to start bu not continuing to refer to false claims in unpublished drafts. When we find errors, we should fix them, not propagate them. Joe
- [tsvwg] A counterproposal to Section 5.5 of draft… C. M. Heard
- Re: [tsvwg] A counterproposal to Section 5.5 of d… Joseph Touch
- Re: [tsvwg] A counterproposal to Section 5.5 of d… C. M. Heard
- [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- [tsvwg] incorrectly coalesce packets [was: Re: RD… Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] incorrectly coalesce packets [was: Re… Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch