Re: [tsvwg] RDMA Support by UDP FRAG Option
Tom Herbert <tom@herbertland.com> Mon, 21 June 2021 01:24 UTC
Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 38F043A1B75 for <tsvwg@ietfa.amsl.com>; Sun, 20 Jun 2021 18:24:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.003
X-Spam-Level:
X-Spam-Status: No, score=0.003 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id agGokhgTvW2U for <tsvwg@ietfa.amsl.com>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4FDC83A1B73 for <tsvwg@ietf.org>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
Received: by mail-ed1-x52d.google.com with SMTP id s15so16636917edt.13 for <tsvwg@ietf.org>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=O665wiGFPzUfA3Zy9a3ei4WI3vzMiIvfyM+yp5I7NF8=; b=v+izMlMxk1o8lI4Hlq5q9AIgzMexl51MjmF0cUvVIo5F5Nr53Z+vUshFhK9jsVXzks /jQQa6Vd3hHp2KOVJ61grKV30rbNXr2qbiiMtnQtgrM/U3kjOR1/93uab9rTYmxLYtAa CSn8oele1nCgFDBG6CDHBxhbphDaf3skZkI94cD+g2MIRXxtjdWOkWNezuvED8xLwQsy ouk9NUQFLrfT/01NYGkfpAr/vOQEHKD5ddh1FMxMay/UTBrz+55wNeHpBs08KKxiKS/X V1geXgUBkhchLlFHjMYSFhNclyAg5SAJwg/I4i04IgRLaiXIGbVPiByujTZzmCmsrFdd 9y7w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=O665wiGFPzUfA3Zy9a3ei4WI3vzMiIvfyM+yp5I7NF8=; b=URYia/rv3lx+oOhN//W+fQiKMTFuL9dY6CBdNCrKzuVWGAzEd2zD9h1tYAWc6wfFyd TFdUqBfe58StHSTf/uiO6P/3tzlnmf3Al/m0i2wikbo4XYrhBg9N5DBh3RAFj+OIBVq2 8mbJbig7J/e7gjfjW7HK40c0HXGDEKZF7iXptjilVCvqAmUtuzN8bmvUSbSWtaIxBqp6 V59aUwNrLsx3bEO9xIvLVCO4ZpygkZzSDEd/TitmKVPzp1ALakIbvawZOGY4ylI8zDIz T941PC+dWOYJ3bzb/IseXiug/BMQ8a97m4D4T2N3NrGT6s2LKUBce5NMyLoyzHaNo92+ e8Hw==
X-Gm-Message-State: AOAM531dVfGThGTqwsSvaFOO50Zr8Y7VcKvhd3QMa/2fVkCphGEetbcE tfpFYJ72Io6fglpge759TZa/M934/zXf0PAibGDy8Q==
X-Google-Smtp-Source: ABdhPJw3cj5uicx5Qscksqdi1UXzYgWT1bd/BmliZYo4U4Xe69W2tSy+ldhz/eC10u231e/AbgqvWtUldZaIfYLg3is=
X-Received: by 2002:a05:6402:26c7:: with SMTP id x7mr18461333edd.383.1624238682331; Sun, 20 Jun 2021 18:24:42 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VEyLdQZ-3hvzXxyA8ehtWs2hXESZ2OqyAx+BeSg85+-cA@mail.gmail.com> <CACL_3VFE4TjKvmkfZjvNpWo6vVfKjz5w85=Q+yqnYZKcwbYLmQ@mail.gmail.com> <63FFC34B-2179-47F1-B325-21CAC3D1543A@strayalpha.com> <CACL_3VHTfxWaBj7TFEmBXBqovrrAj7XuFEZFUag_iBHr3Hx09g@mail.gmail.com> <0EBFC9B0-591A-4860-B327-6E617B83F4D1@strayalpha.com> <CALx6S34pT81TbfQDk2vKF8wBrXL312As79K=rEzUQ3Lmg7UvpA@mail.gmail.com> <7C51D926-9DBB-41F5-93B2-10F716F672B1@strayalpha.com> <CALx6S37uN8TsXQZ3cv5jmxwxSyBRjK=-GQ_MsWxPWSs21XoGHw@mail.gmail.com> <CACL_3VEx7+VnLz7OLdXyhZU41e+-oBz3dc8JdMV_7pLMfic6=w@mail.gmail.com> <fcc8762f-c042-7999-d2e4-f28384950a19@erg.abdn.ac.uk> <CALx6S36sWGcZmFpAhF4DfOMyf6Z0w5F9bemNfeM1yWV-r0M+BA@mail.gmail.com> <8af3abf9-943f-13c1-e239-5efca27cf68c@erg.abdn.ac.uk> <CACL_3VHdyLAmzMbWsTVfJD+4tTzsMvcTzKS1B1CAdZ3k5U957g@mail.gmail.com> <CALx6S34DUrUBYd94LPPg4Hgh0FnZYZjZ4eKEYuaxb-7zbzb=pQ@mail.gmail.com> <CACL_3VEq9R=HmWXGbu_zcrgWfG0=q0z+HWM3cQ9Vh68hTCUR-w@mail.gmail.com> <CALx6S35bdGwY8FagGn8x5CaO4O3zW3U+NnB5ejC7bB6BHsXtJg@mail.gmail.com> <CACL_3VFwUJzT7uiXh33gBffboqqb51uFWJAEh290SsD0=aAzaQ@mail.gmail.com> <CALx6S34Lai=YS8i1VTC1zKHqsCTt_XUeKfwob7Qe_BA49bHC3A@mail.gmail.com> <CACL_3VFZphux8uCqh6seVgTEjyjOhCjGd-jHtdGc0fR9opKWUg@mail.gmail.com>
In-Reply-To: <CACL_3VFZphux8uCqh6seVgTEjyjOhCjGd-jHtdGc0fR9opKWUg@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sun, 20 Jun 2021 18:24:31 -0700
Message-ID: <CALx6S34Yrph523yd0vx9EsCscwrjJY2ek6VrEj+7zCDGTLyuPA@mail.gmail.com>
To: "C. M. Heard" <heard@pobox.com>
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, Joseph Touch <touch@strayalpha.com>, TSVWG <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/hHsubo1ViByca_AGKbAZ7iYVkco>
Subject: Re: [tsvwg] RDMA Support by UDP FRAG Option
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Jun 2021 01:24:51 -0000
On Sun, Jun 20, 2021 at 4:36 PM C. M. Heard <heard@pobox.com> wrote: > > On Sat, Jun 19, 2021 at 6:01 PM Tom Herbert wrote: >> >> On Sat, Jun 19, 2021 at 4:39 PM C. M. Heard wrote: >>> >>> 2.) Assuming that the trailer checksum (OCS) is present, it is not clear how offload for the inner packet checksum could be expected to work properly without change. Remember, OCS does not cause the data in the trailer to sum to zero. It causes it to sum to the ones-complement of the trailer length. >>> >>> The modifications needed to handle point #2 are small -- specifically, adding the trailer length to whatever would normally be preloaded into the inner packet. The FRAG proposal in draft-ietf-tsvwg-udp-options-13#section-5.5, which proposes to sandwich the payload in between option headers, would make it a whole lot harder. >> >> >> Protocol trailers are a fundamental problem. At this point it's not >> clear to me that there is a way to make them work correctly in all use >> cases with deployed HW. Protocol headers, like those used per FRAG, >> shouldn't be a problem, > > > I believe that fear is unjustified. In every case analyzed in detail so far -- including this one -- trailers involve only minor rework at most. I believe that I proved my case conclusively for the usual case (UDP used as a straight transport, not as an encapsulation) in https://mailarchive.ietf.org/arch/msg/tsvwg/RZULHOKRgrSYIvsI-5Hg7sNtSS8/. Mike, Please consider the simple case where the UDP checksum is set. There are two common methods to offload the checksum: protocol agnostic and protocol specific. In the protocol agnostic case the stack provides the starting offset of the checksum area (offset of the UDP header) and the offset of the checksum field to write the result (UDP checksum)-- these values are placed in the TX descriptor for a packet. The checksum field is primed with the pseudo header. The end of the checksum area is the end of the packet (trying to add a length of the area in the hardware interface is a non-starter). If there is surplus and that sums to zero then the algorithm works unchanged. If the surplus area sums to non-zero then that requires an offsetting sum by the host adding in the negative sum value of the surplus area into the checksum field. The good news is that this works with any case of checksum offload in a UDP packet including encapsulated checksums. In any case, if the surplus area always sums to zero that would not require any rework, a non-zero sum surplus area is going to require work in the OS (I have not looked at what would be required, but it might be viewed as significant in especially if we need to add a new field to mbuf, skbuff, etc.) Protocol specific transmit checksum offload is when the device parses the protocol and does all the work to send a checksum. It is not the preferred method, however it is still quite prevalent. We simply have no idea how devices will do checksum offload done in the presence of surplus area, I suspect implementations are all over the place. Some might only checksum over the UDP length, some might checksum over the surplus area also, some might use the wrong length in the pseudo header like described in draft. If I had to take a guess on what gives the best chances for this to work correctly for the most devices, it would be to always ensure the surplus space sums to zero. It's a similar story for receive checksum offload, there is a protocol agnostic and protocol specific method. The protocol generic method works in all cases, but does require that the host computes the checksum over the surplus area regardless of whether a checksum in the area was explicitly set. For protocol specific checksum offload, again we really have no idea how various devices will deal with surplus area. Like the send case, I believe that the greatest chance for compatibility would be to ensure the surplus area sums to zero. Tom > Additionally, while an encapsulator that uses UDP options may need to be modified in order to use hardware checksum offload, I think it's clear that the modifications needed are bounded and small. Actually for the case of a TCP inner packet the pseudo-header will be OK to use with a trailer that has OCS included if instead of the actual RCP length the pseudo-header includes TCP + trailer length. The driver still has to remember to copy only UDP length - 8 bytes. Arguably, if it were written correctly in the first place, it would be ready now. > > The proposal in the -13 draft does have problems in this regard. I'll reply to that thread in the next day or so with an explanation. > > Thanks, > > Mike Heard >
- [tsvwg] A counterproposal to Section 5.5 of draft… C. M. Heard
- Re: [tsvwg] A counterproposal to Section 5.5 of d… Joseph Touch
- Re: [tsvwg] A counterproposal to Section 5.5 of d… C. M. Heard
- [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Gorry Fairhurst
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- [tsvwg] incorrectly coalesce packets [was: Re: RD… Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Rodney W. Grimes
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] incorrectly coalesce packets [was: Re… Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Tom Herbert
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joe Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch
- Re: [tsvwg] RDMA Support by UDP FRAG Option C. M. Heard
- Re: [tsvwg] RDMA Support by UDP FRAG Option Joseph Touch