Re: [tsvwg] RDMA Support by UDP FRAG Option

Tom Herbert <tom@herbertland.com> Mon, 21 June 2021 01:24 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 38F043A1B75 for <tsvwg@ietfa.amsl.com>; Sun, 20 Jun 2021 18:24:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.003
X-Spam-Level:
X-Spam-Status: No, score=0.003 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id agGokhgTvW2U for <tsvwg@ietfa.amsl.com>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4FDC83A1B73 for <tsvwg@ietf.org>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
Received: by mail-ed1-x52d.google.com with SMTP id s15so16636917edt.13 for <tsvwg@ietf.org>; Sun, 20 Jun 2021 18:24:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=O665wiGFPzUfA3Zy9a3ei4WI3vzMiIvfyM+yp5I7NF8=; b=v+izMlMxk1o8lI4Hlq5q9AIgzMexl51MjmF0cUvVIo5F5Nr53Z+vUshFhK9jsVXzks /jQQa6Vd3hHp2KOVJ61grKV30rbNXr2qbiiMtnQtgrM/U3kjOR1/93uab9rTYmxLYtAa CSn8oele1nCgFDBG6CDHBxhbphDaf3skZkI94cD+g2MIRXxtjdWOkWNezuvED8xLwQsy ouk9NUQFLrfT/01NYGkfpAr/vOQEHKD5ddh1FMxMay/UTBrz+55wNeHpBs08KKxiKS/X V1geXgUBkhchLlFHjMYSFhNclyAg5SAJwg/I4i04IgRLaiXIGbVPiByujTZzmCmsrFdd 9y7w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=O665wiGFPzUfA3Zy9a3ei4WI3vzMiIvfyM+yp5I7NF8=; b=URYia/rv3lx+oOhN//W+fQiKMTFuL9dY6CBdNCrKzuVWGAzEd2zD9h1tYAWc6wfFyd TFdUqBfe58StHSTf/uiO6P/3tzlnmf3Al/m0i2wikbo4XYrhBg9N5DBh3RAFj+OIBVq2 8mbJbig7J/e7gjfjW7HK40c0HXGDEKZF7iXptjilVCvqAmUtuzN8bmvUSbSWtaIxBqp6 V59aUwNrLsx3bEO9xIvLVCO4ZpygkZzSDEd/TitmKVPzp1ALakIbvawZOGY4ylI8zDIz T941PC+dWOYJ3bzb/IseXiug/BMQ8a97m4D4T2N3NrGT6s2LKUBce5NMyLoyzHaNo92+ e8Hw==
X-Gm-Message-State: AOAM531dVfGThGTqwsSvaFOO50Zr8Y7VcKvhd3QMa/2fVkCphGEetbcE tfpFYJ72Io6fglpge759TZa/M934/zXf0PAibGDy8Q==
X-Google-Smtp-Source: ABdhPJw3cj5uicx5Qscksqdi1UXzYgWT1bd/BmliZYo4U4Xe69W2tSy+ldhz/eC10u231e/AbgqvWtUldZaIfYLg3is=
X-Received: by 2002:a05:6402:26c7:: with SMTP id x7mr18461333edd.383.1624238682331; Sun, 20 Jun 2021 18:24:42 -0700 (PDT)
MIME-Version: 1.0
References: <CACL_3VEyLdQZ-3hvzXxyA8ehtWs2hXESZ2OqyAx+BeSg85+-cA@mail.gmail.com> <CACL_3VFE4TjKvmkfZjvNpWo6vVfKjz5w85=Q+yqnYZKcwbYLmQ@mail.gmail.com> <63FFC34B-2179-47F1-B325-21CAC3D1543A@strayalpha.com> <CACL_3VHTfxWaBj7TFEmBXBqovrrAj7XuFEZFUag_iBHr3Hx09g@mail.gmail.com> <0EBFC9B0-591A-4860-B327-6E617B83F4D1@strayalpha.com> <CALx6S34pT81TbfQDk2vKF8wBrXL312As79K=rEzUQ3Lmg7UvpA@mail.gmail.com> <7C51D926-9DBB-41F5-93B2-10F716F672B1@strayalpha.com> <CALx6S37uN8TsXQZ3cv5jmxwxSyBRjK=-GQ_MsWxPWSs21XoGHw@mail.gmail.com> <CACL_3VEx7+VnLz7OLdXyhZU41e+-oBz3dc8JdMV_7pLMfic6=w@mail.gmail.com> <fcc8762f-c042-7999-d2e4-f28384950a19@erg.abdn.ac.uk> <CALx6S36sWGcZmFpAhF4DfOMyf6Z0w5F9bemNfeM1yWV-r0M+BA@mail.gmail.com> <8af3abf9-943f-13c1-e239-5efca27cf68c@erg.abdn.ac.uk> <CACL_3VHdyLAmzMbWsTVfJD+4tTzsMvcTzKS1B1CAdZ3k5U957g@mail.gmail.com> <CALx6S34DUrUBYd94LPPg4Hgh0FnZYZjZ4eKEYuaxb-7zbzb=pQ@mail.gmail.com> <CACL_3VEq9R=HmWXGbu_zcrgWfG0=q0z+HWM3cQ9Vh68hTCUR-w@mail.gmail.com> <CALx6S35bdGwY8FagGn8x5CaO4O3zW3U+NnB5ejC7bB6BHsXtJg@mail.gmail.com> <CACL_3VFwUJzT7uiXh33gBffboqqb51uFWJAEh290SsD0=aAzaQ@mail.gmail.com> <CALx6S34Lai=YS8i1VTC1zKHqsCTt_XUeKfwob7Qe_BA49bHC3A@mail.gmail.com> <CACL_3VFZphux8uCqh6seVgTEjyjOhCjGd-jHtdGc0fR9opKWUg@mail.gmail.com>
In-Reply-To: <CACL_3VFZphux8uCqh6seVgTEjyjOhCjGd-jHtdGc0fR9opKWUg@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Sun, 20 Jun 2021 18:24:31 -0700
Message-ID: <CALx6S34Yrph523yd0vx9EsCscwrjJY2ek6VrEj+7zCDGTLyuPA@mail.gmail.com>
To: "C. M. Heard" <heard@pobox.com>
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, Joseph Touch <touch@strayalpha.com>, TSVWG <tsvwg@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/hHsubo1ViByca_AGKbAZ7iYVkco>
Subject: Re: [tsvwg] RDMA Support by UDP FRAG Option
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Jun 2021 01:24:51 -0000

On Sun, Jun 20, 2021 at 4:36 PM C. M. Heard <heard@pobox.com> wrote:
>
> On Sat, Jun 19, 2021 at 6:01 PM Tom Herbert wrote:
>>
>> On Sat, Jun 19, 2021 at 4:39 PM C. M. Heard wrote:
>>>
>>> 2.) Assuming that the trailer checksum (OCS) is present, it is not clear how offload for the inner packet checksum could be expected to work properly without change. Remember, OCS does not cause the data in the trailer to sum to zero. It causes it to sum to the ones-complement of the trailer length.
>>>
>>> The modifications needed to handle point #2 are small -- specifically, adding the trailer length to whatever would normally be preloaded into the inner packet. The FRAG proposal in draft-ietf-tsvwg-udp-options-13#section-5.5, which proposes to sandwich the payload in between option headers, would make it a whole lot harder.
>>
>>
>> Protocol trailers are a fundamental problem. At this point it's not
>> clear to me that there is a way to make them work correctly in all use
>> cases with deployed HW.  Protocol headers, like those used per FRAG,
>> shouldn't be a problem,
>
>
> I believe that fear is unjustified. In every case analyzed in detail so far -- including this one -- trailers involve only minor rework at most.  I believe that I proved my case conclusively for the usual case (UDP used as a straight transport, not as an encapsulation) in https://mailarchive.ietf.org/arch/msg/tsvwg/RZULHOKRgrSYIvsI-5Hg7sNtSS8/.

Mike,

Please consider the simple case where the UDP checksum is set. There
are two common methods to offload the checksum: protocol agnostic and
protocol specific. In the protocol agnostic case the stack provides
the starting offset of the checksum area (offset of the UDP header)
and the offset of the checksum field to write the result (UDP
checksum)-- these values are placed in the TX descriptor for a packet.
The checksum field is primed with the pseudo header. The end of the
checksum area is the end of the packet (trying to add a length of the
area in the hardware interface is a non-starter). If there is surplus
and that sums to zero then the algorithm works unchanged. If the
surplus area sums to non-zero then that requires an offsetting sum by
the host adding in the negative sum value of the surplus area into the
checksum field. The good news is that this works with any case of
checksum offload in a UDP packet including encapsulated checksums. In
any case, if the surplus area always sums to zero that would not
require any rework, a non-zero sum surplus area is going to require
work in the OS (I have not looked at what would be required, but it
might be viewed as significant in especially if we need to add a new
field to mbuf, skbuff, etc.)

Protocol specific transmit checksum offload is when the device parses
the protocol and does all the work to send a checksum. It is not the
preferred method, however it is still quite prevalent. We simply have
no idea how devices will do checksum offload done in the presence of
surplus area, I suspect implementations are all over the place. Some
might only checksum over the UDP length, some might checksum over the
surplus area also, some might use the wrong length in the pseudo
header like described in draft. If I had to take a guess on what gives
the best chances for this to work correctly for the most devices, it
would be to always ensure the surplus space sums to zero.

It's a similar story for receive checksum offload, there is a protocol
agnostic and protocol specific method. The protocol generic method
works in all cases, but does require that the host computes the
checksum over the surplus area regardless of whether a checksum in the
area was explicitly set. For protocol specific checksum offload, again
we really have no idea how various devices will deal with surplus
area. Like the send case, I believe that the greatest chance for
compatibility would be to ensure the surplus area sums to zero.

Tom

> Additionally, while an encapsulator that uses UDP options may need to be modified in order to use hardware checksum offload, I think it's clear that the modifications needed are bounded and small. Actually for the case of a TCP inner packet the pseudo-header will be OK to use with a trailer that has OCS included if instead of the actual RCP length the pseudo-header includes TCP + trailer length. The driver still has to remember to copy only UDP length - 8 bytes. Arguably, if it were written correctly in the first place, it would be ready now.
>
> The proposal in the -13 draft does have problems in this regard. I'll reply to that thread in the next day or so with an explanation.
>
> Thanks,
>
> Mike Heard
>