Re: [nfsv4] Tsvart early review of draft-ietf-nfsv4-rpcrdma-version-two-05
David Noveck <davenoveck@gmail.com> Tue, 02 November 2021 15:50 UTC
Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D95F03A03EA; Tue, 2 Nov 2021 08:50:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w8qZIIZkLUqM; Tue, 2 Nov 2021 08:50:24 -0700 (PDT)
Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D3D553A0789; Tue, 2 Nov 2021 08:50:23 -0700 (PDT)
Received: by mail-ed1-x52a.google.com with SMTP id 5so76451830edw.7; Tue, 02 Nov 2021 08:50:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pDlxexx3k3dle6XVXz0c0Tn1ed61MuYsTK7DQxo4UUQ=; b=b+N0+U1649jm0TZzYSQLKHlCfIqqd7UAxYfkhT7pz8eNbg9cSmIsHWsi15ki+dojhJ btYB7UgdpYoqlOZTEwP+llEZn+GzRnkX9rhlraF+rqy43BUveX2/PxxtFdu6Lfs4zZUw +PcJcyhCj6kjyOuYoaIM0LPUhiw/ym0zrlvDH3PeBe5II4pM8QuAIFWOFiANWvNYjpeS ZvI42laW9axiBW28b6r3H2Eauq7QgplwxMq4RVIiwTP5J4ffDmrCeWfz79/OnUsiKtQ4 siKi84iq+6W6lBA0+2ylsjOjS7iXrsgP9mNZV5grQTrmYLzAxJno4CSFzti1WItoF/cX 5zVQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pDlxexx3k3dle6XVXz0c0Tn1ed61MuYsTK7DQxo4UUQ=; b=Qj6iTXcFyMG/g+5kwRP5BGJSdNN4hHpiiQwXSO30btvry/tpdFnRi9g0pckN8Tnj+B kL6+aEXHEp5RFbQJ7wo8u7xR9qKXtVRgbb9BJWJr6hC0x2hq/XKCrhpeoCne+CVc5e8h LyCOrN48PWeKDl6ERxCb6zHd2fIevmt/23lF49cSHquCgqDIAxn1MBgeskFqLgL1r31c RBo/HXp7LrCiqk3qkE6EqBWJlVjN3PFYAYbndysmJfsuALZhQYBKBkGwGQQ/t+XE2hwB Fe1l9eZQxF85MS1inFpT7rfyDtR8M6PN4r9noLDeMPSjcn2+lzHwAPfLKo7/+liqI7J7 Ia+A==
X-Gm-Message-State: AOAM533T3fTlU8AVgg8PekipvyTpCR7oHTgRxAVtKmDd1WIbDi25c7Zh 5WMyoRhrVJM+LjtpzT3VftTdgIdTmfgIr3LZcgm9BQT5
X-Google-Smtp-Source: ABdhPJw/Zt8dYnsn0YngOBQU+jYc21ZZncvUPEy8HgV2vp+gC5VELRWVWu6OrASB/MP7KI59+b95X9vbHY25RjK26hQ=
X-Received: by 2002:a05:6402:5190:: with SMTP id q16mr14180411edd.123.1635868220721; Tue, 02 Nov 2021 08:50:20 -0700 (PDT)
MIME-Version: 1.0
References: <163584491835.18636.4240783876111764396@ietfa.amsl.com> <C3890CF3-86F3-4E72-BB1A-522BF0AA4CA3@oracle.com>
In-Reply-To: <C3890CF3-86F3-4E72-BB1A-522BF0AA4CA3@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 02 Nov 2021 11:50:08 -0400
Message-ID: <CADaq8jc9Z2QUfTy-+7BVGb8t+opdejctaWyuBJz_5GMsuJNDvQ@mail.gmail.com>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Jana Iyengar <jri.ietf@gmail.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>, "draft-ietf-nfsv4-rpcrdma-version-two.all@ietf.org" <draft-ietf-nfsv4-rpcrdma-version-two.all@ietf.org>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000320e2305cfd04068"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/J1_ZgrSduDXl2ER94Yf_u0Xzek4>
Subject: Re: [nfsv4] Tsvart early review of draft-ietf-nfsv4-rpcrdma-version-two-05
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Nov 2021 15:50:29 -0000
.> This is not a > request to change anything, so perhaps all I'm looking for is that the ADs are > ok with this. I'm sure they would be since RFC4506 is an internet standard. Thanks for the review. On Tue, Nov 2, 2021 at 10:59 AM Chuck Lever III <chuck.lever@oracle.com> wrote: > Hi Jana, thanks for your expert comments. A few responses > are inline below. > > > > On Nov 2, 2021, at 5:21 AM, Jana Iyengar via Datatracker < > noreply@ietf.org> wrote: > > > > Reviewer: Jana Iyengar > > Review result: Not Ready > > > > I've read up through Section 4, specifically to understand and comment > on the > > Flow Control aspects of this draft, as requested by Chuck Lever. > > > > As I understand it, FC works as follows: > > > > - A receiver advertises the number of unacknowledged messages it is > willing to > > receive (`credit window`), and also indicates the number of messages > received so far > > (`received`). The underlying transport is assumed to provide ordered and > > reliable delivery. > > > > - A sender is required to not send more than receiver-advertized > `received + > > credit window` messages. > > > > - These advertisements are typically piggy-backed on payload-bearing > > messages. Optionally, a receiver can send a message with a `Grants > Credit` > > header type to advertise its window. > > > > - The receiver can increase or decrease the advertised window. > > > > - Each message has a max message size (`Inline Threshold`) which is > indicated > > via `Transport Properties`, and defaults to 4096 bytes. > > > > > > I have a few comments specific to Flow Control: > > > > - The document allows for advertised window to increase or decrease, > meaning > > that a receiver can renege on credit it advertises. Specifically, this > allows > > a receiver to say it will accept a particular message and then reduces > its > > window to cause that message to be now out of its window. > > It was not my intention for a credit decrease to force messages > out of the window. I hope this oversight can be addressed by > simply adding a sentence or two restricting the Receiver's > behavior. > > > > This choice is rife with pain. I would urge, in the strongest manner > possible, > > to avoid allowing reneging. That is, do not allow a receiver to > arbitrarily > > reduce advertised credit so as to allow reneging. If you really really > need > > it, prepare for some blood and sweat, and eventually, inevitable tears. > > > > To clarify, I do not mean that the window size must remain the same. A > > receiver can reduce its window size, but do it so that if a message was > > in-window before, it remains in-window after. > > > > For example, here is a sequence of FC messages that reduces the window > and > > reneges on a previously allowed message (this is bad): > > > > received = 10, window = 5 (allowed = 11-15) > > received = 11, window = 2 (allowed = 12-13, reneged on 14, 15) > > > > And here is a sequence of FC messages that reduces the window without > > reneging (this is good): > > > > received = 10, window = 5 (allowed = 11-15) > > received = 11, window = 4 (allowed = 12-15) > > received = 12, window = 3 (allowed = 13-15) > > > > - The current mechanism has a receiver advertising both `received` and > `credit > > window` and the sender computes the limit. It is more direct for a > receiver to > > simply advertise the limit -- basically the maximum message that the > sender > > can send. This makes for fewer fields in the exchange, is more > intuitive, and > > importantly, it is easier to describe a mechanism that avoids reneging > > (advertise the total message limit, and never advertise a lower number). > > A single field is preferable to retain better compatibility > with RPC/RDMA v1. I wasn't able to think of a way to combine > the fields. We can discuss this offline. > > > > - Are the expected limits at a receiver in bytes or in operations? I > imagine > > that the answer is bytes. > > The second paragraph of Section 4.2.1 explains that the > connection's credit limits are expressed in messages, not > bytes. > > An RPC-over-RDMA version 2 credit represents the capability to convey > exactly one RPC-over-RDMA version 2 message, regardless of its size, > via an RDMA Send/Receive pair. > > > > If so, the FC mechanisms should be tied to this > > resource. At the moment, a receiver has to figure out what to advertise > based > > on available memory and the maximum message size it advertised. Note > that > > there will be inefficiency because messages are likely to be smaller > than the > > max limit. It is likely to be more direct and efficient if the unit is > all > > bytes. > > True, messages are typically smaller than the connection's > inline thresholds. Unfortunately message bytes are not > fungible among RDMA Receive buffers. > > A Receiver with 32 4096-byte Receive buffers will use exactly > the same resources to receive 32 400-byte messages as it will > to receive 32 3500-byte messages. Sending small RDMA messages > does not enable a Receiver to then receive additional messages. > > At one point the document text was clear that the inline > threshold is actually the RDMA Receive buffer size, but that > over time has been removed in favor of generality. > > > Regarding efficiency: We had considered building a capability > in RPC/RDMA of conveying multiple short RPC messages in a > single RDMA Send/Receive, but holding up those RPC messages to > coalesce them usually adds unwanted latency. > > > > - When are transport properties exchanged? FC credit needs to be > available at > > the beginning of a connection for any progress to be made at all, so > either > > both endpoints need to advertise it right at the start, or some default > needs > > to be assumed. I don't see either in the document. > > The second paragraph of Section 4.3 explains that both peers > must send one-at-a-time until credit limits are established. > > > > - Finally, `credit window` is a strange term. What the document > currently uses > > is arguably a window, and the term `window` here applies to messages > that can > > be received, not to flow control credit. > > > > General comments (I skimmed the rest of the document): > > > > - I understand the value of the XDR descriptions for the protocol, but I > haven't > > encountered this in the past. Is this adequate for an IETF RFC? This is > not a > > request to change anything, so perhaps all I'm looking for is that the > ADs are > > ok with this. > > Fwiw, XDR describes RPC protocols throughout the RFCs pub- > lished by the nfsv4 WG, starting with RFC 2623. Previous to > the creation of the nfsv4 WG, XDR was originally defined in > RFC 1014 and used to describe NFSv2 in RFC 1094 in 1989. > > > > - It would be useful to have forward references from the earlier > sections to the > > later wire format and detailed error sections. For example, when a > connection > > might be closed due to FC violations. > > > > - I might recommend a little text in the intro section laying out the > structure > > of the document. > > > > - I see lowercase and uppercase MUSTs and SHOULDs. For clarity, consider > > rewriting the text to avoid the lowercase keywords. > > Issues regarding readability, terminology, and document > structure will be addressed in subsequent revisions. Thanks > for pointing them out! > > > -- > Chuck Lever > > > >
- [nfsv4] Tsvart early review of draft-ietf-nfsv4-r… Jana Iyengar via Datatracker
- Re: [nfsv4] Tsvart early review of draft-ietf-nfs… Chuck Lever III
- Re: [nfsv4] Tsvart early review of draft-ietf-nfs… David Noveck