Re: [nfsv4] Tsvart early review of draft-ietf-nfsv4-rpcrdma-version-two-05

David Noveck <davenoveck@gmail.com> Tue, 02 November 2021 15:50 UTC

MIME-Version: 1.0
References: <163584491835.18636.4240783876111764396@ietfa.amsl.com> <C3890CF3-86F3-4E72-BB1A-522BF0AA4CA3@oracle.com>
In-Reply-To: <C3890CF3-86F3-4E72-BB1A-522BF0AA4CA3@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 02 Nov 2021 11:50:08 -0400
Message-ID: <CADaq8jc9Z2QUfTy-+7BVGb8t+opdejctaWyuBJz_5GMsuJNDvQ@mail.gmail.com>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: Jana Iyengar <jri.ietf@gmail.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>, "draft-ietf-nfsv4-rpcrdma-version-two.all@ietf.org" <draft-ietf-nfsv4-rpcrdma-version-two.all@ietf.org>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000320e2305cfd04068"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/J1_ZgrSduDXl2ER94Yf_u0Xzek4>
Subject: Re: [nfsv4] Tsvart early review of draft-ietf-nfsv4-rpcrdma-version-two-05
Precedence: list

.> This is not a
>  request to change anything, so perhaps all I'm looking for is that the
ADs are
>  ok with this.

I'm sure they would be since RFC4506 is an internet standard.

Thanks for the review.



On Tue, Nov 2, 2021 at 10:59 AM Chuck Lever III <chuck.lever@oracle.com>
wrote:

> Hi Jana, thanks for your expert comments. A few responses
> are inline below.
>
>
> > On Nov 2, 2021, at 5:21 AM, Jana Iyengar via Datatracker <
> noreply@ietf.org> wrote:
> >
> > Reviewer: Jana Iyengar
> > Review result: Not Ready
> >
> > I've read up through Section 4, specifically to understand and comment
> on the
> > Flow Control aspects of this draft, as requested by Chuck Lever.
> >
> > As I understand it, FC works as follows:
> >
> > - A receiver advertises the number of unacknowledged messages it is
> willing to
> >  receive (`credit window`), and also indicates the number of messages
> received so far
> >  (`received`). The underlying transport is assumed to provide ordered and
> >  reliable delivery.
> >
> > - A sender is required to not send more than receiver-advertized
> `received +
> >  credit window` messages.
> >
> > - These advertisements are typically piggy-backed on payload-bearing
> >  messages. Optionally, a receiver can send a message with a `Grants
> Credit`
> >  header type to advertise its window.
> >
> > - The receiver can increase or decrease the advertised window.
> >
> > - Each message has a max message size (`Inline Threshold`) which is
> indicated
> >  via `Transport Properties`, and defaults to 4096 bytes.
> >
> >
> > I have a few comments specific to Flow Control:
> >
> > - The document allows for advertised window to increase or decrease,
> meaning
> >  that a receiver can renege on credit it advertises. Specifically, this
> allows
> >  a receiver to say it will accept a particular message and then reduces
> its
> >  window to cause that message to be now out of its window.
>
> It was not my intention for a credit decrease to force messages
> out of the window. I hope this oversight can be addressed by
> simply adding a sentence or two restricting the Receiver's
> behavior.
>
>
> >  This choice is rife with pain. I would urge, in the strongest manner
> possible,
> >  to avoid allowing reneging. That is, do not allow a receiver to
> arbitrarily
> >  reduce advertised credit so as to allow reneging. If you really really
> need
> >  it, prepare for some blood and sweat, and eventually, inevitable tears.
> >
> >  To clarify, I do not mean that the window size must remain the same. A
> >  receiver can reduce its window size, but do it so that if a message was
> >  in-window before, it remains in-window after.
> >
> >  For example, here is a sequence of FC messages that reduces the window
> and
> >  reneges on a previously allowed message (this is bad):
> >
> >  received = 10, window = 5  (allowed = 11-15)
> >  received = 11, window = 2  (allowed = 12-13, reneged on 14, 15)
> >
> >  And here is a sequence of FC messages that reduces the window without
> >  reneging (this is good):
> >
> >  received = 10, window = 5  (allowed = 11-15)
> >  received = 11, window = 4  (allowed = 12-15)
> >  received = 12, window = 3  (allowed = 13-15)
> >
> > - The current mechanism has a receiver advertising both `received` and
> `credit
> >  window` and the sender computes the limit. It is more direct for a
> receiver to
> >  simply advertise the limit -- basically the maximum message that the
> sender
> >  can send. This makes for fewer fields in the exchange, is more
> intuitive, and
> >  importantly, it is easier to describe a mechanism that avoids reneging
> >  (advertise the total message limit, and never advertise a lower number).
>
> A single field is preferable to retain better compatibility
> with RPC/RDMA v1. I wasn't able to think of a way to combine
> the fields. We can discuss this offline.
>
>
> > - Are the expected limits at a receiver in bytes or in operations? I
> imagine
> >  that the answer is bytes.
>
> The second paragraph of Section 4.2.1 explains that the
> connection's credit limits are expressed in messages, not
> bytes.
>
>    An RPC-over-RDMA version 2 credit represents the capability to convey
>    exactly one RPC-over-RDMA version 2 message, regardless of its size,
>    via an RDMA Send/Receive pair.
>
>
> >  If so, the FC mechanisms should be tied to this
> >  resource. At the moment, a receiver has to figure out what to advertise
> based
> >  on available memory and the maximum message size it advertised. Note
> that
> >  there will be inefficiency because messages are likely to be smaller
> than the
> >  max limit. It is likely to be more direct and efficient if the unit is
> all
> >  bytes.
>
> True, messages are typically smaller than the connection's
> inline thresholds. Unfortunately message bytes are not
> fungible among RDMA Receive buffers.
>
> A Receiver with 32 4096-byte Receive buffers will use exactly
> the same resources to receive 32 400-byte messages as it will
> to receive 32 3500-byte messages. Sending small RDMA messages
> does not enable a Receiver to then receive additional messages.
>
> At one point the document text was clear that the inline
> threshold is actually the RDMA Receive buffer size, but that
> over time has been removed in favor of generality.
>
>
> Regarding efficiency: We had considered building a capability
> in RPC/RDMA of conveying multiple short RPC messages in a
> single RDMA Send/Receive, but holding up those RPC messages to
> coalesce them usually adds unwanted latency.
>
>
> > - When are transport properties exchanged? FC credit needs to be
> available at
> >  the beginning of a connection for any progress to be made at all, so
> either
> >  both endpoints need to advertise it right at the start, or some default
> needs
> >  to be assumed. I don't see either in the document.
>
> The second paragraph of Section 4.3 explains that both peers
> must send one-at-a-time until credit limits are established.
>
>
> > - Finally, `credit window` is a strange term. What the document
> currently uses
> >  is arguably a window, and the term `window` here applies to messages
> that can
> >  be received, not to flow control credit.
> >
> > General comments (I skimmed the rest of the document):
> >
> > - I understand the value of the XDR descriptions for the protocol, but I
> haven't
> >  encountered this in the past. Is this adequate for an IETF RFC? This is
> not a
> >  request to change anything, so perhaps all I'm looking for is that the
> ADs are
> >  ok with this.
>
> Fwiw, XDR describes RPC protocols throughout the RFCs pub-
> lished by the nfsv4 WG, starting with RFC 2623. Previous to
> the creation of the nfsv4 WG, XDR was originally defined in
> RFC 1014 and used to describe NFSv2 in RFC 1094 in 1989.
>
>
> > - It would be useful to have forward references from the earlier
> sections to the
> >  later wire format and detailed error sections. For example, when a
> connection
> >  might be closed due to FC violations.
> >
> > - I might recommend a little text in the intro section laying out the
> structure
> >  of the document.
> >
> > - I see lowercase and uppercase MUSTs and SHOULDs. For clarity, consider
> >  rewriting the text to avoid the lowercase keywords.
>
> Issues regarding readability, terminology, and document
> structure will be addressed in subsequent revisions. Thanks
> for pointing them out!
>
>
> --
> Chuck Lever
>
>
>
>

[nfsv4] Tsvart early review of draft-ietf-nfsv4-r… Jana Iyengar via Datatracker
Re: [nfsv4] Tsvart early review of draft-ietf-nfs… Chuck Lever III
Re: [nfsv4] Tsvart early review of draft-ietf-nfs… David Noveck