Re: [nfsv4] New Version Notification for draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt

David Noveck <davenoveck@gmail.com> Tue, 23 August 2016 15:10 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 83FD812D5E3 for <nfsv4@ietfa.amsl.com>; Tue, 23 Aug 2016 08:10:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rw4qKeP3ATdd for <nfsv4@ietfa.amsl.com>; Tue, 23 Aug 2016 08:10:36 -0700 (PDT)
Received: from mail-oi0-x236.google.com (mail-oi0-x236.google.com [IPv6:2607:f8b0:4003:c06::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8116612DB78 for <nfsv4@ietf.org>; Tue, 23 Aug 2016 07:45:49 -0700 (PDT)
Received: by mail-oi0-x236.google.com with SMTP id f189so198683786oig.3 for <nfsv4@ietf.org>; Tue, 23 Aug 2016 07:45:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=5lAyW5K6RrbEC/aa3w2DlSK9FlASPRAbVwLgoO5Wbgs=; b=EQBFTLYrM3r8DvRcE0hM/WyHdJt7EPAWH8Fr5NrioV8pWBVhbs/Kjj/kdubo33zVhL w76QJf5/0OqxznJL8VzqjqiSkIPPHYDsC+GajjdeTInUvusquzPsdvG8TDs8ZcHuYz/2 oMNKmeSb01q8vSmh2K9jRetdOi10rWtuyU1xox539JJX87fsB/O9bKdMl3rFs1c+FsGn bQDCx8iEOflr3SYWfsfVoCi0N4EOzYyWWB/7CA2JxLc7AQ8mOp3CFNPWOdKXTqWDbDu+ aXlAYzLjyI03OD08oqPgfqdVlgMli0t6KQzN4AfXU6a17Z/88NxFWETP9ftBus/pmDGF vYmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=5lAyW5K6RrbEC/aa3w2DlSK9FlASPRAbVwLgoO5Wbgs=; b=gvCcZGavTwzkfd8Ei7eZ0VPbl0AiW3BVF7b3XGw8G02MZrzvICI5oDVhjfgUuxzKpq HESnXOiXdCf9L9KtQZimLZC6JXuESt9Ds1O+I2A2wmaLE7ceWjG6g3zl0jQf03WLseZ1 tBkJd1hebissLqdaGjpuPOPK6cEFJt5ha35FKgqQzQZfm7/8WdeYpfLHFS/8/9o9XhfZ K5GpP1Y83RYWjx1VtHrPh9xStSx5NmqhsReeMK3EufEc4j9u2SXKsI0EvQksM/R4O6er 2pCk/58I7lcrTkrm9Dt2Pu3i9jr75wHw8NgZc9MjpnG2H4TM3/eMuTg1OezTE1ueM5I9 Xr9g==
X-Gm-Message-State: AEkoousFUSmlBsEoD/lE2PcH4zYOgNO66p883vumLKm3u4nVB81GZCautTSJ3J+Z1r+s0c3eBxDKbMd8aky+iA==
X-Received: by 10.157.46.119 with SMTP id c52mr17946607otd.6.1471963548614; Tue, 23 Aug 2016 07:45:48 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.116.37 with HTTP; Tue, 23 Aug 2016 07:45:47 -0700 (PDT)
In-Reply-To: <2E2207C2-EC6D-4C44-9024-56D103563617@oracle.com>
References: <147155138353.27840.7944779905916585881.idtracker@ietfa.amsl.com> <CADaq8jeb6P_1mG9++T=Ff31WBeJi-bD7NqUBQM4GOscxquiDxQ@mail.gmail.com> <2E2207C2-EC6D-4C44-9024-56D103563617@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 23 Aug 2016 10:45:47 -0400
Message-ID: <CADaq8jd6PT7y=x1a5Tynxdzed2DivEuCG_6UK1eA=Vxod3PJxQ@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113d13347007d7053abe3873"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/DL6q1ZQ9lPn7SJzOwZrYgWYDBcs>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] New Version Notification for draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Aug 2016 15:10:40 -0000

Thanks. I'll address these as part of
draft-dnoveck-nfsv4-rpcrdma-xcharext-03, which will be out in the first
half of September.

Unless there are objections, that draft will adopt Karen's suggestion to
change "characteristics" to "properties".  The title of
the document will change but the file name will stay the the same.

> There's no discussion of RPC-over-RDMA credit accounting in this
> document. There needs to be some discussion of credit consumption.

Right.

> In particular:

> Requesters will have to have posted extra receive buffers (over
> and above credits for forward channel replies and backchannel
> requests) to deal with XCHAR messages.

I think you mean that clients will.

> Likewise, responders will
> need to post similar extra receives for this purpose.

Similarly, I think servers are being referred to.

> Perhaps both peers should reserve one credit, and the specification
> should insist that these operations are always single-threaded on a
> connection.

I think single-threading might be an implementation choice but I'm
reluctant to make it part of the protocol.

> Alternately, when xcharext is merged into rpcrdma-version-two,
> there might be a generic discussion of non-RPC-payload-bearing
> messages that could cover this issue.

I think there will need to be generic text regarding the issues, but I
think it should
be consistent with the following approach regarding messages related to
transport.

In the case of messages that do not serve effectively as the response

to a previous message (i.e. ROPT_FIRSTPROP, ROPT_REQPROP,

ROPT_UPDPROP), it is the responsibility of the sender to ensure that

there is a credit available to enable sending the message, just as would

be the case if it were sending an RPC request.

In the case of ROPT_RESPPROP meesages, it is the responsibility of

the sender of the original ROPT_REQPROP to post a receive to

receive the response, which is to be sent by the receiver of the
ROPT_REQPROP,

without consuming a credit.   This is similar to the case of ending an RPC

response.


> Overloading the term "initial"

Your'e right.  It is overloaded.

> Section 3 introduces an "initial set of transport characteristics"
> and Section 4.1 defines "ROPT_INITXCH: Specify Initial
> Characteristics". I think the use of "initial" means different
> things in these two cases?


It does.

> Instead:

> Section 3 could propose "initially-defined" characteristics.


OK.

> Section 4.1 could define ROPT_STARTXCH (like STARTTLS). I'm not
> attached to that name, but it probably shouldn't be ROPT_INITXCH
> if Section 3 defines "initial characteristics".

> ROPT_CONNXCH

> ROPT_FIRSTXCH

> ROPT_EARLYXCH

My current choice is ROPT_FIRSTPROP.

> Section 4.4

> The text in this section has been clarified to address previous
> reviewer comments; thanks! There are a number of syntax and
> grammatical errors that still need to be addressed (most often,
> a few words are repeated in some sentences).

I'll address those in -03.

> The "argument structure" of ROPT_UPDXCH is:

> struct optinfo_updxch {
>    xcharval        optupdxch_now;
>    bool            optupdxch_pendclr;
> };

> I prefer optupdxch_new instead of optupdxch_now; "_now" suggests this
> field records a time and/or date stamp.

OK.

> It would help me to understand why a receiver needs to distinguish
> between these types of notification.

When I specified the possible situations in which these messages could
arise, I did not mean to imply that the receiver necessarily would need to
distinguish these notifications.  My focus was on the sender.

> For instance, if pendclr is false, this could be either a rejection
>of a pending change request,

If it it is a *rejection* of a pending change request pendclr would true

> or it could be an unsolicited change
> notification.

> How does the receiver make use of the difference?

> Instead of a boolean, an enumeration of update event types would be
> a little friendlier, and could be expressed in the same amount of
> space (since an XDR boolean consumes 4 octets on the wire).

I'm OK in principle, but it seems we are both uncertain if this is
indeed "friendlier"

> Based
> on the discussion in Section 4.4, we have:

> enum optupdxch_event_type {
>    OPTUPD_UNSOL  = 1,

pendclr = false as there is nothing pending to clear.

>   OPTUPD_MORE   = 2

pemdclr=false but there is still a pending request.
,
>   OPTUPD_DONE   = 3,

pendclr = true and the request has completed successfully.

   OPTUPD_REJECT = 4,

This sound to me pendclr= true but the word "REJECT" suggests no change at
all was made.  In this case you might also have a partial change
};

> But since the rdma_xid field is not used to tie change requests to
> these change update notifications, I'm not sure why the receiver
> needs to know that a pending request has been completed.

If the receiver keeps track of pending requests, it needs to know when one
is no longer pending.

The receiver is not required to do this and some might not choose to do so,
but the protocol should provide
an implmentation that does the means to keep track.

This does not require the xid, but peer that requested the change can keep
track of the properites for
which it has a pending requested change.

> I think
> REJECT might be more interesting than the difference between UNSOL,
> DONE, and MORE.

How about an enum that distinguished:

   - Unsolicited
   - Clear pending
   - Still Pending

The additional distinction among degrees of request satisfaction in the
last two
cases could be the responsibility of the receiver to determine. Since he
made the request
and has access to the current value, he could determine this himself.
Alternatively,
we could have an int with two distinct bit fields:

   - One distinguishing unsolicited/clear-pending/still-pending
   - Another regarding degrees of request satisfaction.



> There is a bit of a race here: a sender could send an unsolicited
> update notification at the same time the receiver requests a change
> of the same xchar. Could that result in a non-deterministic outcome?

It shouldn't.  The point is that the property changes happen in a sequence
which
is the same for all observers (no weird relativistic effects!) and that the
updates
to the peer should happen that same sequence.  The important point is a
RESPXCH indicating a succesful change request be in the proper place in
that
sequence.

> Would it ever be reasonable to send two or more updates
> simultaneously for the same XCHAR?

Since this is a single connection with sequenced delivery, there is
no way to send updates simultaneously.  You can queue them for
sending at the same time, but the delivery will reflect the order in
which were queued.

> (Requiring single-threading here would prevent that from occurring).

Given that there is no response to UPDXCH, not sure how you
could specify single-threading.  There is no way to define when
it would be OK to send the next.

> What if the sender emits two optinfo_updxch messages: both with
> pendclr set to false, but one with an intermediate value, and one
> with the original value. The result on the receiver could depend on
> the order in which these messages arrive.

It would.

> Possibly some text
> regarding the ordering of these messages is needed.

I can add something.

> What happens if the receiver of ROPT_REQXCH drops the request?

>Is there a timeout after which ROPT_REQXCH may be sent again?

There is no timeout in the protocol.

An implementation may choose to do so, but since this is sent
on a reliable connection, it is hard to imagine it being worth doing.

>What happens if an ROPT_RESPXCH is dropped? If ROPT_REQXCH is sent
> again, the reply is :

>  ROPT_RESPXCH with the requested value marked done ?

This should result.

>   ROPT_RESPXCH with a rejection (no change was done) ?

It is true no change was done but requested value was achieved,

>  ROPT_UPDXCH with the requested value and pendclr set to true ?

ROPT_UPPDXCH is not an alternative to ROPT_RESPXCH.  It is
a possible additional message.

> I don't see language that disallows any of these responses. Which
> one means "I already set this value" ? Sorry if I missed that.

I can add some clarification.

> Assuming that both sides support ROPT_UPDXCH, can an implementation
> use ROPT_UPDXCH exclusively instead of ROPT_INITXCH?

Yes but it  is kind of bogus.  You would be relying on the default initial
values
and then changing them, which would be good in a test but, in real life, it
is
asking for trouble.

BTW, I've always wished there would be an RFC2119bis defining "BOGUS" and
"BRAIN-DEAD" :-)

> Assuming that both sides support ROPT_UPDXCH, may a peer change an
> XCHAR and not send an unsolicited ROPT_UPDXCH?
It may but in most cases it would be BOGUS (or BRAIN-DEAD).

Suppose you raise the receive buffer size and don't tell you peer that it
is raised,  In
that case. raising it is pretty pointless since the peer can't take
advantage of the bigger
buffer.

If you lower the receive buffer size and don't tell the peer it has been
lowered, then he
is going to continue to assume a larger size and break things.

On Mon, Aug 22, 2016 at 2:10 PM, Chuck Lever <chuck.lever@oracle.com> wrote:

> Remarks on rpcrdma-xcharext-02.
>
>
> - Credit accounting
>
> There's no discussion of RPC-over-RDMA credit accounting in this
> document. There needs to be some discussion of credit consumption.
> In particular:
>
> Requesters will have to have posted extra receive buffers (over
> and above credits for forward channel replies and backchannel
> requests) to deal with XCHAR messages. Likewise, responders will
> need to post similar extra receives for this purpose.
>
> Perhaps both peers should reserve one credit, and the specification
> should insist that these operations are always single-threaded on a
> connection.
>
> Alternately, when xcharext is merged into rpcrdma-version-two,
> there might be a generic discussion of non-RPC-payload-bearing
> messages that could cover this issue.
>
>
> - Overloading the term "initial"
>
> Section 3 introduces an "initial set of transport characteristics"
> and Section 4.1 defines "ROPT_INITXCH: Specify Initial
> Characteristics". I think the use of "initial" means different
> things in these two cases?
>
> Instead:
>
> Section 3 could propose "initially-defined" characteristics.
>
> Section 4.1 could define ROPT_STARTXCH (like STARTTLS). I'm not
> attached to that name, but it probably shouldn't be ROPT_INITXCH
> if Section 3 defines "initial characteristics".
>
> ROPT_CONNXCH
>
> ROPT_FIRSTXCH
>
> ROPT_EARLYXCH
>
>
> - Section 4.4
>
> The text in this section has been clarified to address previous
> reviewer comments; thanks! There are a number of syntax and
> grammatical errors that still need to be addressed (most often,
> a few words are repeated in some sentences).
>
> The "argument structure" of ROPT_UPDXCH is:
>
> struct optinfo_updxch {
>     xcharval        optupdxch_now;
>     bool            optupdxch_pendclr;
> };
>
> I prefer optupdxch_new instead of optupdxch_now; "_now" suggests this
> field records a time and/or date stamp.
>
> It would help me to understand why a receiver needs to distinguish
> between these types of notification.
>
> For instance, if pendclr is false, this could be either a rejection
> of a pending change request, or it could be an unsolicited change
> notification. How does the receiver make use of the difference?
>
> Instead of a boolean, an enumeration of update event types would be
> a little friendlier, and could be expressed in the same amount of
> space (since an XDR boolean consumes 4 octets on the wire). Based
> on the discussion in Section 4.4, we have:
>
> enum optupdxch_event_type {
>    OPTUPD_UNSOL  = 1,
>    OPTUPD_MORE   = 2,
>    OPTUPD_DONE   = 3,
>    OPTUPD_REJECT = 4,
> };
>
> But since the rdma_xid field is not used to tie change requests to
> these change update notifications, I'm not sure why the receiver
> needs to know that a pending request has been completed. I think
> REJECT might be more interesting than the difference between UNSOL,
> DONE, and MORE.
>
> There is a bit of a race here: a sender could send an unsolicited
> update notification at the same time the receiver requests a change
> of the same xchar. Could that result in a non-deterministic outcome?
>
> Would it ever be reasonable to send two or more updates
> simultaneously for the same XCHAR? (Requiring single-threading here
> would prevent that from occurring).
>
> What if the sender emits two optinfo_updxch messages: both with
> pendclr set to false, but one with an intermediate value, and one
> with the original value. The result on the receiver could depend on
> the order in which these messages arrive. Possibly some text
> regarding the ordering of these messages is needed.
>
> What happens if the receiver of ROPT_REQXCH drops the request? Is
> there a timeout after which ROPT_REQXCH may be sent again?
>
> What happens if an ROPT_RESPXCH is dropped? If ROPT_REQXCH is sent
> again, the reply is :
>
>   ROPT_RESPXCH with the requested value marked done ?
>   ROPT_RESPXCH with a rejection (no change was done) ?
>   ROPT_UPDXCH with the requested value and pendclr set to true ?
>
> I don't see language that disallows any of these responses. Which
> one means "I already set this value" ? Sorry if I missed that.
>
> Assuming that both sides support ROPT_UPDXCH, can an implementation
> use ROPT_UPDXCH exclusively instead of ROPT_INITXCH?
>
> Assuming that both sides support ROPT_UPDXCH, may a peer change an
> XCHAR and not send an unsolicited ROPT_UPDXCH?
>
>
> > On Aug 18, 2016, at 4:23 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > This is updated and it add some vowels (and consonants too) the field
> and type names.  In particular "rq" --> "req".
> >
> > I'm aware that some people find "XCHAR" confusing.  If someone has an
> idea for a replacement, please propose it on the list.  If the working
> group is OK with it, I'll produce a -03 incorporating it.
> >
> >
> > ---------- Forwarded message ----------
> > From: <internet-drafts@ietf.org>
> > Date: Thu, Aug 18, 2016 at 4:16 PM
> > Subject: New Version Notification for draft-dnoveck-nfsv4-rpcrdma-
> xcharext-02.txt
> > To: David Noveck <davenoveck@gmail.com>
> >
> >
> >
> > A new version of I-D, draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt
> > has been successfully submitted by David Noveck and posted to the
> > IETF repository.
> >
> > Name:           draft-dnoveck-nfsv4-rpcrdma-xcharext
> > Revision:       02
> > Title:          RPC-over-RDMA Extension to Manage Transport
> Characteristics
> > Document date:  2016-08-18
> > Group:          Individual Submission
> > Pages:          23
> > URL:            https://www.ietf.org/internet-
> drafts/draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt
> > Status:         https://datatracker.ietf.org/doc/draft-dnoveck-nfsv4-
> rpcrdma-xcharext/
> > Htmlized:       https://tools.ietf.org/html/draft-dnoveck-nfsv4-rpcrdma-
> xcharext-02
> > Diff:           https://www.ietf.org/rfcdiff?url2=draft-dnoveck-nfsv4-
> rpcrdma-xcharext-02
> >
> > Abstract:
> >    This document specifies an extension to RPC-over-RDMA Version Two.
> >    The extension enables endpoints of an RPC-over-RDMA connection to
> >    exchange information which can be used to optimize message transfer.
> >
> >
> >
> >
> > Please note that it may take a couple of minutes from the time of
> submission
> > until the htmlized version and diff are available at tools.ietf.org.
> >
> > The IETF Secretariat
> >
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4
>
> --
> Chuck Lever
>
>
>
>