Re: [nfsv4] Credit management and one-way messages

David Noveck <davenoveck@gmail.com> Tue, 01 August 2017 17:51 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7FEBA132167 for <nfsv4@ietfa.amsl.com>; Tue, 1 Aug 2017 10:51:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B2AdomFmxqQL for <nfsv4@ietfa.amsl.com>; Tue, 1 Aug 2017 10:51:16 -0700 (PDT)
Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2748C12ECB7 for <nfsv4@ietf.org>; Tue, 1 Aug 2017 10:51:16 -0700 (PDT)
Received: by mail-io0-x232.google.com with SMTP id j32so11244987iod.0 for <nfsv4@ietf.org>; Tue, 01 Aug 2017 10:51:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=PqKj97zadReZT2OPok0ECoNmya+8Hi9cVwzFhUeIcRI=; b=bPGNUfvBzDMYoA/1aYdDp05WyezCqhEL0CDsAf39qlj4+sljeZFCg6nscAF4M8kA1U M+k+8mzTraMZ8/rWPGUaCVip20ODzKDdOSo+UFaYuDiVqX00eHGmy8+R3C/UBkBZw7sN 5BNrAQa0mLD8dblz7fjnPnqZ5LhI90avmI21xbPji7wtnhV+w4NR6AfTyMhjKKVH0RJb FXqePtVGoqhnD7oc3dUPvQHhKA/+1W4tNxAWdIAN0HqM1aNr8tA5yXV+EqNx6k741aMm 1Tt4rEHLWSs6m4Dk5gMMZWLaVDlO+ou0jFnvmvru5Ygq3HTqNAE1sik7yZGF+W1TRfCC ZkHw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=PqKj97zadReZT2OPok0ECoNmya+8Hi9cVwzFhUeIcRI=; b=iO6GIlageKz1CWD1RhMKySm2JoOZzPU7bpuXtvZoPBzYjHgePlBsccx6fe6ndWDfgO qHO76O42blZX9nR3y4185Ov6Q5yECzq6mdis7CywwwA5tZmcdhY+HEy+io8J2wyeKoJa 8LREOM3HOU6a/0pppOugIdoG/+25nwPnD2QJjpMKUsBk4FGvnzsNbFwQQdBODBWvizwA krA8uKQj8zmKivwV1x4Bewr8T/7Mvm6FpHfL3vniDAH3HUIf/RN4gaK3uXQ0NZSY90/S gIjP7LKFTpX33spJiObwXsyNM5xwEy6eV/hC0ogCf2ks+miY3iCPeI0Kl33A5uYH5XDM ccAQ==
X-Gm-Message-State: AIVw113x00+NyzsDum01MZnACk9eLjglf7tzzEwRMkpV05zX+6fQS+UW q1GcqbQAhkZzgXB1QWQd2Ur5/18bOA==
X-Received: by 10.107.19.222 with SMTP id 91mr22773412iot.313.1501609875236; Tue, 01 Aug 2017 10:51:15 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.142.72 with HTTP; Tue, 1 Aug 2017 10:51:14 -0700 (PDT)
In-Reply-To: <56B97007-87A7-4FBC-9DA0-530EA585AD57@oracle.com>
References: <CADaq8jf8s+qpgC25d75Jm4=Mk9mcb5=TEYpKR9AzkczgV7dwag@mail.gmail.com> <56B97007-87A7-4FBC-9DA0-530EA585AD57@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 01 Aug 2017 13:51:14 -0400
Message-ID: <CADaq8jc5D99YBM9d_78L0_CGyF-81xB-oE6RsXdnCof8Lhn_qg@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="001a113f9264346bc50555b4cbad"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/6ubfBHBD9cEZD7jMterwyVWn9Hk>
Subject: Re: [nfsv4] Credit management and one-way messages
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Aug 2017 17:51:18 -0000

*Ouch!!*

If i was addressing an unduly narrow problem, it was because it was framed
around the difficulties that might be added by adding one-way messages such
as those added by Version 2.  If you consider, as I guess makes sense, RPCs
that are not replied to as effectively one-way messages, that puts an
entirely different spin (of possibly high angular momentum) on the problem.

Let me try to start addressing this larger problem.  I hope it doesn't make
anyone (else) dizzy.

One reason that this is hard to address is that we have different views
regarding the roles of the credit management logic regarding replies.  In
my view, there is no point in a requester trying to give a credit to the
responder allowing the responder to send the response to the requester.  I
thought that the right thing was for the requester to ensure that the
responders response could be read was for the requester to post a receive
(or ensure that one was present) before sending the request.  I though that
this happened outside the ambit of credit management because there is no
real need for it to be within the scope of credit  management.  The
requester can be responsible for providing receives corresponding with
in-flight requests without involving the responder in the decision, which
involves complexity, especially when multipeRPC directions are involved.  I
never found anythng in the spec that told me I was wrong in that, but i
just may be missing it.

I think we discussed this a while ago but no change in the spec was made to
either make it clear I was wrong, or indicate that I was right.  My
recollection is that you treated my observation as correct/valid
implementation advice and we never really addressed our different
interpretations of the spec.  It seems to me that the spec was ambiguous on
this point.   Of course, I might not be seeing something that is there and
you might be simply assuming that something is implied because it seems
natural, even if it is not explcitly there.  If the spec still is not clear
on this point, it might need to be fixed, if not to mandate my approach but
at least to indicate how the requester can safeguard itself against the
possibility that a response sent by the responder will cause a disconnect.

With this approach, RPC's not replied to would still be a problem, but the
problem cannot result in a disconnect.  For each not-replied-to request,
there would be a receive that was not included in the credits, which would
be devoted in this model to receives for receiving requests (and things
intended to be one-way messages).   Of course, with 1K or 4K buffers that
would not be such a big deal and you could defer any connection reset until
a time when it was easy to do.

On Tue, Aug 1, 2017 at 11:49 AM, Chuck Lever <chuck.lever@oracle.com> wrote:

>
> > On Jul 31, 2017, at 2:47 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > One issue that has been mentioned as a possible impediment to advnacing
> RPC-over-RDMA to be a working group document, in its current form, concerns
> its use of one-way messages to convey transport properties.  These concerns
> seem to arise from the presentation of credit managemt as a "request-grant
> protocol".  If the credit logic in fact depended on that pairing, then
> there would be no roon for one-way meesages.
> >
> > In fact, it doesn't. If you look at the actual operation of credit
> management, one sees a different picture from what is stated in the
> introductory paragraph of Section 3.1.1 of RFC 8166:
> >
> > Flow control for RDMA Send operations directed to the Responder is
> implemented as a simple request/grant protocol in the RPC-over-RDMA header
> associated with each RPC message.
> >
> > what follows focuses on the role of the grant. giving extensive
> information but how it is computed and nd why it is vsli for the requester
> to use it.
> >
> > While the text says "Practically speaking, the critical value is the
> granted value", it is not clear what the exact role of the request is.  It
> aapears to be a hintwhich the receiver is under no obligation to take any
> notice of.  So it appears that the presentation of the creddit management
> approach as a request-grant protocol., is a reflection of the fact that,
> when it is used to transport RPC Requests and Replies, this pairing always
> exists.  When grants are sent with one-way messages, there is no problem
> that arises from the fact that there is no corresponding credit request.
> The sender simply informs the receiver about his own receive resources and
> thus his ability to accept further sends.
>
> The request-grant protocol sets an upper bound for the number of
> messages that can be in flight at once. You are addressing the
> narrow issue of how a receiver should interpret the value in the
> rdma_credits field to set the negotiated credit limit.
>
> The purpose of this limit is to prevent a sender from transmitting
> more messages than the receiver has available receive buffers.
>
> For example, a requester cannot RPC Call without first ensuring
> there is a receive buffer available to catch the RPC Reply. A
> responder cannot send an RPC Reply without first ensuring there
> are an appropriate number of receive buffers ready for the granted
> number of credits, minus the number of RPC transactions it is
> currently processing. A requester computes the number of remote
> receive buffers that are available, and thus how many more RPC
> Calls it can send before waiting, by observing the number of
> Replies it has received.
>
> In other words, the credit limit is _enforced_ via the two-way
> interchange. The underlying and more significant issue is therefore
> that, when unidirectional messages are introduced, neither side can
> properly compute the number of new messages that may be sent
> relative to the negotiated upper bound.
>
> Another header field could be introduced that contains the exact
> number of receive buffers available on the sender when that message
> was sent. Or such a field could replace rdma_credits. Either would
> be another step away from general compatibility with RPC-over-RDMA
> version 1.
>
>
> > There is one potential issue connected with use of one-way messages to
> be addressed.  If the recever has N credits and then N one-way messages are
> sent without any traffic in the opposite direction, then it is possble for
> a deadlock to result, since there would be no way for the sender to find
> out about the receive resources.  For most one-way messages, this is not a
> problem, since many one-way messages naturally give rise to messages in the
> opposite direction even if the relationship is not formalized witin an RPC
> paradigm.  For example:
> >       • The RDMA2_CONNPROP sent by the client to the server is paired
> with an RDMA2_CONNPROP in the opposite direction.
> >       • An RDMA2_REQPROP results in RDMA2_RESPROP send in response.
> > RDMA2_UPDPROP is an exception.  In the unlikely event that there are a a
> lage series of such messages sent in one irection while there are no RPC's
> being sent to the receiver it is possible for a ealock to arise.  This
> would result in a situation in which the receiver would have to send some
> message back in the other to provide a credit grant.   The most likely way
> to to that is for it to choose to send an RDMA2_UPDPROP with an empty
> property set in the reverse direction.
> >
> > One other problem with one-way messages as they stand in
> rpcrdma-version-two concerns section 7.2.2 in which the following item in
> the bullet list is poblematic:
> >       • When the rdma_proc field has the value RDMA2_OPTIONAL and no RPC
> message payload is present, a Requester MUST set the value of the
> rdma_optdir field to CALL, and a Responder MUST set the value of the
> rdma_optdir field to REPLY.  The Requester chooses a value for the rdma_xid
> field from the XID space that matches the message's direction.  Requesters
> and Responders set the rdma_credit field in a similar fashion: a value is
> set that is appropriate for the direction of the message.
> > This cannot be acted on, because, in the context of a one-way message,
> it is not clear which party is the requester and which is the responder.
> While the roles of client and server are fixed and clear, the roles of
> requester and responder vary from RPC to RPC.  If you are not in an RPC
> context, then any decision as to who is the requestor or responder is
> arbitrary.
> > I think the best way to address these issues is for rpcrdma-version-two
> would be to:
> >       • Provide an explanation of credit management not so tied to the
> RPC paradigm.
>
> That is appropriate to explore, but is much easier said than
> done.
>
> But first, we need to define what a unidirectional message is,
> and even that's harder than it looks. Here are some examples.
>
>  + An RPC Call that does not expect a Reply is unidirectional.
>
>  + A one-way control message is unidirectional.
>
>  + An RPC Call that is dropped or ignored by a responder is
>    unidirectional.
>
>  + An RPC retransmission without a connection break makes
>    either the original or the retransmitted message
>    unidirectional.
>
>  + An RPC Call becomes unidirectional when the matching RPC
>    Reply is lost for some reason.
>
>  + RDMA_DONE would be unidirectional. (RDMA_DONE is proposed
>    as a mechanism to manage Read chunks in RPC Replies in
>    draft-cel-nfsv4-rpcrdma-reliable-reply-00).
>
>  + An extension might introduce some new form of unidirectional
>    message if the problematic text in section 7.2.2 is modified
>    or removed.
>
> These are some of the ways a requester or responder can lose
> synchronization of outstanding credits.
>
>
> >       • Add a new value for the direction field for one-way messages
>
> In some of the above cases, the sender does not know that a
> unidirectional message is being sent, and thus cannot properly
> set the direction field.
>
>
> >       • Provide that one-way messages always contain a credit grant
> rather than a credit request
>
> It doesn't make sense to me to send a message from a requester to
> a responder with a "grant" credit value. What value would the
> requester put in the rdma_credits field in the case where the
> responder and requester have a different number of receive
> buffers available?
>
> It might be stronger to have a special value for either the
> direction or the rdma_credits field which means "the value in the
> rdma_credits field should be ignored".
>
> But see above: sometimes a sender cannot know in advance whether
> a message is unidirectional.
>
>
> >       • Explain how the potential deadlock with RDMA2_UPDPROP can be
> avoided.
>
> In addition, to handle cases where a message becomes unidirectional
> after it is sent, RPC-over-RDMA needs a reliable and non-destructive
> mechanism for resynchronizing the number of outstanding credits at
> the sender and receiver.
>
> Today that mechanism is to drop and re-establish a connection (which
> is 100% reliable but is not non-destructive).
>
> We are introducing mechanisms that can build up state associated
> with a connection (using unidirectional messages, possibly). That
> state is lost and must be re-established if the connection is
> dropped. Not to mention all the outstanding RPCs that have to be
> retransmitted.
>
>
> --
> Chuck Lever
>
>
>
>