Re: [nfsv4] Credit management and one-way messages

David Noveck <davenoveck@gmail.com> Tue, 08 August 2017 11:31 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CDED8124217 for <nfsv4@ietfa.amsl.com>; Tue, 8 Aug 2017 04:31:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r9RN2BibEVJF for <nfsv4@ietfa.amsl.com>; Tue, 8 Aug 2017 04:30:58 -0700 (PDT)
Received: from mail-it0-x234.google.com (mail-it0-x234.google.com [IPv6:2607:f8b0:4001:c0b::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2CDF3131DAC for <nfsv4@ietf.org>; Tue, 8 Aug 2017 04:30:57 -0700 (PDT)
Received: by mail-it0-x234.google.com with SMTP id m34so2442499iti.1 for <nfsv4@ietf.org>; Tue, 08 Aug 2017 04:30:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=EWk0h/sVUqjD5gPWo/A52TAIqUa93CXcLY1MyF6PZXk=; b=QzAce0hhD8xFkzaHrnUIS3fesF1/nJWeqD5i3B4eNdSQ5HcKgDwsrY2AmZJTUk9Tbe IqSYiUZ95h+xxB1wKWNH7HqJ0ANlRSWS8O68QQOAGYxgo9DpIaTmeP4YLY1G82cY9unS mnbUBC5l5Nkbwkn6xPJZ8QqRDHusvPREtujkDA8SUdhnI1+gUzAcwIAHYGhzKugJKPY7 SOkxcRDuuZaEEsShYEz9k/BAsfiUw2sc/ZwbkBUhsdPz0rUASwLDOaLgcKZjvq9QFsOb ORLvd01Kt5CLbJh7rFhaGZyhzyvn05KgNAJHvfkwkMX9iXh9SuSyrlCr4lxWJuH2H1MN ycRQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=EWk0h/sVUqjD5gPWo/A52TAIqUa93CXcLY1MyF6PZXk=; b=AyrYaYMF/B/TNsWu2vECk3e/T1jPKHJa9rRnhPhmOiyKrLgfIUP/dA/LRSH07cZLFL DN5GXOlAaZ3k0Va8KHFwKNQh7/7Nw3qVPchJvkG8iESKFCqSOLfacEg5rwZN6wZk07Do EoAEkmq5RekSTVk+K7uPTqjy3XJuvtwqWgd86kaiWb02zEeBRtZa/jDq68WyJy7Cyg1i WVGtzKPAkVQEjQe/1COGXCtEfSaQjyHShWG5Ge1u5CDE19xRVnhstcNCpR4/54yt+jCb NJmKjg2VRRsfuFoU59FXmdxK/VQgEiBo4YzAi5pW9aHtqrPQPEz25e0ENtqpUtaxLNpQ 0eug==
X-Gm-Message-State: AHYfb5gdyTiYS0v0dh4GCCxbRlzzKoddJiFL5r9RSA5m+KJJSQkqu8IF Ed8AEAPURXa7puj4XGTBA6jwt1d6+g==
X-Received: by 10.36.92.77 with SMTP id q74mr3362518itb.24.1502191857063; Tue, 08 Aug 2017 04:30:57 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.142.72 with HTTP; Tue, 8 Aug 2017 04:30:56 -0700 (PDT)
In-Reply-To: <8D28B73B-29A3-4C33-A9B0-B04182D68859@oracle.com>
References: <CADaq8jf8s+qpgC25d75Jm4=Mk9mcb5=TEYpKR9AzkczgV7dwag@mail.gmail.com> <56B97007-87A7-4FBC-9DA0-530EA585AD57@oracle.com> <CADaq8jc5D99YBM9d_78L0_CGyF-81xB-oE6RsXdnCof8Lhn_qg@mail.gmail.com> <8D28B73B-29A3-4C33-A9B0-B04182D68859@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 08 Aug 2017 07:30:56 -0400
Message-ID: <CADaq8jc1wYJbaWV+UgDEuQoOSava8JPZ-svgSJrdSJonZz+Evw@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="001a1145ea52065bb905563c4c6a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/0S2mJUK-lMZ5afq7fvr4SNUiSm4>
Subject: Re: [nfsv4] Credit management and one-way messages
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Aug 2017 11:31:01 -0000

> Another interpretation is that the spec describes only protocol,
> and that after some trial-and-error, implementers will conclude
> that there are only one or two ways to implement that protocol.

Generally, if you describe the protocol, you should not have all that
much trial and error.  Although you don't tell the implemrenter exactly
how to do things, if you expect a lot of trial and error, particularly
error,
there is a something in the spec that needs ro be clarified.

Also, if the two sides pick different implementations, evertthing should
work OK.

> A requester is responsible for posting enough receive buffers to
> catch Replies for all outstanding RPC Calls.

Agree with that.

> So either:

> A. It can post a receive buffer as part of preparing to send a Call.
> If a Reply is missed, that receive is still posted. The requester
> has to accommodate for that.

Yes.

> B. It can batch-post receive buffers (say, as part of receive
> completion handling) in order to keep more than enough available.

With that formulation, receives are still posted in the event of missed
reply but
they are not considred a problem.

What we seem to agree on and that A and B agree on, is that the responder
is free
to respond to requests without obtaining credits to validate his right to
do so.

> The problem, as I see it, occurs when a responder wants to send a
> one-way message to a requester.

I don't think you can send a one-way message to a "requester" or
"responder".
A parrty is either a requester or resopnder only in the context of an RPC.
If you are not in the process of sending or processing an RPC, you are not
a
requester or a responder  although you are either a client or a server,

> In scenario A, that knocks down a
> receive buffer that was intended to catch a Reply.

It knocks down a receive buffer, clearly.

Specfic receives are not dedicated to specific purposes but I think that
one-way
messages, unlike replies should require a credit for the sender to send
them,
and should not deplete the pool of receves for which there are no
associated
credits (i.e. the ones posted to accommodate replies).

> The requester
> is responsible for getting a replacement receive buffer posted

I don't think the receiver is responsible for this, just as it is not
responsible for
replenishing receives taken up by requests.  It may choose to do so but it
is
not obliged  to do so.

> but there's a possibility that it could not do so fast enough
> to catch other incoming messages.

Preventing that is what the credit logic is for, and you can use that
to prevent over-sending one-way messages, just as it prevnts oversending of
requests.  I accept that one-way messages add some additional issues
(see below) but the same basic logic is applicable.

> A responder is responsible for posting enough receive buffers to
> catch "credit" RPC Calls. It relies on the fact that requesters
> can't keep more than "credit" RPC Calls in flight, to know exactly
> how many receive buffers it needs to keep posted.

I agree.

> So either:

I don't see the point of this typology.

> A. It posts "credit" receive buffers when accepting a connection,

This formulation is confusing.  I would say that it posts some
number of receives and then reports the number of such receives
as "credit".

> and then it posts a receive buffer as part of preparing to send a
> Reply.

It can post an aditional receive at any time it wants.  It is not tied
to sending a reply.

> If sending fails, that receive is still posted, and the
> responder has to accommodate for that.

I don't see anything to be accommodated.  The responder is
required to report as "credit" the number of receives that have
been posted.  This situation is unlike the other "A" above.

> B. It can batch-post receive buffers in order to keep more than
> enough available.

There is no definition of "enough" or "more than enough".  The responder
chooses how may receives to post and should report that as the number of
credits.

> The problem, as I see it, occurs when a requester sends enough
> one-way messages that it prevents the transmission of more
> RPC Calls.

So any solution has to prevent that from happening.  I guess what I prposed
was:

   - Some items considered one-way messages are used in pairs, even though
   they are not part of RPC's that it is the job of the transport.
   - You can send otherwise "empty" one-way messages if you need to send
   credit information.

> The responder cannot send Replies to those one-way
> messages, so the requester has no way to know when the responder
> is ready to receive another RPC Call.

He can send other one-way messages to get the credit information to the
peer.

> I think that credit management will have to take a different
> form if we believe one-way messages are a necessary part of the
> future of RPC-over-RDMA.

I don't believe one-way messages per se are a necessary part of the futue
of RPC-over-RDMA.
I just seems to me simpler to add them then it does to you.

The thing that I think is necessary to the future is of RPC-over-RDMA, if
it has one,
is that there be some provision for the transfer of control messages used
by the
trannsport itself.  If there is a need to pair them, I'm OK with that but I
fell Version 1 is
funmentally handicapped by a strucure in which in which the only messages
to be sent are
either a request to be sent on behalf of a RPC Requester or a reply to be
sent on behalf of
an RPC Responder.  It may be that RPC-over-RDMA will not have a future and
that
relying on new pNFS mapping types is adequate, but I' prefer us to have
multiple paths forward.

This is probably mateial for a 4/1/2018 I-D, but I think we should think
about the fact that some
things do not now have appropriate names:

   - RPC-over-RDMA Version 1 woul be more appropiriately called
   RPC-over-RDMA Version Zero
   - NFSv4.1 would be better understood if it was called NFSv5.0.
   - NFSv4.2 would be better understood if it was called NFSv5.1.



On Mon, Aug 7, 2017 at 5:46 PM, Chuck Lever <chuck.lever@oracle.com> wrote:

>
> > On Aug 1, 2017, at 1:51 PM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > Ouch!!
> >
> > If i was addressing an unduly narrow problem, it was because it was
> framed around the difficulties that might be added by adding one-way
> messages such as those added by Version 2.  If you consider, as I guess
> makes sense, RPCs that are not replied to as effectively one-way messages,
> that puts an entirely different spin (of possibly high angular momentum) on
> the problem.
> >
> > Let me try to start addressing this larger problem.  I hope it doesn't
> make anyone (else) dizzy.
> >
> > One reason that this is hard to address is that we have different views
> regarding the roles of the credit management logic regarding replies.  In
> my view, there is no point in a requester trying to give a credit to the
> responder allowing the responder to send the response to the requester.  I
> thought that the right thing was for the requester to ensure that the
> responders response could be read was for the requester to post a receive
> (or ensure that one was present) before sending the request.  I though that
> this happened outside the ambit of credit management because there is no
> real need for it to be within the scope of credit  management.  The
> requester can be responsible for providing receives corresponding with
> in-flight requests without involving the responder in the decision, which
> involves complexity, especially when multipeRPC directions are involved.  I
> never found anythng in the spec that told me I was wrong in that, but i
> just may be missing it.
> >
> > I think we discussed this a while ago but no change in the spec was made
> to either make it clear I was wrong, or indicate that I was right.  My
> recollection is that you treated my observation as correct/valid
> implementation advice and we never really addressed our different
> interpretations of the spec.  It seems to me that the spec was ambiguous on
> this point.   Of course, I might not be seeing something that is there and
> you might be simply assuming that something is implied because it seems
> natural, even if it is not explcitly there.
>
> Another interpretation is that the spec describes only protocol,
> and that after some trial-and-error, implementers will conclude
> that there are only one or two ways to implement that protocol.
>
>
> > If the spec still is not clear on this point, it might need to be fixed,
> if not to mandate my approach but at least to indicate how the requester
> can safeguard itself against the possibility that a response sent by the
> responder will cause a disconnect.
> >
> > With this approach, RPC's not replied to would still be a problem, but
> the problem cannot result in a disconnect.  For each not-replied-to
> request, there would be a receive that was not included in the credits,
> which would be devoted in this model to receives for receiving requests
> (and things intended to be one-way messages).   Of course, with 1K or 4K
> buffers that would not be such a big deal and you could defer any
> connection reset until a time when it was easy to do.
>
> A requester is responsible for posting enough receive buffers to
> catch Replies for all outstanding RPC Calls. So either:
>
> A. It can post a receive buffer as part of preparing to send a Call.
> If a Reply is missed, that receive is still posted. The requester
> has to accommodate for that.
>
> B. It can batch-post receive buffers (say, as part of receive
> completion handling) in order to keep more than enough available.
>
> The problem, as I see it, occurs when a responder wants to send a
> one-way message to a requester. In scenario A, that knocks down a
> receive buffer that was intended to catch a Reply. The requester
> is responsible for getting a replacement receive buffer posted,
> but there's a possibility that it could not do so fast enough
> to catch other incoming messages.
>
> This is the fundamental issue with one-way messages. A sender
> has no way to know when the receiver is prepared to receive
> additional messages because there is no message acknowledgement
> in RPC-over-RDMA version 1, other than RPC Call and Reply messages.
>
>
> A responder is responsible for posting enough receive buffers to
> catch "credit" RPC Calls. It relies on the fact that requesters
> can't keep more than "credit" RPC Calls in flight, to know exactly
> how many receive buffers it needs to keep posted. So either:
>
> A. It posts "credit" receive buffers when accepting a connection,
> and then it posts a receive buffer as part of preparing to send a
> Reply. If sending fails, that receive is still posted, and the
> responder has to accommodate for that.
>
> B. It can batch-post receive buffers in order to keep more than
> enough available.
>
> The problem, as I see it, occurs when a requester sends enough
> one-way messages that it prevents the transmission of more
> RPC Calls. The responder cannot send Replies to those one-way
> messages, so the requester has no way to know when the responder
> is ready to receive another RPC Call.
>
>
> I think that credit management will have to take a different
> form if we believe one-way messages are a necessary part of the
> future of RPC-over-RDMA.
>
>
> --
> Chuck Lever
>
>
>
>