Re: [nfsv4] draft-ietf-nfsv4-rpcrdma-bidirection-03 review

David Noveck <davenoveck@gmail.com> Mon, 23 May 2016 16:24 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 579F212D810 for <nfsv4@ietfa.amsl.com>; Mon, 23 May 2016 09:24:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wsMEjbRjGbhy for <nfsv4@ietfa.amsl.com>; Mon, 23 May 2016 09:23:59 -0700 (PDT)
Received: from mail-oi0-x232.google.com (mail-oi0-x232.google.com [IPv6:2607:f8b0:4003:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A378B12D9EE for <nfsv4@ietf.org>; Mon, 23 May 2016 09:23:57 -0700 (PDT)
Received: by mail-oi0-x232.google.com with SMTP id b65so141536049oia.1 for <nfsv4@ietf.org>; Mon, 23 May 2016 09:23:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=r9gkCojn36Nq7rJSeXWcM8b9qS95l6atjVgP8uNlEC0=; b=tBo4j7pNSzxQIj1p9WG3n/Xja4C5DPjV7jk8zg0RyUsb7WC0b7YrztzxaqWW2hsFiL sqtostamJmFrnKUnig0b2fHA0wEAxUKmDW+skJTwePA8Y5OQi1EFaygDUAdeFid4ve/g ecfL6xWX4zCmYxLy6NJeQPKVy1UwycnSPG+WgBS/XRkDSpy8uS3jFyHz3HOjgRMh72hL szk+4Z7kzwSvqWJzruMT0yFW5V8rIWGurkc0tsSgy1LmDFUack/bCUUndPe/3SGClLj/ qZuCWZ63SoWpDclSx6G9NGXUcS3qVzhDx2mcQfm1N3kmjRvV5fYAEIV85x/LCx94McF2 5QrA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=r9gkCojn36Nq7rJSeXWcM8b9qS95l6atjVgP8uNlEC0=; b=D4y3WpxvsAceH0tvJ0xv0ScwrXwP2k4MP5fOg2etrIDtb+0ZfnsFkaU+Ny6QdEPTlz Oft0qEyMU+xAaVZsvs4UuT1yTR0KSnF/czNipn6EjilqoYPXUKn1YppmpRoAIOZU0e8L sskvV1fL9Rxc904kH2YAmLncq43rgplLWYkXAS/yQzwlNo2m0GClUsjakWwwo4MGEu9U 8CQNJ1IXGm7bupGCSuUbYGvAvvrYHJvu+QHTZUUCDOBdffszZjVNfs1Ne9+5mrJ93ufE /TVti+spUIVlF+BvhGvY+JcX31L8h6RAkC/MsLrRpXEzisxBHNcuQ0+GKAVrKAt4FAMA 2RFQ==
X-Gm-Message-State: ALyK8tJgtgN2EsgYKHcKa7gIKNlRWGWk+i2EiCKHf44KduoNSS/0pA2/WosjiNxbdlPkJ2gf2w4E58U6rfuL9A==
MIME-Version: 1.0
X-Received: by 10.157.12.178 with SMTP id b47mr4187121otb.6.1464020636733; Mon, 23 May 2016 09:23:56 -0700 (PDT)
Received: by 10.182.29.166 with HTTP; Mon, 23 May 2016 09:23:56 -0700 (PDT)
In-Reply-To: <4E8C421F-1A22-413C-AA2E-833C71AC6F71@oracle.com>
References: <6da90b6b-bb58-d241-0d74-dc421358c97c@oracle.com> <9ECFBBD5-9359-46AB-B1CC-7FCBF06C40A8@oracle.com> <f609eba3-4294-37ae-3bb8-c7df8f648bb0@oracle.com> <0E79D0E4-9D53-4ED4-8D17-2D806C56648F@oracle.com> <567848d1-d854-ef70-8fba-33708e7e0601@oracle.com> <E2B31EDF-74D6-4CC8-8F6C-674C85498B56@oracle.com> <E0C18ECC-7D15-447F-9DA7-654E1EBF6C3B@oracle.com> <CADaq8jcgW316nLwA3LnCAmL7nAY3o6XjeQLCkV-S_Sps9g+LJw@mail.gmail.com> <4E8C421F-1A22-413C-AA2E-833C71AC6F71@oracle.com>
Date: Mon, 23 May 2016 12:23:56 -0400
Message-ID: <CADaq8jdVjt6x0MNgc7g0HCvQ9tR6yC01AdSPCiqHmEn8CMPscw@mail.gmail.com>
From: David Noveck <davenoveck@gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a11409edcff1ce9053384ddab"
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/y_yU5E2PUDKESjW8i8_2DPc60ig>
Cc: NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] draft-ietf-nfsv4-rpcrdma-bidirection-03 review
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 May 2016 16:24:02 -0000

> If the requester sent version 1 but doesn't
> recognize version 1 in replies, something is very wrong.

It certainly would be, but if you want to say that you only
send this when you get a request, it is unclear how one might
determine this.  If you support Version One, and you get a
version field which is not one, you have no idea what the XDR
for that version looks like and thus you have no way to find out
whether out whether the message contains a request or reply.

In practical terms, if you are a server, then you might well assume that
the first message you receive likely to be a request, but future
versions might add initialization/setup transactions that you are not
prepared to parse.  Maybe Version two needs to be modified so
that clients always send a (possibly NULL) request before using
extensions.

> Essentially I'm saying that RDMA_ERROR is always a REPLY.
> That is the way current implementations treat it, thus we
> should document that.

OK, but there are going to be situations in which the receiver cannot
determine whether it has received a request or a reply.  I don'y know
exactly how to deal with those situations (you might say he MAY send
RDMA_ERROR), but I have a problem with writing the spec as if
such situations don't exist.

On Mon, May 23, 2016 at 11:39 AM, Chuck Lever <chuck.lever@oracle.com>
wrote:

>
> > On May 23, 2016, at 11:00 AM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > I think the better alternative is to document the ambiguity and ask
> people
> > to live with it.  Version One's error handling is broken and it doesn't
> seem
> > as though it can be fixed without changing the XDR, and we've decided
> that all
> > such changes are to bump the version number.
>
> I'm not proposing an XDR change, and there would be no
> behavior change required of current implementations. No
> current implementation sends RDMA_ERROR in response to
> a bogus reply message. This is strictly documentation
> of existing implementation behavior.
>
>
> > The idea is to ask the sender not to set the credit value and the
> receiver not
> > to interpret it.
>
> The sender is going to set the credit value no matter
> what. There's nothing that says otherwise. (You haven't
> mentioned the sender side change before: that _would_ be
> a change to existing implementations, and to rfc5666bis).
>
>
> > The problem with restricting when this is sent is that it assumes, that
> if
> > you receive a message, you can tell if it is a request or a reply.  For
> a well-formed
> > message that can be determined, but the idea here is we are dealing with
> a
> > message that is so messed up  that you can't even XDR the transport
> header.
>
> Not necessarily. The RDMA_ERROR message is used most
> typically in two important cases where XDR is parseable:
>
> 1. To report an RPC-over-RDMA protocol version mismatch
>
> 2. When a backward direction request has non-empty chunk
> lists and the responder has no support for chunks
>
> The case where an actual parsing problem occurs is useful
> mostly during early development of an implementation. It
> happens in practice only when there is a catastrophic
> fabric or RNIC failure that corrupts RDMA Send data
> content.
>
>
> > In we now look at section 5.5.2 and look at the examples there:
> > an invalid value in the rdma_proc field
> > Without that, you can't even parse the header so, you don't where to
> look for the call/reply indication.
> > an RDMA_NOMSG message that has no chunk lists
> > Here, there is no payload to look at.
> > or the contents of the rdma_xid field might not match the contents of
> the XID field in the accompanying RPC message.
> > In this case, you can determine which you have.
>
> How should a requester behave when it gets such a
> malformed reply message? Sending an RDMA_ERROR message
> to the responder seems ineffective. The requesters I'm
> familiar with drop such garbage replies and report an
> error, which seems reasonable to me.
>
> Responders are required to copy the rdma_vers field to
> replies. If the requester sent version 1 but doesn't
> recognize version 1 in replies, something is very wrong.
>
> The XID field in an RDMA_ERROR message is valuable: when
> sent from a responder, it tells the requester that the
> responder was not able to process that XID. The requester
> is then free to try again or terminate the transaction.
> I believe making that field unambiguous is a good thing,
> and worth making the proposed change.
>
> Essentially I'm saying that RDMA_ERROR is always a REPLY.
> That is the way current implementations treat it, thus we
> should document that.
>
> This also disambiguates the credit value, but that's
> much less important. I don't have a strong opinion about
> ignoring that field on receipt of an RDMA_ERROR.
>
>
> > On Mon, May 23, 2016 at 10:10 AM, Chuck Lever <chuck.lever@oracle.com>
> wrote:
> >
> > > On May 20, 2016, at 3:15 PM, Chuck Lever <chuck.lever@oracle.com>
> wrote:
> > >
> > >
> > >> On May 20, 2016, at 2:51 PM, Karen <karen.deitke@oracle.com> wrote:
> > >>
> > >>
> > >>
> > >> On 5/20/16 12:44 PM, Chuck Lever wrote:
> > >>>> On May 20, 2016, at 2:30 PM, Karen <karen.deitke@oracle.com> wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 5/20/16 10:41 AM, Chuck Lever wrote:
> > >>>>>>
> > >>>>>> 4.1
> > >>>>>>
> > >>>>>> "When message direction is not fully determined by context"
> > >>>>>>
> > >>>>>> "fully determined by context" thats confusing, what does that
> really mean? I think this means when the rdma header does not directly
> indicate if the message is a call or reply, but the wording is confusing.
> > >>>>> That means that in some cases the receiver can guess
> > >>>>> accurately which direction the message was going, based
> > >>>>> on the context of the operation, even without having
> > >>>>> an RPC message payload.
> > >>>>>
> > >>>>> I'm open to suggestions.
> > >>>>>
> > >>>>> I think some prefer that the document simply state that
> > >>>>> direction is always unknown in cases where an RPC
> > >>>>> message payload is not present. That kind of opens a
> > >>>>> can of worms with RDMA_ERROR, which is needed to report
> > >>>>> that the client does not support backward direction
> > >>>>> operation.
> > >>>> I don't think that it can always absolutely be clear which
> direction without the RPC header. Or am I misunderstanding RPC payload? I'm
> taking that to mean the RPC header, but maybe you are referring to the NFS
> data in the rpc payload?
> > >>> "Context" here means that if the receiver can tell
> > >>> by other means (like, there are no other outstanding
> > >>> operations). So no, it's not always going to be clear.
> > >>> In those cases, direction is not known.
> > >> I don't think there is ever a 100% way to know the direction without
> the rpc header's call or reply field.
> > >
> > > I don't think what I wrote contradicts that statement, but
> > > it does allow latitude for innovation to close this hole
> > > in other ways.
> > >
> > >
> > >> Even if there is an outstanding request, and it is for the xid that
> is defined in the rdma header, being that the xid is not unique between
> client and server, there still exists, though extremely small, a possiblity
> that this is a new request that just so happens to have the same xid and is
> not actually the reply to an outstanding operation.
> > >
> > > The only case in RPC-over-RDMA Version One where this is
> > > a concern is RDMA_ERROR. The other two valid procs are
> > > RDMA_NOMSG and RDMA_MSG, and both of those have RPC
> > > message payloads.
> > >
> > > When a client receives an RDMA_ERROR, that normally means
> > > one of its forward requests had a problem.
> > >
> > > Typical RPC-over-RDMA Version One client implementations
> > > don't send an error to a server due to a problem with a
> > > forward reply. We can probably say the same about a
> > > server sending an RDMA_ERROR to a client in response
> > > to a bad backward reply. (And, rfc5666bis could be
> > > enhanced to suggest or require that requesters don't
> > > send RDMA_ERROR in response to a bogus reply).
> >
> > OK, this idea is growing on me.
> >
> > What do people think of updating rfc5666bis to restrict
> > RDMA_ERROR to be sent only by responders? This would be
> > for RPC-over-RDMA V1 only, of course.
> >
> > The reason to do this would be to eliminate the ambiguity
> > of the meaning of the XID and credit value, due to the
> > absence of an RPC message payload with a direction field
> > in it.
> >
> > I'm not aware of any current requester implementation
> > that can send an RDMA_ERROR message from its reply
> > handler.
> >
> >
> > > So the only instance where a server might receive an
> > > RDMA_ERROR message is when it has sent a backward
> > > request that the client did not like.
> > >
> > > Thus: context indicates what direction that RDMA_ERROR
> > > message is going.
> > >
> > >
> > > --
> > > Chuck Lever
> > >
> > >
> > >
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> >
> > --
> > Chuck Lever
> >
> >
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4
> >
>
> --
> Chuck Lever
>
>
>
>