Re: [nfsv4] New Version Notification for draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt

David Noveck <davenoveck@gmail.com> Fri, 26 August 2016 17:57 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AE51C12D1D0 for <nfsv4@ietfa.amsl.com>; Fri, 26 Aug 2016 10:57:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oX_pereSC5ZT for <nfsv4@ietfa.amsl.com>; Fri, 26 Aug 2016 10:57:18 -0700 (PDT)
Received: from mail-oi0-x22c.google.com (mail-oi0-x22c.google.com [IPv6:2607:f8b0:4003:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 328BC12D14F for <nfsv4@ietf.org>; Fri, 26 Aug 2016 10:57:18 -0700 (PDT)
Received: by mail-oi0-x22c.google.com with SMTP id f189so120368968oig.3 for <nfsv4@ietf.org>; Fri, 26 Aug 2016 10:57:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=2ObQ5wHqbhWf8ZFuVvP2dd5MXOLMm4v2rmL2Y3d04BE=; b=gKysEe//OqfDFgKmV3EX0d++loUQuMblscE0+OuCUgjMoF2ilvq2wFRHgdZIzV2kuR A6epcjTtwGVkeQczS+togBSloYmv69NXADPkJA3+xw63w3JBIdKQZuhd5myQiDspWOr9 /FAwSk5KagISBjxt3ORb0LXh1OdixOecouCJcv/auaepeT8SdlOEJw+YFubzhcsAV7eK ztINc/yHoxCalAjhzFRoQqQJnVIAYCRL8Gq7j3i/4/DXY8j/2qb29oJERFmZbK1CXI46 VQfjgoh6ELnd3FU7l6LNFHDfevxfNAbPPE8nfnoY934xGFhoGJS8Juf+kXtxByQzUZ6c 8Isw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=2ObQ5wHqbhWf8ZFuVvP2dd5MXOLMm4v2rmL2Y3d04BE=; b=ke/3mGpz82pYbdDFoyIw7/BSa9qC8FwgeLCrSLhnecrHJhcyX0vzfimdvM4RjGfi8w XGhe9My0LcZSMA7JPsxvVMKLIpsd2xtE7y/ya7F0cM4TO5VJ9NkZvaiEY9WjZ7AFTv/U vmysq7nUtDzWF5tXEDI7zipDWkzQu9mzT/YI4+d97EALvfzQp9N1YnxIxLtWFJhHe7JC PXb4nK2/Rg6QSo/0fSFym5K3SaB8pYcwN8VqNVP2lJarr0IldbQNiixi79Dt1FOoC8G8 0t72cOjKvFSEkKZygJpwXrDt9kpPMuApELrg78EInMegseUs5GoWEgM23W+EgURABjFU QM4w==
X-Gm-Message-State: AE9vXwP8aYkuZWu6mIDob54evGDwuODGOhiS1NwXptnIPkOblM/XxGLQujQlwCJU1suj+6oh2sz5uCog7Aiz+Q==
X-Received: by 10.202.80.4 with SMTP id e4mr3672770oib.51.1472234237390; Fri, 26 Aug 2016 10:57:17 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.116.37 with HTTP; Fri, 26 Aug 2016 10:57:16 -0700 (PDT)
In-Reply-To: <4FDB0016-8E32-4E61-B139-F46C5AFCED7C@oracle.com>
References: <147155138353.27840.7944779905916585881.idtracker@ietfa.amsl.com> <CADaq8jeb6P_1mG9++T=Ff31WBeJi-bD7NqUBQM4GOscxquiDxQ@mail.gmail.com> <2E2207C2-EC6D-4C44-9024-56D103563617@oracle.com> <CADaq8jd6PT7y=x1a5Tynxdzed2DivEuCG_6UK1eA=Vxod3PJxQ@mail.gmail.com> <54FAE2DB-2583-4512-89D7-2EC9E7AEA86F@oracle.com> <CADaq8jcWTuCC4yb3_fFpYYWcW3he2HHvjz7zaPON2_Yc5qXN-A@mail.gmail.com> <4FDB0016-8E32-4E61-B139-F46C5AFCED7C@oracle.com>
From: David Noveck <davenoveck@gmail.com>
Date: Fri, 26 Aug 2016 13:57:16 -0400
Message-ID: <CADaq8jcirk3GFzY6m8q-sggPDXp_iSyufmGRvZpfzx8mR26M8Q@mail.gmail.com>
To: Chuck Lever <chuck.lever@oracle.com>
Content-Type: multipart/alternative; boundary="001a113d7514bef64d053afd3e22"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/P5DfetdMOpqL7pGauBhc8h13i34>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] New Version Notification for draft-dnoveck-nfsv4-rpcrdma-xcharext-02.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Aug 2016 17:57:21 -0000

> No, I'm saying that pendclr adds no value; the receiver knows
> everything it needs to know by looking only at the new value.

That seems to be because, you believe that the status of a
previously pending request (completed or not) is not something
that the receiver needs to know.  In some cases it might
not but given that there is mechanism to make requests, it
seems to me that the one making the request

> I don't have a problem with UPDXCH as a mechanism for unsolicited
> change notifications, though I'm not sure we have a use case
> where the same kind of notification cannot be done with REQXCH.

REQXCH cannot be used for unsolicited change notifications.  It
is intended for  requested/solicited changes.


> It's especially perplexing to me to see a facility for asynchronous
> notification of completion when there is no real call-reply transaction
> relationship between REQXCH and REPXCH, and there is no timeout when
> sending REQXCH messages. A key question for me is "How is [UPDXCH
> with pendclr set] different than a delayed REPXCH response?" If they
> are semantically the same, then why have both?

It is possible to limit UPDXCH to unsolicited changes.  The issue I
have to consider is how to deal with partial satisfaction (e.g.  you ask
for 32K and get 16K).

> If the document moves forward with pendclr, I think implementers need
> more guidance about how to use this field.

Let me think about the text for this.

> Otherwise, someone is going
> to build an implementation that depends on pendclr,

That presumably would be an implementation that requested a property change.

> and some other
> implementation that implemented the protocol blindly (ie without any
> need for the pendclr value)

The implementation it would be operating with would be one that
responded to such requests.  And an implementation that responds to
such requests and doesn't make them has no need for the pendclr,
but it still should set it for those that need it.


> In other words, I think there is useful feedback from an implementation
> that does not need pendclr.

I'm not sure what that feedback would be.  The fact implementation A has
no use for it does not mean that information B has no need for it.

> Perhaps only interop testing can sort it out.

Perhaps.


> > The protocol allows unsolicited changes
> > but that doesn't mean they are going to be happening all that
frequently.

> A sane implementation of the currently proposed initial properties
> probably won't use UPDXCH at all.

I don't see much use of UPDXCH (by sane implementations) for a while but
i think there will eventually be a need for it.

> I'm trying to explore the corners
> here and play devil's advocate; if an implementer is allowed to use
> UPDXCH in a crazy way, we should try to anticipate that and either
> prohibit it or make it safe.

I don't think safety is the issue,  Even if it is safe, people using it
the way you anticipate would disruptive, in that other implementations
would be forced to implement the same function twice, to work
with an implementation


> To interoperate among implementations that depend on it and
> ones that don't need it, though, I think the document needs
> more precise language about how pendclr is to be used.

I'll try to provide that.



> (Much) earlier you suggested that a ULP might want to select its
> own receive buffer size, for example. If two ULPs share the same
> connection, then they could choose to adjust the connection in
> contradictory ways. Perhaps you are taking that off the table
> now, which is fair enough.

Consider it off the table.



> My remark here was not about two ULPs. The two directions of
> transmission on a connection are not co-ordinated in any way. An
> unsolicited UPDXCH can pass a REQXCH going in opposite directions.

Yes.

> IMO the document has to acknowledge that property value changes
> have to be done carefully and with co-ordination with the regular
> operation of the transport to avoid hazardous oscillations of
> values.

Agree.

> The PROP protocol mechanism does not provide any kind of
> serialization on its own.

True.




> > It then requests a smaller receiver buffer size.  as part of doing that
it decides
> > to not send requests or responses that are larger than this anticipated
size
> > limit.  I think the spec already discusses this but the basic point is
that by making
> > this request the sender is licensing the peer to reduce the buffer
size.  If this
> > request is completed, the sender will make that new size permanent (or
at least until
> > it hears about a change). On the other hand, if the peer rejects the
request
> > (either fairly soon via RESPXCH, or  after a while via UPDXCH), the
sender would
> > then adopt the current value, since the possibility of the
> > peer eventually
> > satisfying the orginal request is now gone. On the other hand, an
update not including pendclr
> > should not have that effect.  As long as the request is pending, he has
to be prepared
> > for its peer to adopt the lower value that it has requested.

> I don't see how the REQuester's behavior would be different if pendclr
> was false. It can't assume the requested setting takes effect until
> it gets a REPXCH or UPDXCH with the new value, whether or not pendclr
> is true, and there's nothing in the spec (that I recall) that says
> otherwise (about pendclr being true is required before a new value is
> trusted -- did I miss something? --

It's not that it is not trusted.  It is that when you have a request
outstanding to redeuce the the
size to X, you need to take that account.  So, if you get an update that
the size is now Y, you
accept that the size is Y when that was sent but the possibility exists
that it might be set
down to X, since you requested that that be done.

> > and if that is the case, then
> > what's the value of sending an unsolicited UPDXCH ?)

In general there is value in doing so, but there may be special
circumstances in which a
existence of a pending  request restricts the receiver from acting on it. .

> There's nothing that prevents the REP side from sending "yes, I took
> the new setting (pendclr true)" and then "oops, no I went back to the
> old one (pendclr false)". That's not something an implementer might
> do on purpose, but it sure can happen by accident (poorly-timed
> administrator action, say).

Yes, but in that case, the requester will find out that his request has
been granted,
and then that administrative action has superseded it.  the implementation
should
work OK although people might have a problem with the administrator's
choices.



> You could state that UPDXCH and REQXCH MUST/SHOULD NOT be used before
> FIRSTPROP has been exchanged.

MUST NOT is OK with me.  The next step is to state the same regarding any
particular property.  I'm OK with SHOULD NOT for that.




> > I don't think unsolicited reduction in receive buffer size (as opposed
to the
> > requested reduction discussed above) can be made to work.

> It could, perhaps, if ...

I was speaking of the protocol currently described.

> the property had two values: one was the send
> size limit, and the other was the receive buffer size.

I think that work better as two separate properties.  Defining a
new property
is a lot easier than changing the structure of an existing one.

> The send size
> limit is not necessarily associated with a physical buffer size, if
> Sends are done with a gather vector, so its relatively
> straightforward to change it.

> One peer could send a REQXCH to reduce the sender's send size limit.
> If the other peer says "OK" then it is safe to reduce the receive
> buffer size.

> In fact, IMO both sides do want to know the send size limit of their
> peer so they can trim receive buffers to a size that would be used.
> It might also help a requester determine when a responder needs a
> Reply chunk.


> > Maybe I should add an explicit statement
> > to that effect.

> That would be implementation guidance at best, and might preclude
> some future innovation that allows unsolicited receive buffer size
> changes.

I think I could draft something that would not run into the preclusion
problem.

You just have to point out that if the sendr is told that the receivers
buffer are X
long, he is going to be sending messages that long, making a reduction in
receive buffer size problmatic in the absence of a mechasim to nsure that
the
sender aware of the new (lower) limit.

> Should similar statements be made about the other initial properties
> proposed in rpcrdma-xcharext?

Probably.

On Fri, Aug 26, 2016 at 11:39 AM, Chuck Lever <chuck.lever@oracle.com>
wrote:

> Hiya Dave-
>
> > On Aug 26, 2016, at 5:08 AM, David Noveck <davenoveck@gmail.com> wrote:
> >
> > > So, there is a protocol mechanism for reporting partial or full
> > > completion of a property change request, yet no mechanism for
> > > indicating which change request it is the receiver is completing.
> > > But multiple property change requests are allowed to be in flight
> > > concurrently.
> >
> > The protocol allows it, but particular implementations, may well
> > restrict things so that they have only one in flight for each property
> > at any particular time.
> >
> > > Without an XID, the only way this might work is by serializing
> > > updates to each property.
> >
> > It is one way.  I don't think it is the only way.  The requester, if it
> > is interested in knowing when there are no outstanding requests
> > for a property can increment on requests and decrement on messages
> > that indicate completion.
> >
> > > And, if the sender can send an "I'm done" update followed by any
> > > number of "Oops, I changed this again" updates,
> >
> > The updates indicate that there has been an unsolicited change in
> > the property:  no "Oops" and no "again".
> >
> > Whether such changes happen or not, depends on the
> > implementation.  In any case, I would expect them to be
> > rare.
> >
> > The proposed protocol provides a way for the peer to be
> > informed of such changes.   I don't see how that undercuts
> > the facilities that allow the completion of requested changes
> > to be reported asynchronously.
> >
> > > I don't see the
> > > point of a receiver ever remembering that it is waiting for a
> > > pending change, or the protocol distinguishing between unsolicited
> > > and pendclr.
> >
> > You seem to be saying the potential existence of unsolicited changes
> > some time after the completion of a requested change somehow
> > makes the notion of completion of the requested change useless.
>
> No, I'm saying that pendclr adds no value; the receiver knows
> everything it needs to know by looking only at the new value.
>
> I don't have a problem with UPDXCH as a mechanism for unsolicited
> change notifications, though I'm not sure we have a use case
> where the same kind of notification cannot be done with REQXCH.
>
>
> > I'm not sure why you believe that.
>
> I believe that, because I can't for the life of me think of a use
> case where a receiver can make use of the information in pendclr.
> (Yes, I probably lack quite a bit of imagination). Can you please
> provide one?
>
> It's difficult, as a reviewer of proposed protocol, to understand
> why we are reserving these 32 bits in the UPDXCH message without
> having at least a little clue about why they are there and how a
> receiver would make use of them.
>
> It's especially perplexing to me to see a facility for asynchronous
> notification of completion when there is no real call-reply transaction
> relationship between REQXCH and REPXCH, and there is no timeout when
> sending REQXCH messages. A key question for me is "How is [UPDXCH
> with pendclr set] different than a delayed REPXCH response?" If they
> are semantically the same, then why have both?
>
> If the document moves forward with pendclr, I think implementers need
> more guidance about how to use this field. Otherwise, someone is going
> to build an implementation that depends on pendclr, and some other
> implementation that implemented the protocol blindly (ie without any
> need for the pendclr value) is not going to get it right (or will take
> shortcuts, or whatever), and there will be an interoperability problem,
> just add boiling water and fluff with fork.
>
> In other words, I think there is useful feedback from an implementation
> that does not need pendclr. Perhaps only interop testing can sort it
> out.
>
>
> > The protocol allows unsolicited changes
> > but that doesn't mean they are going to be happening all that frequently.
>
> A sane implementation of the currently proposed initial properties
> probably won't use UPDXCH at all. I'm trying to explore the corners
> here and play devil's advocate; if an implementer is allowed to use
> UPDXCH in a crazy way, we should try to anticipate that and either
> prohibit it or make it safe.
>
>
> > > it just needs to know the current value of the property.
> >
> > It certainly needs to know that.  Your implementaition might
> > not be interested in anything else but the protocol provides it
> > because it is easy for the sender to provide and useful in some
> > cases.
>
> To interoperate among implementations that depend on it and
> ones that don't need it, though, I think the document needs
> more precise language about how pendclr is to be used.
>
>
> > > If two or more ULPs are using the same connection, they might
> > > send a sequence of possibly contradictory property change
> > > requests.
> >
> > I don't understand why one might assume that ULP's
> > are sending property change requests.  ULPs are RPC
> > protocols.  They send rpc requests and receive replies.
>
> (Much) earlier you suggested that a ULP might want to select its
> own receive buffer size, for example. If two ULPs share the same
> connection, then they could choose to adjust the connection in
> contradictory ways. Perhaps you are taking that off the table
> now, which is fair enough.
>
>
> > > Also, there's no co-ordination between the two senders on the
> > > same connection; or an administrator might adjust these
> > > properties.
> >
> > Any co-ordination requirement that a transport implementation
> > might impose to deal with the case of multiple ULPs would be
> > out of scope in this document.
>
> My remark here was not about two ULPs. The two directions of
> transmission on a connection are not co-ordinated in any way. An
> unsolicited UPDXCH can pass a REQXCH going in opposite directions.
>
> IMO the document has to acknowledge that property value changes
> have to be done carefully and with co-ordination with the regular
> operation of the transport to avoid hazardous oscillations of
> values. The PROP protocol mechanism does not provide any kind of
> serialization on its own.
>
>
> > I don't see how you decide there is "no co-ordination".  If you build
> > an implementation and give multiple ULPs the ability to request and
> > make such changes independently, you are asking for (or begging for)
> > trouble.  As an implementer, the choice regarding co-ordination (or not)
> > is up to you.
> >
> > I think the entities making such changes are most likely to reflect
> > administrative requests or internal transport optimization functions.
> >
> > I can't see why one would allow the ULP to do this itself.
>
> Yes, I'm talking about administrative requests or optimization
> efforts, here.
>
>
> > > I posit that the receiver cares only about the current effective
> > > value of these properties.
> >
> > That's one of the things he cares about, but I believe it is not
> > the only one.
> >
> > pendclr is completely unreliable.
> >
> > If you were to always set it to false, as you suggest below that you
> > might, you can make it unreliable, but there is no good reason to do
> that.
> >
> > > Can you provide a real world example of how pendclr might be used?
> > > Because it appears to me to convey no actionable information, and
> > > I don't see why I shouldn't always set it to false.
> >
> >
> > Let's consider that a client implementation encounters a server
> > which has a very large receive buffer size and feels that it would
> prefer more
> > smaller buffers.
>
> More likely, either the client or server is low on resources, and
> wishes to reclaim receive buffer space. But, OK.
>
>
> > It then requests a smaller receiver buffer size.  as part of doing that
> it decides
> > to not send requests or responses that are larger than this anticipated
> size
> > limit.  I think the spec already discusses this but the basic point is
> that by making
> > this request the sender is licensing the peer to reduce the buffer
> size.  If this
> > request is completed, the sender will make that new size permanent (or
> at least until
> > it hears about a change). On the other hand, if the peer rejects the
> request
> > (either fairly soon via RESPXCH, or  after a while via UPDXCH), the
> sender would
> > then adopt the current value, since the possibility of the
> > peer eventually
> > satisfying the orginal request is now gone. On the other hand, an update
> not including pendclr
> > should not have that effect.  As long as the request is pending, he has
> to be prepared
> > for its peer to adopt the lower value that it has requested.
>
> I don't see how the REQuester's behavior would be different if pendclr
> was false. It can't assume the requested setting takes effect until
> it gets a REPXCH or UPDXCH with the new value, whether or not pendclr
> is true, and there's nothing in the spec (that I recall) that says
> otherwise (about pendclr being true is required before a new value is
> trusted -- did I miss something? -- and if that is the case, then
> what's the value of sending an unsolicited UPDXCH ?).
>
> There's nothing that prevents the REP side from sending "yes, I took
> the new setting (pendclr true)" and then "oops, no I went back to the
> old one (pendclr false)". That's not something an implementer might
> do on purpose, but it sure can happen by accident (poorly-timed
> administrator action, say).
>
>
> > > It's very easy to imagine an implementer deciding that UPDXCH is
> > > better for his or her client or server than FIRSTPROP, and noting
> > > stridently that the document does not prohibit or even warn against
> > > this design.
> >
> > > IMO the document needs to make an explicit requirement not to
> > > behave this way, or state that it is acceptable though not preferred.
> >
> > Are you sure that it is "requirement' and not a "REQUIREMENT" that you
> > are looking for?
> >
> > I'm OK with saying you shouldn't do that, but I have problem saying it
> is "acceptable".  I think it is noxious, although there is no "NOXIOUS".
> >
> > Maybe the right thing is to state the obvious, that the function of
> UPDXCH is to
> > convey changes, making its use to present values at connection time
> dubious,
> > while the function of ROPT_FIRSTPROP is to present the value at initial
> connection.
>
> You could state that UPDXCH and REQXCH MUST/SHOULD NOT be used before
> FIRSTPROP has been exchanged.
>
> Or even should not. I would go with MUST NOT: then receivers can depend
> on seeing FIRSTPROP before seeing other changes.
>
>
> > > For a client to reduce its receive buffers in mid-flight, all it
> > > needs to do is send Reply chunks more often. That will work OK
> > > if the server uses the Reply chunk whenever one is provided.
> > > (Not all servers behave nicely in this regard, though).
> >
> > > For a server to reduce its receive buffers in mid-flight, the
> > > server needs to notify the client of the change in receive
> > > buffer size, the client would need to acknowledge the change,
> > > and only then can the server reduce its receive buffer size.
> > > How might that be done with the protocol proposed in
> > > rpcrdma-xcharext ?
> >
> > I don't think unsolicited reduction in receive buffer size (as opposed
> to the
> > requested reduction discussed above) can be made to work.
>
> It could, perhaps, if the property had two values: one was the send
> size limit, and the other was the receive buffer size. The send size
> limit is not necessarily associated with a physical buffer size, if
> Sends are done with a gather vector, so its relatively
> straightforward to change it.
>
> One peer could send a REQXCH to reduce the sender's send size limit.
> If the other peer says "OK" then it is safe to reduce the receive
> buffer size.
>
> In fact, IMO both sides do want to know the send size limit of their
> peer so they can trim receive buffers to a size that would be used.
> It might also help a requester determine when a responder needs a
> Reply chunk.
>
>
> > Maybe I should add an explicit statement
> > to that effect.
>
> That would be implementation guidance at best, and might preclude
> some future innovation that allows unsolicited receive buffer size
> changes.
>
> Should similar statements be made about the other initial properties
> proposed in rpcrdma-xcharext?
>
> --
> Chuck Lever
>
>
>
>