Re: [nfsv4] can a server replace a read deleg with a write deleg?

Rick Macklem <rmacklem@uoguelph.ca> Tue, 30 August 2011 20:54 UTC

Return-Path: <rmacklem@uoguelph.ca>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 97B4E21F8ECE for <nfsv4@ietfa.amsl.com>; Tue, 30 Aug 2011 13:54:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.542
X-Spam-Level:
X-Spam-Status: No, score=-6.542 tagged_above=-999 required=5 tests=[AWL=0.057, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id soE04iVNgQKF for <nfsv4@ietfa.amsl.com>; Tue, 30 Aug 2011 13:54:28 -0700 (PDT)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by ietfa.amsl.com (Postfix) with ESMTP id 154F021F8ECD for <nfsv4@ietf.org>; Tue, 30 Aug 2011 13:54:27 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AikAAN5NXU6DaFvO/2dsb2JhbABChEyUCJBjgUABAQEBAwEBASBLCwwPEQQBAQECAg0WAwIpHwkIBhOHdqcEkW+BLIQQgREEkySRIQ
X-IronPort-AV: E=Sophos;i="4.68,304,1312171200"; d="scan'208";a="136015150"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 30 Aug 2011 16:55:54 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 8036CB3F28; Tue, 30 Aug 2011 16:55:54 -0400 (EDT)
Date: Tue, 30 Aug 2011 16:55:54 -0400
From: Rick Macklem <rmacklem@uoguelph.ca>
To: david noveck <david.noveck@emc.com>
Message-ID: <1852560994.570259.1314737754480.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <5DEA8DB993B81040A21CF3CB332489F68146A637@MX31A.corp.emc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: nfsv4@ietf.org, trond myklebust <trond.myklebust@fys.uio.no>
Subject: Re: [nfsv4] can a server replace a read deleg with a write deleg?
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Aug 2011 20:54:29 -0000

Thanks Dave. Although somewhat overwhelmed by the reply, I think
I understand it. (When I has asked about `replace` I had assumed
that it would be a different StateID.)

If I understood the answer correctly, when a client holds a read
delegation and receives an Open reply with a write delegation for
the same file in it, it can assume the read delegation no longer
exists and can use the write delegation (with the StateID that is
in the Open reply).

rick

Dave Noveck wrote:
> Although I'm responding to Rick's original message on this subject,
> I'm
> not ignoring what has been said relevant to the topic by Trond after
> that
> and will refer to what he said in his mails as appropriate.
> 
> I believe that the spec needs to be clarified (to agree with what I'm
> saying :-) or else clearly say it is wrong :-(
> 
> This is apart from my belief that the restriction that Trond notes is
> mistaken. For now, I'll just assume it as currently written. I will
> not address how its possible mistakenness relates to the issue until
> later
> on in the message after a big "================================="
> 
> ------------------------------------------
> 
> So my answers to Rick are
> 1) it does not need to CBRecall the read delegation. However, if it
> does
> not recall it, it needs to revoke it.
> 
> As Trond points out, the spec says you "may not" change the file while
> holding
> the read delegation. It does not say "MUST NOT" or "SHOULD NOT"
> because that
> language implies that the other party can depend (at least normally)
> on the other
> party not doing the action. That is normally said about what the
> responder does and
> the requester is relying on that. Here it is about what the requester
> would do and
> the server is not relying on the client not doing these things. If the
> client does
> them, the server has to do something, like return an error and I don't
> see any
> statement about an error being returned.
> 
> How I interpret this is that the client can do any writes
> he wants to try but that they are incompatible with retaining the
> delegation,
> So that the server has to somehow get rid of it.
> 
> Rick points out that the WRITE and not the OPEN is the important
> thing, but
> I think Trond is right in saying the server need not wait for the
> WRITE, i.e.
> that he MAY and should take the OPEN for write as evidence that a
> WRITE will
> most likely follow, although I am not quite as judgmental as Trond in
> making
> that assessment.
> 
> So if the read delegation should go, is a CBRecall needed? I don't
> think so,
> since the client knows he is opening for write. I think the server is
> justified
> in skipping recall in this case and going directly to revocation of
> the read
> Delegation.
> 
> Once the read delegation is revoked, the server is free to grant a
> write delegation.
> 
> So the answer to the other question is:
> 2) yes, it can return a write delegation in the open reply but only if
> it has already
> revoked the read delegation. The client might consider that a
> replacement but the
> two are independent since they have different stateid's. The client
> could interpret
> the write delegation as incompatible with the read delegation and he
> could simply
> consider the read delegation as gone without actually verifying that
> the revoke
> happened.
> 
> Although this is not a replacement in the sense of an upgrade of an
> existing object, the
> client might well consider it as essentially the same in effect,
> although there is
> no guarantee that it is done atomically.
> 
> One way in which the server might well abide by a replacement paradigm
> (not "SHOULD" or
> "should" but more like "would be well-advised to") is in terms of
> resource allocation.
> If a server is limited to having N outstanding delegations, it could
> try to make sure
> that the revoked delegation was not snapped up by an unrelated
> request, making it
> likely that he will be able to grant the write delegation if nothing
> else (conflicting
> opens or delegations) prevents it.
> 
> -----------------------------------------------------------------------------------
> 
> Having answered Rick's questions that about what the server might do,
> a related question
> is whether a client should (by which I mean "may, if determined to act
> prudently") depend
> on this behavior. While this is allowed behavior, and seems to me
> sensible, it is
> not mandated by the spec and clients should not expect it to always
> happen.
> 
> It would be far better for the client in this situation to put a
> delegation return in
> front of the COMPOUND and do the OPEN after that. The server might
> well take the
> same resource trading precautions as it does in the revoke case.
> 
> In this case, if there are delegations that are incompatible with the
> OPEN, the read
> delegation could be gone when you received the DELAY error, so you
> might have resource
> issues when you re-issued that prevented you from gettin the
> delegation. Without
> the delegation return, the server could be coded so that in that case,
> the revoke
> of the read delegation would not happen in the DELAY case. But I don't
> think that is
> a big enough deal to justify not putting the delegation return in the
> COMPOUND.
> 
> ==================================================================================
> 
> I've previously explained why I think this "may not" is a mistake as
> follows and so
> far I've not heard any argument that it isn't:
> 
> > When I get a read delegation, I'm assured that nobody else is
> > changing the file.
> 
> > I don't need an assurance that I'm not changing the file. I know
> > whether I'm changing the file.
> 
> > It could be that this restriction helps the others who have a read
> > delegation but it doesn't. If I had no delegation, then when I
> > opened
> > for write, all of clients holding read delegations would have their
> > delegations recalled. That make sense.
> 
> > Now if I also hold a read delegation, then the spec indicates, as
> > Trond
> > points out, that I will lose mine as well. The question is "why?" It
> > can't be to inform me that I'm writing. I know that.
> 
> > The problem here is that the protocol as now specified, cannot give
> > you a delegation-based assurance that you are the only one writing.
> > You
> > can get an open-based non-revocable assurance that you are the only
> > writer
> > by opening for write with deny-write, but it is anomalous that you
> > can't
> > get a delegation-style assurance of this.
> 
> Now as to my concern as to the mistakenness of the "may not" and why I
> join Rick in
> exploring the ragged edges of the protocol here, let me explain why I
> think this is
> important now.
> 
> We now have cheaper fast caching media in the form of flash memory and
> soon MRAM may
> continue in this vein. Delegations are v4's prime means of providing
> support for more
> abundant and long-lived caching.
> 
> So you want to have read delegations to make sure that your flash
> copies of files remain
> up-to-date. When you don't have a read-delegation, you can just
> interrogate the changed
> attribute, as long as you aren't writing. When you are writing and
> don't have any assurance
> that nobody else is, the data that you have cached is not reliable.
> The problem is
> compounded (not intended as a pun when written, honest :-) by the
> absence of atomic
> pre-and-post-attributes for write.
> 
> So the problem is that what you need is an assurance that nobody else
> is writing the file,
> to assure validity of your cache. A write delegation might allow you
> to do write-back caching
> But you might not be that aggressive (e.g. concern about disasters
> like asteroids, hurricanes
> and earthquakes), but you really want to be able to at least do
> write-through caching.
> 
> With the ability to trade (in some sense) a write delegation for a
> read delegation), you have
> the assurance you need to do write-through caching. Non-atomicity is
> not a problem as long
> As you can check the change attribute before you start writing. So if
> he client could
> return the read delegation and get the write delegation, he would be
> OK, even if there were
> read delegations that others held to be recalled.
> 
> The fly in this particular ointment is that if other clients had the
> file opened for read,
> the write delegation could not be granted and the writing client would
> have to flush his copy
> of the file which might be fairly large in the case of flash caching.
> The problem here is
> that the client needs an assurance that nobody else is writing and
> can't get one, all
> because of "the gratuitous 'may not'".
> 
> So how about getting rid of "the gratuitous 'may not'"?
> 
> Would that hurt the protocol in any way?
> 
> If we deleted this "may not", you could keep your read delegation
> across the OPEN,
> retaining the assurance that nobody else is writing. The
> open-for-write would cause
> read delegations held by others to be recalled but opens for read
> would not interfere
> with the read delegation, as they should not.
> 
> If nobody has a reason in hand and this is just a time-to-bis issue, I
> can see how the
> need to "ship product" might get in the way. If so, might we
> reconsider the issue in
> the context of v4.1? If the deferral of this is motivated by time
> considerations, as
> opposed to deciding that the restriction is correct, I ask that we not
> consider that
> deferral as a precedent for the case of v4.1, where the same schedule
> considerations
> do not apply.
> 
> -----Original Message-----
> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
> Of Rick Macklem
> Sent: Sunday, July 04, 2010 12:28 PM
> To: nfsv4@ietf.org
> Subject: [nfsv4] can a server replace a read deleg with a write deleg?
> 
> Somewhat tangencial to the recent thread, what about the following
> simple case:
> - client opens foo for reading and gets read_deleg_stateid_foo
> - client opens foo for writing against the server (which it must do
> when it holds a read delegation)
> 
> Now, if the server sees that this client is the only one with a read
> delegation for (and opens on) "foo", it could issue a write delegation
> to the client for "foo".
> - To do this does it first need to CBRecall the read delegation? (If
> so,
> it probably cannot issue the write delegation for this open, since it
> would take too long to reply to the Open unless it returns
> NFS4ERR_DELAY
> for the Open for a while. Not an ideal situation.)
> OR
> - Can it return a write delegation for "foo" in the write open reply?
> - If it is allowed to do this, does this delegation replace
> read_delegation_stateid_foo or is there now multiple delegations for
> "foo" issued to the same client?
> (I don't like the concept of having multiple delegations issued to
> the same client for the same file concurrently and I'm pretty sure
> my client isn't implemented to handle this case. Without looking at
> the code to be sure, I think it logs an error and throws away the
> new second delegation.)
> 
> I don't think this is clarified in RFC3530, but please correct me if
> I'm
> incorrect w.r.t. this.
> 
> rick
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4