Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))

Benny Halevy <bhalevy@panasas.com> Wed, 27 October 2010 22:00 UTC

Return-Path: <bhalevy@panasas.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B557B3A67E2 for <nfsv4@core3.amsl.com>; Wed, 27 Oct 2010 15:00:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.574
X-Spam-Level:
X-Spam-Status: No, score=-7.574 tagged_above=-999 required=5 tests=[AWL=1.025, BAYES_00=-2.599, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id df6pi0ZZK-Nq for <nfsv4@core3.amsl.com>; Wed, 27 Oct 2010 15:00:35 -0700 (PDT)
Received: from exprod5og106.obsmtp.com (exprod5og106.obsmtp.com [64.18.0.182]) by core3.amsl.com (Postfix) with SMTP id 2CAFF3A67FB for <nfsv4@ietf.org>; Wed, 27 Oct 2010 15:00:24 -0700 (PDT)
Received: from source ([67.152.220.89]) by exprod5ob106.postini.com ([64.18.4.12]) with SMTP ID DSNKTMihZjw4shW1TGh2dk0Ms+5dVM+BI+f7@postini.com; Wed, 27 Oct 2010 15:02:25 PDT
Received: from fs1.bhalevy.com ([172.17.33.166]) by daytona.int.panasas.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 27 Oct 2010 18:02:12 -0400
Message-ID: <4CC8A163.1070307@panasas.com>
Date: Thu, 28 Oct 2010 00:02:11 +0200
From: Benny Halevy <bhalevy@panasas.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Thunderbird/3.1.4
MIME-Version: 1.0
To: david.noveck@emc.com
References: <E043D9D8EE3B5743B8B174A814FD584F0D3F656C@TK5EX14MBXC124.redmond.corp.microsoft.com><1287680041.9144.2.camel@heimdal.trondhjem.org> <4CC08145.7090404@panasas.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C716D@CORPUSMX50A.corp.emc.com>
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D80028C716D@CORPUSMX50A.corp.emc.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-OriginalArrivalTime: 27 Oct 2010 22:02:12.0809 (UTC) FILETIME=[9BFB3B90:01CB7622]
Cc: nfsv4@ietf.org, trond.myklebust@fys.uio.no
Subject: Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Oct 2010 22:00:43 -0000

On 2010-10-22 04:26, david.noveck@emc.com wrote:
>> Though, to me it was pretty clear that we roughly reached 
>> consensus (lower case, not ietf ROUGH CONSENSUS ;-) that 
>> LAYOUTCOMMIT is always required for maintaining stable 
>> storage semantics.
> 
> Having case be a discriminator of otherwise synonymous words seems like a loser.
> 
> Which of the 16,384 possible variants (14 letters) would belong to Spencer and Beepy and which to the hoi polloi?  The mind reels.
> 
> I'm not going to try to characterize opinion at the bakeathon.
> 
> I just would like to ask a question.
> 
>   1) Suppose I do either a synchronous write or I send an
>   asynchronous write and get back an indication that it
>   was done synchronously.
> 
>   2) I will note that the aforementioned operation is semantically 
>   identical to doing an asynchronous write followed by COMMIT.
> 
>   3) So let us suppose that by doing 1), I have incurred the
>   obligation to do a LAYOUTCOMMIT.  So what is the deadline
>   on completing this obligation.  It doesn't seems sensible 
>   that it turn on when the block leaves my cache.  The fact that
>   it was written synchronously, means that it will never be 
>   resent, so that doesn't seem to be valid as a condition.

That's true when writing to the MDS.  It makes sense when
writing to the DS, when you take loosely coupled data servers
into consideration.

In particular, in the block layout type, the client may be writing
into provisionally allocated blocks on storage.  Although the
writes are synchronous in nature, further failure to commit
the layout to the MDS may require the client to get a new layout,
re-write the data onto newly allocated blocks and attempt
layoutcommit again.

Benny

> 
>   4) Given that, it is hard to find any deadline for my
>   LAYOUTCOMMIT other CLOSE (or maybe some long time if the
>   CLOSE is delayed indefinitely).
> 
>   5) But know note that in the case of 2), we have a similar
>   situation.  Once the COMMIT returns, it doesn't matter when
>   the block gets flushed from my cache.  So if do the COMMIT
>   incurs an obligation to do a LAYOUTCOMMIT, the deadline for
>   completion of this is the CLOSE.
> 
>   6) But if that is the case, why not have the WRITE, rather
>   than the COMMIT incur the obligation.
> 
>   So I just don't see the COMMIT/LAYOUTCOMMIT connection.  Can
>   someone explain to me what I'm missing?  I've hear lots of
>   assertion, but I still haven't heard an explanation.
>   
> 
> -----Original Message-----
> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf Of Benny Halevy
> Sent: Thursday, October 21, 2010 2:07 PM
> To: Trond Myklebust
> Cc: nfsv4 list
> Subject: Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))
> 
> On 2010-10-21 18:54, Trond Myklebust wrote:
>> On Thu, 2010-10-21 at 16:40 +0000, Spencer Shepler wrote:
>>> For the LAYOUTCOMMIT issues, at this point it seems some text
>>> is needed to further discussion towards closure.  This doesn't 
>>> need to be in the form of an I-D at this point given that,
>>> as Dave points out, may be an errata for the RFCs.
>>
>> I am planning to contribute some errata text for this. I'll notify the
>> list as soon as I have a more or less final draft (hopefully in the next
>> few days).
> 
> I agree that the existing text needs to be clarified, given
> the level of discussions we had about it. Thanks!
> 
> Though, to me it was pretty clear that we roughly reached consensus (lower
> case, not ietf ROUGH CONSENSUS ;-) that LAYOUTCOMMIT is always required for
> maintaining stable storage semantics. So, for example the client should not
> drop dirty data out of its cached until that data is stable at the DS
> and a respective LAYOUTCOMMIT was successfully processed by the server.
> 
> It is true that for some server implementations of the files layout,
> LAYOUTCOMMIT might be superfluous as they have a state coherent back-end
> protocol allowing the MDS and DS to implicitly keep files' metadata in sync
> but there could be files-based servers implementing loosely coupled DSs
> that will still require LAYOUTCOMMIT from the client for efficient
> operation.
> 
> I also mentioned Dave Noveck's unofficial proposal from long ago to
> extend the WRITE operation response stable_how4 with a LAYOUT_SYNC4 value:
> 
>    enum stable_how4 {
>            UNSTABLE4       = 0,
>            DATA_SYNC4      = 1,
>            FILE_SYNC4      = 2
> +          LAYOUT_SYNC4    = 3
>    };
> 
> LAYOUT_SYNC4 on the DS means that the DS data, local metadata, and layout
> are on stable storage and so the client does not need to send COMMIT to the DS
> nor LAYOUTCOMMIT to the MDS.
> 
> Benny
> 
>>
>> Cheers
>>   Trond
>>
>>> Spencer
>>>
>>>
>>>> -----Original Message-----
>>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf Of
>>>> sfaibish
>>>> Sent: Thursday, October 21, 2010 8:56 AM
>>>> To: nfsv4 list
>>>> Subject: [nfsv4] Fwd: Notes from Bakeathon (re-sending)
>>>>
>>>>
>>>>
>>>> ------- Forwarded message -------
>>>> From: sfaibish <sfaibish@emc.com>
>>>> To: "nfsv4 list" <nfsv4@ietf.org>
>>>> Cc:
>>>> Subject: Notes from Bakeathon (re-sending)
>>>> Date: Thu, 21 Oct 2010 11:46:17 -0400
>>>>
>>>>
>>>>
>>>> ------- Forwarded message -------
>>>> From: sfaibish <sfaibish@emc.com>
>>>> To: "nfsv4 list" <nfsv4@ietf.org>
>>>> Cc:
>>>> Subject: Notes from Bakeathon
>>>> Date: Thu, 21 Oct 2010 10:11:13 -0400
>>>>
>>>> As I promissed I worked on the notes from the discussions at BAT.
>>>> Attached please find some notes and presentations I want to post on the
>>>> BAT web site. Please take a look and see if they are appropriate for
>>>> posting. Also feel free to comment discuss in the list the notes and the
>>>> discussions. Thank you all
>>>>
>>>> /Sorin
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Sorin Faibish
>>>> Corporate Distinguished Engineer
>>>> Unified Storage Division
>>>>          EMC²
>>>> where information lives
>>>>
>>>> Phone: 508-249-5745
>>>> Cellphone: 617-510-0422
>>>> Email : sfaibish@emc.com
>>> _______________________________________________
>>> nfsv4 mailing list
>>> nfsv4@ietf.org
>>> https://www.ietf.org/mailman/listinfo/nfsv4
>>
>>
>>
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4