Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))

Benny Halevy <bhalevy@panasas.com> Thu, 28 October 2010 14:27 UTC

Return-Path: <bhalevy@panasas.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1481F3A68A0 for <nfsv4@core3.amsl.com>; Thu, 28 Oct 2010 07:27:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.653
X-Spam-Level:
X-Spam-Status: No, score=-7.653 tagged_above=-999 required=5 tests=[AWL=0.946, BAYES_00=-2.599, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NqdmutzUl3x1 for <nfsv4@core3.amsl.com>; Thu, 28 Oct 2010 07:27:30 -0700 (PDT)
Received: from exprod5og101.obsmtp.com (exprod5og101.obsmtp.com [64.18.0.141]) by core3.amsl.com (Postfix) with SMTP id 4FC1C3A6875 for <nfsv4@ietf.org>; Thu, 28 Oct 2010 07:27:30 -0700 (PDT)
Received: from source ([67.152.220.89]) by exprod5ob101.postini.com ([64.18.4.12]) with SMTP ID DSNKTMmIwr1n0E2/nm7LxdsSn48VgmsiNCJ0@postini.com; Thu, 28 Oct 2010 07:29:22 PDT
Received: from fs1.bhalevy.com ([172.17.33.166]) by daytona.int.panasas.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 28 Oct 2010 10:29:21 -0400
Message-ID: <4CC988BF.3020801@panasas.com>
Date: Thu, 28 Oct 2010 16:29:19 +0200
From: Benny Halevy <bhalevy@panasas.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Thunderbird/3.1.4
MIME-Version: 1.0
To: david.noveck@emc.com
References: <E043D9D8EE3B5743B8B174A814FD584F0D3F656C@TK5EX14MBXC124.redmond.corp.microsoft.com><1287680041.9144.2.camel@heimdal.trondhjem.org> <4CC08145.7090404@panasas.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C716D@CORPUSMX50A.corp.emc.com> <4CC8A163.1070307@panasas.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C7CCA@CORPUSMX50A.corp.emc.com>
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D80028C7CCA@CORPUSMX50A.corp.emc.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-OriginalArrivalTime: 28 Oct 2010 14:29:21.0447 (UTC) FILETIME=[83002B70:01CB76AC]
Cc: nfsv4@ietf.org, trond.myklebust@fys.uio.no
Subject: Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Oct 2010 14:27:32 -0000

On 2010-10-28 15:46, david.noveck@emc.com wrote:
> OK, but if we are in the block mapping type we are not doing any COMMITs to the DS, which is where this started.  My head hurts.

True, SCSI writes can be considered in this model similar to NFS WRITEs with committed == FILE_SYNC4
or to NFS WRITEs with committed != FILE_SYNC4 followed by a successful COMMIT to the DS.

> 
> The first step to effective pain relief in this area is to divide up the question into
> what are the requirements on LAYOUTCOMMIT for the protocol as a whole
> (i.e. would apply to all mapping types existing and yet-to-be invented)
> and what other per-mapping-type requirements are there, for those mapping types.
>  I think dividing things up this way will help us reduce mutual incomprehension as one person
> talks about the way the world is (his mapping type) and it doesn't make sense to the other guy
> who lives in a different mapping-type world.

Agreed.

Benny

> 
> 
> -----Original Message-----
> From: Benny Halevy [mailto:bhalevy@panasas.com] 
> Sent: Wednesday, October 27, 2010 6:02 PM
> To: Noveck, David
> Cc: trond.myklebust@fys.uio.no; nfsv4@ietf.org
> Subject: Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))
> 
> On 2010-10-22 04:26, david.noveck@emc.com wrote:
>>> Though, to me it was pretty clear that we roughly reached 
>>> consensus (lower case, not ietf ROUGH CONSENSUS ;-) that 
>>> LAYOUTCOMMIT is always required for maintaining stable 
>>> storage semantics.
>>
>> Having case be a discriminator of otherwise synonymous words seems like a loser.
>>
>> Which of the 16,384 possible variants (14 letters) would belong to Spencer and Beepy and which to the hoi polloi?  The mind reels.
>>
>> I'm not going to try to characterize opinion at the bakeathon.
>>
>> I just would like to ask a question.
>>
>>   1) Suppose I do either a synchronous write or I send an
>>   asynchronous write and get back an indication that it
>>   was done synchronously.
>>
>>   2) I will note that the aforementioned operation is semantically 
>>   identical to doing an asynchronous write followed by COMMIT.
>>
>>   3) So let us suppose that by doing 1), I have incurred the
>>   obligation to do a LAYOUTCOMMIT.  So what is the deadline
>>   on completing this obligation.  It doesn't seems sensible 
>>   that it turn on when the block leaves my cache.  The fact that
>>   it was written synchronously, means that it will never be 
>>   resent, so that doesn't seem to be valid as a condition.
> 
> That's true when writing to the MDS.  It makes sense when
> writing to the DS, when you take loosely coupled data servers
> into consideration.
> 
> In particular, in the block layout type, the client may be writing
> into provisionally allocated blocks on storage.  Although the
> writes are synchronous in nature, further failure to commit
> the layout to the MDS may require the client to get a new layout,
> re-write the data onto newly allocated blocks and attempt
> layoutcommit again.
> 
> Benny
> 
>>
>>   4) Given that, it is hard to find any deadline for my
>>   LAYOUTCOMMIT other CLOSE (or maybe some long time if the
>>   CLOSE is delayed indefinitely).
>>
>>   5) But know note that in the case of 2), we have a similar
>>   situation.  Once the COMMIT returns, it doesn't matter when
>>   the block gets flushed from my cache.  So if do the COMMIT
>>   incurs an obligation to do a LAYOUTCOMMIT, the deadline for
>>   completion of this is the CLOSE.
>>
>>   6) But if that is the case, why not have the WRITE, rather
>>   than the COMMIT incur the obligation.
>>
>>   So I just don't see the COMMIT/LAYOUTCOMMIT connection.  Can
>>   someone explain to me what I'm missing?  I've hear lots of
>>   assertion, but I still haven't heard an explanation.
>>   
>>
>> -----Original Message-----
>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf Of Benny Halevy
>> Sent: Thursday, October 21, 2010 2:07 PM
>> To: Trond Myklebust
>> Cc: nfsv4 list
>> Subject: Re: [nfsv4] LAYOUTCOMMTI clarifications (was Notes from Bakeathon (re-sending))
>>
>> On 2010-10-21 18:54, Trond Myklebust wrote:
>>> On Thu, 2010-10-21 at 16:40 +0000, Spencer Shepler wrote:
>>>> For the LAYOUTCOMMIT issues, at this point it seems some text
>>>> is needed to further discussion towards closure.  This doesn't 
>>>> need to be in the form of an I-D at this point given that,
>>>> as Dave points out, may be an errata for the RFCs.
>>>
>>> I am planning to contribute some errata text for this. I'll notify the
>>> list as soon as I have a more or less final draft (hopefully in the next
>>> few days).
>>
>> I agree that the existing text needs to be clarified, given
>> the level of discussions we had about it. Thanks!
>>
>> Though, to me it was pretty clear that we roughly reached consensus (lower
>> case, not ietf ROUGH CONSENSUS ;-) that LAYOUTCOMMIT is always required for
>> maintaining stable storage semantics. So, for example the client should not
>> drop dirty data out of its cached until that data is stable at the DS
>> and a respective LAYOUTCOMMIT was successfully processed by the server.
>>
>> It is true that for some server implementations of the files layout,
>> LAYOUTCOMMIT might be superfluous as they have a state coherent back-end
>> protocol allowing the MDS and DS to implicitly keep files' metadata in sync
>> but there could be files-based servers implementing loosely coupled DSs
>> that will still require LAYOUTCOMMIT from the client for efficient
>> operation.
>>
>> I also mentioned Dave Noveck's unofficial proposal from long ago to
>> extend the WRITE operation response stable_how4 with a LAYOUT_SYNC4 value:
>>
>>    enum stable_how4 {
>>            UNSTABLE4       = 0,
>>            DATA_SYNC4      = 1,
>>            FILE_SYNC4      = 2
>> +          LAYOUT_SYNC4    = 3
>>    };
>>
>> LAYOUT_SYNC4 on the DS means that the DS data, local metadata, and layout
>> are on stable storage and so the client does not need to send COMMIT to the DS
>> nor LAYOUTCOMMIT to the MDS.
>>
>> Benny
>>
>>>
>>> Cheers
>>>   Trond
>>>
>>>> Spencer
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf Of
>>>>> sfaibish
>>>>> Sent: Thursday, October 21, 2010 8:56 AM
>>>>> To: nfsv4 list
>>>>> Subject: [nfsv4] Fwd: Notes from Bakeathon (re-sending)
>>>>>
>>>>>
>>>>>
>>>>> ------- Forwarded message -------
>>>>> From: sfaibish <sfaibish@emc.com>
>>>>> To: "nfsv4 list" <nfsv4@ietf.org>
>>>>> Cc:
>>>>> Subject: Notes from Bakeathon (re-sending)
>>>>> Date: Thu, 21 Oct 2010 11:46:17 -0400
>>>>>
>>>>>
>>>>>
>>>>> ------- Forwarded message -------
>>>>> From: sfaibish <sfaibish@emc.com>
>>>>> To: "nfsv4 list" <nfsv4@ietf.org>
>>>>> Cc:
>>>>> Subject: Notes from Bakeathon
>>>>> Date: Thu, 21 Oct 2010 10:11:13 -0400
>>>>>
>>>>> As I promissed I worked on the notes from the discussions at BAT.
>>>>> Attached please find some notes and presentations I want to post on the
>>>>> BAT web site. Please take a look and see if they are appropriate for
>>>>> posting. Also feel free to comment discuss in the list the notes and the
>>>>> discussions. Thank you all
>>>>>
>>>>> /Sorin
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Sorin Faibish
>>>>> Corporate Distinguished Engineer
>>>>> Unified Storage Division
>>>>>          EMC²
>>>>> where information lives
>>>>>
>>>>> Phone: 508-249-5745
>>>>> Cellphone: 617-510-0422
>>>>> Email : sfaibish@emc.com
>>>> _______________________________________________
>>>> nfsv4 mailing list
>>>> nfsv4@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/nfsv4
>>>
>>>
>>>
>>> _______________________________________________
>>> nfsv4 mailing list
>>> nfsv4@ietf.org
>>> https://www.ietf.org/mailman/listinfo/nfsv4
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
>