Re: [nfsv4] Server-side copy question

"J. Bruce Fields" <bfields@fieldses.org> Tue, 24 October 2017 01:36 UTC

Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4CB98139F67 for <nfsv4@ietfa.amsl.com>; Mon, 23 Oct 2017 18:36:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KAJ_vQZSEncZ for <nfsv4@ietfa.amsl.com>; Mon, 23 Oct 2017 18:36:47 -0700 (PDT)
Received: from fieldses.org (fieldses.org [173.255.197.46]) by ietfa.amsl.com (Postfix) with ESMTP id 67AB713942F for <nfsv4@ietf.org>; Mon, 23 Oct 2017 18:36:47 -0700 (PDT)
Received: by fieldses.org (Postfix, from userid 2815) id 790A637E; Mon, 23 Oct 2017 21:36:46 -0400 (EDT)
Date: Mon, 23 Oct 2017 21:36:46 -0400
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Olga Kornievskaia <aglo@citi.umich.edu>
Cc: "Adamson, Andy" <William.Adamson@netapp.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Message-ID: <20171024013646.GA22943@fieldses.org>
References: <CADaq8jc6vWdXonTs3795QWO8SWJTudo=x9=gR4uKN+tnfBk+dw@mail.gmail.com> <20170224222135.GJ26378@fieldses.org> <CAN-5tyHJb=ercWh_W2uzPfKR86j=dGWx=zYH1y+TjbNTzYh38A@mail.gmail.com> <20170228164420.GA28845@fieldses.org> <CADaq8jdivTUVNx2LXCiecAKY-nfoZP_XoAoNLwc3V1TUD6X8vg@mail.gmail.com> <CC84C4DB-F00F-45FC-8307-BDA5424DA480@netapp.com> <20170228173215.GA29700@fieldses.org> <CAN-5tyF8EZjYNX3cE43BPM9CbxVLygRdPuTJf_RgC7ntDCf9Kg@mail.gmail.com> <20171020193306.GF15211@fieldses.org> <CAN-5tyHaJeis=_f9h9u1St76Og3A_TtBUxZx_Z=ZouG7sa2=6w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAN-5tyHaJeis=_f9h9u1St76Og3A_TtBUxZx_Z=ZouG7sa2=6w@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/XVwPhKekBlgj5pbal9-SzDiF9Bw>
Subject: Re: [nfsv4] Server-side copy question
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Oct 2017 01:36:49 -0000

On Mon, Oct 23, 2017 at 05:43:42PM -0400, Olga Kornievskaia wrote:
> On Fri, Oct 20, 2017 at 3:33 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > On Wed, Oct 18, 2017 at 05:22:08PM -0400, Olga Kornievskaia wrote:
> >> On Tue, Feb 28, 2017 at 12:32 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >> > On Tue, Feb 28, 2017 at 05:04:27PM +0000, Adamson, Andy wrote:
> >> >> I also thought that bytes copied is only included in the success case – but as Olga pointed out, not true. The coa_status of NFS4_OK has the coa_resok4->wr_count showing the bytes copied, and the default (e.g. failure cases) has the coa_bytes_copied.
> >> >>
> >> >> So the server can send a CB_OFFLOAD with an error (such as the –EINVAL in the example) and send the bytes copied in coa_bytes_copied.
> >> >>
> >> >>    union offload_info4 switch (nfsstat4 coa_status) {
> >> >>    case NFS4_OK:
> >> >>            write_response4 coa_resok4;
> >> >>    default:
> >> >>            length4         coa_bytes_copied;
> >> >>    };
> >> >
> >> > Gah, I'm sorry, Olga said CB_OFFLOAD but for some reason I was looking
> >> > at COPY.
> >> >
> >> > I have no idea what to think of that.  I don't see why it should be
> >> > different from COPY.
> >>
> >> I'd like to resurrect this thread...
> >>
> >> This discussion was initiated to allow for the server to do a partial
> >> copy instead of failing the copy with EINVAL when copy is requested
> >> beyond the end of the file.
> >>
> >> However, the problem I'm running into is that because CB_OFFLOAD only
> >> returns "coa_bytes_copied" and does not include stable_how and
> >> verifier. So even if client were to then send a COMMIT for the bytes
> >> that were copied, it has no verifier to match from the COMMIT to
> >> anything.
> >>
> >> Should the client then ignore the lack of the matching verifier and
> >> just assume the data was committed for the partial bytes?
> >
> > Yes, sounds like in the asynchronous case the server has to wait for the
> > writes to reach disk before calling CB_OFFLOAD.  (Do we currently do
> > that?).
> 
> Why? No the spec does not require the server to do no such thing. The
> server sends back in CB_OFFLOAD if the bytes were written as UNSTABLE
> or FILE_SYNC.

Look back at what you said before....  So the problem you were
explaining (missing stable_how and verifier) was only for the short
write case?

So in the short write case it sounds like it *is* the server's
responsibility to commit.  That's weird, but as long as the short write
case is rare, maybe it's not a big deal?

Hm: "The coa_bytes_copied value indicates the number of bytes copied but
not which specific bytes have been copied."  That makes coa_bytes_copied
pretty much worthless.

I guess I'd be in favor of using the NFS_OK branch of CB_OFFLOAD
whenever a nonzero number of bytes is copied.  If less than the
requested number of bytes were copied, the client can always find out
the error (if any) by attempting to copy the rest.

--b.

> 
> > In theory that shouldn't be too expensive because asynchronous COPY
> > should normally be done with a very large range.
> >
> > (I wonder if the server should just do a synchronous copy if the copy
> > would be less than a few megs?)
> 
> Yes the server could determine to convert an asynchronous copy into a
> synchronous one based on size.
> 
> I guess I'm still confused how to handle the implementation so that it
> doesn't kill interoperability with other possible (future)
> implementations...