Re: [nfsv4] Server-side copy question

David Noveck <davenoveck@gmail.com> Tue, 24 October 2017 11:43 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E56413F3A3 for <nfsv4@ietfa.amsl.com>; Tue, 24 Oct 2017 04:43:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id H9_RfpC-ch9O for <nfsv4@ietfa.amsl.com>; Tue, 24 Oct 2017 04:43:12 -0700 (PDT)
Received: from mail-it0-x236.google.com (mail-it0-x236.google.com [IPv6:2607:f8b0:4001:c0b::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C112213F38C for <nfsv4@ietf.org>; Tue, 24 Oct 2017 04:43:11 -0700 (PDT)
Received: by mail-it0-x236.google.com with SMTP id o135so9651773itb.0 for <nfsv4@ietf.org>; Tue, 24 Oct 2017 04:43:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=IxK27ceHsDPXCtd3a6uaYlKxk7qslBcVkF2IrxQaADw=; b=qEPNozVT7NPPsrkXnLErXK4e1r8rIjpRpmCJgMb4UMCryY1P/ZhIJMzRhK4gm4MGw2 RALyF+OAxy/iSbqEhxw2zeLUN80MtSwFvDfoj5E5ylDx4tPVbf/o5z4eGMrJhITMmeUE zV7HEyp/3HjXeJsOw4WiICI1QVToBzfMnr50khs3nBeD5V4RiZZu3b87UKgTjTn+XTVG DeG22HAZJt0GLaL1R+JwH/qPbmtQWAaA7JA3LXbog0Ne9LptaiYlAo/QoB1LRrI3HBkC SzsDE1qAi1mMdaK9iweMiNfREqq3tZYOLo1WzIYdAUWZlXjIodTsaBtCiNkN3MNNej2+ NgAw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=IxK27ceHsDPXCtd3a6uaYlKxk7qslBcVkF2IrxQaADw=; b=GnwNSTxyJIq35kAvLtAIVQJfOqp5O6tLfIMjEe7ITN+KvlygWb2jOkjH0AdzKzOam3 HNGBe9jCagtbBDQXcH9l2iZWd2PnA6t3RdpcFuMEncFdNJcBHcXjT5sSg/mPhJ7Rlhym MsW0vhVosuW2UcdB6jIasLKYFMTfLfPPBBzp3GJmA0jSTXehcEtBbcc82VNOQ2SVH0fV w4m+WWKJdkaMUILLpRW/xMQxjOoVYF8vm1BFTD9q0a84MlIoyOe82P9JAsmt6h+Wm96F GYvk71s6baVV2OJ91IW3Pve0h82/yT+d9YpWbjFCmC+PqA4VzfKHOKU9prcXY7JIZlmp paBQ==
X-Gm-Message-State: AMCzsaUVq4YursevffXl5x0PSbkj44vH7wtRulYwTa/aX57+O5Fo/h+K zXbZVxA4F3XMHhkYyp/xSDo3LSviq+17adePxVs=
X-Google-Smtp-Source: ABhQp+TRSYCs4YiQpYPeE7kCFq358JHfRK+ADQyB1ZcEeCgwIbbw6O9sG/u0VnmvIwX/GURfe1+R7aGgux96ZOwCpyk=
X-Received: by 10.36.0.77 with SMTP id 74mr12879272ita.67.1508845390829; Tue, 24 Oct 2017 04:43:10 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.31.68 with HTTP; Tue, 24 Oct 2017 04:43:10 -0700 (PDT)
In-Reply-To: <20171024013646.GA22943@fieldses.org>
References: <CADaq8jc6vWdXonTs3795QWO8SWJTudo=x9=gR4uKN+tnfBk+dw@mail.gmail.com> <20170224222135.GJ26378@fieldses.org> <CAN-5tyHJb=ercWh_W2uzPfKR86j=dGWx=zYH1y+TjbNTzYh38A@mail.gmail.com> <20170228164420.GA28845@fieldses.org> <CADaq8jdivTUVNx2LXCiecAKY-nfoZP_XoAoNLwc3V1TUD6X8vg@mail.gmail.com> <CC84C4DB-F00F-45FC-8307-BDA5424DA480@netapp.com> <20170228173215.GA29700@fieldses.org> <CAN-5tyF8EZjYNX3cE43BPM9CbxVLygRdPuTJf_RgC7ntDCf9Kg@mail.gmail.com> <20171020193306.GF15211@fieldses.org> <CAN-5tyHaJeis=_f9h9u1St76Og3A_TtBUxZx_Z=ZouG7sa2=6w@mail.gmail.com> <20171024013646.GA22943@fieldses.org>
From: David Noveck <davenoveck@gmail.com>
Date: Tue, 24 Oct 2017 07:43:10 -0400
Message-ID: <CADaq8jdWkTfQB-Y55ow1A23AGCVM12LtwjXYZ6E7zU7M=Aw89g@mail.gmail.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Olga Kornievskaia <aglo@citi.umich.edu>, "Adamson, Andy" <William.Adamson@netapp.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="001a11c009c08a9fa8055c497174"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/QF96Dpnx-kAV7cinBrINtITF7ik>
Subject: Re: [nfsv4] Server-side copy question
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Oct 2017 11:43:17 -0000

> So in the short write case it sounds like it *is* the server's
> responsibility to commit.

Perhaps it should be the server's responsibility, but it is not,
according to RFC7862

> That's weird,

I agree that it is wierd.  I feel it is well-intentioned but not viable
since
you are essentially creating your own protocol, if clients depend on
this behavior an there is no point in doing it, if clients do not.

> but as long as the short write
> case is rare, maybe it's not a big deal?

>From a performance point of view, it is not a big deal.  However it is
a big deal that the spec is broken. I hate when that happens :-(

>Hm: "The coa_bytes_copied value indicates the number of bytes copied but
> not which specific bytes have been copied."  That makes coa_bytes_copied
> pretty much worthless.

Any non-zero value of coa_bytes_copied is worthless because the client does
not know:

   - Which particular bytes were copied
   - Whether those bytes were copied to the destination stably
   - How to commit any  non-stably-written bytes.

>I guess I'd be in favor of using the NFS_OK branch of CB_OFFLOAD
> whenever a nonzero number of bytes is copied.

That's likely to create an interoperability problem, since there are almost
certainly clients who assume that when there is no error, all requested
bytes
have been copied.  Basically, the server is pretending there is no error
and then
relying on the client to figure out that there was one, which could only be
done if the
spec tells the clients about this.  If RFC7862 doesn't do this, you wind up
essentially creating a private protocol.

> If less than the
> requested number of bytes were copied, the client can always find out
> the error (if any) by attempting to copy the rest.

Unfortunately there is no way of determining which bytes were copied so
there
is no way of knowing "the rest", i.e. the bytes not copied

On Mon, Oct 23, 2017 at 9:36 PM, J. Bruce Fields <bfields@fieldses.org>
wrote:

> On Mon, Oct 23, 2017 at 05:43:42PM -0400, Olga Kornievskaia wrote:
> > On Fri, Oct 20, 2017 at 3:33 PM, J. Bruce Fields <bfields@fieldses.org>
> wrote:
> > > On Wed, Oct 18, 2017 at 05:22:08PM -0400, Olga Kornievskaia wrote:
> > >> On Tue, Feb 28, 2017 at 12:32 PM, J. Bruce Fields <
> bfields@fieldses.org> wrote:
> > >> > On Tue, Feb 28, 2017 at 05:04:27PM +0000, Adamson, Andy wrote:
> > >> >> I also thought that bytes copied is only included in the success
> case – but as Olga pointed out, not true. The coa_status of NFS4_OK has the
> coa_resok4->wr_count showing the bytes copied, and the default (e.g.
> failure cases) has the coa_bytes_copied.
> > >> >>
> > >> >> So the server can send a CB_OFFLOAD with an error (such as the
> –EINVAL in the example) and send the bytes copied in coa_bytes_copied.
> > >> >>
> > >> >>    union offload_info4 switch (nfsstat4 coa_status) {
> > >> >>    case NFS4_OK:
> > >> >>            write_response4 coa_resok4;
> > >> >>    default:
> > >> >>            length4         coa_bytes_copied;
> > >> >>    };
> > >> >
> > >> > Gah, I'm sorry, Olga said CB_OFFLOAD but for some reason I was
> looking
> > >> > at COPY.
> > >> >
> > >> > I have no idea what to think of that.  I don't see why it should be
> > >> > different from COPY.
> > >>
> > >> I'd like to resurrect this thread...
> > >>
> > >> This discussion was initiated to allow for the server to do a partial
> > >> copy instead of failing the copy with EINVAL when copy is requested
> > >> beyond the end of the file.
> > >>
> > >> However, the problem I'm running into is that because CB_OFFLOAD only
> > >> returns "coa_bytes_copied" and does not include stable_how and
> > >> verifier. So even if client were to then send a COMMIT for the bytes
> > >> that were copied, it has no verifier to match from the COMMIT to
> > >> anything.
> > >>
> > >> Should the client then ignore the lack of the matching verifier and
> > >> just assume the data was committed for the partial bytes?
> > >
> > > Yes, sounds like in the asynchronous case the server has to wait for
> the
> > > writes to reach disk before calling CB_OFFLOAD.  (Do we currently do
> > > that?).
> >
> > Why? No the spec does not require the server to do no such thing. The
> > server sends back in CB_OFFLOAD if the bytes were written as UNSTABLE
> > or FILE_SYNC.
>
> Look back at what you said before....  So the problem you were
> explaining (missing stable_how and verifier) was only for the short
> write case?
>
> So in the short write case it sounds like it *is* the server's
> responsibility to commit.  That's weird, but as long as the short write
> case is rare, maybe it's not a big deal?
>
> Hm: "The coa_bytes_copied value indicates the number of bytes copied but
> not which specific bytes have been copied."  That makes coa_bytes_copied
> pretty much worthless.
>
> I guess I'd be in favor of using the NFS_OK branch of CB_OFFLOAD
> whenever a nonzero number of bytes is copied.  If less than the
> requested number of bytes were copied, the client can always find out
> the error (if any) by attempting to copy the rest.
>
> --b.
>
> >
> > > In theory that shouldn't be too expensive because asynchronous COPY
> > > should normally be done with a very large range.
> > >
> > > (I wonder if the server should just do a synchronous copy if the copy
> > > would be less than a few megs?)
> >
> > Yes the server could determine to convert an asynchronous copy into a
> > synchronous one based on size.
> >
> > I guess I'm still confused how to handle the implementation so that it
> > doesn't kill interoperability with other possible (future)
> > implementations...
>
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
>