Re: [nfsv4] 4.0 trunking

"J. Bruce Fields" <bfields@fieldses.org> Wed, 21 September 2016 02:45 UTC

Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C34AC12B53A for <nfsv4@ietfa.amsl.com>; Tue, 20 Sep 2016 19:45:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.218
X-Spam-Level:
X-Spam-Status: No, score=-4.218 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.316, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T9lJVqXIXi4e for <nfsv4@ietfa.amsl.com>; Tue, 20 Sep 2016 19:45:31 -0700 (PDT)
Received: from fieldses.org (fieldses.org [173.255.197.46]) by ietfa.amsl.com (Postfix) with ESMTP id BA06012B17B for <nfsv4@ietf.org>; Tue, 20 Sep 2016 19:45:31 -0700 (PDT)
Received: by fieldses.org (Postfix, from userid 2815) id 40F2B2080; Tue, 20 Sep 2016 22:45:31 -0400 (EDT)
Date: Tue, 20 Sep 2016 22:45:31 -0400
From: "J. Bruce Fields" <bfields@fieldses.org>
To: David Noveck <davenoveck@gmail.com>
Message-ID: <20160921024531.GA17232@fieldses.org>
References: <20160907212039.GA6847@fieldses.org> <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com> <20160908010532.GA10658@fieldses.org> <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com> <20160910200355.GA30688@fieldses.org> <BBB2EBDC-6F05-44C1-B45A-C84C24A9AD7F@netapp.com> <20160920213931.GC12789@fieldses.org> <CADaq8jeqr9WPeBeV0ruxwy+5omfgHhJFjAR=tCGL3N-Yjt0PuQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CADaq8jeqr9WPeBeV0ruxwy+5omfgHhJFjAR=tCGL3N-Yjt0PuQ@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/hwUkQBJoTcidh2OPpQaD5vHzp_I>
Cc: "Adamson, Andy" <William.Adamson@netapp.com>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Sep 2016 02:45:35 -0000

On Tue, Sep 20, 2016 at 08:55:31PM -0400, David Noveck wrote:
> >The thing is, though, that in that case we already have our
> > answer at step 3:
> > if X1 pointed to the same server as X2, then the
> > SETCLIENTID to X1 would have to return a distinct verifier.
> 
> True but the converse doesn't hold.

Agreed, I was suggesting stopping here only in the case the verifiers
match, and continuing otherwise.

> If you get a distinct verifier,
> it could be because X1 and X2 are the same server, but the other
> possibility is that X1 and X2 are different servers and, if they are, they
> may well happen to have different verifiers.

Yes.

> But I think you are on to something.  If the verifiers for X1 and X2 are V1
> and V2 respectively, if V2 works successfully then you have to go back and
> use V1 on X1.  If the SETCLIENTID_CONFIRM to X2 using V2 succeeds, you can
> go back and try another confirm to X1 using V2.  If X1 and X2 are distinct,
> that should still succeed.  On the other hand, if they are the same server
> the creation of V2 should replace V1 and it will become invalid.

I think that would work, though what I was suggesting was a few less
steps, roughly:

         1. a SETCLIENTID to IP address X1 returns a clientid.  You
         confirm it and carry on.

	 2.  Some time later a SETCLIENTID to IP address X2 returns the
	 same clientid, and verifier V2.  To check whether X1 and X2
	 actually point to the same server:

	 3. Send a callback-changing SETCLIENTID to X1.  Get back
	 (clientid, V1).  If V1 == V2, then X1 and X2 are different
	 servers.  Otherwise:

	 4. Send a SETCLIENTID_CONFIRM(clientid, V1) to X2.  If it
	 succeeds, they're the same, otherwise, they're different.

--b.

> 
> > We can stop there.
> 
> Or somewhere.
> 
> On Tue, Sep 20, 2016 at 5:39 PM, J. Bruce Fields <bfields@fieldses.org>
> wrote:
> 
> > On Mon, Sep 12, 2016 at 12:59:05PM +0000, Adamson, Andy wrote:
> > > IMHO we should not worry too much about NFSv4.0 trunking as NFSv4.1+
> > solves this issue. Trunking is simply one more reason to move from NFSv4.0
> > to NFSv4.1.
> >
> > I tend to agree, but since it got implemented an was causing a real bug,
> > it was hard to ignore....
> >
> > After a patch (55b9df93ddd6 "NFSv4/v4.1: Verify the client owner id
> > during trunking detection", if anyone cares) the damage seems restricted
> > to the non-default "migration" case, making this less of a concern.
> >
> > I wonder if there's an easy fix, though: if I'm reading rfc 7931 right,
> > it works (slightly simplified) like this:
> >
> >         1. a SETCLIENTID to IP address X1 returns a clientid.  You
> >         confirm it and carry on.
> >
> >         2.  Some time later a SETCLIENTID to IP address X2 returns the
> >         same clientid.  To check whether X1 and X2 actually point to the
> >         same server:
> >
> >         3. Send a callback-changing SETCLIENTID to X1.  Get back
> >         (clientid, verifer).
> >
> >         4. Confirm it with a SETCLIENTID_CONFIRM to X2.
> >
> > If the SETCLIENTID_CONFIRM succeeds then they're the same server.
> > Servers are required to return an error (STALE_CLIENTID, I think?) if
> > the verifier in the SETCLIENTID_CONFIRM arguments doesn't match.
> >
> > But, if the SETCLIENTID at step 2 just happened to return the same
> > verifier as the SETCLIENTID at step 3, then the SETCLIENTID_CONFIRM at
> > step 4 will succeed even if the two servers are different.  Hence the
> > bug.
> >
> > The thing is, though, that in that case we already have our answer at
> > step 3: if X1 pointed to the same server as X2, then the SETCLIENTID to
> > X1 would have to return a distinct verifier.  We can stop there.
> >
> > With that addition, don't we get the right answer every time, without
> > any unjustified assumptions about the randomness of server's clientid
> > and verifier generation?  Or am I missing something?
> >
> > --b.
> >
> > >
> > > —>Andy
> > >
> > >
> > > > On Sep 10, 2016, at 4:03 PM, J. Bruce Fields <bfields@fieldses.org>
> > wrote:
> > > >
> > > > On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck wrote:
> > > >> Thanks for pointing this out.
> > > >>
> > > >> Before I address the details of this, let me state my overall
> > position:
> > > >>
> > > >>   - I think it is correct that RFC7931 overstates the degree of
> > assurance
> > > >>   that one might reasonably have regarding the unlikelihood of
> > spurious
> > > >>   collision of clientid4's and verifiers.
> > > >>   - Nevertheless, I don't think the situation is as bleak as you
> > paint it,
> > > >>   with regard to the Linux server approach to these matters that you
> > > >>   describe.  This is basically because I don't see the issue of the
> > > >>   correlation between clientid4's and verifiers the same way that you
> > do. See
> > > >>   below.
> > > >>   - I think it is possible to address the issue satisfactorily in the
> > > >>   context of an errata rfc7931 errata and will start working on that.
> > > >>   - Given the weakness of the rfc 3530/7530 requirements in this
> > area, we
> > > >>   may need (yet another) RFC updating 7530 at some point.
> > > >>   - I see that as a longer-term effort, since the practices that you
> > > >>   describe will not result in large numbers of spurious collisions and
> > > >>   clients can adapt the algorithm to require additional verification.
> > > >>   - If there are servers out there whose practices are significantly
> > more
> > > >>   troublesome than the ones you describe, we need to find out soon.
> > Given
> > > >>   that many of those responsible for v4.0 implementations may not be
> > reading
> > > >>   this list, I suggest we discuss this at the October Bakeathon.
> > > >>
> > > >>
> > > >>> The Linux server is generating clientid as a (boot time, counter)
> > pair.
> > > >>
> > > >> I'm assuming the boot time is in the form of a 32-bit unsigned number
> > of
> > > >> seconds after some fixed date/time.  Correct?
> > > >>
> > > >> If that is the case, duplicates would require that:
> > > >> 1. Two servers be booted during the same second (e.g. when power came
> > back
> > > >> on after a disruption).
> > > >
> > > > Right, or routine maintenance, or whatever.
> > > >
> > > >> 2. Each of the servers has received the same number of client
> > SETCLIENTIDs
> > > >> (mod 4 billion).
> > > >>
> > > >> Clearly this is possible although I would say that it is unlikely.
> > > >> Nevertheless, the "highly unlikely" in RFC7931 is overstating it.
> > > >
> > > > It turns out that if you have a hundred servers that get rebooted
> > > > simultaneously for a kernel update or some similar routine maintenance,
> > > > this happen every time.  (I'm a step removed from the original case
> > > > here, but I believe this is more-or-less accurate and not
> > hypothetical.)
> > > >
> > > > Pretty sure you can easily hit this without that big a setup, too.
> > > >
> > > >> The case that would be more worrisome would be one which a server
> > simply
> > > >> used a 64-bit word of byte-addressable persistent memory (e.g.
> > NVDIMM, 3d
> > > >> xpoint) to maintain a permanent counter, dispensing with the boot
> > time.
> > > >
> > > > That'd be something worth looking into in cases the users have the
> > right
> > > > hardware (not always).  The workaround I'm going with for now is
> > > > initializing the counter part to something random.  (Random numbers may
> > > > not be completely reliable at boot time either, but I suspect they'll
> > be
> > > > good enough here...).
> > > >
> > > >> That is allowable as RFC7530 stands.  I don't expect that this is a
> > current
> > > >> problem but, given where persistent memory is going (i.e. cheaper and
> > more
> > > >> common), we could see this in the future.
> > > >>
> > > >> Clearly, spurious clientid4 collisions are possible, which is the
> > point of
> > > >> the whole algorithm.
> > > >>
> > > >>> Ditto for the verifier--it's a (boot time, counter) pair,
> > > >>
> > > >> I presume the same boot time format is used.  Correct?
> > > >>
> > > >>> with the second part coming from a different counter, but
> > > >>> normally I think they'll get incremented at the same time,
> > > >>
> > > >> It is true that they will both get incremented in many common (i.e.
> > > >> "normal") situations.
> > > >>
> > > >> However, this common case is not the only case.
> > > >>
> > > >> In particular, in the case in which one or more clients are following
> > the
> > > >> guidance in RFC7931 and there are trunked server addresses, you will
> > have
> > > >> occasions in which the verifier counter will get incremented and the
> > > >> clientid4 counter will not.
> > > >> 1.  If the SETCLIENTID is used to modify the callback information
> > > >> 2.  If two different SETCLIENTID operations are done by the same
> > client
> > > >
> > > > Sure.
> > > >
> > > >>> so clientid and verifier are highly correlated.
> > > >>
> > > >> I don't think so.  The boot time portions are clearly the same, but
> > given
> > > >> the likely occurrence of cases 1. and 2.these counters will typically
> > have
> > > >> different values.  They may be correlated in the sense that the
> > difference
> > > >> between the two values is likely not to be large, but that does not
> > mean
> > > >> that the probability of them having the same value is necessarily
> > substantial.
> > > >>
> > > >> Once there are any occasions in which the verifiers are incremented
> > and the
> > > >> clientid's are not, these values are essentially uncorrelated.  The
> > only
> > > >> way that they can collide at the same time the clientids collide is
> > when
> > > >> the number of instances of the cases 1. and 2. above is a multiple of
> > > >> 2^32.
> > > >
> > > > I'm not following.  I think you may be confusing clientid-verifier
> > > > collisions with verifier-verifier (from different server) collisions.
> > > > The latter is what matters.
> > > >
> > > > The confusion may be my fault.  I should have said something like:
> > > > chance of a collision between two clientid's is correlated with chance
> > > > of a collision between two verifiers given out by the same two servers.
> > > > So adding in a verifier comparison doesn't decrease the probability as
> > > > much as you'd expect.
> > > >
> > > >> While I think that RFC7931's "vanishingly small" is not correct, I
> > > >> would argue that this brings us into "highly unlikely" territory.
> > > >
> > > > So, unfortunately, we appear to have an actual real-life case where
> > this
> > > > happens all the time.
> > > >
> > > >> Also, the
> > > >> existing text does suggest that you can repeat the procedure any
> > number of
> > > >> times to reduce the likelihood of a spurious determination that two
> > > >> addresses are trunked.
> > > >
> > > > My intuition is that that would mean a lot of effort for a
> > disappointing
> > > > reduction in the probability of a bug, but I haven't done any
> > > > experiments or thought this through much.
> > > >
> > > > --b.
> > > >
> > > >>> So, we can mitigate this by adding some randomness, OK.
> > > >>
> > > >> One version of that might be to start the counter fields with the
> > > >> nanoseconds portion of the boot time.
> > > >>
> > > >> Alternatively, you might  keep the  current counter each starting at
> > zero
> > > >> but take advantage of the fact that clientid and verfier are used
> > > >> together.  To do that the timestamp, in clientid4 might be boot time
> > in
> > > >> seconds while the one in the verifier would be boot time nanoseconds
> > within
> > > >> the second during which the boot occurred. That would reduce the
> > frequency
> > > >> of verifier collisions but this might not be necessary, if, as i
> > expect,
> > > >> the algorithm will usually result in distinct verifiers anyway.
> > > >>
> > > >>> But that's a new requirement not apparent from 3530,
> > > >>
> > > >> I think the 7931 text can be patched up via an errata, but if not we
> > will
> > > >> have to consider how and whether to address the issue in 7530.
> > > >> Unfortunately, that would be kind of big for an errata.
> > > >>
> > > >> If no existing servers have a substantial issue, then we do have the
> > option
> > > >> of doing a longer-term RFC updating 7530.  This could be directed to
> > the
> > > >> (probably now) hypothetical case of a server not using boot time and
> > > >> maintaining 64-bit global persistent counts.
> > > >>
> > > >> Such servers would have a higher probability of clientid4 conflict and
> > > >> verifier conflict than the ones you mention, while staying within the
> > > >> rfc7530 requirements.
> > > >>
> > > >>> and I wouldn't be surprised if other servers have similar issues.
> > > >>
> > > >> Those who read this list have been notified of the issue by your
> > message.
> > > >>
> > > >> The problem is with server implementers who don't read this list.  In
> > > >> theory, errata should take care of that, but aside from the
> > difficulty of
> > > >> arriving at a suitable change and getting it through, it may be that
> > many
> > > >> v4.0 implementers and maintainers may not be reading this list or
> > paying
> > > >> much attention to errata either.  Perhaps we can discuss this in
> > Westford
> > > >> where many implementers, who might not be all that aware of stuff on
> > the
> > > >> working group list, would be present.  That's also one way to see if
> > there
> > > >> are existing servers where this is big problem.
> > > >
> > > > OK!
> > > >
> > > > Assuming nobody's doing anything very complicated--it's probably also
> > > > not difficult to guess algorithms in testing.  Something we can do in
> > > > Westford, or maybe before if anyone has access to a variety of servers.
> > > >
> > > > --b.
> > > >
> > > >>
> > > >>
> > > >> On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <bfields@fieldses.org
> > >
> > > >> wrote:
> > > >>
> > > >>> On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote:
> > > >>>>> But I can't find any further discussion of how a client might make
> > that
> > > >>>>> determination.  Am I overlooking it?
> > > >>>>
> > > >>>> It's actually in section 5.8 of RFC7931.
> > > >>>
> > > >>> Oops, thanks!
> > > >>>
> > > >>> Looking at that, I think none of this is true:
> > > >>>
> > > >>>        Note that the NFSv4.0 specification requires the server to
> > make
> > > >>>        sure that such verifiers are very unlikely to be regenerated.
> > > >>>
> > > >>> Verifiers given out by one server shouldn't be reused, sure, but
> > that's
> > > >>> quite different from the claim that collisions *between* servers are
> > > >>> unlikely.
> > > >>>
> > > >>>        Given that it is already highly unlikely that the clientid4 XC
> > > >>>        is duplicated by distinct servers,
> > > >>>
> > > >>> Why is that highly unlikely?
> > > >>>
> > > >>>        the probability that SCn is
> > > >>>        duplicated as well has to be considered vanishingly small.
> > > >>>
> > > >>> There's no reason to believe the probability of a verifier collision
> > is
> > > >>> uncorrelated with the probability of a clientid collision.
> > > >>>
> > > >>> The Linux server is generating clientid as a (boot time, counter)
> > pair.
> > > >>> So collision between servers started at the same time (probably not
> > that
> > > >>> unusual) is possible.
> > > >>>
> > > >>> Ditto for the verifier--it's a (boot time, counter) pair, with the
> > > >>> second part coming from a different counter, but normally I think
> > > >>> they'll get incremented at the same time, so clientid and verifier
> > are
> > > >>> highly correlated.
> > > >>>
> > > >>> So, we can mitigate this by adding some randomness, OK.  But that's a
> > > >>> new requirement not apparent from 3530, and I wouldn't be surprised
> > if
> > > >>> other servers have similar issues.
> > > >>>
> > > >>> --b.
> > > >>>
> > > >>>> I've only been keeping draft-ietf-nfsv4-migration-issues alive
> > because
> > > >>> of
> > > >>>> the section dealing with issues relating to v4.1.Otherwise, I would
> > have
> > > >>>> let the thing expire.  The next time I update this, I'll probably
> > > >>> collapse
> > > >>>> sections 4 and 5 to a short section saying that all the
> > > >>>> v4.0 issues were addressed by publication of RFC7931.
> > > >>>>
> > > >>>>> (It appears that the Linux client is trying to do that by sending
> > > >>>>> a setclientid_confirm to server1 using the (clientid,verifier)
> > > >>>>> returned from a setclientid reply from server2.  That doesn't look
> > > >>>>> correct to me.)
> > > >>>>
> > > >>>> It seems kind of weird but the idea is that if you get the same
> > clientid
> > > >>>> from two server IP address they are probably connected to the same
> > server
> > > >>>> (i.e are trunked), but there is a chance that having the same
> > clientid
> > > >>> is a
> > > >>>> coincidence,  The idea is that if these are the same server using
> > the
> > > >>>> verifier will work but if they are different servers it won't work
> > but
> > > >>> will
> > > >>>> be harmless.
> > > >>>>
> > > >>>>
> > > >>>> On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <
> > bfields@fieldses.org>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> In
> > > >>>>> https://tools.ietf.org/html/draft-ietf-nfsv4-migration-
> > > >>>>> issues-10#section-5.4.2
> > > >>>>>
> > > >>>>>        In the face of possible trunking of server IP addresses, the
> > > >>>>>        client will use the receipt of the same clientid4 from
> > multiple
> > > >>>>>        IP-addresses, as an indication that the two IP- addresses
> > may
> > > >>> be
> > > >>>>>        trunked and proceed to determine, from the observed server
> > > >>>>>        behavior whether the two addresses are in fact trunked.
> > > >>>>>
> > > >>>>> But I can't find any further discussion of how a client might make
> > that
> > > >>>>> determination.  Am I overlooking it?
> > > >>>>>
> > > >>>>> (It appears that the Linux client is trying to do that by sending a
> > > >>>>> setclientid_confirm to server1 using the (clientid,verifier)
> > returned
> > > >>>>> from a setclientid reply from server2.  That doesn't look correct
> > to
> > > >>>>> me.)
> > > >>>>>
> > > >>>>> --b.
> > > >>>>>
> > > >>>
> > > >
> > > > _______________________________________________
> > > > nfsv4 mailing list
> > > > nfsv4@ietf.org
> > > > https://www.ietf.org/mailman/listinfo/nfsv4
> > >
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4
> >