Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id 5119412B0AF
 for <nfsv4@ietfa.amsl.com>; Wed, 21 Sep 2016 02:10:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level: 
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5
 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
 DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001,
 RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001]
 autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key)
 header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id XOU9DT1J16KK for <nfsv4@ietfa.amsl.com>;
 Wed, 21 Sep 2016 02:10:00 -0700 (PDT)
Received: from mail-oi0-x236.google.com (mail-oi0-x236.google.com
 [IPv6:2607:f8b0:4003:c06::236])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 98BD612B360
 for <nfsv4@ietf.org>; Wed, 21 Sep 2016 02:10:00 -0700 (PDT)
Received: by mail-oi0-x236.google.com with SMTP id t83so52557346oie.3
 for <nfsv4@ietf.org>; Wed, 21 Sep 2016 02:10:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; 
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=wZ8EYXUrGwyAVWXS1xTMJ6oQq37ACEO4dcrb6r5i7Ks=;
 b=Yl3VuSXFP7KAnAZrYrSJoSABPydm3ZPmMSeOmu7wBRPfbFqY/fHNInhgVNwLQn4G9D
 O8LvZm/K1D07ty13n83fw9Y45ODS+9Mmbgvf/r0ZcBTckG1LA3wukPyXubSMOKKtwgYl
 TitnLYfxtA5CJyxpsiSHzKURGqS5S1YcQ42QoE4oHVk3gAeura10rO+gO4sIiitqzyyn
 Rl1zfUL6I9fSb0OcPI1gokK9FbZCm7vbBLLu2OlWAK5OLUm88mO4eg+gB+kipCFoqx0P
 e9VksJ5lO/UgprjfflqJNuuaPBG8SQv5D45u/LM5WBmzQYfmqL6kVMw+eQRJOnf0I28+
 MS6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=wZ8EYXUrGwyAVWXS1xTMJ6oQq37ACEO4dcrb6r5i7Ks=;
 b=YFc1JHpotiGacAtv9kczC+PA2H+jMZwXk+e/dIHqaDVQmnoXDTLyHfhqGdeeaPOz6M
 dW52Rt5Ioq53+dXiXniO70n4ALQdWBbOMiVHMlCr8n3JJvM2Th+pGLDPsESBkkUTPy+r
 axjPeY7HGoIHsZ17X39M0WoKN7AoLBHIaiyg1z3wVUur5J+NiR72GUsfqmVHYlhRTqEM
 yP0yc7siAVC6M05i7yhSCW4Fouo4dbXHDltOjbRVZqayRuDMfl036YtM/EGyey06IPLT
 zFa88/J5Yc9Mz/OoCarubDglQjelVaZ9lv7+K57j2kxjRCP+pXHN1Mcnpcp0/qMt0ceh
 oFDA==
X-Gm-Message-State: AE9vXwOs41L1hUCZ7s6JJ5xboBLXK1+W8QsLTcZU2GD4mISRWfBqz134BHBkdcFlkitwkqdTjcaH51G4bsGqeg==
X-Received: by 10.202.65.10 with SMTP id o10mr42066116oia.147.1474448999770;
 Wed, 21 Sep 2016 02:09:59 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.192.10 with HTTP; Wed, 21 Sep 2016 02:09:59 -0700 (PDT)
In-Reply-To: <20160921024531.GA17232@fieldses.org>
References: <20160907212039.GA6847@fieldses.org>
 <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com>
 <20160908010532.GA10658@fieldses.org>
 <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com>
 <20160910200355.GA30688@fieldses.org>
 <BBB2EBDC-6F05-44C1-B45A-C84C24A9AD7F@netapp.com>
 <20160920213931.GC12789@fieldses.org>
 <CADaq8jeqr9WPeBeV0ruxwy+5omfgHhJFjAR=tCGL3N-Yjt0PuQ@mail.gmail.com>
 <20160921024531.GA17232@fieldses.org>
From: David Noveck <davenoveck@gmail.com>
Date: Wed, 21 Sep 2016 05:09:59 -0400
Message-ID: <CADaq8jdUJG7zBwn7f2xtLO3gf4nvwjM2H00E6N2oTGP2b9Nzww@mail.gmail.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Content-Type: multipart/alternative; boundary=001a11477448defacf053d00e8c4
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/LVOdDkmText3xu_CgVOr6fi7g3s>
Cc: "Adamson, Andy" <William.Adamson@netapp.com>,
 "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>,
 <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>,
 <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Sep 2016 09:10:05 -0000

--001a11477448defacf053d00e8c4
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

I think that works.  I'm going to look at how to update section 5.8 of
RFC7931.

On Tue, Sep 20, 2016 at 10:45 PM, J. Bruce Fields <bfields@fieldses.org>
wrote:

> On Tue, Sep 20, 2016 at 08:55:31PM -0400, David Noveck wrote:
> > >The thing is, though, that in that case we already have our
> > > answer at step 3:
> > > if X1 pointed to the same server as X2, then the
> > > SETCLIENTID to X1 would have to return a distinct verifier.
> >
> > True but the converse doesn't hold.
>
> Agreed, I was suggesting stopping here only in the case the verifiers
> match, and continuing otherwise.
>
> > If you get a distinct verifier,
> > it could be because X1 and X2 are the same server, but the other
> > possibility is that X1 and X2 are different servers and, if they are,
> they
> > may well happen to have different verifiers.
>
> Yes.
>
> > But I think you are on to something.  If the verifiers for X1 and X2 ar=
e
> V1
> > and V2 respectively, if V2 works successfully then you have to go back
> and
> > use V1 on X1.  If the SETCLIENTID_CONFIRM to X2 using V2 succeeds, you
> can
> > go back and try another confirm to X1 using V2.  If X1 and X2 are
> distinct,
> > that should still succeed.  On the other hand, if they are the same
> server
> > the creation of V2 should replace V1 and it will become invalid.
>
> I think that would work, though what I was suggesting was a few less
> steps, roughly:
>
>          1. a SETCLIENTID to IP address X1 returns a clientid.  You
>          confirm it and carry on.
>
>          2.  Some time later a SETCLIENTID to IP address X2 returns the
>          same clientid, and verifier V2.  To check whether X1 and X2
>          actually point to the same server:
>
>          3. Send a callback-changing SETCLIENTID to X1.  Get back
>          (clientid, V1).  If V1 =3D=3D V2, then X1 and X2 are different
>          servers.  Otherwise:
>
>          4. Send a SETCLIENTID_CONFIRM(clientid, V1) to X2.  If it
>          succeeds, they're the same, otherwise, they're different.
>
> --b.
>
> >
> > > We can stop there.
> >
> > Or somewhere.
> >
> > On Tue, Sep 20, 2016 at 5:39 PM, J. Bruce Fields <bfields@fieldses.org>
> > wrote:
> >
> > > On Mon, Sep 12, 2016 at 12:59:05PM +0000, Adamson, Andy wrote:
> > > > IMHO we should not worry too much about NFSv4.0 trunking as NFSv4.1=
+
> > > solves this issue. Trunking is simply one more reason to move from
> NFSv4.0
> > > to NFSv4.1.
> > >
> > > I tend to agree, but since it got implemented an was causing a real
> bug,
> > > it was hard to ignore....
> > >
> > > After a patch (55b9df93ddd6 "NFSv4/v4.1: Verify the client owner id
> > > during trunking detection", if anyone cares) the damage seems
> restricted
> > > to the non-default "migration" case, making this less of a concern.
> > >
> > > I wonder if there's an easy fix, though: if I'm reading rfc 7931 righ=
t,
> > > it works (slightly simplified) like this:
> > >
> > >         1. a SETCLIENTID to IP address X1 returns a clientid.  You
> > >         confirm it and carry on.
> > >
> > >         2.  Some time later a SETCLIENTID to IP address X2 returns th=
e
> > >         same clientid.  To check whether X1 and X2 actually point to
> the
> > >         same server:
> > >
> > >         3. Send a callback-changing SETCLIENTID to X1.  Get back
> > >         (clientid, verifer).
> > >
> > >         4. Confirm it with a SETCLIENTID_CONFIRM to X2.
> > >
> > > If the SETCLIENTID_CONFIRM succeeds then they're the same server.
> > > Servers are required to return an error (STALE_CLIENTID, I think?) if
> > > the verifier in the SETCLIENTID_CONFIRM arguments doesn't match.
> > >
> > > But, if the SETCLIENTID at step 2 just happened to return the same
> > > verifier as the SETCLIENTID at step 3, then the SETCLIENTID_CONFIRM a=
t
> > > step 4 will succeed even if the two servers are different.  Hence the
> > > bug.
> > >
> > > The thing is, though, that in that case we already have our answer at
> > > step 3: if X1 pointed to the same server as X2, then the SETCLIENTID =
to
> > > X1 would have to return a distinct verifier.  We can stop there.
> > >
> > > With that addition, don't we get the right answer every time, without
> > > any unjustified assumptions about the randomness of server's clientid
> > > and verifier generation?  Or am I missing something?
> > >
> > > --b.
> > >
> > > >
> > > > =E2=80=94>Andy
> > > >
> > > >
> > > > > On Sep 10, 2016, at 4:03 PM, J. Bruce Fields <bfields@fieldses.or=
g
> >
> > > wrote:
> > > > >
> > > > > On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck wrote:
> > > > >> Thanks for pointing this out.
> > > > >>
> > > > >> Before I address the details of this, let me state my overall
> > > position:
> > > > >>
> > > > >>   - I think it is correct that RFC7931 overstates the degree of
> > > assurance
> > > > >>   that one might reasonably have regarding the unlikelihood of
> > > spurious
> > > > >>   collision of clientid4's and verifiers.
> > > > >>   - Nevertheless, I don't think the situation is as bleak as you
> > > paint it,
> > > > >>   with regard to the Linux server approach to these matters that
> you
> > > > >>   describe.  This is basically because I don't see the issue of
> the
> > > > >>   correlation between clientid4's and verifiers the same way tha=
t
> you
> > > do. See
> > > > >>   below.
> > > > >>   - I think it is possible to address the issue satisfactorily i=
n
> the
> > > > >>   context of an errata rfc7931 errata and will start working on
> that.
> > > > >>   - Given the weakness of the rfc 3530/7530 requirements in this
> > > area, we
> > > > >>   may need (yet another) RFC updating 7530 at some point.
> > > > >>   - I see that as a longer-term effort, since the practices that
> you
> > > > >>   describe will not result in large numbers of spurious
> collisions and
> > > > >>   clients can adapt the algorithm to require additional
> verification.
> > > > >>   - If there are servers out there whose practices are
> significantly
> > > more
> > > > >>   troublesome than the ones you describe, we need to find out
> soon.
> > > Given
> > > > >>   that many of those responsible for v4.0 implementations may no=
t
> be
> > > reading
> > > > >>   this list, I suggest we discuss this at the October Bakeathon.
> > > > >>
> > > > >>
> > > > >>> The Linux server is generating clientid as a (boot time, counte=
r)
> > > pair.
> > > > >>
> > > > >> I'm assuming the boot time is in the form of a 32-bit unsigned
> number
> > > of
> > > > >> seconds after some fixed date/time.  Correct?
> > > > >>
> > > > >> If that is the case, duplicates would require that:
> > > > >> 1. Two servers be booted during the same second (e.g. when power
> came
> > > back
> > > > >> on after a disruption).
> > > > >
> > > > > Right, or routine maintenance, or whatever.
> > > > >
> > > > >> 2. Each of the servers has received the same number of client
> > > SETCLIENTIDs
> > > > >> (mod 4 billion).
> > > > >>
> > > > >> Clearly this is possible although I would say that it is unlikel=
y.
> > > > >> Nevertheless, the "highly unlikely" in RFC7931 is overstating it=
.
> > > > >
> > > > > It turns out that if you have a hundred servers that get rebooted
> > > > > simultaneously for a kernel update or some similar routine
> maintenance,
> > > > > this happen every time.  (I'm a step removed from the original ca=
se
> > > > > here, but I believe this is more-or-less accurate and not
> > > hypothetical.)
> > > > >
> > > > > Pretty sure you can easily hit this without that big a setup, too=
.
> > > > >
> > > > >> The case that would be more worrisome would be one which a serve=
r
> > > simply
> > > > >> used a 64-bit word of byte-addressable persistent memory (e.g.
> > > NVDIMM, 3d
> > > > >> xpoint) to maintain a permanent counter, dispensing with the boo=
t
> > > time.
> > > > >
> > > > > That'd be something worth looking into in cases the users have th=
e
> > > right
> > > > > hardware (not always).  The workaround I'm going with for now is
> > > > > initializing the counter part to something random.  (Random
> numbers may
> > > > > not be completely reliable at boot time either, but I suspect
> they'll
> > > be
> > > > > good enough here...).
> > > > >
> > > > >> That is allowable as RFC7530 stands.  I don't expect that this i=
s
> a
> > > current
> > > > >> problem but, given where persistent memory is going (i.e. cheape=
r
> and
> > > more
> > > > >> common), we could see this in the future.
> > > > >>
> > > > >> Clearly, spurious clientid4 collisions are possible, which is th=
e
> > > point of
> > > > >> the whole algorithm.
> > > > >>
> > > > >>> Ditto for the verifier--it's a (boot time, counter) pair,
> > > > >>
> > > > >> I presume the same boot time format is used.  Correct?
> > > > >>
> > > > >>> with the second part coming from a different counter, but
> > > > >>> normally I think they'll get incremented at the same time,
> > > > >>
> > > > >> It is true that they will both get incremented in many common
> (i.e.
> > > > >> "normal") situations.
> > > > >>
> > > > >> However, this common case is not the only case.
> > > > >>
> > > > >> In particular, in the case in which one or more clients are
> following
> > > the
> > > > >> guidance in RFC7931 and there are trunked server addresses, you
> will
> > > have
> > > > >> occasions in which the verifier counter will get incremented and
> the
> > > > >> clientid4 counter will not.
> > > > >> 1.  If the SETCLIENTID is used to modify the callback informatio=
n
> > > > >> 2.  If two different SETCLIENTID operations are done by the same
> > > client
> > > > >
> > > > > Sure.
> > > > >
> > > > >>> so clientid and verifier are highly correlated.
> > > > >>
> > > > >> I don't think so.  The boot time portions are clearly the same,
> but
> > > given
> > > > >> the likely occurrence of cases 1. and 2.these counters will
> typically
> > > have
> > > > >> different values.  They may be correlated in the sense that the
> > > difference
> > > > >> between the two values is likely not to be large, but that does
> not
> > > mean
> > > > >> that the probability of them having the same value is necessaril=
y
> > > substantial.
> > > > >>
> > > > >> Once there are any occasions in which the verifiers are
> incremented
> > > and the
> > > > >> clientid's are not, these values are essentially uncorrelated.
> The
> > > only
> > > > >> way that they can collide at the same time the clientids collide
> is
> > > when
> > > > >> the number of instances of the cases 1. and 2. above is a
> multiple of
> > > > >> 2^32.
> > > > >
> > > > > I'm not following.  I think you may be confusing clientid-verifie=
r
> > > > > collisions with verifier-verifier (from different server)
> collisions.
> > > > > The latter is what matters.
> > > > >
> > > > > The confusion may be my fault.  I should have said something like=
:
> > > > > chance of a collision between two clientid's is correlated with
> chance
> > > > > of a collision between two verifiers given out by the same two
> servers.
> > > > > So adding in a verifier comparison doesn't decrease the
> probability as
> > > > > much as you'd expect.
> > > > >
> > > > >> While I think that RFC7931's "vanishingly small" is not correct,=
 I
> > > > >> would argue that this brings us into "highly unlikely" territory=
.
> > > > >
> > > > > So, unfortunately, we appear to have an actual real-life case whe=
re
> > > this
> > > > > happens all the time.
> > > > >
> > > > >> Also, the
> > > > >> existing text does suggest that you can repeat the procedure any
> > > number of
> > > > >> times to reduce the likelihood of a spurious determination that
> two
> > > > >> addresses are trunked.
> > > > >
> > > > > My intuition is that that would mean a lot of effort for a
> > > disappointing
> > > > > reduction in the probability of a bug, but I haven't done any
> > > > > experiments or thought this through much.
> > > > >
> > > > > --b.
> > > > >
> > > > >>> So, we can mitigate this by adding some randomness, OK.
> > > > >>
> > > > >> One version of that might be to start the counter fields with th=
e
> > > > >> nanoseconds portion of the boot time.
> > > > >>
> > > > >> Alternatively, you might  keep the  current counter each startin=
g
> at
> > > zero
> > > > >> but take advantage of the fact that clientid and verfier are use=
d
> > > > >> together.  To do that the timestamp, in clientid4 might be boot
> time
> > > in
> > > > >> seconds while the one in the verifier would be boot time
> nanoseconds
> > > within
> > > > >> the second during which the boot occurred. That would reduce the
> > > frequency
> > > > >> of verifier collisions but this might not be necessary, if, as i
> > > expect,
> > > > >> the algorithm will usually result in distinct verifiers anyway.
> > > > >>
> > > > >>> But that's a new requirement not apparent from 3530,
> > > > >>
> > > > >> I think the 7931 text can be patched up via an errata, but if no=
t
> we
> > > will
> > > > >> have to consider how and whether to address the issue in 7530.
> > > > >> Unfortunately, that would be kind of big for an errata.
> > > > >>
> > > > >> If no existing servers have a substantial issue, then we do have
> the
> > > option
> > > > >> of doing a longer-term RFC updating 7530.  This could be directe=
d
> to
> > > the
> > > > >> (probably now) hypothetical case of a server not using boot time
> and
> > > > >> maintaining 64-bit global persistent counts.
> > > > >>
> > > > >> Such servers would have a higher probability of clientid4
> conflict and
> > > > >> verifier conflict than the ones you mention, while staying withi=
n
> the
> > > > >> rfc7530 requirements.
> > > > >>
> > > > >>> and I wouldn't be surprised if other servers have similar issue=
s.
> > > > >>
> > > > >> Those who read this list have been notified of the issue by your
> > > message.
> > > > >>
> > > > >> The problem is with server implementers who don't read this
> list.  In
> > > > >> theory, errata should take care of that, but aside from the
> > > difficulty of
> > > > >> arriving at a suitable change and getting it through, it may be
> that
> > > many
> > > > >> v4.0 implementers and maintainers may not be reading this list o=
r
> > > paying
> > > > >> much attention to errata either.  Perhaps we can discuss this in
> > > Westford
> > > > >> where many implementers, who might not be all that aware of stuf=
f
> on
> > > the
> > > > >> working group list, would be present.  That's also one way to se=
e
> if
> > > there
> > > > >> are existing servers where this is big problem.
> > > > >
> > > > > OK!
> > > > >
> > > > > Assuming nobody's doing anything very complicated--it's probably
> also
> > > > > not difficult to guess algorithms in testing.  Something we can d=
o
> in
> > > > > Westford, or maybe before if anyone has access to a variety of
> servers.
> > > > >
> > > > > --b.
> > > > >
> > > > >>
> > > > >>
> > > > >> On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <
> bfields@fieldses.org
> > > >
> > > > >> wrote:
> > > > >>
> > > > >>> On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote:
> > > > >>>>> But I can't find any further discussion of how a client might
> make
> > > that
> > > > >>>>> determination.  Am I overlooking it?
> > > > >>>>
> > > > >>>> It's actually in section 5.8 of RFC7931.
> > > > >>>
> > > > >>> Oops, thanks!
> > > > >>>
> > > > >>> Looking at that, I think none of this is true:
> > > > >>>
> > > > >>>        Note that the NFSv4.0 specification requires the server =
to
> > > make
> > > > >>>        sure that such verifiers are very unlikely to be
> regenerated.
> > > > >>>
> > > > >>> Verifiers given out by one server shouldn't be reused, sure, bu=
t
> > > that's
> > > > >>> quite different from the claim that collisions *between* server=
s
> are
> > > > >>> unlikely.
> > > > >>>
> > > > >>>        Given that it is already highly unlikely that the
> clientid4 XC
> > > > >>>        is duplicated by distinct servers,
> > > > >>>
> > > > >>> Why is that highly unlikely?
> > > > >>>
> > > > >>>        the probability that SCn is
> > > > >>>        duplicated as well has to be considered vanishingly smal=
l.
> > > > >>>
> > > > >>> There's no reason to believe the probability of a verifier
> collision
> > > is
> > > > >>> uncorrelated with the probability of a clientid collision.
> > > > >>>
> > > > >>> The Linux server is generating clientid as a (boot time, counte=
r)
> > > pair.
> > > > >>> So collision between servers started at the same time (probably
> not
> > > that
> > > > >>> unusual) is possible.
> > > > >>>
> > > > >>> Ditto for the verifier--it's a (boot time, counter) pair, with
> the
> > > > >>> second part coming from a different counter, but normally I thi=
nk
> > > > >>> they'll get incremented at the same time, so clientid and
> verifier
> > > are
> > > > >>> highly correlated.
> > > > >>>
> > > > >>> So, we can mitigate this by adding some randomness, OK.  But
> that's a
> > > > >>> new requirement not apparent from 3530, and I wouldn't be
> surprised
> > > if
> > > > >>> other servers have similar issues.
> > > > >>>
> > > > >>> --b.
> > > > >>>
> > > > >>>> I've only been keeping draft-ietf-nfsv4-migration-issues alive
> > > because
> > > > >>> of
> > > > >>>> the section dealing with issues relating to v4.1.Otherwise, I
> would
> > > have
> > > > >>>> let the thing expire.  The next time I update this, I'll
> probably
> > > > >>> collapse
> > > > >>>> sections 4 and 5 to a short section saying that all the
> > > > >>>> v4.0 issues were addressed by publication of RFC7931.
> > > > >>>>
> > > > >>>>> (It appears that the Linux client is trying to do that by
> sending
> > > > >>>>> a setclientid_confirm to server1 using the (clientid,verifier=
)
> > > > >>>>> returned from a setclientid reply from server2.  That doesn't
> look
> > > > >>>>> correct to me.)
> > > > >>>>
> > > > >>>> It seems kind of weird but the idea is that if you get the sam=
e
> > > clientid
> > > > >>>> from two server IP address they are probably connected to the
> same
> > > server
> > > > >>>> (i.e are trunked), but there is a chance that having the same
> > > clientid
> > > > >>> is a
> > > > >>>> coincidence,  The idea is that if these are the same server
> using
> > > the
> > > > >>>> verifier will work but if they are different servers it won't
> work
> > > but
> > > > >>> will
> > > > >>>> be harmless.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <
> > > bfields@fieldses.org>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> In
> > > > >>>>> https://tools.ietf.org/html/draft-ietf-nfsv4-migration-
> > > > >>>>> issues-10#section-5.4.2
> > > > >>>>>
> > > > >>>>>        In the face of possible trunking of server IP
> addresses, the
> > > > >>>>>        client will use the receipt of the same clientid4 from
> > > multiple
> > > > >>>>>        IP-addresses, as an indication that the two IP-
> addresses
> > > may
> > > > >>> be
> > > > >>>>>        trunked and proceed to determine, from the observed
> server
> > > > >>>>>        behavior whether the two addresses are in fact trunked=
.
> > > > >>>>>
> > > > >>>>> But I can't find any further discussion of how a client might
> make
> > > that
> > > > >>>>> determination.  Am I overlooking it?
> > > > >>>>>
> > > > >>>>> (It appears that the Linux client is trying to do that by
> sending a
> > > > >>>>> setclientid_confirm to server1 using the (clientid,verifier)
> > > returned
> > > > >>>>> from a setclientid reply from server2.  That doesn't look
> correct
> > > to
> > > > >>>>> me.)
> > > > >>>>>
> > > > >>>>> --b.
> > > > >>>>>
> > > > >>>
> > > > >
> > > > > _______________________________________________
> > > > > nfsv4 mailing list
> > > > > nfsv4@ietf.org
> > > > > https://www.ietf.org/mailman/listinfo/nfsv4
> > > >
> > > > _______________________________________________
> > > > nfsv4 mailing list
> > > > nfsv4@ietf.org
> > > > https://www.ietf.org/mailman/listinfo/nfsv4
> > >
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> > >
>

--001a11477448defacf053d00e8c4
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I think that works.=C2=A0 I&#39;m going to look at how to =
update section 5.8 of RFC7931.=C2=A0</div><div class=3D"gmail_extra"><br><d=
iv class=3D"gmail_quote">On Tue, Sep 20, 2016 at 10:45 PM, J. Bruce Fields =
<span dir=3D"ltr">&lt;<a href=3D"mailto:bfields@fieldses.org" target=3D"_bl=
ank">bfields@fieldses.org</a>&gt;</span> wrote:<br><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex"><span class=3D"">On Tue, Sep 20, 2016 at 08:55:31PM -0400, David Nov=
eck wrote:<br>
&gt; &gt;The thing is, though, that in that case we already have our<br>
&gt; &gt; answer at step 3:<br>
&gt; &gt; if X1 pointed to the same server as X2, then the<br>
&gt; &gt; SETCLIENTID to X1 would have to return a distinct verifier.<br>
&gt;<br>
&gt; True but the converse doesn&#39;t hold.<br>
<br>
</span>Agreed, I was suggesting stopping here only in the case the verifier=
s<br>
match, and continuing otherwise.<br>
<span class=3D""><br>
&gt; If you get a distinct verifier,<br>
&gt; it could be because X1 and X2 are the same server, but the other<br>
&gt; possibility is that X1 and X2 are different servers and, if they are, =
they<br>
&gt; may well happen to have different verifiers.<br>
<br>
</span>Yes.<br>
<span class=3D""><br>
&gt; But I think you are on to something.=C2=A0 If the verifiers for X1 and=
 X2 are V1<br>
&gt; and V2 respectively, if V2 works successfully then you have to go back=
 and<br>
&gt; use V1 on X1.=C2=A0 If the SETCLIENTID_CONFIRM to X2 using V2 succeeds=
, you can<br>
&gt; go back and try another confirm to X1 using V2.=C2=A0 If X1 and X2 are=
 distinct,<br>
&gt; that should still succeed.=C2=A0 On the other hand, if they are the sa=
me server<br>
&gt; the creation of V2 should replace V1 and it will become invalid.<br>
<br>
</span>I think that would work, though what I was suggesting was a few less=
<br>
steps, roughly:<br>
<span class=3D""><br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01. a SETCLIENTID to IP address X1 returns=
 a clientid.=C2=A0 You<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0confirm it and carry on.<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02.=C2=A0 Some time later a SETCLIENTID to=
 IP address X2 returns the<br>
</span>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0same clientid, and verifier V2.=C2=
=A0 To check whether X1 and X2<br>
<span class=3D"">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0actually point to the sa=
me server:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03. Send a callback-changing SETCLIENTID t=
o X1.=C2=A0 Get back<br>
</span>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(clientid, V1).=C2=A0 If V1 =3D=3D=
 V2, then X1 and X2 are different<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0servers.=C2=A0 Otherwise:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A04. Send a SETCLIENTID_CONFIRM(clientid, V=
1) to X2.=C2=A0 If it<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0succeeds, they&#39;re the same, otherwise=
, they&#39;re different.<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--b.<br>
</font></span><div class=3D"HOEnZb"><div class=3D"h5"><br>
&gt;<br>
&gt; &gt; We can stop there.<br>
&gt;<br>
&gt; Or somewhere.<br>
&gt;<br>
&gt; On Tue, Sep 20, 2016 at 5:39 PM, J. Bruce Fields &lt;<a href=3D"mailto=
:bfields@fieldses.org">bfields@fieldses.org</a>&gt;<br>
&gt; wrote:<br>
&gt;<br>
&gt; &gt; On Mon, Sep 12, 2016 at 12:59:05PM +0000, Adamson, Andy wrote:<br=
>
&gt; &gt; &gt; IMHO we should not worry too much about NFSv4.0 trunking as =
NFSv4.1+<br>
&gt; &gt; solves this issue. Trunking is simply one more reason to move fro=
m NFSv4.0<br>
&gt; &gt; to NFSv4.1.<br>
&gt; &gt;<br>
&gt; &gt; I tend to agree, but since it got implemented an was causing a re=
al bug,<br>
&gt; &gt; it was hard to ignore....<br>
&gt; &gt;<br>
&gt; &gt; After a patch (55b9df93ddd6 &quot;NFSv4/v4.1: Verify the client o=
wner id<br>
&gt; &gt; during trunking detection&quot;, if anyone cares) the damage seem=
s restricted<br>
&gt; &gt; to the non-default &quot;migration&quot; case, making this less o=
f a concern.<br>
&gt; &gt;<br>
&gt; &gt; I wonder if there&#39;s an easy fix, though: if I&#39;m reading r=
fc 7931 right,<br>
&gt; &gt; it works (slightly simplified) like this:<br>
&gt; &gt;<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01. a SETCLIENTID to IP address X=
1 returns a clientid.=C2=A0 You<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0confirm it and carry on.<br>
&gt; &gt;<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02.=C2=A0 Some time later a SETCL=
IENTID to IP address X2 returns the<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0same clientid.=C2=A0 To check wh=
ether X1 and X2 actually point to the<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0same server:<br>
&gt; &gt;<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A03. Send a callback-changing SETC=
LIENTID to X1.=C2=A0 Get back<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(clientid, verifer).<br>
&gt; &gt;<br>
&gt; &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A04. Confirm it with a SETCLIENTID=
_CONFIRM to X2.<br>
&gt; &gt;<br>
&gt; &gt; If the SETCLIENTID_CONFIRM succeeds then they&#39;re the same ser=
ver.<br>
&gt; &gt; Servers are required to return an error (STALE_CLIENTID, I think?=
) if<br>
&gt; &gt; the verifier in the SETCLIENTID_CONFIRM arguments doesn&#39;t mat=
ch.<br>
&gt; &gt;<br>
&gt; &gt; But, if the SETCLIENTID at step 2 just happened to return the sam=
e<br>
&gt; &gt; verifier as the SETCLIENTID at step 3, then the SETCLIENTID_CONFI=
RM at<br>
&gt; &gt; step 4 will succeed even if the two servers are different.=C2=A0 =
Hence the<br>
&gt; &gt; bug.<br>
&gt; &gt;<br>
&gt; &gt; The thing is, though, that in that case we already have our answe=
r at<br>
&gt; &gt; step 3: if X1 pointed to the same server as X2, then the SETCLIEN=
TID to<br>
&gt; &gt; X1 would have to return a distinct verifier.=C2=A0 We can stop th=
ere.<br>
&gt; &gt;<br>
&gt; &gt; With that addition, don&#39;t we get the right answer every time,=
 without<br>
&gt; &gt; any unjustified assumptions about the randomness of server&#39;s =
clientid<br>
&gt; &gt; and verifier generation?=C2=A0 Or am I missing something?<br>
&gt; &gt;<br>
&gt; &gt; --b.<br>
&gt; &gt;<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; =E2=80=94&gt;Andy<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; On Sep 10, 2016, at 4:03 PM, J. Bruce Fields &lt;<a hre=
f=3D"mailto:bfields@fieldses.org">bfields@fieldses.org</a>&gt;<br>
&gt; &gt; wrote:<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck =
wrote:<br>
&gt; &gt; &gt; &gt;&gt; Thanks for pointing this out.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Before I address the details of this, let me state =
my overall<br>
&gt; &gt; position:<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- I think it is correct that RFC7931 ov=
erstates the degree of<br>
&gt; &gt; assurance<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0that one might reasonably have regardin=
g the unlikelihood of<br>
&gt; &gt; spurious<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0collision of clientid4&#39;s and verifi=
ers.<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- Nevertheless, I don&#39;t think the s=
ituation is as bleak as you<br>
&gt; &gt; paint it,<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0with regard to the Linux server approac=
h to these matters that you<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0describe.=C2=A0 This is basically becau=
se I don&#39;t see the issue of the<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0correlation between clientid4&#39;s and=
 verifiers the same way that you<br>
&gt; &gt; do. See<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0below.<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- I think it is possible to address the=
 issue satisfactorily in the<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0context of an errata rfc7931 errata and=
 will start working on that.<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- Given the weakness of the rfc 3530/75=
30 requirements in this<br>
&gt; &gt; area, we<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0may need (yet another) RFC updating 753=
0 at some point.<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- I see that as a longer-term effort, s=
ince the practices that you<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0describe will not result in large numbe=
rs of spurious collisions and<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0clients can adapt the algorithm to requ=
ire additional verification.<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0- If there are servers out there whose =
practices are significantly<br>
&gt; &gt; more<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0troublesome than the ones you describe,=
 we need to find out soon.<br>
&gt; &gt; Given<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0that many of those responsible for v4.0=
 implementations may not be<br>
&gt; &gt; reading<br>
&gt; &gt; &gt; &gt;&gt;=C2=A0 =C2=A0this list, I suggest we discuss this at=
 the October Bakeathon.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; The Linux server is generating clientid as a (b=
oot time, counter)<br>
&gt; &gt; pair.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; I&#39;m assuming the boot time is in the form of a =
32-bit unsigned number<br>
&gt; &gt; of<br>
&gt; &gt; &gt; &gt;&gt; seconds after some fixed date/time.=C2=A0 Correct?<=
br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; If that is the case, duplicates would require that:=
<br>
&gt; &gt; &gt; &gt;&gt; 1. Two servers be booted during the same second (e.=
g. when power came<br>
&gt; &gt; back<br>
&gt; &gt; &gt; &gt;&gt; on after a disruption).<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Right, or routine maintenance, or whatever.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; 2. Each of the servers has received the same number=
 of client<br>
&gt; &gt; SETCLIENTIDs<br>
&gt; &gt; &gt; &gt;&gt; (mod 4 billion).<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Clearly this is possible although I would say that =
it is unlikely.<br>
&gt; &gt; &gt; &gt;&gt; Nevertheless, the &quot;highly unlikely&quot; in RF=
C7931 is overstating it.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; It turns out that if you have a hundred servers that ge=
t rebooted<br>
&gt; &gt; &gt; &gt; simultaneously for a kernel update or some similar rout=
ine maintenance,<br>
&gt; &gt; &gt; &gt; this happen every time.=C2=A0 (I&#39;m a step removed f=
rom the original case<br>
&gt; &gt; &gt; &gt; here, but I believe this is more-or-less accurate and n=
ot<br>
&gt; &gt; hypothetical.)<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Pretty sure you can easily hit this without that big a =
setup, too.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; The case that would be more worrisome would be one =
which a server<br>
&gt; &gt; simply<br>
&gt; &gt; &gt; &gt;&gt; used a 64-bit word of byte-addressable persistent m=
emory (e.g.<br>
&gt; &gt; NVDIMM, 3d<br>
&gt; &gt; &gt; &gt;&gt; xpoint) to maintain a permanent counter, dispensing=
 with the boot<br>
&gt; &gt; time.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; That&#39;d be something worth looking into in cases the=
 users have the<br>
&gt; &gt; right<br>
&gt; &gt; &gt; &gt; hardware (not always).=C2=A0 The workaround I&#39;m goi=
ng with for now is<br>
&gt; &gt; &gt; &gt; initializing the counter part to something random.=C2=
=A0 (Random numbers may<br>
&gt; &gt; &gt; &gt; not be completely reliable at boot time either, but I s=
uspect they&#39;ll<br>
&gt; &gt; be<br>
&gt; &gt; &gt; &gt; good enough here...).<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; That is allowable as RFC7530 stands.=C2=A0 I don&#3=
9;t expect that this is a<br>
&gt; &gt; current<br>
&gt; &gt; &gt; &gt;&gt; problem but, given where persistent memory is going=
 (i.e. cheaper and<br>
&gt; &gt; more<br>
&gt; &gt; &gt; &gt;&gt; common), we could see this in the future.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Clearly, spurious clientid4 collisions are possible=
, which is the<br>
&gt; &gt; point of<br>
&gt; &gt; &gt; &gt;&gt; the whole algorithm.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Ditto for the verifier--it&#39;s a (boot time, =
counter) pair,<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; I presume the same boot time format is used.=C2=A0 =
Correct?<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; with the second part coming from a different co=
unter, but<br>
&gt; &gt; &gt; &gt;&gt;&gt; normally I think they&#39;ll get incremented at=
 the same time,<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; It is true that they will both get incremented in m=
any common (i.e.<br>
&gt; &gt; &gt; &gt;&gt; &quot;normal&quot;) situations.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; However, this common case is not the only case.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; In particular, in the case in which one or more cli=
ents are following<br>
&gt; &gt; the<br>
&gt; &gt; &gt; &gt;&gt; guidance in RFC7931 and there are trunked server ad=
dresses, you will<br>
&gt; &gt; have<br>
&gt; &gt; &gt; &gt;&gt; occasions in which the verifier counter will get in=
cremented and the<br>
&gt; &gt; &gt; &gt;&gt; clientid4 counter will not.<br>
&gt; &gt; &gt; &gt;&gt; 1.=C2=A0 If the SETCLIENTID is used to modify the c=
allback information<br>
&gt; &gt; &gt; &gt;&gt; 2.=C2=A0 If two different SETCLIENTID operations ar=
e done by the same<br>
&gt; &gt; client<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Sure.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; so clientid and verifier are highly correlated.=
<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; I don&#39;t think so.=C2=A0 The boot time portions =
are clearly the same, but<br>
&gt; &gt; given<br>
&gt; &gt; &gt; &gt;&gt; the likely occurrence of cases 1. and 2.these count=
ers will typically<br>
&gt; &gt; have<br>
&gt; &gt; &gt; &gt;&gt; different values.=C2=A0 They may be correlated in t=
he sense that the<br>
&gt; &gt; difference<br>
&gt; &gt; &gt; &gt;&gt; between the two values is likely not to be large, b=
ut that does not<br>
&gt; &gt; mean<br>
&gt; &gt; &gt; &gt;&gt; that the probability of them having the same value =
is necessarily<br>
&gt; &gt; substantial.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Once there are any occasions in which the verifiers=
 are incremented<br>
&gt; &gt; and the<br>
&gt; &gt; &gt; &gt;&gt; clientid&#39;s are not, these values are essentiall=
y uncorrelated.=C2=A0 The<br>
&gt; &gt; only<br>
&gt; &gt; &gt; &gt;&gt; way that they can collide at the same time the clie=
ntids collide is<br>
&gt; &gt; when<br>
&gt; &gt; &gt; &gt;&gt; the number of instances of the cases 1. and 2. abov=
e is a multiple of<br>
&gt; &gt; &gt; &gt;&gt; 2^32.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; I&#39;m not following.=C2=A0 I think you may be confusi=
ng clientid-verifier<br>
&gt; &gt; &gt; &gt; collisions with verifier-verifier (from different serve=
r) collisions.<br>
&gt; &gt; &gt; &gt; The latter is what matters.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; The confusion may be my fault.=C2=A0 I should have said=
 something like:<br>
&gt; &gt; &gt; &gt; chance of a collision between two clientid&#39;s is cor=
related with chance<br>
&gt; &gt; &gt; &gt; of a collision between two verifiers given out by the s=
ame two servers.<br>
&gt; &gt; &gt; &gt; So adding in a verifier comparison doesn&#39;t decrease=
 the probability as<br>
&gt; &gt; &gt; &gt; much as you&#39;d expect.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; While I think that RFC7931&#39;s &quot;vanishingly =
small&quot; is not correct, I<br>
&gt; &gt; &gt; &gt;&gt; would argue that this brings us into &quot;highly u=
nlikely&quot; territory.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; So, unfortunately, we appear to have an actual real-lif=
e case where<br>
&gt; &gt; this<br>
&gt; &gt; &gt; &gt; happens all the time.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; Also, the<br>
&gt; &gt; &gt; &gt;&gt; existing text does suggest that you can repeat the =
procedure any<br>
&gt; &gt; number of<br>
&gt; &gt; &gt; &gt;&gt; times to reduce the likelihood of a spurious determ=
ination that two<br>
&gt; &gt; &gt; &gt;&gt; addresses are trunked.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; My intuition is that that would mean a lot of effort fo=
r a<br>
&gt; &gt; disappointing<br>
&gt; &gt; &gt; &gt; reduction in the probability of a bug, but I haven&#39;=
t done any<br>
&gt; &gt; &gt; &gt; experiments or thought this through much.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; --b.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; So, we can mitigate this by adding some randomn=
ess, OK.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; One version of that might be to start the counter f=
ields with the<br>
&gt; &gt; &gt; &gt;&gt; nanoseconds portion of the boot time.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Alternatively, you might=C2=A0 keep the=C2=A0 curre=
nt counter each starting at<br>
&gt; &gt; zero<br>
&gt; &gt; &gt; &gt;&gt; but take advantage of the fact that clientid and ve=
rfier are used<br>
&gt; &gt; &gt; &gt;&gt; together.=C2=A0 To do that the timestamp, in client=
id4 might be boot time<br>
&gt; &gt; in<br>
&gt; &gt; &gt; &gt;&gt; seconds while the one in the verifier would be boot=
 time nanoseconds<br>
&gt; &gt; within<br>
&gt; &gt; &gt; &gt;&gt; the second during which the boot occurred. That wou=
ld reduce the<br>
&gt; &gt; frequency<br>
&gt; &gt; &gt; &gt;&gt; of verifier collisions but this might not be necess=
ary, if, as i<br>
&gt; &gt; expect,<br>
&gt; &gt; &gt; &gt;&gt; the algorithm will usually result in distinct verif=
iers anyway.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; But that&#39;s a new requirement not apparent f=
rom 3530,<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; I think the 7931 text can be patched up via an erra=
ta, but if not we<br>
&gt; &gt; will<br>
&gt; &gt; &gt; &gt;&gt; have to consider how and whether to address the iss=
ue in 7530.<br>
&gt; &gt; &gt; &gt;&gt; Unfortunately, that would be kind of big for an err=
ata.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; If no existing servers have a substantial issue, th=
en we do have the<br>
&gt; &gt; option<br>
&gt; &gt; &gt; &gt;&gt; of doing a longer-term RFC updating 7530.=C2=A0 Thi=
s could be directed to<br>
&gt; &gt; the<br>
&gt; &gt; &gt; &gt;&gt; (probably now) hypothetical case of a server not us=
ing boot time and<br>
&gt; &gt; &gt; &gt;&gt; maintaining 64-bit global persistent counts.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Such servers would have a higher probability of cli=
entid4 conflict and<br>
&gt; &gt; &gt; &gt;&gt; verifier conflict than the ones you mention, while =
staying within the<br>
&gt; &gt; &gt; &gt;&gt; rfc7530 requirements.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; and I wouldn&#39;t be surprised if other server=
s have similar issues.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; Those who read this list have been notified of the =
issue by your<br>
&gt; &gt; message.<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; The problem is with server implementers who don&#39=
;t read this list.=C2=A0 In<br>
&gt; &gt; &gt; &gt;&gt; theory, errata should take care of that, but aside =
from the<br>
&gt; &gt; difficulty of<br>
&gt; &gt; &gt; &gt;&gt; arriving at a suitable change and getting it throug=
h, it may be that<br>
&gt; &gt; many<br>
&gt; &gt; &gt; &gt;&gt; v4.0 implementers and maintainers may not be readin=
g this list or<br>
&gt; &gt; paying<br>
&gt; &gt; &gt; &gt;&gt; much attention to errata either.=C2=A0 Perhaps we c=
an discuss this in<br>
&gt; &gt; Westford<br>
&gt; &gt; &gt; &gt;&gt; where many implementers, who might not be all that =
aware of stuff on<br>
&gt; &gt; the<br>
&gt; &gt; &gt; &gt;&gt; working group list, would be present.=C2=A0 That&#3=
9;s also one way to see if<br>
&gt; &gt; there<br>
&gt; &gt; &gt; &gt;&gt; are existing servers where this is big problem.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; OK!<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Assuming nobody&#39;s doing anything very complicated--=
it&#39;s probably also<br>
&gt; &gt; &gt; &gt; not difficult to guess algorithms in testing.=C2=A0 Som=
ething we can do in<br>
&gt; &gt; &gt; &gt; Westford, or maybe before if anyone has access to a var=
iety of servers.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; --b.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt; On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields &lt=
;<a href=3D"mailto:bfields@fieldses.org">bfields@fieldses.org</a><br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt;&gt; wrote:<br>
&gt; &gt; &gt; &gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; On Wed, Sep 07, 2016 at 08:14:58PM -0400, David=
 Noveck wrote:<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; But I can&#39;t find any further discus=
sion of how a client might make<br>
&gt; &gt; that<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; determination.=C2=A0 Am I overlooking i=
t?<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; It&#39;s actually in section 5.8 of RFC7931=
.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Oops, thanks!<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Looking at that, I think none of this is true:<=
br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 Note that the NFSv4.=
0 specification requires the server to<br>
&gt; &gt; make<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 sure that such verif=
iers are very unlikely to be regenerated.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Verifiers given out by one server shouldn&#39;t=
 be reused, sure, but<br>
&gt; &gt; that&#39;s<br>
&gt; &gt; &gt; &gt;&gt;&gt; quite different from the claim that collisions =
*between* servers are<br>
&gt; &gt; &gt; &gt;&gt;&gt; unlikely.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 Given that it is alr=
eady highly unlikely that the clientid4 XC<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 is duplicated by dis=
tinct servers,<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Why is that highly unlikely?<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 the probability that=
 SCn is<br>
&gt; &gt; &gt; &gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 duplicated as well h=
as to be considered vanishingly small.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; There&#39;s no reason to believe the probabilit=
y of a verifier collision<br>
&gt; &gt; is<br>
&gt; &gt; &gt; &gt;&gt;&gt; uncorrelated with the probability of a clientid=
 collision.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; The Linux server is generating clientid as a (b=
oot time, counter)<br>
&gt; &gt; pair.<br>
&gt; &gt; &gt; &gt;&gt;&gt; So collision between servers started at the sam=
e time (probably not<br>
&gt; &gt; that<br>
&gt; &gt; &gt; &gt;&gt;&gt; unusual) is possible.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; Ditto for the verifier--it&#39;s a (boot time, =
counter) pair, with the<br>
&gt; &gt; &gt; &gt;&gt;&gt; second part coming from a different counter, bu=
t normally I think<br>
&gt; &gt; &gt; &gt;&gt;&gt; they&#39;ll get incremented at the same time, s=
o clientid and verifier<br>
&gt; &gt; are<br>
&gt; &gt; &gt; &gt;&gt;&gt; highly correlated.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; So, we can mitigate this by adding some randomn=
ess, OK.=C2=A0 But that&#39;s a<br>
&gt; &gt; &gt; &gt;&gt;&gt; new requirement not apparent from 3530, and I w=
ouldn&#39;t be surprised<br>
&gt; &gt; if<br>
&gt; &gt; &gt; &gt;&gt;&gt; other servers have similar issues.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt; --b.<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; I&#39;ve only been keeping draft-ietf-nfsv4=
-migration-<wbr>issues alive<br>
&gt; &gt; because<br>
&gt; &gt; &gt; &gt;&gt;&gt; of<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; the section dealing with issues relating to=
 v4.1.Otherwise, I would<br>
&gt; &gt; have<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; let the thing expire.=C2=A0 The next time I=
 update this, I&#39;ll probably<br>
&gt; &gt; &gt; &gt;&gt;&gt; collapse<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; sections 4 and 5 to a short section saying =
that all the<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; v4.0 issues were addressed by publication o=
f RFC7931.<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; (It appears that the Linux client is tr=
ying to do that by sending<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; a setclientid_confirm to server1 using =
the (clientid,verifier)<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; returned from a setclientid reply from =
server2.=C2=A0 That doesn&#39;t look<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; correct to me.)<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; It seems kind of weird but the idea is that=
 if you get the same<br>
&gt; &gt; clientid<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; from two server IP address they are probabl=
y connected to the same<br>
&gt; &gt; server<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; (i.e are trunked), but there is a chance th=
at having the same<br>
&gt; &gt; clientid<br>
&gt; &gt; &gt; &gt;&gt;&gt; is a<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; coincidence,=C2=A0 The idea is that if thes=
e are the same server using<br>
&gt; &gt; the<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; verifier will work but if they are differen=
t servers it won&#39;t work<br>
&gt; &gt; but<br>
&gt; &gt; &gt; &gt;&gt;&gt; will<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; be harmless.<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fi=
elds &lt;<br>
&gt; &gt; <a href=3D"mailto:bfields@fieldses.org">bfields@fieldses.org</a>&=
gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt; wrote:<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; In<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; <a href=3D"https://tools.ietf.org/html/=
draft-ietf-nfsv4-migration-" rel=3D"noreferrer" target=3D"_blank">https://t=
ools.ietf.org/html/<wbr>draft-ietf-nfsv4-migration-</a><br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; issues-10#section-5.4.2<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 In the face =
of possible trunking of server IP addresses, the<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 client will =
use the receipt of the same clientid4 from<br>
&gt; &gt; multiple<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 IP-addresses=
, as an indication that the two IP- addresses<br>
&gt; &gt; may<br>
&gt; &gt; &gt; &gt;&gt;&gt; be<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 trunked and =
proceed to determine, from the observed server<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0 behavior whe=
ther the two addresses are in fact trunked.<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; But I can&#39;t find any further discus=
sion of how a client might make<br>
&gt; &gt; that<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; determination.=C2=A0 Am I overlooking i=
t?<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; (It appears that the Linux client is tr=
ying to do that by sending a<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; setclientid_confirm to server1 using th=
e (clientid,verifier)<br>
&gt; &gt; returned<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; from a setclientid reply from server2.=
=C2=A0 That doesn&#39;t look correct<br>
&gt; &gt; to<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; me.)<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt; --b.<br>
&gt; &gt; &gt; &gt;&gt;&gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;&gt;&gt;<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; ______________________________<wbr>_________________<br=
>
&gt; &gt; &gt; &gt; nfsv4 mailing list<br>
&gt; &gt; &gt; &gt; <a href=3D"mailto:nfsv4@ietf.org">nfsv4@ietf.org</a><br=
>
&gt; &gt; &gt; &gt; <a href=3D"https://www.ietf.org/mailman/listinfo/nfsv4"=
 rel=3D"noreferrer" target=3D"_blank">https://www.ietf.org/mailman/<wbr>lis=
tinfo/nfsv4</a><br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; ______________________________<wbr>_________________<br>
&gt; &gt; &gt; nfsv4 mailing list<br>
&gt; &gt; &gt; <a href=3D"mailto:nfsv4@ietf.org">nfsv4@ietf.org</a><br>
&gt; &gt; &gt; <a href=3D"https://www.ietf.org/mailman/listinfo/nfsv4" rel=
=3D"noreferrer" target=3D"_blank">https://www.ietf.org/mailman/<wbr>listinf=
o/nfsv4</a><br>
&gt; &gt;<br>
&gt; &gt; ______________________________<wbr>_________________<br>
&gt; &gt; nfsv4 mailing list<br>
&gt; &gt; <a href=3D"mailto:nfsv4@ietf.org">nfsv4@ietf.org</a><br>
&gt; &gt; <a href=3D"https://www.ietf.org/mailman/listinfo/nfsv4" rel=3D"no=
referrer" target=3D"_blank">https://www.ietf.org/mailman/<wbr>listinfo/nfsv=
4</a><br>
&gt; &gt;<br>
</div></div></blockquote></div><br></div>

--001a11477448defacf053d00e8c4--

