Re: [nfsv4] 4.0 trunking

David Noveck <davenoveck@gmail.com> Sat, 10 September 2016 19:38 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 37BF812B184 for <nfsv4@ietfa.amsl.com>; Sat, 10 Sep 2016 12:38:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eH7_UVU1USGd for <nfsv4@ietfa.amsl.com>; Sat, 10 Sep 2016 12:38:31 -0700 (PDT)
Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2955512B167 for <nfsv4@ietf.org>; Sat, 10 Sep 2016 12:38:31 -0700 (PDT)
Received: by mail-oi0-x231.google.com with SMTP id q188so104380918oia.3 for <nfsv4@ietf.org>; Sat, 10 Sep 2016 12:38:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=zzM8cBRCyGG/CDJYD+b8QncQ/hB3h9dU6Zgw05XuJAg=; b=ZNoFh8cQ8sHxo5uHU/DuL+SKuYKGkAz9dDTK3P/aXF7Bcc04SLtvIcbUDAKlCXhCP/ 6Uo/BOxf3lIamf1iKYw057DIJcKSU7KLm2zSJF9WdwhEdDyGJ93Q1gU/iuZWpc1f6W2e AVojWH88o6c8O9w59/1tBG2jz1GgrKt0Xc1WKiiDA50CtRY+jSN6z4fvDi7e2MuEltSB 3qeZW+xwUIKk8EI+Pf76b6+9SWLbgYMegD/Q7nqLmcvGYKlMkhZ9hQe0hLpI3xxwh5nD PweKGrWTkI7Ac41rZ2nBQqqyMKhZNB4YpASEijjnx0Ca7KkXUIABgENNbCbi0KNdpV3b 4XHg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=zzM8cBRCyGG/CDJYD+b8QncQ/hB3h9dU6Zgw05XuJAg=; b=aPBWQ9fAw0VlIXYw+LQ3KWJIdnT3sspsmJxJYZvm7VThWi5uOQJUw3sQXmouq8EgQk Ls5/V2NzsAjwsXczF4c0KHCwoVd4fLB667CxNgCTIItKhRdYBceIxH/W6ss4NIPIq9LE rLlw1P7f1LaGr7AKZ3dxRxpa0wu6POdozkqtTaBmjmEa/z2oVhG3frhWel/nr8IhQ23m NE6Ouyst9l0I1Hf7aD7mNKp4LVlcI8o+3QsdV1R6BcRFLAPhV+yk1oYHli7JSvg7aON1 R/aKHBxV6Ce7gw5yMFxnxnqZBUN4PdoidINYjeJnz3CZw88KtG4lVGNN44fJ53eybGMO HeoQ==
X-Gm-Message-State: AE9vXwOoTTrnjpgIRrIatksrYknXegpo/K/SrCM8bUUfKTkSwzy5f0vXRydArVzZCZzkX04yrNj1Glz9LaAlDg==
X-Received: by 10.157.47.230 with SMTP id b35mr12001251otd.90.1473536310361; Sat, 10 Sep 2016 12:38:30 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.182.192.10 with HTTP; Sat, 10 Sep 2016 12:38:29 -0700 (PDT)
In-Reply-To: <20160908010532.GA10658@fieldses.org>
References: <20160907212039.GA6847@fieldses.org> <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com> <20160908010532.GA10658@fieldses.org>
From: David Noveck <davenoveck@gmail.com>
Date: Sat, 10 Sep 2016 15:38:29 -0400
Message-ID: <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Content-Type: multipart/alternative; boundary="001a1141bee857c886053c2c6857"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/8MCAo0qh40oRMKEM2rOxprhy6tE>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 10 Sep 2016 19:38:34 -0000

Thanks for pointing this out.

Before I address the details of this, let me state my overall position:

   - I think it is correct that RFC7931 overstates the degree of assurance
   that one might reasonably have regarding the unlikelihood of spurious
   collision of clientid4's and verifiers.
   - Nevertheless, I don't think the situation is as bleak as you paint it,
   with regard to the Linux server approach to these matters that you
   describe.  This is basically because I don't see the issue of the
   correlation between clientid4's and verifiers the same way that you do. See
   below.
   - I think it is possible to address the issue satisfactorily in the
   context of an errata rfc7931 errata and will start working on that.
   - Given the weakness of the rfc 3530/7530 requirements in this area, we
   may need (yet another) RFC updating 7530 at some point.
   - I see that as a longer-term effort, since the practices that you
   describe will not result in large numbers of spurious collisions and
   clients can adapt the algorithm to require additional verification.
   - If there are servers out there whose practices are significantly more
   troublesome than the ones you describe, we need to find out soon.  Given
   that many of those responsible for v4.0 implementations may not be reading
   this list, I suggest we discuss this at the October Bakeathon.


> The Linux server is generating clientid as a (boot time, counter) pair.

I'm assuming the boot time is in the form of a 32-bit unsigned number of
seconds after some fixed date/time.  Correct?

If that is the case, duplicates would require that:
1. Two servers be booted during the same second (e.g. when power came back
on after a disruption).
2. Each of the servers has received the same number of client SETCLIENTIDs
(mod 4 billion).

Clearly this is possible although I would say that it is unlikely.
Nevertheless, the "highly unlikely" in RFC7931 is overstating it.

The case that would be more worrisome would be one which a server simply
used a 64-bit word of byte-addressable persistent memory (e.g. NVDIMM, 3d
xpoint) to maintain a permanent counter, dispensing with the boot time.
That is allowable as RFC7530 stands.  I don't expect that this is a current
problem but, given where persistent memory is going (i.e. cheaper and more
common), we could see this in the future.

Clearly, spurious clientid4 collisions are possible, which is the point of
the whole algorithm.

> Ditto for the verifier--it's a (boot time, counter) pair,

I presume the same boot time format is used.  Correct?

>with the second part coming from a different counter, but
> normally I think they'll get incremented at the same time,

It is true that they will both get incremented in many common (i.e.
"normal") situations.

However, this common case is not the only case.

In particular, in the case in which one or more clients are following the
guidance in RFC7931 and there are trunked server addresses, you will have
occasions in which the verifier counter will get incremented and the
clientid4 counter will not.
1.  If the SETCLIENTID is used to modify the callback information
2.  If two different SETCLIENTID operations are done by the same client

>so clientid and verifier are highly correlated.

I don't think so.  The boot time portions are clearly the same, but given
the likely occurrence of cases 1. and 2.these counters will typically have
different values.  They may be correlated in the sense that the difference
between the two values is likely not to be large, but that does not mean
that the probability of them having the same value is necessarily substantial.
Once there are any occasions in which the verifiers are incremented and the
clientid's are not, these values are essentially uncorrelated.  The only
way that they can collide at the same time the clientids collide is when
the number of instances of the cases 1. and 2. above is a multiple of
2^32.  While I think that RFC7931's "vanishingly small" is not correct, I
would argue that this brings us into "highly unlikely" territory. Also, the
existing text does suggest that you can repeat the procedure any number of
times to reduce the likelihood of a spurious determination that two
addresses are trunked.

> So, we can mitigate this by adding some randomness, OK.

One version of that might be to start the counter fields with the
nanoseconds portion of the boot time.

Alternatively, you might  keep the  current counter each starting at zero
but take advantage of the fact that clientid and verfier are used
together.  To do that the timestamp, in clientid4 might be boot time in
seconds while the one in the verifier would be boot time nanoseconds within
the second during which the boot occurred. That would reduce the frequency
of verifier collisions but this might not be necessary, if, as i expect,
the algorithm will usually result in distinct verifiers anyway.

> But that's a new requirement not apparent from 3530,

I think the 7931 text can be patched up via an errata, but if not we will
have to consider how and whether to address the issue in 7530.
Unfortunately, that would be kind of big for an errata.

If no existing servers have a substantial issue, then we do have the option
of doing a longer-term RFC updating 7530.  This could be directed to the
(probably now) hypothetical case of a server not using boot time and
 maintaining 64-bit global persistent counts.

Such servers would have a higher probability of clientid4 conflict and
verifier conflict than the ones you mention, while staying within the
rfc7530 requirements.

> and I wouldn't be surprised if other servers have similar issues.

Those who read this list have been notified of the issue by your message.

The problem is with server implementers who don't read this list.  In
theory, errata should take care of that, but aside from the difficulty of
arriving at a suitable change and getting it through, it may be that many
v4.0 implementers and maintainers may not be reading this list or paying
much attention to errata either.  Perhaps we can discuss this in Westford
where many implementers, who might not be all that aware of stuff on the
working group list, would be present.  That's also one way to see if there
are existing servers where this is big problem.


On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <bfields@fieldses.org>
wrote:

> On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote:
> > > But I can't find any further discussion of how a client might make that
> > > determination.  Am I overlooking it?
> >
> > It's actually in section 5.8 of RFC7931.
>
> Oops, thanks!
>
> Looking at that, I think none of this is true:
>
>         Note that the NFSv4.0 specification requires the server to make
>         sure that such verifiers are very unlikely to be regenerated.
>
> Verifiers given out by one server shouldn't be reused, sure, but that's
> quite different from the claim that collisions *between* servers are
> unlikely.
>
>         Given that it is already highly unlikely that the clientid4 XC
>         is duplicated by distinct servers,
>
> Why is that highly unlikely?
>
>         the probability that SCn is
>         duplicated as well has to be considered vanishingly small.
>
> There's no reason to believe the probability of a verifier collision is
> uncorrelated with the probability of a clientid collision.
>
> The Linux server is generating clientid as a (boot time, counter) pair.
> So collision between servers started at the same time (probably not that
> unusual) is possible.
>
> Ditto for the verifier--it's a (boot time, counter) pair, with the
> second part coming from a different counter, but normally I think
> they'll get incremented at the same time, so clientid and verifier are
> highly correlated.
>
> So, we can mitigate this by adding some randomness, OK.  But that's a
> new requirement not apparent from 3530, and I wouldn't be surprised if
> other servers have similar issues.
>
> --b.
>
> > I've only been keeping draft-ietf-nfsv4-migration-issues alive because
> of
> > the section dealing with issues relating to v4.1.Otherwise, I would have
> > let the thing expire.  The next time I update this, I'll probably
> collapse
> > sections 4 and 5 to a short section saying that all the
> > v4.0 issues were addressed by publication of RFC7931.
> >
> > > (It appears that the Linux client is trying to do that by sending
> > > a setclientid_confirm to server1 using the (clientid,verifier)
> > > returned from a setclientid reply from server2.  That doesn't look
> > > correct to me.)
> >
> > It seems kind of weird but the idea is that if you get the same clientid
> > from two server IP address they are probably connected to the same server
> > (i.e are trunked), but there is a chance that having the same clientid
> is a
> > coincidence,  The idea is that if these are the same server using the
> > verifier will work but if they are different servers it won't work but
> will
> > be harmless.
> >
> >
> > On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <bfields@fieldses.org>
> > wrote:
> >
> > > In
> > > https://tools.ietf.org/html/draft-ietf-nfsv4-migration-
> > > issues-10#section-5.4.2
> > >
> > >         In the face of possible trunking of server IP addresses, the
> > >         client will use the receipt of the same clientid4 from multiple
> > >         IP-addresses, as an indication that the two IP- addresses may
> be
> > >         trunked and proceed to determine, from the observed server
> > >         behavior whether the two addresses are in fact trunked.
> > >
> > > But I can't find any further discussion of how a client might make that
> > > determination.  Am I overlooking it?
> > >
> > > (It appears that the Linux client is trying to do that by sending a
> > > setclientid_confirm to server1 using the (clientid,verifier) returned
> > > from a setclientid reply from server2.  That doesn't look correct to
> > > me.)
> > >
> > > --b.
> > >
>