Re: [nfsv4] 4.0 trunking

"Adamson, Andy" <William.Adamson@netapp.com> Mon, 12 September 2016 13:00 UTC

Return-Path: <William.Adamson@netapp.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AE4D712B0AB for <nfsv4@ietfa.amsl.com>; Mon, 12 Sep 2016 06:00:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.429
X-Spam-Level:
X-Spam-Status: No, score=-8.429 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.508, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rj2QQQF7Ge83 for <nfsv4@ietfa.amsl.com>; Mon, 12 Sep 2016 06:00:11 -0700 (PDT)
Received: from mx143.netapp.com (mx143.netapp.com [216.240.21.24]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 616C112B21C for <nfsv4@ietf.org>; Mon, 12 Sep 2016 06:00:11 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.30,322,1470726000"; d="scan'208";a="141584450"
Received: from hioexcmbx05-prd.hq.netapp.com ([10.122.105.38]) by mx143-out.netapp.com with ESMTP; 12 Sep 2016 05:59:07 -0700
Received: from HIOEXCMBX03-PRD.hq.netapp.com (10.122.105.36) by hioexcmbx05-prd.hq.netapp.com (10.122.105.38) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Mon, 12 Sep 2016 05:59:07 -0700
Received: from HIOEXCMBX03-PRD.hq.netapp.com ([::1]) by hioexcmbx03-prd.hq.netapp.com ([fe80::ec66:9703:cc13:6a53%21]) with mapi id 15.00.1210.000; Mon, 12 Sep 2016 05:59:06 -0700
From: "Adamson, Andy" <William.Adamson@netapp.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Thread-Topic: [nfsv4] 4.0 trunking
Thread-Index: AQHSCU2xvI4z0on2XEOSsC0OfOvr1aBvLs4AgAAOIQCABFuegIAABxuAgAKt+AA=
Date: Mon, 12 Sep 2016 12:59:05 +0000
Message-ID: <BBB2EBDC-6F05-44C1-B45A-C84C24A9AD7F@netapp.com>
References: <20160907212039.GA6847@fieldses.org> <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com> <20160908010532.GA10658@fieldses.org> <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com> <20160910200355.GA30688@fieldses.org>
In-Reply-To: <20160910200355.GA30688@fieldses.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3112)
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.122.56.79]
Content-Type: text/plain; charset="utf-8"
Content-ID: <210205C8626F304386781C599B820775@hq.netapp.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/fL0ig44I5hZ7i6_yC3325MEE6-0>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Sep 2016 13:00:13 -0000

IMHO we should not worry too much about NFSv4.0 trunking as NFSv4.1+ solves this issue. Trunking is simply one more reason to move from NFSv4.0 to NFSv4.1.

—>Andy


> On Sep 10, 2016, at 4:03 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck wrote:
>> Thanks for pointing this out.
>> 
>> Before I address the details of this, let me state my overall position:
>> 
>>   - I think it is correct that RFC7931 overstates the degree of assurance
>>   that one might reasonably have regarding the unlikelihood of spurious
>>   collision of clientid4's and verifiers.
>>   - Nevertheless, I don't think the situation is as bleak as you paint it,
>>   with regard to the Linux server approach to these matters that you
>>   describe.  This is basically because I don't see the issue of the
>>   correlation between clientid4's and verifiers the same way that you do. See
>>   below.
>>   - I think it is possible to address the issue satisfactorily in the
>>   context of an errata rfc7931 errata and will start working on that.
>>   - Given the weakness of the rfc 3530/7530 requirements in this area, we
>>   may need (yet another) RFC updating 7530 at some point.
>>   - I see that as a longer-term effort, since the practices that you
>>   describe will not result in large numbers of spurious collisions and
>>   clients can adapt the algorithm to require additional verification.
>>   - If there are servers out there whose practices are significantly more
>>   troublesome than the ones you describe, we need to find out soon.  Given
>>   that many of those responsible for v4.0 implementations may not be reading
>>   this list, I suggest we discuss this at the October Bakeathon.
>> 
>> 
>>> The Linux server is generating clientid as a (boot time, counter) pair.
>> 
>> I'm assuming the boot time is in the form of a 32-bit unsigned number of
>> seconds after some fixed date/time.  Correct?
>> 
>> If that is the case, duplicates would require that:
>> 1. Two servers be booted during the same second (e.g. when power came back
>> on after a disruption).
> 
> Right, or routine maintenance, or whatever.
> 
>> 2. Each of the servers has received the same number of client SETCLIENTIDs
>> (mod 4 billion).
>> 
>> Clearly this is possible although I would say that it is unlikely.
>> Nevertheless, the "highly unlikely" in RFC7931 is overstating it.
> 
> It turns out that if you have a hundred servers that get rebooted
> simultaneously for a kernel update or some similar routine maintenance,
> this happen every time.  (I'm a step removed from the original case
> here, but I believe this is more-or-less accurate and not hypothetical.)
> 
> Pretty sure you can easily hit this without that big a setup, too.
> 
>> The case that would be more worrisome would be one which a server simply
>> used a 64-bit word of byte-addressable persistent memory (e.g. NVDIMM, 3d
>> xpoint) to maintain a permanent counter, dispensing with the boot time.
> 
> That'd be something worth looking into in cases the users have the right
> hardware (not always).  The workaround I'm going with for now is
> initializing the counter part to something random.  (Random numbers may
> not be completely reliable at boot time either, but I suspect they'll be
> good enough here...).
> 
>> That is allowable as RFC7530 stands.  I don't expect that this is a current
>> problem but, given where persistent memory is going (i.e. cheaper and more
>> common), we could see this in the future.
>> 
>> Clearly, spurious clientid4 collisions are possible, which is the point of
>> the whole algorithm.
>> 
>>> Ditto for the verifier--it's a (boot time, counter) pair,
>> 
>> I presume the same boot time format is used.  Correct?
>> 
>>> with the second part coming from a different counter, but
>>> normally I think they'll get incremented at the same time,
>> 
>> It is true that they will both get incremented in many common (i.e.
>> "normal") situations.
>> 
>> However, this common case is not the only case.
>> 
>> In particular, in the case in which one or more clients are following the
>> guidance in RFC7931 and there are trunked server addresses, you will have
>> occasions in which the verifier counter will get incremented and the
>> clientid4 counter will not.
>> 1.  If the SETCLIENTID is used to modify the callback information
>> 2.  If two different SETCLIENTID operations are done by the same client
> 
> Sure.
> 
>>> so clientid and verifier are highly correlated.
>> 
>> I don't think so.  The boot time portions are clearly the same, but given
>> the likely occurrence of cases 1. and 2.these counters will typically have
>> different values.  They may be correlated in the sense that the difference
>> between the two values is likely not to be large, but that does not mean
>> that the probability of them having the same value is necessarily substantial.
>> 
>> Once there are any occasions in which the verifiers are incremented and the
>> clientid's are not, these values are essentially uncorrelated.  The only
>> way that they can collide at the same time the clientids collide is when
>> the number of instances of the cases 1. and 2. above is a multiple of
>> 2^32.
> 
> I'm not following.  I think you may be confusing clientid-verifier
> collisions with verifier-verifier (from different server) collisions.
> The latter is what matters.
> 
> The confusion may be my fault.  I should have said something like:
> chance of a collision between two clientid's is correlated with chance
> of a collision between two verifiers given out by the same two servers.
> So adding in a verifier comparison doesn't decrease the probability as
> much as you'd expect.
> 
>> While I think that RFC7931's "vanishingly small" is not correct, I
>> would argue that this brings us into "highly unlikely" territory.
> 
> So, unfortunately, we appear to have an actual real-life case where this
> happens all the time.
> 
>> Also, the
>> existing text does suggest that you can repeat the procedure any number of
>> times to reduce the likelihood of a spurious determination that two
>> addresses are trunked.
> 
> My intuition is that that would mean a lot of effort for a disappointing
> reduction in the probability of a bug, but I haven't done any
> experiments or thought this through much.
> 
> --b.
> 
>>> So, we can mitigate this by adding some randomness, OK.
>> 
>> One version of that might be to start the counter fields with the
>> nanoseconds portion of the boot time.
>> 
>> Alternatively, you might  keep the  current counter each starting at zero
>> but take advantage of the fact that clientid and verfier are used
>> together.  To do that the timestamp, in clientid4 might be boot time in
>> seconds while the one in the verifier would be boot time nanoseconds within
>> the second during which the boot occurred. That would reduce the frequency
>> of verifier collisions but this might not be necessary, if, as i expect,
>> the algorithm will usually result in distinct verifiers anyway.
>> 
>>> But that's a new requirement not apparent from 3530,
>> 
>> I think the 7931 text can be patched up via an errata, but if not we will
>> have to consider how and whether to address the issue in 7530.
>> Unfortunately, that would be kind of big for an errata.
>> 
>> If no existing servers have a substantial issue, then we do have the option
>> of doing a longer-term RFC updating 7530.  This could be directed to the
>> (probably now) hypothetical case of a server not using boot time and
>> maintaining 64-bit global persistent counts.
>> 
>> Such servers would have a higher probability of clientid4 conflict and
>> verifier conflict than the ones you mention, while staying within the
>> rfc7530 requirements.
>> 
>>> and I wouldn't be surprised if other servers have similar issues.
>> 
>> Those who read this list have been notified of the issue by your message.
>> 
>> The problem is with server implementers who don't read this list.  In
>> theory, errata should take care of that, but aside from the difficulty of
>> arriving at a suitable change and getting it through, it may be that many
>> v4.0 implementers and maintainers may not be reading this list or paying
>> much attention to errata either.  Perhaps we can discuss this in Westford
>> where many implementers, who might not be all that aware of stuff on the
>> working group list, would be present.  That's also one way to see if there
>> are existing servers where this is big problem.
> 
> OK!
> 
> Assuming nobody's doing anything very complicated--it's probably also
> not difficult to guess algorithms in testing.  Something we can do in
> Westford, or maybe before if anyone has access to a variety of servers.
> 
> --b.
> 
>> 
>> 
>> On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <bfields@fieldses.org>
>> wrote:
>> 
>>> On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote:
>>>>> But I can't find any further discussion of how a client might make that
>>>>> determination.  Am I overlooking it?
>>>> 
>>>> It's actually in section 5.8 of RFC7931.
>>> 
>>> Oops, thanks!
>>> 
>>> Looking at that, I think none of this is true:
>>> 
>>>        Note that the NFSv4.0 specification requires the server to make
>>>        sure that such verifiers are very unlikely to be regenerated.
>>> 
>>> Verifiers given out by one server shouldn't be reused, sure, but that's
>>> quite different from the claim that collisions *between* servers are
>>> unlikely.
>>> 
>>>        Given that it is already highly unlikely that the clientid4 XC
>>>        is duplicated by distinct servers,
>>> 
>>> Why is that highly unlikely?
>>> 
>>>        the probability that SCn is
>>>        duplicated as well has to be considered vanishingly small.
>>> 
>>> There's no reason to believe the probability of a verifier collision is
>>> uncorrelated with the probability of a clientid collision.
>>> 
>>> The Linux server is generating clientid as a (boot time, counter) pair.
>>> So collision between servers started at the same time (probably not that
>>> unusual) is possible.
>>> 
>>> Ditto for the verifier--it's a (boot time, counter) pair, with the
>>> second part coming from a different counter, but normally I think
>>> they'll get incremented at the same time, so clientid and verifier are
>>> highly correlated.
>>> 
>>> So, we can mitigate this by adding some randomness, OK.  But that's a
>>> new requirement not apparent from 3530, and I wouldn't be surprised if
>>> other servers have similar issues.
>>> 
>>> --b.
>>> 
>>>> I've only been keeping draft-ietf-nfsv4-migration-issues alive because
>>> of
>>>> the section dealing with issues relating to v4.1.Otherwise, I would have
>>>> let the thing expire.  The next time I update this, I'll probably
>>> collapse
>>>> sections 4 and 5 to a short section saying that all the
>>>> v4.0 issues were addressed by publication of RFC7931.
>>>> 
>>>>> (It appears that the Linux client is trying to do that by sending
>>>>> a setclientid_confirm to server1 using the (clientid,verifier)
>>>>> returned from a setclientid reply from server2.  That doesn't look
>>>>> correct to me.)
>>>> 
>>>> It seems kind of weird but the idea is that if you get the same clientid
>>>> from two server IP address they are probably connected to the same server
>>>> (i.e are trunked), but there is a chance that having the same clientid
>>> is a
>>>> coincidence,  The idea is that if these are the same server using the
>>>> verifier will work but if they are different servers it won't work but
>>> will
>>>> be harmless.
>>>> 
>>>> 
>>>> On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <bfields@fieldses.org>
>>>> wrote:
>>>> 
>>>>> In
>>>>> https://tools.ietf.org/html/draft-ietf-nfsv4-migration-
>>>>> issues-10#section-5.4.2
>>>>> 
>>>>>        In the face of possible trunking of server IP addresses, the
>>>>>        client will use the receipt of the same clientid4 from multiple
>>>>>        IP-addresses, as an indication that the two IP- addresses may
>>> be
>>>>>        trunked and proceed to determine, from the observed server
>>>>>        behavior whether the two addresses are in fact trunked.
>>>>> 
>>>>> But I can't find any further discussion of how a client might make that
>>>>> determination.  Am I overlooking it?
>>>>> 
>>>>> (It appears that the Linux client is trying to do that by sending a
>>>>> setclientid_confirm to server1 using the (clientid,verifier) returned
>>>>> from a setclientid reply from server2.  That doesn't look correct to
>>>>> me.)
>>>>> 
>>>>> --b.
>>>>> 
>>> 
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4