Re: [nfsv4] 4.0 trunking
"Adamson, Andy" <William.Adamson@netapp.com> Mon, 12 September 2016 13:00 UTC
Return-Path: <William.Adamson@netapp.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AE4D712B0AB for <nfsv4@ietfa.amsl.com>; Mon, 12 Sep 2016 06:00:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.429
X-Spam-Level:
X-Spam-Status: No, score=-8.429 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.508, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rj2QQQF7Ge83 for <nfsv4@ietfa.amsl.com>; Mon, 12 Sep 2016 06:00:11 -0700 (PDT)
Received: from mx143.netapp.com (mx143.netapp.com [216.240.21.24]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 616C112B21C for <nfsv4@ietf.org>; Mon, 12 Sep 2016 06:00:11 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.30,322,1470726000"; d="scan'208";a="141584450"
Received: from hioexcmbx05-prd.hq.netapp.com ([10.122.105.38]) by mx143-out.netapp.com with ESMTP; 12 Sep 2016 05:59:07 -0700
Received: from HIOEXCMBX03-PRD.hq.netapp.com (10.122.105.36) by hioexcmbx05-prd.hq.netapp.com (10.122.105.38) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Mon, 12 Sep 2016 05:59:07 -0700
Received: from HIOEXCMBX03-PRD.hq.netapp.com ([::1]) by hioexcmbx03-prd.hq.netapp.com ([fe80::ec66:9703:cc13:6a53%21]) with mapi id 15.00.1210.000; Mon, 12 Sep 2016 05:59:06 -0700
From: "Adamson, Andy" <William.Adamson@netapp.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Thread-Topic: [nfsv4] 4.0 trunking
Thread-Index: AQHSCU2xvI4z0on2XEOSsC0OfOvr1aBvLs4AgAAOIQCABFuegIAABxuAgAKt+AA=
Date: Mon, 12 Sep 2016 12:59:05 +0000
Message-ID: <BBB2EBDC-6F05-44C1-B45A-C84C24A9AD7F@netapp.com>
References: <20160907212039.GA6847@fieldses.org> <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com> <20160908010532.GA10658@fieldses.org> <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com> <20160910200355.GA30688@fieldses.org>
In-Reply-To: <20160910200355.GA30688@fieldses.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3112)
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.122.56.79]
Content-Type: text/plain; charset="utf-8"
Content-ID: <210205C8626F304386781C599B820775@hq.netapp.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/fL0ig44I5hZ7i6_yC3325MEE6-0>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Sep 2016 13:00:13 -0000
IMHO we should not worry too much about NFSv4.0 trunking as NFSv4.1+ solves this issue. Trunking is simply one more reason to move from NFSv4.0 to NFSv4.1. —>Andy > On Sep 10, 2016, at 4:03 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck wrote: >> Thanks for pointing this out. >> >> Before I address the details of this, let me state my overall position: >> >> - I think it is correct that RFC7931 overstates the degree of assurance >> that one might reasonably have regarding the unlikelihood of spurious >> collision of clientid4's and verifiers. >> - Nevertheless, I don't think the situation is as bleak as you paint it, >> with regard to the Linux server approach to these matters that you >> describe. This is basically because I don't see the issue of the >> correlation between clientid4's and verifiers the same way that you do. See >> below. >> - I think it is possible to address the issue satisfactorily in the >> context of an errata rfc7931 errata and will start working on that. >> - Given the weakness of the rfc 3530/7530 requirements in this area, we >> may need (yet another) RFC updating 7530 at some point. >> - I see that as a longer-term effort, since the practices that you >> describe will not result in large numbers of spurious collisions and >> clients can adapt the algorithm to require additional verification. >> - If there are servers out there whose practices are significantly more >> troublesome than the ones you describe, we need to find out soon. Given >> that many of those responsible for v4.0 implementations may not be reading >> this list, I suggest we discuss this at the October Bakeathon. >> >> >>> The Linux server is generating clientid as a (boot time, counter) pair. >> >> I'm assuming the boot time is in the form of a 32-bit unsigned number of >> seconds after some fixed date/time. Correct? >> >> If that is the case, duplicates would require that: >> 1. Two servers be booted during the same second (e.g. when power came back >> on after a disruption). > > Right, or routine maintenance, or whatever. > >> 2. Each of the servers has received the same number of client SETCLIENTIDs >> (mod 4 billion). >> >> Clearly this is possible although I would say that it is unlikely. >> Nevertheless, the "highly unlikely" in RFC7931 is overstating it. > > It turns out that if you have a hundred servers that get rebooted > simultaneously for a kernel update or some similar routine maintenance, > this happen every time. (I'm a step removed from the original case > here, but I believe this is more-or-less accurate and not hypothetical.) > > Pretty sure you can easily hit this without that big a setup, too. > >> The case that would be more worrisome would be one which a server simply >> used a 64-bit word of byte-addressable persistent memory (e.g. NVDIMM, 3d >> xpoint) to maintain a permanent counter, dispensing with the boot time. > > That'd be something worth looking into in cases the users have the right > hardware (not always). The workaround I'm going with for now is > initializing the counter part to something random. (Random numbers may > not be completely reliable at boot time either, but I suspect they'll be > good enough here...). > >> That is allowable as RFC7530 stands. I don't expect that this is a current >> problem but, given where persistent memory is going (i.e. cheaper and more >> common), we could see this in the future. >> >> Clearly, spurious clientid4 collisions are possible, which is the point of >> the whole algorithm. >> >>> Ditto for the verifier--it's a (boot time, counter) pair, >> >> I presume the same boot time format is used. Correct? >> >>> with the second part coming from a different counter, but >>> normally I think they'll get incremented at the same time, >> >> It is true that they will both get incremented in many common (i.e. >> "normal") situations. >> >> However, this common case is not the only case. >> >> In particular, in the case in which one or more clients are following the >> guidance in RFC7931 and there are trunked server addresses, you will have >> occasions in which the verifier counter will get incremented and the >> clientid4 counter will not. >> 1. If the SETCLIENTID is used to modify the callback information >> 2. If two different SETCLIENTID operations are done by the same client > > Sure. > >>> so clientid and verifier are highly correlated. >> >> I don't think so. The boot time portions are clearly the same, but given >> the likely occurrence of cases 1. and 2.these counters will typically have >> different values. They may be correlated in the sense that the difference >> between the two values is likely not to be large, but that does not mean >> that the probability of them having the same value is necessarily substantial. >> >> Once there are any occasions in which the verifiers are incremented and the >> clientid's are not, these values are essentially uncorrelated. The only >> way that they can collide at the same time the clientids collide is when >> the number of instances of the cases 1. and 2. above is a multiple of >> 2^32. > > I'm not following. I think you may be confusing clientid-verifier > collisions with verifier-verifier (from different server) collisions. > The latter is what matters. > > The confusion may be my fault. I should have said something like: > chance of a collision between two clientid's is correlated with chance > of a collision between two verifiers given out by the same two servers. > So adding in a verifier comparison doesn't decrease the probability as > much as you'd expect. > >> While I think that RFC7931's "vanishingly small" is not correct, I >> would argue that this brings us into "highly unlikely" territory. > > So, unfortunately, we appear to have an actual real-life case where this > happens all the time. > >> Also, the >> existing text does suggest that you can repeat the procedure any number of >> times to reduce the likelihood of a spurious determination that two >> addresses are trunked. > > My intuition is that that would mean a lot of effort for a disappointing > reduction in the probability of a bug, but I haven't done any > experiments or thought this through much. > > --b. > >>> So, we can mitigate this by adding some randomness, OK. >> >> One version of that might be to start the counter fields with the >> nanoseconds portion of the boot time. >> >> Alternatively, you might keep the current counter each starting at zero >> but take advantage of the fact that clientid and verfier are used >> together. To do that the timestamp, in clientid4 might be boot time in >> seconds while the one in the verifier would be boot time nanoseconds within >> the second during which the boot occurred. That would reduce the frequency >> of verifier collisions but this might not be necessary, if, as i expect, >> the algorithm will usually result in distinct verifiers anyway. >> >>> But that's a new requirement not apparent from 3530, >> >> I think the 7931 text can be patched up via an errata, but if not we will >> have to consider how and whether to address the issue in 7530. >> Unfortunately, that would be kind of big for an errata. >> >> If no existing servers have a substantial issue, then we do have the option >> of doing a longer-term RFC updating 7530. This could be directed to the >> (probably now) hypothetical case of a server not using boot time and >> maintaining 64-bit global persistent counts. >> >> Such servers would have a higher probability of clientid4 conflict and >> verifier conflict than the ones you mention, while staying within the >> rfc7530 requirements. >> >>> and I wouldn't be surprised if other servers have similar issues. >> >> Those who read this list have been notified of the issue by your message. >> >> The problem is with server implementers who don't read this list. In >> theory, errata should take care of that, but aside from the difficulty of >> arriving at a suitable change and getting it through, it may be that many >> v4.0 implementers and maintainers may not be reading this list or paying >> much attention to errata either. Perhaps we can discuss this in Westford >> where many implementers, who might not be all that aware of stuff on the >> working group list, would be present. That's also one way to see if there >> are existing servers where this is big problem. > > OK! > > Assuming nobody's doing anything very complicated--it's probably also > not difficult to guess algorithms in testing. Something we can do in > Westford, or maybe before if anyone has access to a variety of servers. > > --b. > >> >> >> On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <bfields@fieldses.org> >> wrote: >> >>> On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote: >>>>> But I can't find any further discussion of how a client might make that >>>>> determination. Am I overlooking it? >>>> >>>> It's actually in section 5.8 of RFC7931. >>> >>> Oops, thanks! >>> >>> Looking at that, I think none of this is true: >>> >>> Note that the NFSv4.0 specification requires the server to make >>> sure that such verifiers are very unlikely to be regenerated. >>> >>> Verifiers given out by one server shouldn't be reused, sure, but that's >>> quite different from the claim that collisions *between* servers are >>> unlikely. >>> >>> Given that it is already highly unlikely that the clientid4 XC >>> is duplicated by distinct servers, >>> >>> Why is that highly unlikely? >>> >>> the probability that SCn is >>> duplicated as well has to be considered vanishingly small. >>> >>> There's no reason to believe the probability of a verifier collision is >>> uncorrelated with the probability of a clientid collision. >>> >>> The Linux server is generating clientid as a (boot time, counter) pair. >>> So collision between servers started at the same time (probably not that >>> unusual) is possible. >>> >>> Ditto for the verifier--it's a (boot time, counter) pair, with the >>> second part coming from a different counter, but normally I think >>> they'll get incremented at the same time, so clientid and verifier are >>> highly correlated. >>> >>> So, we can mitigate this by adding some randomness, OK. But that's a >>> new requirement not apparent from 3530, and I wouldn't be surprised if >>> other servers have similar issues. >>> >>> --b. >>> >>>> I've only been keeping draft-ietf-nfsv4-migration-issues alive because >>> of >>>> the section dealing with issues relating to v4.1.Otherwise, I would have >>>> let the thing expire. The next time I update this, I'll probably >>> collapse >>>> sections 4 and 5 to a short section saying that all the >>>> v4.0 issues were addressed by publication of RFC7931. >>>> >>>>> (It appears that the Linux client is trying to do that by sending >>>>> a setclientid_confirm to server1 using the (clientid,verifier) >>>>> returned from a setclientid reply from server2. That doesn't look >>>>> correct to me.) >>>> >>>> It seems kind of weird but the idea is that if you get the same clientid >>>> from two server IP address they are probably connected to the same server >>>> (i.e are trunked), but there is a chance that having the same clientid >>> is a >>>> coincidence, The idea is that if these are the same server using the >>>> verifier will work but if they are different servers it won't work but >>> will >>>> be harmless. >>>> >>>> >>>> On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <bfields@fieldses.org> >>>> wrote: >>>> >>>>> In >>>>> https://tools.ietf.org/html/draft-ietf-nfsv4-migration- >>>>> issues-10#section-5.4.2 >>>>> >>>>> In the face of possible trunking of server IP addresses, the >>>>> client will use the receipt of the same clientid4 from multiple >>>>> IP-addresses, as an indication that the two IP- addresses may >>> be >>>>> trunked and proceed to determine, from the observed server >>>>> behavior whether the two addresses are in fact trunked. >>>>> >>>>> But I can't find any further discussion of how a client might make that >>>>> determination. Am I overlooking it? >>>>> >>>>> (It appears that the Linux client is trying to do that by sending a >>>>> setclientid_confirm to server1 using the (clientid,verifier) returned >>>>> from a setclientid reply from server2. That doesn't look correct to >>>>> me.) >>>>> >>>>> --b. >>>>> >>> > > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4
- [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Adamson, Andy
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Andy Adamson
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Chuck Lever
- Re: [nfsv4] 4.0 trunking J. Bruce Fields