Re: [nfsv4] 4.0 trunking
"J. Bruce Fields" <bfields@fieldses.org> Sat, 10 September 2016 20:03 UTC
Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5442412B18D for <nfsv4@ietfa.amsl.com>; Sat, 10 Sep 2016 13:03:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.41
X-Spam-Level:
X-Spam-Status: No, score=-3.41 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-1.508, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wqHX0V7GtBaH for <nfsv4@ietfa.amsl.com>; Sat, 10 Sep 2016 13:03:57 -0700 (PDT)
Received: from fieldses.org (fieldses.org [IPv6:2600:3c00::f03c:91ff:fe50:41d6]) by ietfa.amsl.com (Postfix) with ESMTP id 0F33E12B180 for <nfsv4@ietf.org>; Sat, 10 Sep 2016 13:03:57 -0700 (PDT)
Received: by fieldses.org (Postfix, from userid 2815) id 8C8692409; Sat, 10 Sep 2016 16:03:55 -0400 (EDT)
Date: Sat, 10 Sep 2016 16:03:55 -0400
From: "J. Bruce Fields" <bfields@fieldses.org>
To: David Noveck <davenoveck@gmail.com>
Message-ID: <20160910200355.GA30688@fieldses.org>
References: <20160907212039.GA6847@fieldses.org> <CADaq8jfiRU7DTRYXGHZvMALAZWeRjhqcpo8Si3_diMt_5dNSMw@mail.gmail.com> <20160908010532.GA10658@fieldses.org> <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CADaq8jcnananUPDHH4Vzhv93JTxZegsZLMCtZWFD-keHheKvHA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/UcE9B1NAkc0HHNE2LBQAg6T20Y0>
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] 4.0 trunking
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 10 Sep 2016 20:03:59 -0000
On Sat, Sep 10, 2016 at 03:38:29PM -0400, David Noveck wrote: > Thanks for pointing this out. > > Before I address the details of this, let me state my overall position: > > - I think it is correct that RFC7931 overstates the degree of assurance > that one might reasonably have regarding the unlikelihood of spurious > collision of clientid4's and verifiers. > - Nevertheless, I don't think the situation is as bleak as you paint it, > with regard to the Linux server approach to these matters that you > describe. This is basically because I don't see the issue of the > correlation between clientid4's and verifiers the same way that you do. See > below. > - I think it is possible to address the issue satisfactorily in the > context of an errata rfc7931 errata and will start working on that. > - Given the weakness of the rfc 3530/7530 requirements in this area, we > may need (yet another) RFC updating 7530 at some point. > - I see that as a longer-term effort, since the practices that you > describe will not result in large numbers of spurious collisions and > clients can adapt the algorithm to require additional verification. > - If there are servers out there whose practices are significantly more > troublesome than the ones you describe, we need to find out soon. Given > that many of those responsible for v4.0 implementations may not be reading > this list, I suggest we discuss this at the October Bakeathon. > > > > The Linux server is generating clientid as a (boot time, counter) pair. > > I'm assuming the boot time is in the form of a 32-bit unsigned number of > seconds after some fixed date/time. Correct? > > If that is the case, duplicates would require that: > 1. Two servers be booted during the same second (e.g. when power came back > on after a disruption). Right, or routine maintenance, or whatever. > 2. Each of the servers has received the same number of client SETCLIENTIDs > (mod 4 billion). > > Clearly this is possible although I would say that it is unlikely. > Nevertheless, the "highly unlikely" in RFC7931 is overstating it. It turns out that if you have a hundred servers that get rebooted simultaneously for a kernel update or some similar routine maintenance, this happen every time. (I'm a step removed from the original case here, but I believe this is more-or-less accurate and not hypothetical.) Pretty sure you can easily hit this without that big a setup, too. > The case that would be more worrisome would be one which a server simply > used a 64-bit word of byte-addressable persistent memory (e.g. NVDIMM, 3d > xpoint) to maintain a permanent counter, dispensing with the boot time. That'd be something worth looking into in cases the users have the right hardware (not always). The workaround I'm going with for now is initializing the counter part to something random. (Random numbers may not be completely reliable at boot time either, but I suspect they'll be good enough here...). > That is allowable as RFC7530 stands. I don't expect that this is a current > problem but, given where persistent memory is going (i.e. cheaper and more > common), we could see this in the future. > > Clearly, spurious clientid4 collisions are possible, which is the point of > the whole algorithm. > > > Ditto for the verifier--it's a (boot time, counter) pair, > > I presume the same boot time format is used. Correct? > > >with the second part coming from a different counter, but > > normally I think they'll get incremented at the same time, > > It is true that they will both get incremented in many common (i.e. > "normal") situations. > > However, this common case is not the only case. > > In particular, in the case in which one or more clients are following the > guidance in RFC7931 and there are trunked server addresses, you will have > occasions in which the verifier counter will get incremented and the > clientid4 counter will not. > 1. If the SETCLIENTID is used to modify the callback information > 2. If two different SETCLIENTID operations are done by the same client Sure. > >so clientid and verifier are highly correlated. > > I don't think so. The boot time portions are clearly the same, but given > the likely occurrence of cases 1. and 2.these counters will typically have > different values. They may be correlated in the sense that the difference > between the two values is likely not to be large, but that does not mean > that the probability of them having the same value is necessarily substantial. > > Once there are any occasions in which the verifiers are incremented and the > clientid's are not, these values are essentially uncorrelated. The only > way that they can collide at the same time the clientids collide is when > the number of instances of the cases 1. and 2. above is a multiple of > 2^32. I'm not following. I think you may be confusing clientid-verifier collisions with verifier-verifier (from different server) collisions. The latter is what matters. The confusion may be my fault. I should have said something like: chance of a collision between two clientid's is correlated with chance of a collision between two verifiers given out by the same two servers. So adding in a verifier comparison doesn't decrease the probability as much as you'd expect. > While I think that RFC7931's "vanishingly small" is not correct, I > would argue that this brings us into "highly unlikely" territory. So, unfortunately, we appear to have an actual real-life case where this happens all the time. > Also, the > existing text does suggest that you can repeat the procedure any number of > times to reduce the likelihood of a spurious determination that two > addresses are trunked. My intuition is that that would mean a lot of effort for a disappointing reduction in the probability of a bug, but I haven't done any experiments or thought this through much. --b. > > So, we can mitigate this by adding some randomness, OK. > > One version of that might be to start the counter fields with the > nanoseconds portion of the boot time. > > Alternatively, you might keep the current counter each starting at zero > but take advantage of the fact that clientid and verfier are used > together. To do that the timestamp, in clientid4 might be boot time in > seconds while the one in the verifier would be boot time nanoseconds within > the second during which the boot occurred. That would reduce the frequency > of verifier collisions but this might not be necessary, if, as i expect, > the algorithm will usually result in distinct verifiers anyway. > > > But that's a new requirement not apparent from 3530, > > I think the 7931 text can be patched up via an errata, but if not we will > have to consider how and whether to address the issue in 7530. > Unfortunately, that would be kind of big for an errata. > > If no existing servers have a substantial issue, then we do have the option > of doing a longer-term RFC updating 7530. This could be directed to the > (probably now) hypothetical case of a server not using boot time and > maintaining 64-bit global persistent counts. > > Such servers would have a higher probability of clientid4 conflict and > verifier conflict than the ones you mention, while staying within the > rfc7530 requirements. > > > and I wouldn't be surprised if other servers have similar issues. > > Those who read this list have been notified of the issue by your message. > > The problem is with server implementers who don't read this list. In > theory, errata should take care of that, but aside from the difficulty of > arriving at a suitable change and getting it through, it may be that many > v4.0 implementers and maintainers may not be reading this list or paying > much attention to errata either. Perhaps we can discuss this in Westford > where many implementers, who might not be all that aware of stuff on the > working group list, would be present. That's also one way to see if there > are existing servers where this is big problem. OK! Assuming nobody's doing anything very complicated--it's probably also not difficult to guess algorithms in testing. Something we can do in Westford, or maybe before if anyone has access to a variety of servers. --b. > > > On Wed, Sep 7, 2016 at 9:05 PM, J. Bruce Fields <bfields@fieldses.org> > wrote: > > > On Wed, Sep 07, 2016 at 08:14:58PM -0400, David Noveck wrote: > > > > But I can't find any further discussion of how a client might make that > > > > determination. Am I overlooking it? > > > > > > It's actually in section 5.8 of RFC7931. > > > > Oops, thanks! > > > > Looking at that, I think none of this is true: > > > > Note that the NFSv4.0 specification requires the server to make > > sure that such verifiers are very unlikely to be regenerated. > > > > Verifiers given out by one server shouldn't be reused, sure, but that's > > quite different from the claim that collisions *between* servers are > > unlikely. > > > > Given that it is already highly unlikely that the clientid4 XC > > is duplicated by distinct servers, > > > > Why is that highly unlikely? > > > > the probability that SCn is > > duplicated as well has to be considered vanishingly small. > > > > There's no reason to believe the probability of a verifier collision is > > uncorrelated with the probability of a clientid collision. > > > > The Linux server is generating clientid as a (boot time, counter) pair. > > So collision between servers started at the same time (probably not that > > unusual) is possible. > > > > Ditto for the verifier--it's a (boot time, counter) pair, with the > > second part coming from a different counter, but normally I think > > they'll get incremented at the same time, so clientid and verifier are > > highly correlated. > > > > So, we can mitigate this by adding some randomness, OK. But that's a > > new requirement not apparent from 3530, and I wouldn't be surprised if > > other servers have similar issues. > > > > --b. > > > > > I've only been keeping draft-ietf-nfsv4-migration-issues alive because > > of > > > the section dealing with issues relating to v4.1.Otherwise, I would have > > > let the thing expire. The next time I update this, I'll probably > > collapse > > > sections 4 and 5 to a short section saying that all the > > > v4.0 issues were addressed by publication of RFC7931. > > > > > > > (It appears that the Linux client is trying to do that by sending > > > > a setclientid_confirm to server1 using the (clientid,verifier) > > > > returned from a setclientid reply from server2. That doesn't look > > > > correct to me.) > > > > > > It seems kind of weird but the idea is that if you get the same clientid > > > from two server IP address they are probably connected to the same server > > > (i.e are trunked), but there is a chance that having the same clientid > > is a > > > coincidence, The idea is that if these are the same server using the > > > verifier will work but if they are different servers it won't work but > > will > > > be harmless. > > > > > > > > > On Wed, Sep 7, 2016 at 5:20 PM, J. Bruce Fields <bfields@fieldses.org> > > > wrote: > > > > > > > In > > > > https://tools.ietf.org/html/draft-ietf-nfsv4-migration- > > > > issues-10#section-5.4.2 > > > > > > > > In the face of possible trunking of server IP addresses, the > > > > client will use the receipt of the same clientid4 from multiple > > > > IP-addresses, as an indication that the two IP- addresses may > > be > > > > trunked and proceed to determine, from the observed server > > > > behavior whether the two addresses are in fact trunked. > > > > > > > > But I can't find any further discussion of how a client might make that > > > > determination. Am I overlooking it? > > > > > > > > (It appears that the Linux client is trying to do that by sending a > > > > setclientid_confirm to server1 using the (clientid,verifier) returned > > > > from a setclientid reply from server2. That doesn't look correct to > > > > me.) > > > > > > > > --b. > > > > > >
- [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Adamson, Andy
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking David Noveck
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Andy Adamson
- Re: [nfsv4] 4.0 trunking J. Bruce Fields
- Re: [nfsv4] 4.0 trunking Chuck Lever
- Re: [nfsv4] 4.0 trunking J. Bruce Fields