Re: [nfsv4] Issues for precis
<Noveck_David@emc.com> Sat, 07 August 2010 21:03 UTC
Return-Path: <Noveck_David@emc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 75C153A6954 for <nfsv4@core3.amsl.com>; Sat, 7 Aug 2010 14:03:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.151
X-Spam-Level:
X-Spam-Status: No, score=-5.151 tagged_above=-999 required=5 tests=[AWL=-1.152, BAYES_50=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SVqTc0DKzP0t for <nfsv4@core3.amsl.com>; Sat, 7 Aug 2010 14:03:28 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 896143A690A for <nfsv4@ietf.org>; Sat, 7 Aug 2010 14:03:28 -0700 (PDT)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.3.2/Switch-3.1.7) with ESMTP id o77L40XJ008062 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <nfsv4@ietf.org>; Sat, 7 Aug 2010 17:04:00 -0400
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.221.251]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor) for <nfsv4@ietf.org>; Sat, 7 Aug 2010 17:03:55 -0400
Received: from corpussmtp4.corp.emc.com (corpussmtp4.corp.emc.com [10.254.169.197]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id o77L3s8n029316 for <nfsv4@ietf.org>; Sat, 7 Aug 2010 17:03:54 -0400
Received: from CORPUSMX50A.corp.emc.com ([128.221.62.43]) by corpussmtp4.corp.emc.com with Microsoft SMTPSVC(6.0.3790.4675); Sat, 7 Aug 2010 17:03:54 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Sat, 07 Aug 2010 17:03:52 -0400
Message-ID: <BF3BB6D12298F54B89C8DCC1E4073D8002266711@CORPUSMX50A.corp.emc.com>
In-Reply-To: <C2D311A6F086424F99E385949ECFEBCB03453E71@CORPUSMX80B.corp.emc.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: Issues for precis
Thread-Index: AcsulbmU784hTIsWT8WRMMMLTKirlwAYoCqwAAQh5VAArHzFwADZWSIw
References: <BF3BB6D12298F54B89C8DCC1E4073D8001FDB584@CORPUSMX50A.corp.emc.com> <C2D311A6F086424F99E385949ECFEBCB03453502@CORPUSMX80B.corp.emc.com> <BF3BB6D12298F54B89C8DCC1E4073D8001FDB9B4@CORPUSMX50A.corp.emc.com> <C2D311A6F086424F99E385949ECFEBCB03453E71@CORPUSMX80B.corp.emc.com>
From: Noveck_David@emc.com
To: david.black@emc.com
X-OriginalArrivalTime: 07 Aug 2010 21:03:54.0498 (UTC) FILETIME=[0B5D9620:01CB3674]
X-EMM-MHVC: 1
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] Issues for precis
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Aug 2010 21:03:31 -0000
I realize this set of issues is trying everyone's patience, except those wise enough to have ignored this chapter of RFC3530, both as far as implementation and modification for RFC3530bis. If only I had been so wise. As regards the issue of waiting for precis, David offers some good points. Although, my view of the possible benefit probability distribution is likely to be different (more skeptical) than David's, neither of us nor the working group would appreciate that debate. The more important issue is benefits vs. costs. I had understood that there was concern about the dispatch with which RFC3530bis would be produced. There was a concern with Tom's proposed target and the fact that it differed from one in the charter by some months, although I don't understand the motivations for very prompt attention to this matter well enough to understand whether there is something there that would override the considerations that David has offered. In light of this, we need a discussion of the issue of benefits versus costs (of waiting). The problem is that nobody wants to discuss these messy issues in detail and one soon finds out that details are all there is. So my proposal is that we modify plans to wait for precis, based on David's points. We can review this decision periodically at working group meetings, and also on the list if there is some change in circumstances that merits it. The charter would treat this as an ongoing area of work without being able to set a meaningful target date. Given my perception of the working group feeling ("Do anything you want as long as I don't have to be involved", a bit overstated to be sure), we may sit here waiting for a response from somebody else. So I would suggest that if we get no response in the next three weeks, we proceed along the abovementioned plan. I will send weekly notices about this possible change in direction so it doesn't slip by anyone. Regarding David's input, let me first thank him for it, both that which has been on the list and also his comments on various pre-drafts. However, with regard to some of the later input, I am having difficulty in determining the implications, if any, for RFC3530bis. For example, David says below that he thinks we have to "tackle a normalization-insensitive/normalization-preserving approach to filenames directly in NFSv4." The draft in question specifies this as a valid option and mentions its advantages over others, although it does not go so far as a "SHOULD". What specifically is missing/wrong in what is in the draft? Given that what I've written is so great :-), maybe that is a hard question to answer, but only if we address that iteratively will we get to where we want to be, with an RFC3530bis that really describes the protocol. I'm assuming (naively?) that that is an achievable goal. Finally, one other area where David and I seem to be talking past one another is regarding the issue of character mappings. David is right that the non-implementation of stringprep eliminates the issues raised by possible changes to address issues raised the troublesome four characters. The issue that we have to come to terms with is with regards to other characters in the list of characters to be mapped (I think they are all excised). If someone has a filename that contains such a character, he would not be able to open it and might open a different file, if precis where to adopt the stringprep mappings minus the troublesome ones (since it might be considered that there was nothing wrong with the remainder) and NFSv4 were to do adopt the precis mapping. That's why we must not transfer this decision to precis. Draft-04 makes reference to the set of non-troublesome mappings by limiting the mappings that a file system may perform to a subset of that set, possibly null, as determined by the file system. If there are problems with this, we need to understand what they are. -----Original Message----- From: Black, David Sent: Sunday, August 01, 2010 4:54 PM To: Noveck, David Cc: 'nfsv4@ietf.org' Subject: RE: Issues for precis See other message for suggestions on how to proceed. I don't have the patience to respond blow-by-blow to this thread, so some high level considerations: Two reasons to wait for precis: - precis has to tackle user names. It would be good for NFSv4 to align with an IETF-wide approach to user names. - precis should do a better job of future-proofing vs. Unicode changes than NFSv4 is likely to do on our own. Character mappings should not be an issue because the i18n portion of 3530 was never implemented, hence we shouldn't have to worry about backwards compatibility with non-existent code that implemented broken mappings. As noted elsewhere, I think we have to tackle a normalization-insensitive/normalization-preserving approach to filenames directly in NFSv4. Thanks, --David > -----Original Message----- > From: Noveck, David > Sent: Thursday, July 29, 2010 8:41 AM > To: Black, David > Cc: 'nfsv4@ietf.org' > Subject: RE: Issues for precis > > > The precis meeting has just concluded. One of the things we > > will need to do for NFSv4 is resurrect your analysis of all > > the different classes of NFSv4 strings to which i18n applies. > > They're in the bis draft. I wasn't aware that they were considered dead. I will send some mail > extracting the relevant portions of the draft and summarizing them if that will help things. > > > Discussion in the precis meeting made it clear that user names > > and domain names are special (domain names have to use IDNAbis > > as specified, and changes to user names need to be conservative). > > Filename changes need to be very conservative. In fact, they need to be so conservative they make > Rand Paul look like a socialist :-) > > > The precis effort appears to be headed in the direction of > > identifying multiple classes of strings that are used in protocols > > and providing starting points for appropriate i18n treatment of > > them. The current RFC 3530 structure of 3 stringprep profiles > > based on case sensitivity is almost certainly the wrong structure > > for this, > > Almost certainly. > > > and hence I would suggest that NFSv4 cooperate with that effort for > > now. > > We have something in the bis which at least is not almost certainly wrong. We can either work on that > base or co-operate with and presumably wait for precis. I'm not sure what the group consensus would > be on this issue, but I think we need to find out. It needs to be understood that if we wait for > precis "for now" that could be a while and we should drop all pretext of dates for RFC3530bis. If > that's what's necessary, it is, but let us be clear what is happening. We will shift effort toward > 4.2 and that's not necessarily wrong. > > > We need to keep in mind that NFSv4 has a number of other strings > > beyond filenames that have i18n requirements. Turning to your three > > issues: > > True but filenames are by far the most important, we have an answer for domains, etc. > > We also have users and groups which I've put in the same category as filenames: up to the > implementation with some normalization rules to aid interoperability. > > That leaves tags and I hope, precis or not, we can just agree to let them be. When suitably > inebriated I could tell you a story of how I was debugging something and thought the tag contained e > accent egu but all of a sudden realized it was really Pinyin e in second tone, and so knew what was > wrong :-). Maybe not. Anyway, tags are compared but the comparison happens in your mind and so I > hope we can consider that comparison out-of-scope and declare this a non-precis exception from i18n > rules. > > > > 1) Limitations on character set, whether in the interests > > > of limiting confusion or otherwise. > > > > > > This is the most critical one. Filesystems contain files > > > with names having the characters they were assigned, whether > > > by creation with some other network protocol or during local > > > use. If Precis or anyone else restricts that, then you can't > > > open your file. > > > There have to be limitations somewhere because Unicode > > allows different representations of the same character. > > That does not follow at all. If you have an n-i/n-p filesystem which prevents you, for example from > having a file whose name is an e accent egu and also one in the same directory whose name is e > followed by a combining acute accent, that's a normalization-related constraint but it doesn't > restrict the character set. I tried to deal with normalization in 3). > > Let me ask this. Do you see any other, non-normalization-related need to impose restrictions on the > character sets that may appear in filename accessed by NFSv4? > > Because if so, the problem I noted above remains. If there are files that have names containing those > characters, then imposing such a restriction will mean you can't open your files, and we agree that > that is bad. > > > A physical filesystem that has two files whose names that > > differ only in the Unicode representations of the same > > character is an i18n-busted FS that needs attention. NFS > > is not the place to set out to fix that FS, but if NFS > > happens to do something that exposes the i18n-busted-ness > > of that FS, the result should be a problem for that FS and > > not NFS. > > It sounds like we are in agreement (except for the terminology issue noted below). Note the issue in > the spec and go on. Normalization-sensitive filesystems are "i18n-busted" and the group will discuss > the proper way to say that. > > As to the terminology, the string pair that we are talking about, e-with-acute-accent and e followed > by combining-acute-accent are not two representations of the same character. They are two different > strings which have the same graphic representation and which have been defined by unicode to be > canonically equivalent. > > > > 2) Character mapping including the troublesome eszett and word > > > boundary and non-boundary characters. > > > There are four troublesome characters - German eszett, Greek > > final sigma and the two zero-width characters (joiner and > > non-joiner) whose mappings changed between stringprep and IDNAbis. > > The current stringprep mappings of those characters are "bugs" > > that I expect precis to "fix" by picking up the IDNAbis treatment. > > To the extent that stringprep has never been implemented for NFS, > > the transition to the "fix" will be easier. > > Depends what the fix is. Let's call your list of four T, and the list of all the others in that list > (all of which are mapped to the null string, I think) U. > > If the fix is the one in current bis draft, that you may only do mappings within T+U, should not do > any T, and are not required to do any, you are fine. > > If you are implying that deleting only the mapping in T will solve the problem, it won't. If there > are files with characters in U, you won't be able to open them. > > So that "fix" (if that is what precis winds up with) would only be untroublesome for a filesystem > which implemented stringprep but had the prescience to not include the troublesome characters. > > > > 3) Normalization issues as they apply to existing clients and > > > file systems. > > > > > > I think here we have a big problem in that some clients and > > > some servers like NFC and some like NFD. If we impose one, > > > a client will have the name of his file changed. Clearly > > > n-i/n-p is the best choice, although I didn't go so far as raise > > > this to a formal SHOULD in the bis draft. > > > For filenames, my current inclination is to go with something > > like the normalization-insensitive / normalization-preserving > > approach that pushes the string comparison problem into the > > physical filesystem. > > If the comparison is part of the filesystem implementation, then we need to be clear about a few > restrictions that are imposed. The main one is that a string may not be rejected because of > normalization form. > > > The result is still exposed to a risk that i18n code above the > > NFSv4 client will interact badly with i18n string compare code > > in the filesystem, but at least NFSv4 won't have made things worse. > > That may wind up being the best that's possible - if so, this > > conclusion will need to be checked multiple times before we're > > done ;-). > > OK. Asking people for discussion of this issue to see if they see any problems is part of that > process. > > > Another advance warning from the precis meeting is that > > case-folding and bidirectional functionality are going > > to be painful to work through. > > Indeed. Expect multiple people to confess to shooting JFK, if only that will cause it to stop. > > As far as bidirectional string rejection for NFSv4, that can go away. I just allowed people to have > file systems which rejected bidirectional strings in case there were such. If there aren't, we can > just drop this. > > > The sorting out of classes of strings may help with case folding. > > If the class of filenames where we are communicating with a filesystem which is doing the case folding > is recognized, then yes. > > Here we have the same problem. If precis decides on something in any way different from what your > filesystem implemented long before precis existed, then you are likely to be unable to open your file. > I think we have the class of strings that have a legacy structure that we cannot change and precis > cannot change without impacting functionality. We need precis to recognize this and stay its hand > with regard to those strings. > > > > -----Original Message----- > From: Black, David > Sent: Thursday, July 29, 2010 5:09 AM > To: Noveck, David > Cc: 'nfsv4@ietf.org' > Subject: RE: Issues for precis > > Dave, > > The precis meeting has just concluded. One of the things we will need to do for NFSv4 is resurrect > your analysis of all the different classes of NFSv4 strings to which i18n applies. Discussion in the > precis meeting made it clear that user names and domain names are special (domain names have to use > IDNAbis as specified, and changes to user names need to be conservative). The precis effort appears > to be headed in the direction of identifying multiple classes of strings that are used in protocols > and providing starting points for appropriate i18n treatment of them. The current RFC 3530 structure > of 3 stringprep profiles based on case sensitivity is almost certainly the wrong structure for this, > and hence I would suggest that NFSv4 cooperate with that effort for now. > > We need to keep in mind that NFSv4 has a number of other strings beyond filenames that have i18n > requirements. Turning to your three issues: > > > 1) Limitations on character set, whether in the interests > > of limiting confusion or otherwise. > > > > This is the most critical one. Filesystems contain files > > with names having the characters they were assigned, whether > > by creation with some other network protocol or during local > > use. If Precis or anyone else restricts that, then you can't > > open your file. > > There have to be limitations somewhere because Unicode allows different representations of the same > character. A physical filesystem that has two files whose names that differ only in the Unicode > representations of the same character is an i18n-busted FS that needs attention. NFS is not the place > to set out to fix that FS, but if NFS happens to do something that exposes the i18n-busted-ness of > that FS, the result should be a problem for that FS and not NFS. > > > 2) Character mapping including the troublesome eszett and word > > boundary and non-boundary characters. > > There are four troublesome characters - German eszett, Greek final sigma and the two zero-width > characters (joiner and non-joiner) whose mappings changed between stringprep and IDNAbis. The current > stringprep mappings of those characters are "bugs" that I expect precis to "fix" by picking up the > IDNAbis treatment. To the extent that stringprep has never been implemented for NFS, the transition to > the "fix" will be easier. > > > 3) Normalization issues as they apply to existing clients and > > file systems. > > > > I think here we have a big problem in that some clients and > > some servers like NFC and some like NFD. If we impose one, > > a client will have the name of his file changed. Clearly > > n-i/n-p is the best choice, although I didn't go so far as raise > > this to a formal SHOULD in the bis draft. > > For filenames, my current inclination is to go with something like the normalization-insensitive / > normalization-preserving approach that pushes the string comparison problem into the physical > filesystem. The result is still exposed to a risk that i18n code above the NFSv4 client will interact > badly with i18n string compare code in the filesystem, but at least NFSv4 won't have made things > worse. That may wind up being the best that's possible - if so, this conclusion will need to be > checked multiple times before we're done ;-). > > Another advance warning from the precis meeting is that case-folding and bidirectional functionality > are going to be painful to work through. The sorting out of classes of strings may help with case > folding. > > Thanks, > --David > > > > -----Original Message----- > > From: Noveck, David > > Sent: Wednesday, July 28, 2010 4:45 PM > > To: Black, David > > Cc: nfsv4@ietf.org > > Subject: Issues for precis > > > > I've listed the follow as issues which seem to me things > > that precis would need to address to handle the concerns > > you raised about being able to open the right file (by > > which is meant the one you intended). I've given these > > numbers. There are a few other remarks about these > > issues which are identified by other characters (keeping > > within ascii). > > > > 1) Limitations on character set, whether in the interests > > of limiting confusion or otherwise. > > > > This is the most critical one. Filesystems contain files > > with names having the characters they were assigned, whether > > by creation with some other network protocol or during local > > use. If Precis or anyone else restricts that, then you can't > > open your file. > > > > And in terms of the open-the-right-file test, it is much more > > important that the character set as seen through NFSv4 matches > > that seen via local access, then it is that the character set > > on my file system matches the one on yours. That's leaving > > aside the fact that we have no way of arranging that last in > > any case. > > > > 2) Character mapping including the troublesome eszett and word > > boundary and non-boundary characters. > > > > This needs to be dealt with. One important fact is that in the > > case of case-insensitive/case-preserving mapping, the mapping is > > within the file system and choices may be different from those > > specified within RFC3530. The most important point is that if > > we try to change the implementations now, we will have the > > wrong-file-opened problem in spades. > > > > Even though RFC3530 indicates eszett is to be mapped to "ss", > > a file system implementing case-insensitive case-preserving > > mapping, or one mapping to lower case (which V4.0) allows, would > > have no need to map eszett at all. "ss" "SS" "sS" and "Ss" > > would be in one class of equivalent file names and eszett in > > another. It is only the confluence of an assumption of mapping > > to upper case combined with the pre-unicode-5.1 absence of an > > upper-case eszett that created the mapping issue. > > > > Here the most critical issue is that different mappings exist > > (whether implemented in the file system or the server) and that > > changing them now, whatever arguments might be adduced for the > > new mapping's abstract goodness, is a recipe for the occurrence > > of the wrong-file-opened problem > > > > 3) Normalization issues as they apply to existing clients and > > file systems. > > > > I think here we have a big problem in that some clients and > > some servers like NFC and some like NFD. If we impose one, > > a client will have the name of his file changed. Clearly > > n-i/n-p is the best choice, although I didn't go so far as raise > > this to a formal SHOULD in the bis draft. The problem we have > > is that file systems that prefer both NFC and NFD do exist as do > > those that are normalization-insensitive. Respecting that fact, > > rather than changing it, seems the only possible choice at this > > point. The bis draft adopts the rule that filenames may not be > > rejected on account of normalization. This assures interoperability > > in the face of clients and servers with different preferences in > > this area. > > > > The one area where I've-opened-the-wrong-file seems possible > > is in the case of normalization-sensitive file systems. Two > > canonically equivalent names can exist within the same directory > > directory. If I am working with two client systems that have > > different preferences in this area, then which of the two files > > I get will depend on the system I am using at the time and that > > is bad. The question is what to do about. There are many such > > file systems even though they normally have very cases where two > > files with canonically equivalent exist in the same directory. > > > > Given that sort of file system has this issue, I think the group > > should consider what to say about file systems with > > normalization-sensitive file name handling. "SHOULD NOT" seems > > excessive to me but I think there should be group discussion of > > the issue. > > > > !) I think you mentioned the idea of forbidding the client from > > doing mapping (or maybe advising strongly against it: "If > > you ever want to be able to store a file and get it back, > > don't mess with the file name, bozo.") What is currently > > in the bis draft is: > > > > o On the client, file names may be processed as part of forming NFS > > version 4 requests. Any such processing will reflect specific > > needs of the client's environment and will be treated as out-of- > > scope from the viewpoint of this specification. > > > > In principle, I would stick with that. Since the name the client > > starts with comes from somewhere we don't have any idea about, > > making statements about what the client should or shouldn't do > > to it to make it suitable for transmission to the server, kind > > of uncertain. But I do see the logic of what you are saying so > > how about something like: > > > > o On the client, file names may be processed as part of forming > > NFS version 4 requests. Modification of the name string will > > reflect the nature of the string that is started with, as that > > will define what must be done to arrive at the actual file name. > > However, changing the name by mapping some characters to others > > or by deleting certain characters, whether in the interests of > > reducing the possibility of confusions, or otherwise, MUST NOT > > be done. Any such tasks are the responsibility of the server > > and the associated file system and for clients to make this > > sort of change will only confuse matters. One category of > > string modification that is acceptable is normalization by > > which is meant the replacement of the string by a canonically > > equivalent one. > > > > *) This leaves the issue of principle, that was mentioned. > > > > I don't want to over stress our disagreement on this, since I > > Think that fully addressing 1), 2), 3) and coming to some > > agreement about !) are sufficient to get us to where we want > > to be, but still I think the issue of principle that was not > > addressed when RFC3530 was done needs to be set out clearly, > > even if we agree to disagree. > > > > The responsibility of addressing the possibility of the > > o-my-I've-got-the-wrong-file problem is that of the file systems > > and they need to do that in the case of local file access. The > > responsibility of NFSv4 is not to make the problem worse. Most > > of the issues that made RFC3530 unimplementable arose from an > > attempt to take over that responsibility in a context in which > > there was no possibility of making things work in the stringprep > > framework because client and server implementers do not generally > > control file system implementations. > > > > That's the fundamental problem of prescriptive frameworks like > > stringprep in this context, at least as I see it.
- [nfsv4] Issues for precis Noveck_David
- Re: [nfsv4] Issues for precis david.black
- Re: [nfsv4] Issues for precis Noveck_David
- Re: [nfsv4] Issues for precis david.black
- Re: [nfsv4] Issues for precis Noveck_David
- Re: [nfsv4] Issues for precis Staubach_Peter
- Re: [nfsv4] Issues for precis Tom Haynes
- Re: [nfsv4] Issues for precis Tom Haynes
- Re: [nfsv4] Issues for precis James Lentini
- Re: [nfsv4] Issues for precis david.noveck
- Re: [nfsv4] Issues for precis Tom Haynes
- Re: [nfsv4] Issues for precis david.noveck
- Re: [nfsv4] Issues for precis Spencer Shepler
- Re: [nfsv4] Issues for precis david.noveck