Re: [nfsv4] Going forward on I18N in RFC3530 bis

> I'll freely admit I haven't been following this discussion, so apologies
> if I'm misunderstanding what you ask.

No apology needed - that's definitely an example of "running code" ...

Thanks,
--David

> -----Original Message-----
> From: J. Bruce Fields [mailto:bfields@fieldses.org]
> Sent: Wednesday, September 22, 2010 2:06 PM
> To: Black, David
> Cc: Noveck, David; nfsv4@ietf.org
> Subject: Re: [nfsv4] Going forward on I18N in RFC3530 bis
> 
> On Wed, Sep 22, 2010 at 11:48:44AM -0400, david.black@emc.com wrote:
> > Dave,
> >
> > I reviewed the i18n material in -04 (Section 12).  it looks fairly
> > good, but the details are now beyond my level of i18n expertise.  I
> > suggest that we get a real i18n expert to review this section in the
> > next version of the draft - I have a couple of candidate reviewers in
> > mind.  Many thanks for the extensive effort that has clearly gone into
> > this.
> >
> > I have one basic disagreement that should not come as a surprise ;-)
> > ...
> >
> > My current view of A-labels vs. U-labels is that I'm going to (try to)
> > insist on no A-labels, *unless* there is important "running code" that
> > depends on A-labels on the wire and that needs to be grandfathered.
> 
> Last time I checked, the linux referral code would only follow a
> referral if it was an A-label:
> 
> 	http://www.ietf.org/mail-archive/web/nfsv4/current/msg07863.html
> 
> That may have changed.  However, it wouldn't be surprising if it were
> common behavior to pass the referral name directly to non-idn-aware dns
> interfaces...
> 
> > A-labels exist because the DNS infrastructure is fundamentally ASCII.
> > Since NFSv4 is UTF-8 capable, A-labels on the wire are just plain
> > wrong in principle, IMHO.  FWIW, I don't care whether it's possible to
> > get the current A-label approach blessed by the IETF's i18n gurus.
> > This turns up in 12.6 as "MAY be in the form of an A-label".  My
> > preference is that A-labels on the wire be "MUST NOT"
> 
> ... and therefore that a server conforming to such a "MUST NOT" would
> likely fail to operate with client implementations.
> 
> I'll freely admit I haven't been following this discussion, so apologies
> if I'm misunderstanding what you ask.
> 
> --b.
> 
> > - if there's
> > important "running code", I might settle for "SHOULD NOT" with an
> > explanation of the "running code" that requires ignoring that "SHOULD
> > NOT" in order to keep that "running code" happy.
> 
> 
> >
> > Comments:
> >
> > For strings that SHOULD be UTF-8, but aren't, what's the protocol requirement?  I think the
> requirement is 8-bit clean (e.g., MUST NOT force the most significant octet to zero, unless the
> string MUST be ASCII).  That should be stated as part of the string classification.
> >
> > The redefinition of "SHOULD" in 12.2.2 is an invitation to confusion.  I suggest:
> > 	SHOULD -> USHOULD, VSHOULD -> UVSHOULD & VMUST -> UVMUST
> > plus use of capitalized SHOULD/MUST in defining these terms.
> >
> > The first paragraph of 12.3 does not distinguish utf8_should strings from utf8val_should strings -
> the "SHOULD" requirement to return an error if the string is not UTF-8 conflicts with the statement
> that utf8_should strings are not checked for UTF-8 validity - I think that error return requirement
> applies only to utf8val_should strings.
> >
> > 12.4.2 suggests that NFSv4 supports hex-encoded text forms of IPv4 addresses.  Is that correct
> and/or needed?  The usual textual form of IPv4 addresses is decimal encoding.
> >
> > 12.7.1.2:
> >
> >    However, in any of the following situations, file names have to be
> >    treated as strings of characters and servers MUST return
> >    NFS4ERR_INVAL when file names that are not in UTF-8 format:
> >
> > Would "characters" -> "Unicode characters" be consistent with what was intended?  If so, that
> change would make the text clearer.  If not, I'm confused.
> >
> > 12.7.1.3 uses lower-case "must" and "should".  Is that deliberate vs. upper-case.  In general,
> double-check all uses of lower-case "must" and "should" to make sure that they are intended.
> >
> > 12.7.1.5.2 would be improved by examples of what clients should and/or should not do in order to
> improve interoperability with servers that do not handle normalization in the fashion that the
> client expects.
> >
> > 12.7.2 - If link text is utf8_should, servers aren't supposed to check for valid UTF8.  Based on
> 12.2.3, it looks like link text is utf8val_should, for which this check is appropriate.
> >
> > Nits:
> > - Saw one instance of NFKC garbled to NKFC.
> >
> > Thanks,
> > --David
> >
> > > -----Original Message-----
> > > From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf Of david.noveck@emc.com
> > > Sent: Thursday, September 09, 2010 6:36 PM
> > > To: nfsv4@ietf.org
> > > Subject: [nfsv4] Going forward on I18N in RFC3530 bis
> > >
> > > David Black (the man behind NFSv4.2 :-) has asked me to summarize the
> > > situation with regard to I18N in RFC3530 and the current plan about what
> > > to do about it going forward in handling it in RFC3530bis.
> > >
> > > ---- First some pointers:
> > >
> > >     The description of I18N is in chapter 11 of RFC3530,
> > >     page 122 of http://www.ietf.org/rfc/rfc3530.txt
> > >
> > >     The current draft replacement is in the latest draft
> > >     of RFC3530bis, that is, in chapter 12 (pages 160-179)
> > >     http://tools.ietf.org/id/draft-ietf-nfsv4-rfc3530bis-04.txt.
> > >     This is pretty much a rewrite of chapter 12 of the
> > >     previous draft-03, so looking at the diff is not much help.
> > >
> > > ---- Background:
> > >
> > > The basic problem with chapter 11 of RFC3530 is that it has almost no
> > > relation to what has been actually implemented.  The current form of
> > > chapter 11 reflects political pressures at the time of RFC approval
> > > within the IETF to conform to the stringprep paradigm, and so it is
> > > organized around that.  But implementations started without it, and
> > > never were adjusted to conform to that model, for good reasons,
> > > discussed below.
> > >
> > > In the meantime, problems within stringprep have become manifest.  Even
> > > more important is the fact that for the most important string type
> > > subject to I18N issues, filename components, the stringprep-style
> > > approach in its totality does not match the needs of NFSv4.  The issue
> > > is that you can think of the server as a single thing (including the
> > > server code and the file system you are talking to) in which case it
> > > makes sense to define, in exquisite detail, character mapping, and
> > > repertoire rules, so as to provide interoperability down to the most
> > > recondite character-handling details.
> > >
> > > However in fact, server implementations and file-systems are separate
> > > things and one cannot enforce detailed character handling rules on the
> > > file-systems and if one does one limits unacceptably the file systems
> > > that one can use.  And if one does that in front of the file systems, we
> > > interfere with another major goal of NFSv4, proper interoperability with
> > > other network file systems and with local use of those file systems.  If
> > > the protocol imposes rules that are not imposed locally, there may be
> > > valid files you can't get at over NFSv4.
> > >
> > > As a result, NFSv4, at least in this regard is better described as a
> > > protocol to pass names from the client to the remote server file system,
> > > making as few modifications as we can.  In fact, this is what people
> > > actually implemented and it differs in a major way from what is
> > > described in chapter 11 of RFC3530.  Thus the need to describe the
> > > reality that clients and servers implement in RFC3530bis.
> > >
> > > ---- Changes:
> > >
> > > This is a brief summary of the changes I introduced.  It is a high-level
> > > summary and I may have forgotten a few things.
> > >
> > > Re-organize the string types.  In RFC3530, these had been organized
> > > about stringprep profiles, basically around whether strings
> > > case-sensitive or not, or partially case-sensitive.  The resulted in
> > > very strange conclusions such as applying UTF-8 checking and checking
> > > for characters outside Unicode 3.1 being applied to tags.
> > >
> > > Tags are treated opaquely with no UTF-8 checking, Unicode repertoire
> > > checking, normalization-related checking.
> > >
> > > There is more clarity about various sorts of strings.  In particular,
> > > string which, for various reasons, do not require internationalization
> > > handling are explicitly called out.
> > >
> > > Adopting IDNA handling for domains and servers and simply referencing
> > > those docs for what is OK.  There is the issue of U-labels vs. A-labels.
> > > We allow A-labels or UTF-8 strings whether canonicalized or not.  There
> > > has been some discussion about changing that to U-labels only but that
> > > will only be done if there is working group consensus.
> > >
> > > Extensive discussion of the fact that our ability to legislate character
> > > handling for file systems is limited.
> > >
> > > Change UTF-8 requirement for filenames from MUST to SHOULD to match
> > > NFSv4.1.
> > >
> > > Get rid of requirement that everything be within Unicode 3.1.  Get rid
> > > of requirements that large sets of characters within Unicode 3.1 be
> > > rejected for various reasons.
> > >
> > > Get rid of requirement to map various characters.  SHOULD NOT do
> > > mappings which are problematic for stringprep (German eszett mapped to
> > > 'ss', zero-length join and non-join characters mapped to nothing causing
> > > issues Farsi) but MAY use other mappings in that (and by implication no
> > > mappings outside it).
> > >
> > > New treatment of normalization.  Allow normalization-sensitive servers
> > > (but warn of difficulties without saying SHOULD NOT), allow
> > > servers/file-systems to choose to normalize NFC or NFD (but not reject
> > > filename in "wrong" normalization as was implied by RFC3530), and also
> > > mention/allow for the first time
> > > normalization/insensitive/normalization-preserving handling of names
> > > (best choice but no SHOULD because this is big change to the file system
> > > ad thus nor really spec's business).
> > >
> > > Discussion of how symlink text should be processed and where the
> > > handling differs from file component names.
> > >
> > > New treatment of user/group names.  Each domain establishes its own list
> > > of these so there are no repertoire rules.  There is a discussion about
> > > why you should match these based on canonical equivalence, but there is
> > > no n/i-n/p option for these because it would require fs to save 2 (or
> > > sometimes many more) variants of that same user and group in the user
> > > and group attributes and in ACLs.   Nobody is going to do that nor
> > > should they.
> > >
> > > I'm sure I've missed some things.  If you notice them, let me know, as
> > > it would be good to maintain somewhere a summary of what was done in
> > > this chapter.
> > >
> > > ---- Discussion on call:
> > >
> > > The big issue discussed was whether we should wait for precis to finish
> > > this up.
> > >
> > > David Black came to the conclusion that since precis work was not
> > > proceeding very fast, we should go ahead based on the current draft plus
> > > working group comments, with the potential of an additional update
> > > (RFC3530tris?) when the precis work is finished and can be applied to
> > > NFSv4.
> > >
> > > There were arguments about his use of the word "patch" and the probable
> > > relative proportions of updates in RFC3530bis and its successor but no
> > > fundamental disagreement on the basic approach.
> > >
> > > I will be working on an update to chapter 12 that will go into a new
> > > draft of rfc3530-bis, targeted at the Beijing deadline.  May be able to
> > > get it out earlier but I hope people will have chance to look at the
> > > current draft and give me their comments.
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4