Re: [nfsv4] My probable non-participation in IETF114 meeting -- missing agenda items

David Noveck <davenoveck@gmail.com> Wed, 17 August 2022 11:24 UTC

Return-Path: <davenoveck@gmail.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2793DC1527A8 for <nfsv4@ietfa.amsl.com>; Wed, 17 Aug 2022 04:24:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.104
X-Spam-Level:
X-Spam-Status: No, score=-7.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JRvp8E17kyXu for <nfsv4@ietfa.amsl.com>; Wed, 17 Aug 2022 04:24:55 -0700 (PDT)
Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DE999C1527A6 for <nfsv4@ietf.org>; Wed, 17 Aug 2022 04:24:54 -0700 (PDT)
Received: by mail-ej1-x62c.google.com with SMTP id j8so23955260ejx.9 for <nfsv4@ietf.org>; Wed, 17 Aug 2022 04:24:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=b8Bm0fSPLIQA/M6+MRxys4XhhCKzLwhvOE1oeiCVdwA=; b=NmcIR91SDEOff6B496U3XIFInmW0hZamgbk2TGLh2knhT4MyM4g561KtV/JKowWO7n a58+8ilz6xAtFWv+AfboqE03T0QdiMs1M4lx3SoyKkJMBSYJScQOiC8xbU5jOck3djFt cj7PGaYpIfLHf9g3cN4NBpfhMhvnWFQbl1C13ibSO20V5s2WpO1aOS9In9uEbQUEcwlO kJ7XMUSijtZwmIJjlUSR88jt+01J/P6kOF+1qpJ29HQxd98fv8sTVan3yWl33DOMtDn4 Gw2oOtL+o6Kznud+iqfQXzm87BJun+Qzz4PJEb9yxNfWwV+eKQEI7ToFyqhW5Qum0LuT 0sXw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=b8Bm0fSPLIQA/M6+MRxys4XhhCKzLwhvOE1oeiCVdwA=; b=RNNAMmL2eN4LFtwxYkN5fJTRsoX4IzvDFAW0pPqBxdUvRay5JifRO3Pq+5Ixol8BpY gNo7bkayPUrwlxu29C+lHgPHm9uEWrOTxqUfE6X6SaaLlrr4honzemzXf/sobc1c54Q9 Gs4QG3cVTsVwebSCVj+QmNXIRlwRCIDk8q2giXbpCUMPu9jR8ZCuZVAloJ1+2RS6Chu8 k6IMYnly9AoFF5R8AJF/eRlY3emf5uBaPjixpVludHulCH7Nh6F2eurwva9+1MT1exl9 mfPTbCWAjujivB7UDhn+TwkGGaQ1n5mG59X+NrUD5dP159W5rPTXxIgpE9lUTFQ1GbGe AFYw==
X-Gm-Message-State: ACgBeo3vwqjWOcy6oR394N3iwu8D7OVjb0T4GONM0tjKOCWNzHn0Tgd9 KuFXt0SbyrF7P/DjwCTC90FSiuO1ky/YqgmFaHDwyeTM
X-Google-Smtp-Source: AA6agR5NGkjW/Og9xYciymP8uYx+6O2cmgPi5eI+n49Y5thtHGx9qCWAU3x3n5cgkR8+an64TWr95jOD4GELEMyxaTI=
X-Received: by 2002:a17:906:9b16:b0:730:d5f4:d44a with SMTP id eo22-20020a1709069b1600b00730d5f4d44amr16120065ejc.630.1660735492848; Wed, 17 Aug 2022 04:24:52 -0700 (PDT)
MIME-Version: 1.0
References: <MN2PR06MB5597E737F236E8E35C02380FE18C9@MN2PR06MB5597.namprd06.prod.outlook.com> <20220726022100.GD30255@kduck.mit.edu> <CADaq8jfb0Ecbh-3AUW=wMJCT4yu1GwUX+y50cYNrZfRRHc8Dhg@mail.gmail.com> <MN2PR19MB4045F9FDDDD501D798B3B8A9839A9@MN2PR19MB4045.namprd19.prod.outlook.com> <YQBPR0101MB97426A8AF9FAF18186C495B8DD9D9@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM> <MN2PR19MB4045965236195D2DC755CDF7839D9@MN2PR19MB4045.namprd19.prod.outlook.com> <CADaq8jc4HEMMK7o0_0jPU-X-gkceQQ+uZ9Vj_Xg0VknhzKBoTg@mail.gmail.com>
In-Reply-To: <CADaq8jc4HEMMK7o0_0jPU-X-gkceQQ+uZ9Vj_Xg0VknhzKBoTg@mail.gmail.com>
From: David Noveck <davenoveck@gmail.com>
Date: Wed, 17 Aug 2022 07:24:40 -0400
Message-ID: <CADaq8jfzzqOYgXJ7PEyqz+5xX8feUomFFGaHoPDDBEVFEVUjxw@mail.gmail.com>
To: "Black, David" <David.Black@dell.com>
Cc: Rick Macklem <rmacklem@uoguelph.ca>, NFSv4 <nfsv4@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000001e1d5005e66e1db9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/9eaxFOZ2pSAa7fRGPQCiH_HTcIM>
Subject: Re: [nfsv4] My probable non-participation in IETF114 meeting -- missing agenda items
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Aug 2022 11:24:59 -0000

In attempting to rewrite section 12 on an as-implemented basis, I find I
need some information about how existing servers and clients deal with
internationalized domain names.

With regard to servers, I need to know how they deal with domain strings
beginning with "xn--":
   - do they simply treat such strings as
     presented, ignoring the Idna
     convention?
   - do they translate the portion past "--"
     from punycode to Unicode?
   - do they follow the recommended
     approach in rfc7530 (unlikely but
     possible)

Also need to know what checks are done for label   validity. In particular,
how/if checks for valid UTF-8 are done.

With regard to clients, need to know if any generate domain strings
beginning with "xn--.






On Tue, Aug 9, 2022, 11:06 AM David Noveck <davenoveck@gmail.com> wrote:

>
>
> On Tue, Aug 2, 2022, 9:33 AM Black, David <David.Black@dell.com> wrote:
>
>> >>>       As a result, the domain string returned on a GETATTR of
>> >>>       the user id MUST be the same as that used when setting the
>> >>>       user id by the SETATTR.
>> >>
>> >>I would expect server implementations to do exactly that, but it'd be
>> good >to check - would compliance with that "MUST" be a problem for any
>> >known implementations?
>> >>
>> >>If so, please explain how the GETATTR and SETATTR user id values could
>> >differ.
>> > For the FreeBSD name<->id mapping daemon, the domain string is
>> > considered to be case independent.
>> > i.e., UoGuelph.ca is considered the same as uoguelph.ca
>> > I think I did this because that is how DNS host domain names have
>> > traditionally been handled.
>>
>
> It appears they can't be now.  How disruptive would it be to change this
> for ASCII names?
>
>>
>> Unfortunately, the concept of case independence is significantly more
>> complex in Unicode.
>
>
> It is not only more complex. You cannot do case conversion using the text
> string alone.
>
> A common example is that in ASCII, 'i' is the lower-case counterpart of
>> 'I', but that's not always the case for Unicode, e.g., in Turkish, the
>> lower case counterpart of 'I' is U+0131 (&inodot), LATIN SMALL LETTER
>> DOTLESS I (https://en.wikipedia.org/wiki/Dotless_I), and applying ASCII
>> case independence produced incorrect results, as there are words in Turkish
>> that differ only in dotted-i vs. dotless-i.
>>
>> The "method one" that is proposed for elimination dealt with Unicode case
>> insensitivity via use of ToASCII and ToUnicode - Unicode case mapping is
>> buried in the NAMEPREP step.  Unfortunately, the entire NAMEPREP framework
>> (including those mechanisms) is obsolete for a number of good reasons,
>> including its dependence on a specific version of Unicode.  Case
>> insensitivity of domain names has been moved to input processing, as
>> indicated by this item in Appendix A of RFC 5891 (
>> https://datatracker.ietf.org/doc/html/rfc5891#appendix-A):
>
>
> Method one has to go.
>
>
>>
>>    4.   Remove the mapping and normalization steps from the protocol and
>>         have them, instead, done by the applications themselves,
>>         possibly in a local fashion, before invoking the protocol.
>>
>
> This doesn't address cases in which the domain name is formulated by the
> server. There is no obvious "application" on the server.  The important
> case is multiple domains in multi-server namespace.
>
>
>> For NFS, the likely upshot is that language/locale-dependent (case)
>> mapping and normalization has to be done by the client or code above it.
>
>
> That works for mount and the client side of referrals.  To deal with this
> in full, we should simply say the protocol does modify the domain to either
> an equivalent one with a  changed case or a canonically equivalent one.
>
> ASCII case insensitivity would be safe for a server that enforces a
>> restriction to 7-bit ASCII, but it's not safe in general for UTF-8.
>>
>
> Yes but not sure what spec should say about that case
>
>
>
>> Thanks, --David.
>
>
> Thanks for the guidance.  Will provide an updated section for discussion
> in about a week. After that, will try to produce internationalization-02.
>
>>
>> -----Original Message-----
>> From: Rick Macklem <rmacklem@uoguelph.ca>
>> Sent: Monday, August 1, 2022 8:37 PM
>> To: Black, David; David Noveck
>> Cc: NFSv4
>> Subject: Re: [nfsv4] My probable non-participation in IETF114 meeting --
>> missing agenda items
>>
>>
>> [EXTERNAL EMAIL]
>>
>> Black, David <David.Black=40dell.com@dmarc.ietf.org> wrote:
>> >Dave,
>> >
>> >> The core issue derives from the following text in RFC7530:
>> >
>> >[ ... SNIP ...]
>> >
>> >> The above text was suggested by David Black based on discussions with
>> internationalization experts, and had no problems getting accepted by the
>> IESG.
>> >
>> >In light of what has (not) transpired since then ...
>> >
>> >> The first method references RFC 3490, now obsolete, so cannot be
>> transferred to a new NFSv4 internationalization document
>> >>
>> >> This reference should, almost certainly, not have appeared in RFC7530,
>> but I'm not sure how it was approved.
>> >>
>> >> It seems very unlikely that the first method was ever implemented by
>> any NFSv4 server, despite the fact it is recommended above.
>> >
>> >... while I could patch the text to deal with RFC 3490 being obsolete, I
>> think "running code" wins this one ... i.e., I suggest that the WG proceed
>> to:
>> >
>> >>    - eliminate the use of method one.
>> >
>> >That creates a small item to deal with, as method 2 contains a "MUST"
>> >requirement that would apply to all implementations:
>> >
>> >
>> >>       As a result, the domain string returned on a GETATTR of
>>
>> >>       the user id MUST be the same as that used when setting the
>>
>> >>       user id by the SETATTR.
>> >
>> >I would expect server implementations to do exactly that, but it'd be
>> good >to check - would compliance with that "MUST" be a problem for any
>> >known implementations?
>> >
>> >If so, please explain how the GETATTR and SETATTR user id values could
>> >differ.
>> For the FreeBSD name<->id mapping daemon, the domain string is
>> considered to be case independent.
>> ie. UoGuelph.ca is considered the same as uoguelph.ca
>> I think I did this because that is how DNS host domain names have
>> traditionally been handled.
>>
>> rick
>>
>> Thanks, --David
>>
>> From: nfsv4 <nfsv4-bounces@ietf.org> On Behalf Of David Noveck
>> Sent: Thursday, July 28, 2022 9:57 AM
>> To: Benjamin Kaduk
>> Cc: Noveck, David; NFSv4
>> Subject: Re: [nfsv4] My probable non-participation in IETF114 meeting --
>> missing agenda items
>>
>>
>> [EXTERNAL EMAIL]
>> The core issue derives from the following text in RFC7530:
>>
>>
>>    string sent SHOULD be in the form of one or more U-labels as
>>
>>    defined by [RFC5890 [datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc5890__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVR6jQoVSp$>].
>> If that is impractical, it can instead be in
>>
>>    the form of one or more LDH labels [RFC5890 [datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc5890__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVR6jQoVSp$>]
>> or a UTF-8 domain name
>>
>>    that contains labels that are not properly formatted U-labels.  The
>>
>>    receiver needs to be able to accept domain and server names in any of
>>
>>    the formats allowed.  The server MUST reject, using the error
>>
>>    NFS4ERR_INVAL, a string that is not valid UTF-8, or that contains an
>>
>>    ASCII label that is not a valid LDH label, or that contains an
>>
>>    XN-label (begins with "xn--") for which the characters after "xn--"
>>
>>    are not valid output of the Punycode algorithm [RFC3492 [
>> datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc3492__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVR8HVeyrh$
>> >].
>>
>>
>>
>>    When a domain string is part of id@domain or group@domain, there are
>>
>>    two possible approaches:
>>
>>    1.  The server treats the domain string as a series of U-labels.  In
>>
>>        cases where the domain string is a series of A-labels or
>>
>>        Non-Reserved LDH (NR-LDH) labels, it converts them to U-labels
>>
>>        using the Punycode algorithm [RFC3492 [datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc3492__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVR8HVeyrh$>].
>> In cases where the
>>
>>        domain string is a series of other sorts of LDH labels, the
>>
>>        server can use the ToUnicode function defined in [RFC3490 [
>> datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc3490__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVRw5pOT9y$>]
>> to
>>
>>        convert the string to a series of labels that generally conform
>>
>>        to the U-label syntax.  In cases where the domain string is a
>>
>>        UTF-8 string that contains non-U-labels, the server can attempt
>>
>>        to use the ToASCII function defined in [RFC3490 [
>> datatracker.ietf.org]<
>> https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc3490__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVRw5pOT9y$>]
>> and then the
>>
>>        ToUnicode function on the string to convert it to a series of
>>
>>        labels that generally conform to the U-label syntax.  As a
>>
>>        result, the domain string returned within a user id on a GETATTR
>>
>>        may not match that sent when the user id is set using SETATTR,
>>
>>        although when this happens, the domain will be in the form that
>>
>>        generally conforms to the U-label syntax.
>>
>>
>>
>>    2.  The server does not attempt to treat the domain string as a
>>
>>        series of U-labels; specifically, it does not map a domain string
>>
>>        that is not a U-label into a U-label using the methods described
>>
>>        above.  As a result, the domain string returned on a GETATTR of
>>
>>        the user id MUST be the same as that used when setting the
>>
>>        user id by the SETATTR.
>>
>>
>>
>>    A server SHOULD use the first method.
>>
>>
>>
>>
>>
>>
>> The above text was suggested by David Black based on discussions with
>> internationalization experts, and had  no problems getting accepted by the
>> IESG.
>> .
>>
>> The first method references RFC 3490, now obsolete, so cannot be
>> transferred to a new NFSv4 internationalization document
>>
>> This reference should, almost certainly, not have appeared in RFC7530,
>> but I'm not sure how it was approved.
>>
>> It seems very unlikely that the first method was ever implemented by any
>> NFSv4 server, despite the fact it is recommended above.
>>
>> The basis of the recommendation is quite unclear and it is not easy to
>> determine a situation in which the use of the first method would be
>> needed/desirable.  Further, the use of "SHOULD" leaves unanswered the
>> question of what are valid reasons to bypass the recommendation.
>>
>> The existing handling is not transferrable to other NFSv4.  We need to
>> do one of the following:
>>
>>    - eliminate the use of method one.
>>
>>    - provide an alternative the process in method 1 that does not depend
>> on RFC3490.
>>
>>
>>
>> On Mon, Jul 25, 2022, 10:21 PM Benjamin Kaduk <kaduk@mit.edu<mailto:
>> kaduk@mit.edu>> wrote:
>> Hi David,
>>
>> On Mon, Jul 18, 2022 at 01:05:25PM +0000, Noveck, David wrote:
>> >
>> >   *   draft-ietf-nfsv4-internationalization is now expired.  In order
>> to get it anywhere wglc, I have to address the issues in section 12.
>> >
>> > I haven't been able to get idna information from the working group or
>> the internationalization people. Please provide viable sources on email.
>> Ifno willing experts are to be found, will need to do further research and
>> may have to update 7530 to not support idna, if that is possible.
>>
>> While I don't consider myself an internationalization person, I do know a
>> few who would probably qualify.  Please forgive my only sporadic
>> attentiveness to this list -- is there a clean summary of the open issues
>> that I could send to people and ask for help?
>>
>> Thanks,
>>
>> Ben
>>
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org<mailto:nfsv4@ietf.org>
>>
>> https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/nfsv4__;!!LpKI!j7NGRO1t2zpt74VohXnTaCVcI5yF4q6IlWSmly-txDjG1Zsps-m41aBflmY1EC2eDKCuH3y92J99lxNwrwyZXw$
>> [ietf[.]org] [ietf.org]<
>> https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/nfsv4__;!!LpKI!gD1BRtFwwUpp8etNt74yc4T_v57MvYaRpGrXsWpU9X61X1wbdZB1UvN3ftbQldXRpIblsAw3UTPVR46uD3g8$
>> >p
>>
>