Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05

Shang Ye <yesh25@mail2.sysu.edu.cn> Fri, 17 March 2023 12:37 UTC

Return-Path: <yesh25@mail2.sysu.edu.cn>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9E2F7C1516E9 for <ipv6@ietfa.amsl.com>; Fri, 17 Mar 2023 05:37:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.41
X-Spam-Level:
X-Spam-Status: No, score=0.41 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FORGED_MUA_MOZILLA=2.309, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AzFEzxkz1O6N for <ipv6@ietfa.amsl.com>; Fri, 17 Mar 2023 05:37:11 -0700 (PDT)
Received: from smtpbgau2.qq.com (smtpbgau2.qq.com [54.206.34.216]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0C8B8C14CE24 for <ipv6@ietf.org>; Fri, 17 Mar 2023 05:37:09 -0700 (PDT)
X-QQ-mid: bizesmtp91t1679056606th4jvugs
Received: from [172.27.7.181] ( [58.249.112.49]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 17 Mar 2023 20:36:45 +0800 (CST)
X-QQ-SSF: 01400000002000B0E000B00A0000000
X-QQ-FEAT: znfcQSa1hKa8vPaeyXbe6p/CI7534VuJ+so5EDdU9PtXv2dbWctza0v2Ksxse hQoVc0JS3jklQJIq37D1Wjyn+DA79sXoi6IABSj/oTQ+coOD5pks2eVvY26sZyJDTq7Pq/d /CD98+PtH2xx0yKDb2jyLlV3yjbhaKDMQGjzdlpQE21CrEc9TBV1W8GWLDq8OEQpHpYRjbi 04TY9OnNny/iQ6uEGLMwp/PYxfSJutAQstVrikXgBtMBfUvVPsTf7Ikfv1j5ijfv8ebL9Ua qVi3g7EzwMljymMYLyn9S2gX/HSQJoddaFHQT3YAouUxV6ywjyawV3z+hSa6WkSgFCdRcC6 L5Jek6euyhgxH9oJ4r8tlISksrD87hZEwkpliTwFl7PnZWv9JiacwS+/PiwN2+qFmUlRLhh XCHbLoKMnWwkC8zBY683XA==
X-QQ-GoodBg: 2
Content-Type: multipart/alternative; boundary="------------bDKsrKrC7rnadmKP4rTjAZ7F"
Message-ID: <0588A2F132B43B8C+199a2d6e-7879-c615-9146-7fae54f3f449@mail2.sysu.edu.cn>
Date: Fri, 17 Mar 2023 20:36:46 +0800
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0
From: Shang Ye <yesh25@mail2.sysu.edu.cn>
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
Cc: IPv6 List <ipv6@ietf.org>
References: <tencent_3D3F07CC3D59EE52093D35F2@qq.com> <95cc5bf1-bd2c-40bf-185f-a65730aa3e80@gmail.com> <CA+9kkMDgu8b79CconJ54R2BbkfFe6yxBRQRmJu6nNhAsTJYK5Q@mail.gmail.com> <1A9A7C1EB187822F+6bce53d1-7a0c-3e39-a101-0fec25aa7bf5@mail2.sysu.edu.cn> <2cb32e10-5f79-f03e-79a7-7910d9eddb3e@gmail.com>
Content-Language: en-US
In-Reply-To: <2cb32e10-5f79-f03e-79a7-7910d9eddb3e@gmail.com>
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:mail2.sysu.edu.cn:qybglogicsvr:qybglogicsvr7
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/l3rBR724_vM6uxioXywykxLpFyM>
Subject: Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Mar 2023 12:37:14 -0000

Now that I think about it, there could be another solution to the issue, 
that is:

State that the zone ID in a URI is case-sensitive, that implementations 
MAY treat it as case-insensitive and normalize it to lower case, and 
that the use of uppercase letters in a zone ID is NOT RECOMMENDED.

I agree that it's possible that we state that the zone ID in a URI is 
case-insensitive and is normalized to lower case, but then comes the 
question:

Can we make applications use the zone ID in a URI only after it is 
normalized to lower case?

AFAIK, many URI parsing libraries (especially those zero-copy ones) 
perform no normalization when parsing a URI string. They instead provide 
index ranges (or substrings) of each component within the string. 
Applications may then use the components however they like, including 
passing a zone ID containing uppercase letters directly to `if_nametoindex`.

For this reason, I personally feel that it's better for these libraries 
to treat zone IDs as case-sensitive when updated to support the zone ID 
format, for the sake of consistency.

Considering the past discussions on the issue, I provide a possible 
replacement for the paragraph in question:

    The zone identifier in a URI is case-sensitive. However, considering
    the broad promise in RFC 3986 that the host subcomponent in a URI is
    case-insensitive, implementations MAY treat the zone identifier in a
    URI as case-insensitive and normalize it to lower case. It is
    therefore NOT RECOMMENDED to assign to an interface a zone
    identifier that contain uppercase letters, to avoid compatibility
    issues with the URI syntax.

Which may be moved a bit up in the document if approved.

Regards,
Shang

On 2023/3/16 3:54, Brian E Carpenter wrote:
> The authors will wait for AD and WG Chair guidance after the IESG 
> ballot is
> complete.
>
> Personally I prefer David Farmer's suggestion: "For clarity and to avoid
> compatibility issues with the URI syntax, this specification 
> RECOMMENDS the
> use of lowercase letters only when assigning Zone IDs to interfaces."
>
> Regards
>    Brian Carpenter
>
> On 16-Mar-23 05:32, Shang Ye wrote:
>> Ted and Brian,
>>
>> RFC 8820 says it "targets the authors of specifications that constrain
>> the syntax or structure of URIs or parts of them", and is thus, I 
>> believe,
>> irrelevant when it comes to extending the URI syntax in RFC 3986. IMHO
>> my proposal to recommend treating a zone ID as case-sensitive is just as
>> backward compatible as the choice of "%" as a delimiter preceding a 
>> zone ID,
>> because the Zone ID format wasn't at all allowed in RFC 3986 and thus 
>> we're
>> free to specify the way a parser MUST handle it, let alone we're only 
>> discussing
>> about a SHOULD.
>>
>> I understand Brian's concerns for browsers not being able to follow 
>> along,
>> but since neither the draft nor RFC 4007 strictly forbids the use of 
>> a zone ID
>> that contain uppercase letters, there ought to be at least some 
>> clarification
>> on how parsers should handle such a zone ID, so as to avoid 
>> inconsistencies
>> in parsing a specific URI before and after normalization.
>> Such inconsistencies may include, for example:
>>
>>      When a URI, say <http://[fe80::1%En1]>, is parsed without 
>> normalization,
>>      an application is able to reach the server `fe80::1` through the 
>> interface `En1`
>>      provided by the parser; however, when the same URI is normalized 
>> (zone ID
>>      being lowercased) and parsed again, the application may fail to 
>> reach the
>>      same server through the interface `en1` because such an 
>> interface doesn't
>>      exist or the two interfaces are different.
>>
>> Also, the text in RFC 3986 doesn't really favor either lower or upper 
>> case for
>> a zone ID. The statement "the scheme and host are case-insensitive and
>> therefore should be normalized to lowercase" in Section 6.2.2.1 is in 
>> fact
>> a bit misleading, considering that percent-encoded octets are normalized
>> to *uppercase* within a host component.
>>
>> If there weren't percent-encoding in RFC 3986, we would be free to 
>> choose
>> whether to normalize percent-encoded octets to lower or upper case when
>> they are included, as either way the change would be backward 
>> compatible.
>> I think the current situation is no different from this, except that 
>> the zone ID
>> is case-sensitive and a zone ID in either case is perfectly valid on 
>> certain
>> operating systems.
>>
>> Regards,
>> Shang
>>
>> On 2023/3/15 17:50, Ted Hardie wrote:
>>> On Tue, Mar 14, 2023 at 8:42 PM Brian E Carpenter 
>>> <brian.e.carpenter@gmail.com> wrote:
>>>
>>>     Shang,
>>>
>>>     You are correct, and I feel that the underlying issue here is that
>>>     RFC 4007 under-specifies the Zone ID format, which is very hard to
>>>     fix. But the URI syntax is very explicit that "the scheme and host
>>>     are case-insensitive." I don't see that we can really override that
>>>     with a SHOULD and expect the browsers to follow along.
>>>
>>> I think that it is true, and that the guidance on this is pretty 
>>> clear.   RFC 8820 has this text in Section 2.2:
>>>
>>> Scheme definitions define the presence, format, and semantics of an 
>>> authority component in URIs; all other Specifications MUST NOT 
>>> constrain or define the structure or the semantics for URI 
>>> authorities, unless they update the scheme registration itself or 
>>> the structures it relies upon (e.g., DNS name syntax, as defined in 
>>> Section 3.5 <https://www.rfc-editor.org/rfc/rfc1034#section-3.5> of 
>>> [RFC1034 <https://www.rfc-editor.org/rfc/rfc8820#RFC1034>]).For 
>>> example, an Extension or Application cannot say that the "foo" 
>>> prefix in "https://foo_app.example.com" is meaningful or triggers 
>>> special handling in URIs, unless they update either the "http" URI 
>>> scheme or the DNS hostname syntax.Applications can nominate or 
>>> constrain the port they use, when applicable. For example, BarApp 
>>> could run over port nnnn (provided that it is properly registered).
>>>
>>> regards,
>>>
>>> Ted Hardie
>>>
>>>     I understand
>>>     that it's relatively easy for a stand-alone parser to do this, but
>>>     the browser implementations are a different story.
>>>
>>>         Brian
>>>
>>>     On 14-Mar-23 19:54, Shang Ye wrote:
>>>     > Hi Brian,
>>>     >
>>>     > I notice that the current version of the draft mentions the 
>>> case-sensitivity issue
>>>     > of zone identifiers in this paragraph:
>>>     >
>>>     >> RFC 3986 also states that the host subcomponent of a URI is 
>>> case-
>>>     >> insensitive and is normalised to lower case.  The mechanism 
>>> described
>>>     >  > here will therefore fail for zone identifiers that contain 
>>> upper case
>>>     >  > letters, since RFC 4007 implies case-sensitivity.
>>>     >
>>>     > Indeed it has pointed out the issue, but it doesn't really 
>>> tell people how a
>>>     > zone identifier that contain upper case letters may be 
>>> resolved. Should
>>>     > they pass it directly to a mapping function, say 
>>> `if_nametoindex`, normalize
>>>     > it to lower case first, or, "fail" to resolve it as implied in 
>>> the paragraph?
>>>     >
>>>     > When I modified my own URI parser to support rfc6874bis, I 
>>> felt that the only
>>>     > correct option is passing the zone identifier as it is, 
>>> considering the fact that
>>>     > interface names are case-sensitive on Linux. Personally I feel 
>>> it quite wrong
>>>     > to either reject an uppercase zone identifier or turn it into 
>>> lower case first.
>>>     >
>>>     > Also, I notice that you doubted that making zone identifiers 
>>> case-sensitive
>>>     > would cause a major problem for the URI parsers in every 
>>> browser, especially in Firefox.
>>>     > I admit that some browsers may have their *URL* parsing code 
>>> so convoluted that it can be very hard to make the change, but I 
>>> believe that it isn't
>>>     > the case with those relatively simple *URI* parsers that serve 
>>> other use cases.
>>>     >
>>>     > Therefore, I'd suggest we instead say somewhere in Section 3 
>>> that "URI parsers SHOULD treat a
>>>     > zone identifier as case-sensitive and preserve the letter case 
>>> of a zone identifier when
>>>     > it is normalized as part of a URI or mapped into a numeric 
>>> zone index."
>>>     > This way it'd be okay if browser people decide it's 
>>> unreasonably hard to
>>>     > follow this specific requirement, and it would benefit the 
>>> users of other conforming
>>>     > implementations by ensuring that everything goes just as 
>>> expected.
>>>     >
>>>     > Regards,
>>>     > Shang
>>>     >
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> --------------------------------------------------------------------
>>>     IETF IPv6 working group mailing list
>>> ipv6@ietf.org
>>>     Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>>> --------------------------------------------------------------------
>>>