Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05

Brian E Carpenter <brian.e.carpenter@gmail.com> Fri, 17 March 2023 20:46 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 327ECC152575 for <ipv6@ietfa.amsl.com>; Fri, 17 Mar 2023 13:46:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U5eIMUDhMzwX for <ipv6@ietfa.amsl.com>; Fri, 17 Mar 2023 13:46:51 -0700 (PDT)
Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C182C14CF05 for <ipv6@ietf.org>; Fri, 17 Mar 2023 13:46:51 -0700 (PDT)
Received: by mail-pj1-x102c.google.com with SMTP id qe8-20020a17090b4f8800b0023f07253a2cso6515445pjb.3 for <ipv6@ietf.org>; Fri, 17 Mar 2023 13:46:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679086011; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=mhqd8Qnn+dI91T+YM83TGyDHSAXKP4JKaMTnCYMckR4=; b=KAMT/4nsJDoS+ActZ71rlMB23KwS+r3upZ5D+OWSG9Fyd328Y71Wcdq2vGGsCvUIzd 8HMibxTi8tS2pjnKUlosm0iaJuL6irfQi5xz/qHhvdunEA/HejcuR7ETcYjdtCrPEpdk 8rFyDdN9gw/sqj/QYE2rodoFfj2qoV2eFf5zyo34OoxLTM4j3Kdg5S1PI8WisbwRJpRh /P02jSSHuh2rX+OJy2v/BKqWdg0RWWyPhObAxoh6PkHMZjVmuDQQ+paENG4e8toc+9F+ gdFFVAwdtK39HvPo2QcpnBvr7KaMWRyaNtdeUX1E92vywA3+stcxAbmeTIujbu13KAiz o6LQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679086011; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mhqd8Qnn+dI91T+YM83TGyDHSAXKP4JKaMTnCYMckR4=; b=QpRqqTATTV0nEAFGZTRJhYAd/DasQAk0qFyvsHKO72hzSSp5X5u0N+b5k9N79Sp1xm QXfgFM83CYBHqtUNSmPkARtrypk9WQI2adh0Bd1y/kUogwHrEUP8wpM5EBlZPnQoCmHh J3d5Kzu69vh+r61gp5Hqgc+F9yYIMlQIEF5BWGkLMTsvUqK+ye+PUKTtCi5s0hcPkMOg pjre/9U9Ck6TvpqxWARoqemYkS2CSpkkzyZCBEfysTTKXQ3YIlKjqgREehIl0NU8AiZM /bueG17T+Czqxpeu5tIfvCXdod2tv7DRAzxpCAkQZz8mBpZKOK9N2TKiurh/e7iFDdvM tWSQ==
X-Gm-Message-State: AO0yUKWbeQu3WBWCeQroArSlA4ur1RGzKH1Co/uXeakFvxmhSew1tRp3 IJljc28NAzM1/lZgG66idjQ=
X-Google-Smtp-Source: AK7set8DcMtGO/D682APISZ2mPOXD77pmDJ+zQrCK3z3BRUVB3Ps6FTXWQhkf6QEjZLkv4cYE7fVdA==
X-Received: by 2002:a05:6a20:8417:b0:cc:cc27:9185 with SMTP id c23-20020a056a20841700b000cccc279185mr11317213pzd.50.1679086010854; Fri, 17 Mar 2023 13:46:50 -0700 (PDT)
Received: from ?IPV6:2406:e003:1044:3e01:be79:8734:e850:d333? ([2406:e003:1044:3e01:be79:8734:e850:d333]) by smtp.gmail.com with ESMTPSA id 24-20020aa79158000000b00593e84f2d08sm1965661pfi.52.2023.03.17.13.46.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 17 Mar 2023 13:46:50 -0700 (PDT)
Message-ID: <db99670f-f47d-39f9-dd2f-3b3529c93cca@gmail.com>
Date: Sat, 18 Mar 2023 09:46:46 +1300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
To: Shang Ye <yesh25@mail2.sysu.edu.cn>
Cc: IPv6 List <ipv6@ietf.org>
References: <tencent_3D3F07CC3D59EE52093D35F2@qq.com> <95cc5bf1-bd2c-40bf-185f-a65730aa3e80@gmail.com> <CA+9kkMDgu8b79CconJ54R2BbkfFe6yxBRQRmJu6nNhAsTJYK5Q@mail.gmail.com> <1A9A7C1EB187822F+6bce53d1-7a0c-3e39-a101-0fec25aa7bf5@mail2.sysu.edu.cn> <2cb32e10-5f79-f03e-79a7-7910d9eddb3e@gmail.com> <0588A2F132B43B8C+199a2d6e-7879-c615-9146-7fae54f3f449@mail2.sysu.edu.cn>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <0588A2F132B43B8C+199a2d6e-7879-c615-9146-7fae54f3f449@mail2.sysu.edu.cn>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: base64
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/M4jgSSrBRCwYIn0iva2jjb2ixg0>
Subject: Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Mar 2023 20:46:56 -0000

On 18-Mar-23 01:36, Shang Ye wrote:
> Now that I think about it, there could be another solution to the issue, that is:
> 
> State that the zone ID in a URI is case-sensitive, that implementations MAY treat it as case-insensitive and normalize it to lower case, and that the use of uppercase letters in a zone ID is NOT RECOMMENDED.

Yes, I think that works. I think we've also established that the complex interface names with constructs like A/1/2/3 will be considered out of scope, because of their specialized use in infrastructure devices.

> 
> I agree that it's possible that we state that the zone ID in a URI is case-insensitive and is normalized to lower case, but then comes the question:
> 
> Can we make applications use the zone ID in a URI only after it is normalized to lower case?
> 
> AFAIK, many URI parsing libraries (especially those zero-copy ones) perform no normalization when parsing a URI string. They instead provide index ranges (or substrings) of each component within the string. Applications may then use the components however they like, including passing a zone ID containing uppercase letters directly to `if_nametoindex`.

Historically, I think the case-insensitive rule for the host part is directly inherited from DNS, so indeed normalization is not required. I think we can note that normalization is not required, but imposing that as a rule on browser implementations is not going to work. (It was the assertions in RFC6874 about browser behaviour that most annoyed the browser community.)

Thanks
    Brian

> 
> For this reason, I personally feel that it's better for these libraries to treat zone IDs as case-sensitive when updated to support the zone ID format, for the sake of consistency.
> 
> Considering the past discussions on the issue, I provide a possible replacement for the paragraph in question:
> 
>     The zone identifier in a URI is case-sensitive. However, considering the broad promise in RFC 3986 that the host subcomponent in a URI is case-insensitive, implementations MAY treat the zone identifier in a URI as case-insensitive and normalize it to lower case. It is therefore NOT RECOMMENDED to assign to an interface a zone identifier that contain uppercase letters, to avoid compatibility issues with the URI syntax.
> 
> Which may be moved a bit up in the document if approved.
> 
> Regards,
> Shang
> 
> On 2023/3/16 3:54, Brian E Carpenter wrote:
>> The authors will wait for AD and WG Chair guidance after the IESG ballot is
>> complete.
>>
>> Personally I prefer David Farmer's suggestion: "For clarity and to avoid
>> compatibility issues with the URI syntax, this specification RECOMMENDS the
>> use of lowercase letters only when assigning Zone IDs to interfaces."
>>
>> Regards
>>    Brian Carpenter
>>
>> On 16-Mar-23 05:32, Shang Ye wrote:
>>> Ted and Brian,
>>>
>>> RFC 8820 says it "targets the authors of specifications that constrain
>>> the syntax or structure of URIs or parts of them", and is thus, I believe,
>>> irrelevant when it comes to extending the URI syntax in RFC 3986. IMHO
>>> my proposal to recommend treating a zone ID as case-sensitive is just as
>>> backward compatible as the choice of "%" as a delimiter preceding a zone ID,
>>> because the Zone ID format wasn't at all allowed in RFC 3986 and thus we're
>>> free to specify the way a parser MUST handle it, let alone we're only discussing
>>> about a SHOULD.
>>>
>>> I understand Brian's concerns for browsers not being able to follow along,
>>> but since neither the draft nor RFC 4007 strictly forbids the use of a zone ID
>>> that contain uppercase letters, there ought to be at least some clarification
>>> on how parsers should handle such a zone ID, so as to avoid inconsistencies
>>> in parsing a specific URI before and after normalization.
>>> Such inconsistencies may include, for example:
>>>
>>>      When a URI, say <http://[fe80::1%En1]>, is parsed without normalization,
>>>      an application is able to reach the server `fe80::1` through the interface `En1`
>>>      provided by the parser; however, when the same URI is normalized (zone ID
>>>      being lowercased) and parsed again, the application may fail to reach the
>>>      same server through the interface `en1` because such an interface doesn't
>>>      exist or the two interfaces are different.
>>>
>>> Also, the text in RFC 3986 doesn't really favor either lower or upper case for
>>> a zone ID. The statement "the scheme and host are case-insensitive and
>>> therefore should be normalized to lowercase" in Section 6.2.2.1 is in fact
>>> a bit misleading, considering that percent-encoded octets are normalized
>>> to *uppercase* within a host component.
>>>
>>> If there weren't percent-encoding in RFC 3986, we would be free to choose
>>> whether to normalize percent-encoded octets to lower or upper case when
>>> they are included, as either way the change would be backward compatible.
>>> I think the current situation is no different from this, except that the zone ID
>>> is case-sensitive and a zone ID in either case is perfectly valid on certain
>>> operating systems.
>>>
>>> Regards,
>>> Shang
>>>
>>> On 2023/3/15 17:50, Ted Hardie wrote:
>>>> On Tue, Mar 14, 2023 at 8:42 PM Brian E Carpenter <brian.e.carpenter@gmail.com> wrote:
>>>>
>>>>     Shang,
>>>>
>>>>     You are correct, and I feel that the underlying issue here is that
>>>>     RFC 4007 under-specifies the Zone ID format, which is very hard to
>>>>     fix. But the URI syntax is very explicit that "the scheme and host
>>>>     are case-insensitive." I don't see that we can really override that
>>>>     with a SHOULD and expect the browsers to follow along.
>>>>
>>>> I think that it is true, and that the guidance on this is pretty clear.   RFC 8820 has this text in Section 2.2:
>>>>
>>>> Scheme definitions define the presence, format, and semantics of an authority component in URIs; all other Specifications MUST NOT constrain or define the structure or the semantics for URI authorities, unless they update the scheme registration itself or the structures it relies upon (e.g., DNS name syntax, as defined in Section 3.5 <https://www.rfc-editor.org/rfc/rfc1034#section-3.5> of [RFC1034 <https://www.rfc-editor.org/rfc/rfc8820#RFC1034>]).For example, an Extension or Application cannot say that the "foo" prefix in "https://foo_app.example.com" is meaningful or triggers special handling in URIs, unless they update either the "http" URI scheme or the DNS hostname syntax.Applications can nominate or constrain the port they use, when applicable. For example, BarApp could run over port nnnn (provided that it is properly registered).
>>>>
>>>> regards,
>>>>
>>>> Ted Hardie
>>>>
>>>>     I understand
>>>>     that it's relatively easy for a stand-alone parser to do this, but
>>>>     the browser implementations are a different story.
>>>>
>>>>         Brian
>>>>
>>>>     On 14-Mar-23 19:54, Shang Ye wrote:
>>>>     > Hi Brian,
>>>>     >
>>>>     > I notice that the current version of the draft mentions the case-sensitivity issue
>>>>     > of zone identifiers in this paragraph:
>>>>     >
>>>>     >> RFC 3986 also states that the host subcomponent of a URI is case-
>>>>     >> insensitive and is normalised to lower case.  The mechanism described
>>>>     >  > here will therefore fail for zone identifiers that contain upper case
>>>>     >  > letters, since RFC 4007 implies case-sensitivity.
>>>>     >
>>>>     > Indeed it has pointed out the issue, but it doesn't really tell people how a
>>>>     > zone identifier that contain upper case letters may be resolved. Should
>>>>     > they pass it directly to a mapping function, say `if_nametoindex`, normalize
>>>>     > it to lower case first, or, "fail" to resolve it as implied in the paragraph?
>>>>     >
>>>>     > When I modified my own URI parser to support rfc6874bis, I felt that the only
>>>>     > correct option is passing the zone identifier as it is, considering the fact that
>>>>     > interface names are case-sensitive on Linux. Personally I feel it quite wrong
>>>>     > to either reject an uppercase zone identifier or turn it into lower case first.
>>>>     >
>>>>     > Also, I notice that you doubted that making zone identifiers case-sensitive
>>>>     > would cause a major problem for the URI parsers in every browser, especially in Firefox.
>>>>     > I admit that some browsers may have their *URL* parsing code so convoluted that it can be very hard to make the change, but I believe that it isn't
>>>>     > the case with those relatively simple *URI* parsers that serve other use cases.
>>>>     >
>>>>     > Therefore, I'd suggest we instead say somewhere in Section 3 that "URI parsers SHOULD treat a
>>>>     > zone identifier as case-sensitive and preserve the letter case of a zone identifier when
>>>>     > it is normalized as part of a URI or mapped into a numeric zone index."
>>>>     > This way it'd be okay if browser people decide it's unreasonably hard to
>>>>     > follow this specific requirement, and it would benefit the users of other conforming
>>>>     > implementations by ensuring that everything goes just as expected.
>>>>     >
>>>>     > Regards,
>>>>     > Shang
>>>>     >

>>>> --------------------------------------------------------------------
>>>>     IETF IPv6 working group mailing list
>>>> ipv6@ietf.org
>>>>     Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>>>> --------------------------------------------------------------------
>>>>