Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05

Shang Ye <yesh25@mail2.sysu.edu.cn> Wed, 15 March 2023 16:32 UTC

Return-Path: <yesh25@mail2.sysu.edu.cn>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 90361C14CF17 for <ipv6@ietfa.amsl.com>; Wed, 15 Mar 2023 09:32:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.412
X-Spam-Level:
X-Spam-Status: No, score=0.412 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FORGED_MUA_MOZILLA=2.309, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wzm3IC-eJhrm for <ipv6@ietfa.amsl.com>; Wed, 15 Mar 2023 09:32:37 -0700 (PDT)
Received: from smtpbgeu1.qq.com (smtpbgeu1.qq.com [52.59.177.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 28B8AC14CE42 for <ipv6@ietf.org>; Wed, 15 Mar 2023 09:32:31 -0700 (PDT)
X-QQ-mid: bizesmtp91t1678897932tbg13i7c
Received: from [172.27.75.118] ( [58.249.112.44]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 16 Mar 2023 00:32:11 +0800 (CST)
X-QQ-SSF: 01400000002000B0E000B00A0000000
X-QQ-FEAT: cbck7jzG4wahYaES91Z4XXnNjNSR/X0A5yS33ncJA/vJVmezp495N1nKYYxK/ k9wIi2icZ7dcUiBxREEh10QaPrSbBq1zCw9I2AzEO+AVAv88gi/ulJsWCymXiU6vBnPRL9w RMqq/R2fLWumeT9ZnJcBsZmy07bO+yzr5+eJJ/f9ZaCpDqHi6umJB4vbxiIWErs5CFSq/7S o8SVU4EVYGMAfpJXQYUFzngM8YZpydnSGO7RCImErTT16NpXNck0wMwSKQV4/BNN+Vt0oO7 6q5QDvWbFdS8mxs9RAVvItgR/K7nGFHa6rllrpFNuqDUH6MeB4zMk4WBuLgSosDUegxOXkN Umqb9avvjAr2qHt+/fvxsnx+4dkI2QUlNFYksuNyxguRAhV83XN+OGNLaD4WnIMbsQ5J8gO YWkYceo9420=
X-QQ-GoodBg: 2
Content-Type: multipart/alternative; boundary="------------9MRrPmbHUmv7cby4sYqHS70g"
Message-ID: <1A9A7C1EB187822F+6bce53d1-7a0c-3e39-a101-0fec25aa7bf5@mail2.sysu.edu.cn>
Date: Thu, 16 Mar 2023 00:32:11 +0800
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0
From: Shang Ye <yesh25@mail2.sysu.edu.cn>
To: Ted Hardie <ted.ietf@gmail.com>, Brian E Carpenter <brian.e.carpenter@gmail.com>
Cc: ipv6@ietf.org
References: <tencent_3D3F07CC3D59EE52093D35F2@qq.com> <95cc5bf1-bd2c-40bf-185f-a65730aa3e80@gmail.com> <CA+9kkMDgu8b79CconJ54R2BbkfFe6yxBRQRmJu6nNhAsTJYK5Q@mail.gmail.com>
Content-Language: en-US
In-Reply-To: <CA+9kkMDgu8b79CconJ54R2BbkfFe6yxBRQRmJu6nNhAsTJYK5Q@mail.gmail.com>
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:mail2.sysu.edu.cn:qybglogicsvr:qybglogicsvr7
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/dLPtIMgFzYyI-OiLj4dFNrsl6Jw>
Subject: Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Mar 2023 16:32:41 -0000

Ted and Brian,

RFC 8820 says it "targets the authors of specifications that constrain
the syntax or structure of URIs or parts of them", and is thus, I believe,
irrelevant when it comes to extending the URI syntax in RFC 3986. IMHO
my proposal to recommend treating a zone ID as case-sensitive is just as
backward compatible as the choice of "%" as a delimiter preceding a zone ID,
because the Zone ID format wasn't at all allowed in RFC 3986 and thus we're
free to specify the way a parser MUST handle it, let alone we're only 
discussing
about a SHOULD.

I understand Brian's concerns for browsers not being able to follow along,
but since neither the draft nor RFC 4007 strictly forbids the use of a 
zone ID
that contain uppercase letters, there ought to be at least some 
clarification
on how parsers should handle such a zone ID, so as to avoid inconsistencies
in parsing a specific URI before and after normalization.
Such inconsistencies may include, for example:

     When a URI, say <http://[fe80::1%En1]>, is parsed without 
normalization,
     an application is able to reach the server `fe80::1` through the 
interface `En1`
     provided by the parser; however, when the same URI is normalized 
(zone ID
     being lowercased) and parsed again, the application may fail to 
reach the
     same server through the interface `en1` because such an interface 
doesn't
     exist or the two interfaces are different.

Also, the text in RFC 3986 doesn't really favor either lower or upper 
case for
a zone ID. The statement "the scheme and host are case-insensitive and
therefore should be normalized to lowercase" in Section 6.2.2.1 is in fact
a bit misleading, considering that percent-encoded octets are normalized
to *uppercase* within a host component.

If there weren't percent-encoding in RFC 3986, we would be free to choose
whether to normalize percent-encoded octets to lower or upper case when
they are included, as either way the change would be backward compatible.
I think the current situation is no different from this, except that the 
zone ID
is case-sensitive and a zone ID in either case is perfectly valid on certain
operating systems.

Regards,
Shang

On 2023/3/15 17:50, Ted Hardie wrote:
> On Tue, Mar 14, 2023 at 8:42 PM Brian E Carpenter 
> <brian.e.carpenter@gmail.com> wrote:
>
>     Shang,
>
>     You are correct, and I feel that the underlying issue here is that
>     RFC 4007 under-specifies the Zone ID format, which is very hard to
>     fix. But the URI syntax is very explicit that "the scheme and host
>     are case-insensitive." I don't see that we can really override that
>     with a SHOULD and expect the browsers to follow along. 
>
>
> I think that it is true, and that the guidance on this is pretty 
> clear.   RFC 8820 has this text in Section 2.2:
>
> Scheme definitions define the presence, format, and semantics of an 
> authority component in URIs; all other Specifications MUST NOT 
> constrain or define the structure or the semantics for URI 
> authorities, unless they update the scheme registration itself or the 
> structures it relies upon (e.g., DNS name syntax, as defined in 
> Section 3.5 <https://www.rfc-editor.org/rfc/rfc1034#section-3.5> of 
> [RFC1034 <https://www.rfc-editor.org/rfc/rfc8820#RFC1034>]).For 
> example, an Extension or Application cannot say that the "foo" prefix 
> in "https://foo_app.example.com" is meaningful or triggers special 
> handling in URIs, unless they update either the "http" URI scheme or 
> the DNS hostname syntax.Applications can nominate or constrain the 
> port they use, when applicable. For example, BarApp could run over 
> port nnnn (provided that it is properly registered).
>
> regards,
>
> Ted Hardie
>
>     I understand
>     that it's relatively easy for a stand-alone parser to do this, but
>     the browser implementations are a different story.
>
>         Brian
>
>     On 14-Mar-23 19:54, Shang Ye wrote:
>     > Hi Brian,
>     >
>     > I notice that the current version of the draft mentions the
>     case-sensitivity issue
>     > of zone identifiers in this paragraph:
>     >
>     >> RFC 3986 also states that the host subcomponent of a URI is case-
>     >> insensitive and is normalised to lower case.  The mechanism
>     described
>     >  > here will therefore fail for zone identifiers that contain
>     upper case
>     >  > letters, since RFC 4007 implies case-sensitivity.
>     >
>     > Indeed it has pointed out the issue, but it doesn't really tell
>     people how a
>     > zone identifier that contain upper case letters may be resolved.
>     Should
>     > they pass it directly to a mapping function, say
>     `if_nametoindex`, normalize
>     > it to lower case first, or, "fail" to resolve it as implied in
>     the paragraph?
>     >
>     > When I modified my own URI parser to support rfc6874bis, I felt
>     that the only
>     > correct option is passing the zone identifier as it is,
>     considering the fact that
>     > interface names are case-sensitive on Linux. Personally I feel
>     it quite wrong
>     > to either reject an uppercase zone identifier or turn it into
>     lower case first.
>     >
>     > Also, I notice that you doubted that making zone identifiers
>     case-sensitive
>     > would cause a major problem for the URI parsers in every
>     browser, especially in Firefox.
>     > I admit that some browsers may have their *URL* parsing code so
>     convoluted that it can be very hard to make the change, but I
>     believe that it isn't
>     > the case with those relatively simple *URI* parsers that serve
>     other use cases.
>     >
>     > Therefore, I'd suggest we instead say somewhere in Section 3
>     that "URI parsers SHOULD treat a
>     > zone identifier as case-sensitive and preserve the letter case
>     of a zone identifier when
>     > it is normalized as part of a URI or mapped into a numeric zone
>     index."
>     > This way it'd be okay if browser people decide it's unreasonably
>     hard to
>     > follow this specific requirement, and it would benefit the users
>     of other conforming
>     > implementations by ensuring that everything goes just as expected.
>     >
>     > Regards,
>     > Shang
>     >

>     --------------------------------------------------------------------
>     IETF IPv6 working group mailing list
>     ipv6@ietf.org
>     Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>     --------------------------------------------------------------------
>