Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05

Brian E Carpenter <brian.e.carpenter@gmail.com> Wed, 15 March 2023 19:54 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5B0AEC1522DA for <ipv6@ietfa.amsl.com>; Wed, 15 Mar 2023 12:54:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.096
X-Spam-Level:
X-Spam-Status: No, score=-7.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xbyT11cdfdfc for <ipv6@ietfa.amsl.com>; Wed, 15 Mar 2023 12:54:52 -0700 (PDT)
Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE7CCC14CE38 for <ipv6@ietf.org>; Wed, 15 Mar 2023 12:54:52 -0700 (PDT)
Received: by mail-pj1-x1029.google.com with SMTP id y2so20076678pjg.3 for <ipv6@ietf.org>; Wed, 15 Mar 2023 12:54:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678910092; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=DHt8eCQsnWIkNmerXJ/8n/F+HjXKSBK1f6OLvYX0pfo=; b=Cc+2PyczmjGmpob7Z14H6QEVKT61RaCbHKOQvxyqys6MOmgsfFkaEVaVyJlna5NWXT aBcmJy+YN6f+d/CvyDdAUtb4NPnXPytfQtiMO6KJWFXcek6Wwoc2GIA0FPd19IAGnJHg 5Li+zrbEfZEKbWnvOVaefM8/KWyUdrq23vDyCIKQf/z2DjnnyjapmhNPrwgOk9quqw2p PYFHR08bzmw9eEIt5mQKWbG6CT8uRqW5E1Y87Uqc9wPJPtulQp1N0Aws5GvptVJ8yGtk K6o/GXt6QGcuu16Xw3FIu2X85jpULJAxvwMugpRdPqbM3IiACFqsR+4cEvpq24L8oUBj E6qw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678910092; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DHt8eCQsnWIkNmerXJ/8n/F+HjXKSBK1f6OLvYX0pfo=; b=v4oExAU8qNgpC4yEQgsiojIMNP3QeRvhEdxmg6SoIVBBrAhufjKwX6t16RaCuB7H4w 2zOFIv5oKkphpa9J/D6UT3vVJQqnYz6sGlfGpt9uUmr2yT1w8IOPfnnBVgZ+sMwQ3fX+ r5ju3GLcZ9tpTDaWLMSZIEPZIX1LB4/c+Az9UY4GjKOT4F2ai9/5zUZ9T1FwchqOTeN/ fBAWPtZ/L/89Vrbz4I0VU81fdMGozRvyz7QQQ+1vTUrKMK6nYjyHeyGeLjxgtM2xNLwk cCUiv8oav0h/rDqw8rkSTNUhg4v22E5pVm8lUC76F8pn6kbBm2PLrWOv6RRU/Fa3egwY gYRw==
X-Gm-Message-State: AO0yUKXRXrES8zqWR+FXwEeYegVUvUCvpDkDIjrtAcR9ZmGwThhPoN4w pYumNePW3lwHdUzptvBW93o=
X-Google-Smtp-Source: AK7set+74yo+rfsW+UxiSotaKz5mIyoVjCguxdUAV6F4saUmDROmjmjVOMTEO8659PmpSd4NJNoy0A==
X-Received: by 2002:a17:902:c781:b0:19a:8636:9e2c with SMTP id w1-20020a170902c78100b0019a86369e2cmr558851pla.57.1678910092105; Wed, 15 Mar 2023 12:54:52 -0700 (PDT)
Received: from ?IPV6:2406:e003:1044:3e01:be79:8734:e850:d333? ([2406:e003:1044:3e01:be79:8734:e850:d333]) by smtp.gmail.com with ESMTPSA id v8-20020a170902b7c800b0019adbef6a63sm3981959plz.235.2023.03.15.12.54.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Mar 2023 12:54:51 -0700 (PDT)
Message-ID: <2cb32e10-5f79-f03e-79a7-7910d9eddb3e@gmail.com>
Date: Thu, 16 Mar 2023 08:54:46 +1300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
To: Shang Ye <yesh25@mail2.sysu.edu.cn>, Ted Hardie <ted.ietf@gmail.com>
Cc: ipv6@ietf.org
References: <tencent_3D3F07CC3D59EE52093D35F2@qq.com> <95cc5bf1-bd2c-40bf-185f-a65730aa3e80@gmail.com> <CA+9kkMDgu8b79CconJ54R2BbkfFe6yxBRQRmJu6nNhAsTJYK5Q@mail.gmail.com> <1A9A7C1EB187822F+6bce53d1-7a0c-3e39-a101-0fec25aa7bf5@mail2.sysu.edu.cn>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <1A9A7C1EB187822F+6bce53d1-7a0c-3e39-a101-0fec25aa7bf5@mail2.sysu.edu.cn>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: base64
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/0aGgN0htAi2uCaI_78y6juv751s>
Subject: Re: [IPv6] Case-sensitivity issue of zone identifiers in draft-ietf-6man-rfc6874bis-05
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Mar 2023 19:54:56 -0000

The authors will wait for AD and WG Chair guidance after the IESG ballot is
complete.

Personally I prefer David Farmer's suggestion: "For clarity and to avoid
compatibility issues with the URI syntax, this specification RECOMMENDS the
use of lowercase letters only when assigning Zone IDs to interfaces."

Regards
    Brian Carpenter

On 16-Mar-23 05:32, Shang Ye wrote:
> Ted and Brian,
> 
> RFC 8820 says it "targets the authors of specifications that constrain
> the syntax or structure of URIs or parts of them", and is thus, I believe,
> irrelevant when it comes to extending the URI syntax in RFC 3986. IMHO
> my proposal to recommend treating a zone ID as case-sensitive is just as
> backward compatible as the choice of "%" as a delimiter preceding a zone ID,
> because the Zone ID format wasn't at all allowed in RFC 3986 and thus we're
> free to specify the way a parser MUST handle it, let alone we're only discussing
> about a SHOULD.
> 
> I understand Brian's concerns for browsers not being able to follow along,
> but since neither the draft nor RFC 4007 strictly forbids the use of a zone ID
> that contain uppercase letters, there ought to be at least some clarification
> on how parsers should handle such a zone ID, so as to avoid inconsistencies
> in parsing a specific URI before and after normalization.
> Such inconsistencies may include, for example:
> 
>      When a URI, say <http://[fe80::1%En1]>, is parsed without normalization,
>      an application is able to reach the server `fe80::1` through the interface `En1`
>      provided by the parser; however, when the same URI is normalized (zone ID
>      being lowercased) and parsed again, the application may fail to reach the
>      same server through the interface `en1` because such an interface doesn't
>      exist or the two interfaces are different.
> 
> Also, the text in RFC 3986 doesn't really favor either lower or upper case for
> a zone ID. The statement "the scheme and host are case-insensitive and
> therefore should be normalized to lowercase" in Section 6.2.2.1 is in fact
> a bit misleading, considering that percent-encoded octets are normalized
> to *uppercase* within a host component.
> 
> If there weren't percent-encoding in RFC 3986, we would be free to choose
> whether to normalize percent-encoded octets to lower or upper case when
> they are included, as either way the change would be backward compatible.
> I think the current situation is no different from this, except that the zone ID
> is case-sensitive and a zone ID in either case is perfectly valid on certain
> operating systems.
> 
> Regards,
> Shang
> 
> On 2023/3/15 17:50, Ted Hardie wrote:
>> On Tue, Mar 14, 2023 at 8:42 PM Brian E Carpenter <brian.e.carpenter@gmail.com> wrote:
>>
>>     Shang,
>>
>>     You are correct, and I feel that the underlying issue here is that
>>     RFC 4007 under-specifies the Zone ID format, which is very hard to
>>     fix. But the URI syntax is very explicit that "the scheme and host
>>     are case-insensitive." I don't see that we can really override that
>>     with a SHOULD and expect the browsers to follow along. 
>>
>>
>> I think that it is true, and that the guidance on this is pretty clear.   RFC 8820 has this text in Section 2.2:
>>
>> Scheme definitions define the presence, format, and semantics of an authority component in URIs; all other Specifications MUST NOT constrain or define the structure or the semantics for URI authorities, unless they update the scheme registration itself or the structures it relies upon (e.g., DNS name syntax, as defined in Section 3.5 <https://www.rfc-editor.org/rfc/rfc1034#section-3.5> of [RFC1034 <https://www.rfc-editor.org/rfc/rfc8820#RFC1034>]).For example, an Extension or Application cannot say that the "foo" prefix in "https://foo_app.example.com" is meaningful or triggers special handling in URIs, unless they update either the "http" URI scheme or the DNS hostname syntax.Applications can nominate or constrain the port they use, when applicable. For example, BarApp could run over port nnnn (provided that it is properly registered).
>>
>> regards,
>>
>> Ted Hardie
>>
>>     I understand
>>     that it's relatively easy for a stand-alone parser to do this, but
>>     the browser implementations are a different story.
>>
>>         Brian
>>
>>     On 14-Mar-23 19:54, Shang Ye wrote:
>>     > Hi Brian,
>>     >
>>     > I notice that the current version of the draft mentions the case-sensitivity issue
>>     > of zone identifiers in this paragraph:
>>     >
>>     >> RFC 3986 also states that the host subcomponent of a URI is case-
>>     >> insensitive and is normalised to lower case.  The mechanism described
>>     >  > here will therefore fail for zone identifiers that contain upper case
>>     >  > letters, since RFC 4007 implies case-sensitivity.
>>     >
>>     > Indeed it has pointed out the issue, but it doesn't really tell people how a
>>     > zone identifier that contain upper case letters may be resolved. Should
>>     > they pass it directly to a mapping function, say `if_nametoindex`, normalize
>>     > it to lower case first, or, "fail" to resolve it as implied in the paragraph?
>>     >
>>     > When I modified my own URI parser to support rfc6874bis, I felt that the only
>>     > correct option is passing the zone identifier as it is, considering the fact that
>>     > interface names are case-sensitive on Linux. Personally I feel it quite wrong
>>     > to either reject an uppercase zone identifier or turn it into lower case first.
>>     >
>>     > Also, I notice that you doubted that making zone identifiers case-sensitive
>>     > would cause a major problem for the URI parsers in every browser, especially in Firefox.
>>     > I admit that some browsers may have their *URL* parsing code so convoluted that it can be very hard to make the change, but I believe that it isn't
>>     > the case with those relatively simple *URI* parsers that serve other use cases.
>>     >
>>     > Therefore, I'd suggest we instead say somewhere in Section 3 that "URI parsers SHOULD treat a
>>     > zone identifier as case-sensitive and preserve the letter case of a zone identifier when
>>     > it is normalized as part of a URI or mapped into a numeric zone index."
>>     > This way it'd be okay if browser people decide it's unreasonably hard to
>>     > follow this specific requirement, and it would benefit the users of other conforming
>>     > implementations by ensuring that everything goes just as expected.
>>     >
>>     > Regards,
>>     > Shang
>>     >

>>     --------------------------------------------------------------------
>>     IETF IPv6 working group mailing list
>>     ipv6@ietf.org
>>     Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>>     --------------------------------------------------------------------
>>