Re: [Last-Call] [art] Artart last call review of draft-ietf-6man-rfc6874bis-02

Brian E Carpenter <brian.e.carpenter@gmail.com> Mon, 21 November 2022 21:09 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: last-call@ietfa.amsl.com
Delivered-To: last-call@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1DCD7C157B32; Mon, 21 Nov 2022 13:09:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Q_l6XzRxVOzX; Mon, 21 Nov 2022 13:09:54 -0800 (PST)
Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 781D3C15790B; Mon, 21 Nov 2022 13:09:54 -0800 (PST)
Received: by mail-pj1-x1032.google.com with SMTP id o5-20020a17090a678500b00218cd5a21c9so76581pjj.4; Mon, 21 Nov 2022 13:09:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:references:to:from :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=TPtDDU2W+XZnYfLm07h62jS7OV3fKxhxMC3I7mDgQJQ=; b=Lv9QATJltRd6UQP7SsWUbyRCkNDpB/1DChLDvGOkbsPBhvNkrDNw3mVrh0f8ZAX8c0 icyTogY8DorqfNTgSTXevYzkp1+DJ07HlQxFu1uQFll30AFzFNLMBIXZP4GoLKDv4eSw Ij28cS/myU/hgEJ38BpaMM6eQeklQMymxb14KIfsao7JTA3TnNtYB+QUmYD9pcgtPB8E Bo2eHY40mGh3GckGEzQtja0mL3ulCS4k8Fe+pUA4R5u7LZOFjLm2s1wy2TWXVySUNn6Y QG5L3rJ7ftNSwaaO5mqcmojf+Ay0AMxCKRfvCGvn6y8FM78M4Dazq/Ku8TzzQrhqTTZT eslQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:references:to:from :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TPtDDU2W+XZnYfLm07h62jS7OV3fKxhxMC3I7mDgQJQ=; b=eMOidnwpfJchxHMlmFB5hYALqmfiamvRtiyhqqiGhjZGQxU0nRzW9vOhAWWyD0zM84 +2CYfaKk7V5LeK9Lz7KVn4Wm+DL8+zQJ/nFHNE4lJERROLmrXNv6alLlja5p6jaezyKS XTAISNMPwW394b4hRsfvmJeogh/WJQn9mn7FnJypLAHc3LnJlrLY1tVY9u8/Wm89+xHa risHuBzmUr+ZJ21OuHl5nLGZR/FPiUNWCugfEqJKVcX9m1S4TOnrwti9aeI/rO/DiBhq aagJQvZQ9byGWmnw3pqZQUTRM0SQfpb74I1c9JvHq30V9GlaPiyBGEhclXzItsitAjgk 654Q==
X-Gm-Message-State: ANoB5pn0gSUmV6vpHb5tDm+oB8k05/DLwXSdrZpMW3Dto84zK/NG7O9l u+Cb7iZuBDdc1Kzbf3vhu2o=
X-Google-Smtp-Source: AA0mqf6ES2Yc2uUbQHT8AF4Gmf1d+TxI8Pl5FoKMTjE9m6GrvRuJrbRzPRThiC5xorJApEsEgWrwow==
X-Received: by 2002:a17:902:c3d1:b0:188:f5de:890b with SMTP id j17-20020a170902c3d100b00188f5de890bmr13876966plj.110.1669064993438; Mon, 21 Nov 2022 13:09:53 -0800 (PST)
Received: from ?IPV6:2406:e003:1124:9301:672e:17ee:b374:8a9b? ([2406:e003:1124:9301:672e:17ee:b374:8a9b]) by smtp.gmail.com with ESMTPSA id e14-20020a170902784e00b0018661d627d7sm10213956pln.59.2022.11.21.13.09.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Nov 2022 13:09:52 -0800 (PST)
Message-ID: <0c1e7186-a50b-1e8f-bd42-a712eb24ef41@gmail.com>
Date: Tue, 22 Nov 2022 10:09:46 +1300
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Dale R. Worley" <worley@ariadne.com>, vasilenko.eduard=40huawei.com@dmarc.ietf.org, fielding@gbiv.com, mt@lowentropy.net, art@ietf.org, draft-ietf-6man-rfc6874bis.all@ietf.org, ipv6@ietf.org, last-call@ietf.org
References: <878rlv5bz6.fsf@hobgoblin.ariadne.com> <0481e0d2-a6ac-659a-659b-aa699e36dae5@gmail.com> <d06a6566-b3c0-653a-3d5d-57c1424601c2@it.aoyama.ac.jp> <f4af85e4-43b5-eb25-c1b4-e8a83ed570c5@gmail.com>
In-Reply-To: <f4af85e4-43b5-eb25-c1b4-e8a83ed570c5@gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: base64
Archived-At: <https://mailarchive.ietf.org/arch/msg/last-call/atBeswZzDudX92pJkhaaDy7-gZI>
Subject: Re: [Last-Call] [art] Artart last call review of draft-ietf-6man-rfc6874bis-02
X-BeenThere: last-call@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF Last Calls <last-call.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/last-call>, <mailto:last-call-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/last-call/>
List-Post: <mailto:last-call@ietf.org>
List-Help: <mailto:last-call-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/last-call>, <mailto:last-call-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Nov 2022 21:09:59 -0000

A slight update on curl.

On Windows, it simply ignores the %xxx and uses the default zone.

On Linux, it *requires* the %xxx because Linux has no default zone. So actually it's Linux curl that already supports rfc6874bis completely, and Windows curl is weak.

Linux curl *also* supports the RFC6874 notation %25xxx, which creates an issue if one names a zone as "25".

Regards
    Brian Carpenter

On 22-Nov-22 09:07, Brian E Carpenter wrote:
> Hi Martin,
> 
> Please see a few comments in line:
> 
> On 21-Nov-22 17:48, Martin J. Dürst wrote:
>> Sorry to be late with this answer, but there are a few points below than
>> need to be corrected.
>>
>> On 2022-10-05 12:02, Brian E Carpenter wrote:
>>> Hi Dale,
>>>
>>> Note that since the Last Call has ended, there's now a -03 drfat
>>> that attempts to respond to the actionable comments from various
>>> reviews.
>>> (https://datatracker.ietf.org/doc/draft-ietf-6man-rfc6874bis/)
>>>
>>> More in line:
>>>
>>> On 05-Oct-22 13:51, Dale R. Worley wrote:
>>>> I'm not an expert in this area, but it seems that these points can be
>>>> made:
>>
>>>> 2. It doesn't seem to be a philosophical problem that we define a type
>>>> of URI that can only be properly interpreted within a very small part of
>>>> the Internet.
>>
>> This is definitely correct. There's no requirement that an URI has to be
>> dereferenceable everywhere. But while there's no philosophical problem
>> with that, it seems quite strange to change a very fundamental part of
>> URIs (percent escaping) for such a small use case.
>>
>>>> There's no requirement that URIs be universally
>>>> interpretable; "U" stands for "uniform" as in uniform syntax.
>>
>> Yes. And part of that uniform syntax is the uniformity of percent
>> escaping, which this proposal squarely ignores.
> 
> I believe that is not correct. I agree that there is some subtley in
> the description of percent encoding, and this is one of the reasons
> that RFC6874 took a different (and wrong) approach. We assumed that
> this sentence in RFC3986:
> 
> "2.1.  Percent-Encoding
> 
>      A percent-encoding mechanism is used to represent a data octet in a
>      component when that octet's corresponding character is outside the
>      allowed set or is being used as a delimiter of, or within, the
>      component."
> 
> would apply to IP-literal, but it doesn't, because the proposed BNF
> says it doesn't. Then we have:
> 
> "2.4.  When to Encode or Decode
> 
>      Under normal circumstances, the only time when octets within a URI
>      are percent-encoded is during the process of producing the URI from
>      its component parts.  This is when an implementation determines which
>      of the reserved characters are to be used as subcomponent delimiters
>      and which can be safely used as data."
> 
> An implementation (which in this case is usually a human!) has no reason
> to determine that "fe80::a%eth0" needs percent-encoding, because the
> proposed BNF says it doesn't. In that terminology, the "%" is
> acting as a subcomponent delimiter, not as data, so it doesn't need
> encoding.
> 
> (It was Andrew Cady who first clearly pointed this out last year:
> https://mailarchive.ietf.org/arch/msg/ipv6/ocNXw2Tl7YnOXOVjnUJ_VS7PI88 )
> 
> In fact, neither our interactions with browser implementors, nor my brief
> experience patching wget, have shown up any problems with this.
> 
> Incidentally, I just thought to try this command on my Windows box:
> 
> C:\WINDOWS\system32>curl http://[fe80::2e3a:fdff:fea4:cce7%7]
> 
> (That's the link-local address of my Fritz Box, slightly obfuscated.)
> 
> And guess what, it replied:
> 
> <!DOCTYPE html>
> <html lang="en">
> <head>
> ...
> </script>
> </body>
> </html>
> Thus, curl on Windows 10 already supports draft-ietf-6man-rfc6874bis,
> and the Fritz Box web server seems happy with it.
> 
>>
>>>> Or for
>>>> that matter, that they might be interpreted differently in different
>>>> places.  There is vast elasticity regarding what it means to "identify"
>>>> a "resource".  (I've been involved in a working group that defined URNs
>>>> that were abstract properties, and would only be realized by comparing a
>>>> prioritized sequence of URNs against the signals that a device was
>>>> capable of producing.)
>>>
>>> We agree. The -03 draft makes this point.
>>>
>>>>
>>>> 3. Given #2, it's not a problem that many implementations would be
>>>> unable to parse these URLs because their syntax is not
>>>> upward-compatible, as long as the beneficial use cases are generally
>>>> implemented.
>>
>> It depends on what "unable to parse" means. Many parsers and other
>> software are written so that edge cases and errors get processed just
>> 'somehow', possibly producing unexpected results.
> 
> True. But this doesn't matter in practice, as the -05 draft explains at
> https://www.ietf.org/archive/id/draft-ietf-6man-rfc6874bis-05.html#name-scope-and-deployment
> 
>>
>>
>>>> 5. There's a significant amount of trouble because RFC 4007 chose "%" as
>>>> the delimiter for zone indexes but "%" has a special syntax in URLs.  In
>>>> principle, this shouldn't be a problem.  "%" is used as the first
>>>> character of "%xx" escapes, but within URLs, that's just a constraint on
>>>> the contexts in which "%" may be used.  Unfortunately, many people are
>>>> sloppy and e.g. consider the URL "http://example.com/foo-bar" to be
>>>> equivalent to "http://example.com/foo%2dbar", which leads to a lot of
>>>> software attempting to "normalize" URIs that contain "%".  But the fact
>>>> that such software would choke on URLs containing zone indexes doesn't
>>>> seem to be important, as we expect zone indexes to have limited use.
>>
>> Such software isn't sloppy at all, it follows RFC 3986. Please see in
>> particular Section 2.4
>> (https://www.rfc-editor.org/rfc/rfc3986.html#section-2.4).
>> "http://example.com/foo-bar" and "http://example.com/foo%2dbar" are
>> equivalent. See also Section 6.2.2
>> (https://www.rfc-editor.org/rfc/rfc3986.html#section-6.2.2).
> 
> I discussed 2.4 above. I don't see 6.2.2 as relevant to the various use
> cases.
> 
>>
>>> Correct. That's exactly why the necessary patch to wget is two lines of C.
>>> (https://github.com/becarpenter/wget6/blob/main/wget-6874bis.md)
>>> It's *significantly* harder for the browsers, since their parsers are much
>>> more complex than wget, but your analysis seems to be spot on.
>>>
>>>> The unfortunate circumstance is that RFC 4007 has pretty much frozen "%"
>>>> as the delimiter character.  If we could change that, life would be
>>>> easier.  But there's a lot of deployed software and current practice
>>>> that would have to be changed.
>>>
>>> Exactly. It's unfortunate, but at the time of RFC4007, nobody noticed
>>> this gotcha.
>>
>> The gotcha can't be fixed anymore. But for those who created it, it
>> might at least be possible to acknowledge it and compromise in a greater
>> context. If all the Windows users who are accustomed to '\' as a path
>> separator can change that to '/' in URIs, why is it so difficult for the
>> very rare and localized case of zone ids to find another character than '%'?
> 
> That was discussed, actually prior to RFC6874, and the consensus in 6MAN
> was pretty clear - people want cut-and-paste, which means accepting "%".
> Today that is even harder to change than it was a few years ago.
> 
> Thanks
>       Brian
> 
>>
>> Regards,   Martin.
>>
>>>> 6. To get full usage of the new syntax, both the browsers and servers
>>>> that would be accessed by link-local addresses need to be changed.  In
>>>> practice, the browsers are likely to be general-purpose but the servers
>>>> are likely to be resident on a small subset of devices that are
>>>> self-consciously network devices.
>>>
>>> That's correct. As now noted in the draft, in some use cases even an
>>> HTTP error response is a fine result for diagnostic purposes, because
>>> it confirms connectivity.
>>>
>>> Regards
>>>        Brian
>>>