Re: [apps-discuss] Potential issues in RFC 3986

Sam Ruby <rubys@intertwingly.net> Fri, 02 January 2015 15:58 UTC

Return-Path: <rubys@intertwingly.net>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 28CC31A1BCC for <apps-discuss@ietfa.amsl.com>; Fri, 2 Jan 2015 07:58:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.3
X-Spam-Level:
X-Spam-Status: No, score=-1.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oKdNTBnbrR5M for <apps-discuss@ietfa.amsl.com>; Fri, 2 Jan 2015 07:58:19 -0800 (PST)
Received: from cdptpa-oedge-vip.email.rr.com (cdptpa-outbound-snat.email.rr.com [107.14.166.231]) by ietfa.amsl.com (Postfix) with ESMTP id C1B381A1BC9 for <apps-discuss@ietf.org>; Fri, 2 Jan 2015 07:58:19 -0800 (PST)
Received: from [98.27.51.253] ([98.27.51.253:19674] helo=rubix) by cdptpa-oedge01 (envelope-from <rubys@intertwingly.net>) (ecelerity 3.5.0.35861 r(Momo-dev:tip)) with ESMTP id D9/FE-27767-B10C6A45; Fri, 02 Jan 2015 15:58:19 +0000
Received: from [192.168.1.102] (unknown [192.168.1.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: rubys) by rubix (Postfix) with ESMTPSA id 6F718140BC9; Fri, 2 Jan 2015 10:58:18 -0500 (EST)
Message-ID: <54A6C01A.6020000@intertwingly.net>
Date: Fri, 02 Jan 2015 10:58:18 -0500
From: Sam Ruby <rubys@intertwingly.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Julian Reschke <julian.reschke@gmx.de>, Graham Klyne <gk@ninebynine.org>
References: <20140926010029.26660.82167.idtracker@ietfa.amsl.com> <EAACE200D9B0224D94BF52CF2DD166A425A68A90@ex10mb6.qut.edu.au> <CACweHNBEYRFAuw9-vfeyd_wf703cvM3ykZoRMqAokRFYG_O7hQ@mail.gmail.com> <DM2PR0201MB09602B351692D424A49C6B0DC3650@DM2PR0201MB0960.namprd02.prod.outlook.com> <CACweHNBN_Bv=jeXQ_VwXi2HzHKNEwZJ1NiF-BJJo_9-mhO60gQ@mail.gmail.com> <54A5730C.8040501@ninebynine.org> <54A583DD.9010602@intertwingly.net> <54A59651.4060306@ninebynine.org> <54A59B26.5000408@intertwingly.net> <54A6AABF.4060406@ninebynine.org> <54A6B6DF.1010206@intertwingly.net> <54A6BB22.2060203@gmx.de>
In-Reply-To: <54A6BB22.2060203@gmx.de>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-RR-Connecting-IP: 107.14.168.118:25
X-Cloudmark-Score: 0
Archived-At: http://mailarchive.ietf.org/arch/msg/apps-discuss/TYlLwYPpTTZVpFUCqoP6WsEY3wI
Cc: apps-discuss@ietf.org
Subject: Re: [apps-discuss] Potential issues in RFC 3986
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Jan 2015 15:58:21 -0000

On 01/02/2015 10:37 AM, Julian Reschke wrote:
> On 2015-01-02 16:18, Sam Ruby wrote:
>> ...
>>> I fully accept that there may be desirable agent behaviours that are not
>>> covered here, and that an additional document may be desired to describe
>>> these, particularly where the behaviours impact interoperability.
>>
>> I would like to discuss that topic too.
>>
>> Whether that document is separate or not will depend on the outcome of
>> the discussion as to whether RFC 3986 matches current, deployed
>> applications.
>> ...
>
> It could be separate, even if we find out that we want to update RFC 3986.

Agreed.

The case where being separate would be made more difficult would be the 
case where the IETF did not agree to update RFC 3986.

>> ...
>>> For me, the question of what URI.parse *does* goes beyond what the core
>>> URI spec needs to define.  But I agree about operating system specific
>>> behaviours of file: URIs being outside the desirable scope of that core
>>> spec.
>>
>> Can I get you to explain what you mean by this.  We can ignore operating
>> system specific behaviors for the moment.  I would think that the basic
>> operation of identifying the scheme, path, fragment, etc for a given
>> input is exactly what a URI spec needs to define.  Why do you think
>> otherwise and/or what am I missing?
>> ...
>
> RFC 3986 does that for valid URIs, as far as I can tell. If you take the
> non-normative regexp in the appendix into account, it even des that for
> many more strings.

If you limit it to pure US ASCII inputs, it certainly gives precise 
answers.  The question is whether it gives accurate answers.

>> ...
>>> In a brief sampling, I couldn't see any divergence which is likely to be
>>> resolvable by changing the URI spec.
>>
>> I encourage you to spend more time with that data.  An example of a
>> concrete problem is handing of hosts in a UTS-46 compliant manner.
>> ...
>
> Which test, specifically?

Most of the tests in the range of 242..263 deal with IDNA issues in some 
manner or another.  A particularly fun one is test 261:

https://url.spec.whatwg.org/interop/test-results/e99bce2c50

<aside>ordinal test numbers may change over time.  The hash, however, 
will not</aside>.

I encourage you to scan the text of RFC 3986 for the term "IDNA".  In 
the decade between when RFC 3986 was published and today, things have 
progressed a bit here.

> Best regards, Julian

- Sam Ruby