Re: [apps-discuss] presumption that RFC3986 is correct

Stephen Farrell <stephen.farrell@cs.tcd.ie> Sat, 03 January 2015 21:05 UTC

Return-Path: <stephen.farrell@cs.tcd.ie>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF1141A03F9 for <apps-discuss@ietfa.amsl.com>; Sat, 3 Jan 2015 13:05:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.31
X-Spam-Level:
X-Spam-Status: No, score=-1.31 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_NONE=-0.0001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vgTxWLRMl4Fs for <apps-discuss@ietfa.amsl.com>; Sat, 3 Jan 2015 13:05:06 -0800 (PST)
Received: from mercury.scss.tcd.ie (mercury.scss.tcd.ie [134.226.56.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DEA4F1A033B for <apps-discuss@ietf.org>; Sat, 3 Jan 2015 13:05:05 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mercury.scss.tcd.ie (Postfix) with ESMTP id B6CE3BF35; Sat, 3 Jan 2015 21:05:04 +0000 (GMT)
X-Virus-Scanned: Debian amavisd-new at scss.tcd.ie
Received: from mercury.scss.tcd.ie ([127.0.0.1]) by localhost (mercury.scss.tcd.ie [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cFAyzIhq_jsm; Sat, 3 Jan 2015 21:05:03 +0000 (GMT)
Received: from [10.87.48.73] (unknown [86.46.26.8]) by mercury.scss.tcd.ie (Postfix) with ESMTPSA id D83F9BF32; Sat, 3 Jan 2015 21:05:02 +0000 (GMT)
Message-ID: <54A8597E.9090206@cs.tcd.ie>
Date: Sat, 03 Jan 2015 21:05:02 +0000
From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Sam Ruby <rubys@intertwingly.net>
References: <20140926010029.26660.82167.idtracker@ietfa.amsl.com> <EAACE200D9B0224D94BF52CF2DD166A425A68A90@ex10mb6.qut.edu.au> <CACweHNBEYRFAuw9-vfeyd_wf703cvM3ykZoRMqAokRFYG_O7hQ@mail.gmail.com> <DM2PR0201MB09602B351692D424A49C6B0DC3650@DM2PR0201MB0960.namprd02.prod.outlook.com> <CACweHNBN_Bv=jeXQ_VwXi2HzHKNEwZJ1NiF-BJJo_9-mhO60gQ@mail.gmail.com> <54A5730C.8040501@ninebynine.org> <54A583DD.9010602@intertwingly.net> <54A59651.4060306@ninebynine.org> <54A59B26.5000408@intertwingly.net> <54A6AABF.4060406@ninebynine.org> <54A6B6DF.1010206@intertwingly.net> <54A7DC46.2020708@ninebynine.org> <54A7E9F4.80406@intertwingly.net> <54A820EA.20200@ninebynine.org> <54A82CC4.9080606@intertwingly.net> <54A83B72.4010106@cs.tcd.ie> <54A8550A.1020708@intertwingly.net>
In-Reply-To: <54A8550A.1020708@intertwingly.net>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/apps-discuss/58yZb8o8gUyFoFSep29rSI58aT8
Cc: apps-discuss@ietf.org
Subject: Re: [apps-discuss] presumption that RFC3986 is correct
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Jan 2015 21:05:08 -0000

Hiya,

On 03/01/15 20:46, Sam Ruby wrote:
> On 01/03/2015 01:56 PM, Stephen Farrell wrote:
>>
>> Hi Sam,
>>
>> On 03/01/15 17:54, Sam Ruby wrote:
>>>
>>> I intend to work with implementors, providing patches and/or new
>>> implementations along the way.  And I'll continue to document and
>>> publish findings.  One such place I have published such work is at the
>>> W3C:
>>>
>>>    http://www.w3.org/TR/url/
>>
>> I have at least one question about how you (or W3C, or any of us)
>> plan to head towards some reasonable level of completeness with
>> that work. (This may be a bit of an aside in the current discussion,
>> or maybe not, I'm not sure.)
>>
>> The draft at the URL above includes [1], which is a risibly small
>> and fixed (?) subset of an IANA registry. [2] What's the plan for
>> making that sensible? I would assume pointing at the IANA registry
>> is the simple and obvious fix there, but am puzzled as to why that
>> hasn't been done in the few years this text has been around.
>>
>> Is that just an oversight? Or is your work really only covering
>> exactly that particular subset of schemes? Or something else?
> 
> This is a valid question, and the subject of an open bug:
> 
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=27233
> 
> So the short answer is: it is a known issue, and suggestions are welcome.

My suggestion: a) reference the IANA registry for schemes and b) the
same for port numbers and then c) list which subsets of those the
additions to 3986 (*) in the w3c document cover. That is, I don't think
you need to try provide additional rules/text for all schemes but you
do need to specify which you cover.

(*) I'm assuming a sensible model where [1] ends up as a set of
additional specifications on top of 3986 (or on top of 3986+errata
or on top of a 3986bis if one ends up being needed).

   [1] http://www.w3.org/TR/url/

> 
> The longer answer isn't all that much longer.  Given that every modern
> programming language (and for that matter, every browser) will have a
> part of their runtime library a concept of either a URI or a URL, and a
> method to parse a string into such a structure, the question you pose is
> equivalent to: "how should URI.parse methods handle unknown schemes"?
> 
> Possible answers include: treat the content as hierarchical, and treat
> the content as opaque.  There may be other answers.
> 
> What there probably needs to be is a sane default, and a way to register
> new schemes.  At the moment, the URL Working Draft treats unknown
> schemes as opaque.  The bug suggests that hierarchical might be a better
> choice.
> 
> As to registration, at the moment that is undefined.  The spec literally
> says "..." at this point:
> 
>   http://www.w3.org/TR/url/#url-writing
> 
> The hope is to work together with the authors of the following
> Internet-Draft:
> 
>   https://tools.ietf.org/html/draft-ietf-appsawg-uri-scheme-reg

That's a good plan. It's (welcome) news to me though, but good to
hear. Are the editors of [1] and that draft in contact already?

> 
> This is mentioned in bullet 3 of the following section:
> 
>   https://tools.ietf.org/html/draft-ruby-url-problem-00#section-4
> 
> Meanwhile, patches are welcome!  It may be that there are certain URI
> schemes that defy conventional classification (file: certainly comes to
> mind, there may be others) that need to be specified explicitly in the
> specification.
> 
> The easiest way to participate 

That depends on who you are I guess:-) For me, that is not the best
way (nor easiest) to participate.

Cheers,
S.

> is to propose tests in the form of input
> strings, base strings, and expected results.  That data will be added to:
> 
> https://github.com/w3c/web-platform-tests/blob/master/url/urltestdata.txt
> 
> And I'll use that data to update:
> 
> https://url.spec.whatwg.org/interop/test-results/
> 
> Should you feel inclined to suggest changes to the URL spec, I'd
> encourage you to look at the following which contains an incomplete but
> significantly reworked parser:
> 
> https://specs.webplatform.org/url/webspecs/develop/
> 
> That repository has a bunch of other things, including the evaluation
> scripts and a reference implementation.  More information can be found
> here:
> 
> https://github.com/webspecs/url#the-url-standard
> 
>> Thanks,
>> S.
> 
> - Sam Ruby
> 
>> [1] http://www.w3.org/TR/url/#relative-scheme
>> [2] https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml
> 
>