Re: [apps-discuss] Fwd: FW: New Version Notification for draft-kerwin-file-scheme-13.txt

Graham Klyne <> Sat, 03 January 2015 12:11 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id DCA7F1A8A09 for <>; Sat, 3 Jan 2015 04:11:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -3.6
X-Spam-Status: No, score=-3.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id F0L5awkl-C-k for <>; Sat, 3 Jan 2015 04:10:57 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id AFC4F1A8A0D for <>; Sat, 3 Jan 2015 04:10:29 -0800 (PST)
Received: from ([]) by with esmtp (Exim 4.80) (envelope-from <>) id 1Y7NXI-00062E-dI; Sat, 03 Jan 2015 12:10:28 +0000
Received: from ([] by with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <>) id 1Y7NXH-0000G3-FG; Sat, 03 Jan 2015 12:10:28 +0000
Message-ID: <>
Date: Sat, 03 Jan 2015 12:10:46 +0000
From: Graham Klyne <>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Sam Ruby <>
References: <> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Oxford-Username: zool0635
Subject: Re: [apps-discuss] Fwd: FW: New Version Notification for draft-kerwin-file-scheme-13.txt
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 03 Jan 2015 12:11:02 -0000


Rather than continue the blow-by-blow exchange of points, let me try and respond 
in one place where I think we stand:

1. I agree that draft-kerwin-file-scheme *should* work for everyone.  And that 
probably means reducing the scope of anything there that may be considered 

    But I also believe it can be useful to document other behaviours 
*informatively*.  I think this is discussed elsewhere and anticipate evolution 
in this direction.  This could mean, to develop your example, describing 
Microsoft Windows specific behaviours without any expectation that such 
behaviours would be implemented by Apple.

2. I think our disagreement may lie primarily in the area of what should be the 
scope of RFC 3986 or its successor.  There is (I claim) a substantial developer 
community who are familiar with RFC3986 as it stands, and creating a new 
document to cover the same material with enlarged scope is unnecessary and 
disruptive.  The additional scope coverage could be in a new document that 
builds upon what RFC3986 does specify.

    Part of this disagreement is that I don't think the URI core spec needs to 
describe the operation of splitting a URI into its components.  (I regard as merely informative.)  The 
syntax specification is sufficient to associate substrings of a well-formed URI 
with named syntax productions.

    I also think the core URI spec should not be describing how to turn non-URI 
strings (some of which may be system-dependent forms) into valid URIs.  (Except 
URI-references, which are covered by the resolution specification.)

3. Where there is divergence between implementations and RFC3986, these indeed 
should be considered on a case-by-case basis, but with (IMO) the presumption 
that RFC3986 is correct.  I.e. it is for those who think there is a problem with 
RFC3986 to make the case.

    You ask me to spend time with your data.  That's a big ask.  If some 
implementers think there is a problem with RFC 3986 then I think it is they who 
should be making the specific arguments about where RFC3986 is problematic.  I 
accept that UTS-46 host names is such an area, concerning which there should be 
a considered and focused debate (and about which I have insufficient knowledge 
to make a meaningful contribution).

4. I agree with your point about qualifying my statement about system-dependent 
forms conforming to core URI syntax.  While forms such as "C:\Program Files 
(x86)" might be described as variations, I don't think they should be considered 
to be valid URIs.

I'm supportive of the strategy you outline (repeated here for ease of 
reference), which I don't think is so different from what I've argued for:

A strategy that is more likely to be successful would be to identify URIs as 
being completely system independent, and URLs as being mostly system 
independent, and for there to be a well known and documented mechanism for 
converting from URLs to URIs.  Even that is not likely to be completely achieved 
-- the conversion may end up being (at least partially) system dependent, but in 
such cases we should be able to define the problematic set of the inputs as 

Where I may diverge is that I don't think the "well known and documented 
mechanism for converting from URLs to URIs" should be part of the URI 
specification (cf. my point 2 above).

5. The previous point also begs the question of what should be covered by the 
file: scheme document.  I think it may be appropriate to describe some commonly 
occurring system-dependent file: URL forms, but I'm less convinced that this is 
the place to describe how to map them to URIs.  Any normative specification of 
file: URI formats should be restricted to forms that comply fully with RFC3986.


On 02/01/2015 15:18, Sam Ruby wrote:
> On 01/02/2015 09:27 AM, Graham Klyne wrote:
>> On 01/01/2015 19:08, Sam Ruby wrote:
>>>> This raises two points for me:
>>>> 1. 'file: should be seen as an "escape hatch"'.
>>>> I disagree.  For me, the value of file: URIs is to provide a file naming
>>>> structure that can be used in libraries that unify local and web access
>>>> to resources.  So I think it's important that file: URIs follow common
>>>> URI syntax and resolution mechanisms, even if their interpretation for
>>>> the purposes of dereferencing, etc., is defined locally.
>>>> (This is not to impose "normative statements on OS vendors for their own
>>>> software" - unless the vendors choose to use URIs natively for file
>>>> naming.)
>>> I'll contrast "can be used in libraries that..." (immediately above),
>>> and "works
>>> for everyone" (previous point).
>>> If the intent of draft-kerwin-file-scheme is only to be valid in a
>>> subset of
>>> libraries, then it should say so.  If it is intended to also match
>>> other user
>>> agent behaviors (e.g. browsers), then we collectively have
>>> considerably more
>>> work to do.
>> I never made that claim.
> I'm confused.  Let me try again.
> I am making the claim that draft-kerwin-file-scheme does "work for everyone".
> Either there needs to be considerably more work done, or it needs to reduce its
> scope.
>> I also think it is not the role of the URI spec to describe "agent
>> behaviour" (beyond relative reference resolution, if that is a
>> "behaviour").
>> I think it is the role of the URI spec to:
>> (a) define what constitutes a valid URI, and URI reference, and
>> (b) describe how to combine a valid base URI with a valid URI reference
>> to yield a valid resulting URI.
>> Which is, of course, what RFC3986 does (how well is up for discussion).
> I, indeed, would like to discuss that topic.
>> I fully accept that there may be desirable agent behaviours that are not
>> covered here, and that an additional document may be desired to describe
>> these, particularly where the behaviours impact interoperability.
> I would like to discuss that topic too.
> Whether that document is separate or not will depend on the outcome of the
> discussion as to whether RFC 3986 matches current, deployed applications.
>>>> 2. Use of vendor-specific documentation
>>>> I agree with this, specifically: "The right way ... is to write the RFC
>>>> in such a way that OS-specific variations are not required for
>>>> RFC-compliance in the first place"
>>>> So it's clear to me that there are aspects of file: URI handling that
>>>> are local-context dependent (e.g. how to actually dereference).  But I
>>>> think other activities (such as relative reference resolution should be
>>>> possible without regard to the underlying file system implementation -
>>>> and this is the level of commonality that a file: URI scheme RFC should
>>>> aim to provide.
>>> I'll note that draft-kerwin-file-scheme includes such constructs as
>>> windows-path
>>> and unc-path.
>> Sure, but I'm not sure what point you're making here.
> Let me try to make that point in a different way then.  I'm skeptical that Apple
> will be interested in implementing any portion of a specification that is
> specific to Microsoft Windows.  This goes back to the point of trying to build a
> specification that "works for everyone".
>> I would want to see all such system-specific forms conform to standard
>> URI syntax, and to yield the desired results when resolved using a
>> standard resolution algorithm.
> We are going to need to qualify that statement considerably.
> Here is an example of a system-specific form: "C:\Program Files (x86)".
> A strategy that is more likely to be successful would be to identify URIs as
> being completely system independent, and URLs as being mostly system
> independent, and for there to be a well known and documented mechanism for
> converting from URLs to URIs.  Even that is not likely to be completely achieved
> -- the conversion may end up being (at least partially) system dependent, but in
> such cases we should be able to define the problematic set of the inputs as
> non-conforming.
> An example of a restriction we should consider: valid schemes must have at least
> two characters.
>>>>> It is my hope that by working together I can feel confident enough to
>>>>> remove
>>>>> that red box.  As it is, I don't feel that either spec matches widely
>>>>> deployed
>>>>> applications.
>>>> Fair enough.  Which suggests to me that focusing on a single focused
>>>> spec and aligning around that might be a productive way to tackle this.
>>> What I am focused on is the following question: what should a
>>> "URI.parse" method
>>> do?  In some ways that question is more general (in that is isn't
>>> file: scheme
>>> specific).  In some ways that question is more focused (in that it
>>> doesn't
>>> attempt to describe the operating system specific interpretations of
>>> the results).
>> For me, the question of what URI.parse *does* goes beyond what the core
>> URI spec needs to define.  But I agree about operating system specific
>> behaviours of file: URIs being outside the desirable scope of that core
>> spec.
> Can I get you to explain what you mean by this.  We can ignore operating system
> specific behaviors for the moment.  I would think that the basic operation of
> identifying the scheme, path, fragment, etc for a given input is exactly what a
> URI spec needs to define.  Why do you think otherwise and/or what am I missing?
>>>> <aside>
>>>> A problem for me is (as I've said before in other forums) that RFC3986
>>>> is a perfectly good specification of URI syntax and I don't see the need
>>>> to consult any other.  So why should I put energy into so doing?  I make
>>>> this point on the presumption that I'm not the only one who is OK with
>>>> RFC 3986.
>>>> Now, if the community decides that some other spec is the True
>>>> Pronouncement about URI syntax, I shall have to reconsider.  But I don't
>>>> see why I should be asked to put energy into reviewing a specification
>>>> which doesn't give me anything I don't already have.
>>>> This doesn't mean I oppose this specification in its goals to cover
>>>> areas that are not covered by RFC3986.  But, speaking personally, I'd
>>>> really like to be assured that any valid RFC3986 URI will be acceptable
>>>> according to the syntax you describe.  That way I don't have to read the
>>>> other document if I don't want the extra capabilities it offers.
>>> I have evidence that RFC 3986 doesn't match a variety of user agent
>>> behavior.
>>> Agents that aren't limited to browsers, but also to libraries that are
>>> used by
>>> what you would consider "middleware".
>>> Here is a filtered list of test results that only considers RFC 3986
>>> valid URI
>>> references as inputs:
>> I took a brief look, but haven't delved into the details of your
>> results.  At that superficial level, the list suggests to me that there
>> are many cases where implementations are buggy, and in different ways.
>> It doesn't tell me what are the problems in RFC3986.
> We can agree that implementations don't match RFC 3986.  In such cases, where
> the bug is would be need to be determined on a case by case basis.
>> In a brief sampling, I couldn't see any divergence which is likely to be
>> resolvable by changing the URI spec.
> I encourage you to spend more time with that data.  An example of a concrete
> problem is handing of hosts in a UTS-46 compliant manner.
>> #g
>> --
> - Sam Ruby