Re: [Uri-review] Request for review

Timothy Mcsweeney <tim@dropnumber.com> Fri, 29 May 2020 18:54 UTC

Return-Path: <tim@dropnumber.com>
X-Original-To: uri-review@ietfa.amsl.com
Delivered-To: uri-review@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 252AB3A0ABA for <uri-review@ietfa.amsl.com>; Fri, 29 May 2020 11:54:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.798
X-Spam-Level:
X-Spam-Status: No, score=-1.798 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qpClAt29UM7Y for <uri-review@ietfa.amsl.com>; Fri, 29 May 2020 11:54:48 -0700 (PDT)
Received: from mout.perfora.net (mout.perfora.net [74.208.4.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4AB7C3A00C1 for <uri-review@ietf.org>; Fri, 29 May 2020 11:54:48 -0700 (PDT)
Received: from oxusgaltgw05.schlund.de ([10.72.72.51]) by mrelay.perfora.net (mreueus003 [74.208.5.2]) with ESMTPSA (Nemesis) id 0LxQZI-1ivAhZ2Zti-016uXh for <uri-review@ietf.org>; Fri, 29 May 2020 20:54:47 +0200
Date: Fri, 29 May 2020 14:54:47 -0400
From: Timothy Mcsweeney <tim@dropnumber.com>
Reply-To: Timothy Mcsweeney <tim@dropnumber.com>
To: uri-review@ietf.org
Message-ID: <1729337515.289325.1590778487527@email.ionos.com>
In-Reply-To: <CA+9kkMDUiFHLYqx-nTchvSkQE0VUEGkvXni0cuYHvPYr8YoLHA@mail.gmail.com>
References: <491516506.246380.1589851279474@email.ionos.com> <5EC9B257.31362.CC5E003@dan.tobias.name> <1783049000.100771.1590323508943@email.ionos.com> <5ECA8A94.23977.101292FE@dan.tobias.name> <1426881880.158099.1590335585858@email.ionos.com> <94368b41-c15b-da2c-421d-fdd9300be6e9@dret.net> <1310141163.159340.1590344745080@email.ionos.com> <BL0PR2101MB102738EF50D7C8AD647E10BBA3B20@BL0PR2101MB1027.namprd21.prod.outlook.com> <1081815563.141711.1590624311343@email.ionos.com> <BL0PR2101MB102762C4CAFACC383412D5D8A38E0@BL0PR2101MB1027.namprd21.prod.outlook.com> <BL0PR2101MB10278A5360398EFF2E73FC0BA38E0@BL0PR2101MB1027.namprd21.prod.outlook.com> <117630321.142251.1590627970509@email.ionos.com> <8ae1641a-74c8-6c2d-7092-6cf53e745fb7@ninebynine.org> <797476254.282655.1590770737009@email.ionos.com> <CA+9kkMDUiFHLYqx-nTchvSkQE0VUEGkvXni0cuYHvPYr8YoLHA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
Importance: Normal
X-Mailer: Open-Xchange Mailer v7.10.1-Rev31
X-Originating-Client: open-xchange-appsuite
X-Provags-ID: V03:K1:C4HUh3uyyFKglaPySv29rQnAGz5qsv0Iuul4IOXpHGO8ptQGFWv kqrW/p4hyweIjawnTlZG10/Ze0Ow2rC6r42hl11d/RBTIvR82CG6BHU/xdeHadCjM5O7Q+R 0DY5sVuZiOTX4tM/+EjfJpWyoBcWW09nhgUc2DhD7meg4Gg5etsB3c+/0p4YZWLZ667ladP 4MYpyI2XMlhiPO3vR28mg==
X-UI-Out-Filterresults: notjunk:1;V03:K0:5YDEwggRQSU=:f97BHSFr7BabCJvV+8lM3a jrR2q17cYkv0inLny7bzH82EguMec7zgLE8kAr0VgGicEkqCe7bC8EyoOWcwW7rY/zzytVozS +/uM7RO2Ww4bZ73nqJxRWK2iyh47HLwPA2wwKEYf9u877hYoFD7ll+agW5tfZCtp/Jrq54oiD KKD4VXzicEjOTBQnU2Fojz5Vo5CJm9nV+IIBPWdMAol9Crq35KD9gHm5HNVOTtElEZKdP5OXm +35viaa8XVvClaFflyyTznOD26JXfm3ydc2klfmDtvgK/m/EOqQsIlSjJAM8eVkD1ftMcBtk5 ZHrKnfCKJTngtOOSBemglhTCoYrBbQsmNo+WmYWCpER4n0uIuPmCjue7J6wkMlNMc9nG84/hY Ui7h6T0xuQjXJhf91ke9zgWYTHAs8r1zymVcO++pT1P98woQHHm1K7J3I14uWL+lSdsBBhFGd jfvthjS4PNYRlLUGFp43ccckhz4jJyfduEXPyXoapDzpsiGkLUZ8jZWcxxdht3LXmKVTUWs0O yQ4A64diBWr0NE51dw4PmaVIuf9+eLF3X27/J7MxawgiUKxVQnr8l2W87vES9l6VUIYgyblsd rUiB4IBNgeaad+EZlRvRyjeV6B8u0LQPzwi1Yd8XdaShA9bnG0V+jzQdZeBTWUNqfEX+niPwA Pt/3uq2KKtiIbk+Aq5PnmKDT35+zXaIu9LBXG2SxBZVZJGYR8kaqfpCw6zieTnaBN5+SEGFw9 wZqJ1EiA16nKpAn8ywXBbGi5wxZLDrJGNUKBvKdxP5ZkBimU992MdzjDUqnZibPjYyPiSS3Hh pPP2ivinAhnxuUAnA6qIvZYnqRa7dR3OdPGnfhJuJM8J+CUA1rmVmEo8En3pusg11oM4j2Y
Archived-At: <https://mailarchive.ietf.org/arch/msg/uri-review/i9chMPRc2JGq2O8c8eGgehkSI0Y>
Subject: Re: [Uri-review] Request for review
X-BeenThere: uri-review@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proposed URI Schemes <uri-review.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/uri-review>, <mailto:uri-review-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/uri-review/>
List-Post: <mailto:uri-review@ietf.org>
List-Help: <mailto:uri-review-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/uri-review>, <mailto:uri-review-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 May 2020 18:54:50 -0000

Hi Ted,
>For a URI parser, the method used to identify the scheme within any example is basically: "it's the bit to the left of the colon".

Is it basically this, or exactly this?

> The colon is the "end of label" marker, and that's why folks are insisting on it here; without it, you can't use the URI API to select a branch

Is the colon the only "end of label" marker?

Tim
On May 29, 2020 at 2:22 PM Ted Hardie <ted.ietf@gmail.com> wrote:

Hi Tim,

Thanks for sharing your thoughts on this.  Let me suggest a slightly different formulation, which is not based on the colon itself, but on the scheme. 

The scheme indicates which type of URI a particular  example is and thus governs a bunch of the behavior you describe below (whether it is used to dereference resources and, if so, what protocol(s) are used, if there are specific parameters and, if so, whether any are mandatory, and so on).  For a URI parser, the method used to identify the scheme within any example is basically: "it's the bit to the left of the colon".  If you don't have that specific marker, then a standard URI parser won't find a scheme and thus won't know what behavior to match to the example. 

A common early metaphor for this was to identify the network retrieval protocols as different parts of the plumbing of the Web, with the scheme names as the labels on the taps.  That plumbing metaphor doesn't really work with abstract identifiers and is stretched by schemes like HTTPS which might have more than one set of underlying protocols.  But the labeling still does--you can think of the schemes and the labels that tell the API which branch of behavior to invoke.  The colon is the "end of label" marker, and that's why folks are insisting on it here; without it, you can't use the URI API to select a branch.

best regards,

Ted


On Fri, May 29, 2020 at 9:46 AM Timothy Mcsweeney < tim@dropnumber.com> wrote:
Hi Graham,

I would never treat suggestions from this list as arbitrary, quite the contrary.  
I want to change the format of this reply just for a minute to express my deductions. 

This is an excerpt from the conversation in my head:


If you take away all the components and subcomponents of a URI, what's leftover? 
The colon. 
And what governs the colon? 
The dereferencing algorithm. 
Does http use a colon in its dereferencing? 
It does. 
What about a URN? 
It does. 
FTP and Mailto? 
Yup the same. 
So If you change the colon to a number sign would you get them same output?
Yes.
All of them? 
Yes.
Can you prove it?
Yes.  
Why do all the delimiters have quotes around them?
Because they are interchangeable.
Interchangeable everywhere?
No, just within the scope of their placement.  That's why URNs can use a bunch of colons and not interfere with the first colon after the URN scheme name.
But it says the colon is required doesn't it?
I can not pinpoint the sentence that says that.
But section 3, the colon is in the generic syntax, you can see that right?
Yes but the title of section 3 is "Syntax Components" and the colon is not a component. 
Wait, what does generic mean?
Not specific. 
So the generic syntax is not specific?
That's right.
So [RFC3986] is a specification that is defining something that is not specific?
Yup, says it right there in the abstract.

From here my mental conversation took a left turn.  But I wanted to put this out here so that the members of this list didn't think my intent was for purely self interest reasons but that we can all use what's here. 

Tim

On May 29, 2020 at 6:01 AM Graham Klyne < gk@ninebynine.org> wrote:


Hmmm... I find that bit of RFC3986 isn't immeditely clear. But on closer study,
I think it's simply saying that the characters are "safe" in the sense that
they are protected from change by URI normalization, hence that when used as
delimiters there is no risk that the interpretation of the URI is affected by
URI normalization (see also section 6 of RFC 3986).

But some of these reserved characters already have defined purposes in URI
structure, and any scheme-dependent use needs to take care not to interfere with
such use. For example, using "#" as a delimiter within a URI path would
interfere with it's already-defined purpose to delimit a fragment.

Also, current URI structure *requires* that the ":" is used to delimit the
scheme name from the rest of the URI. Suggestions by others on this list to use
":" rather than "#" are not entirely arbitrary.

As a rule of thumb, I would suggest that if you do need scheme-specific
delimiters (and it's not clear to me that you do), then using one from the
"sub-delims" set is more likely to avoid conflicts with generic URI syntax and
interpretation.

#g
--


On 28/05/2020 02:06, Timothy Mcsweeney wrote:
Hi Dave,

By "safe" I meant like ".....

safe to be
used by scheme-specific and producer-specific algorithms for
delimiting data subcomponents within a URI"

Like it says in section 2.2 of RFC3986.

Tim

On May 27, 2020 at 8:48 PM Dave Thaler < dthaler@microsoft.com> wrote:
>> s/URL/URI/ in both cases in my response J
>>
>> *From:*Uri-review < uri-review-bounces@ietf.org> *On Behalf Of *Dave Thaler
>> *Sent:* Wednesday, May 27, 2020 5:47 PM
>> *To:* Timothy Mcsweeney < tim@dropnumber.com>; uri-review@ietf.org
>> *Subject:* Re: [Uri-review] Request for review
>>
>>
>> I don’t understand your question.   The URL syntax is fixed by that RFC.
>>
>> I don’t know what you mean by “safe” or “valid”.
>>
>> If by “valid” you mean “allowed by RFC 3986”, the answer is that they may only
>> appear in a URL literally
>>
>> if they have the exact meaning in the RFC, otherwise they must be pct-encoded.
>>
>> *From:*Uri-review < uri-review-bounces@ietf.org
>> <mailto: uri-review-bounces@ietf.org>> *On Behalf Of *Timothy Mcsweeney
>> *Sent:* Wednesday, May 27, 2020 5:05 PM
>> *Subject:* Re: [Uri-review] Request for review
>>
>>
>> Hi Dave,
>>
>>
>> If the other six gen-delims from the reserved set were safe and valid, would
>> you oppose their use in URIs?
>>
>>
>> Tim
>>
>>
>>
>>
>> On May 24, 2020 at 6:08 PM Dave Thaler < dthaler@microsoft.com
>> <mailto: dthaler@microsoft.com>> wrote:
>>
>> Hi Tim,
>>
>> Correct the colon is not part of the hier-part, the hier-part is what
>> comes after the colon.  RFC 3986 says:
>>
>> URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>>
>> Only strings that conform to the above are URIs.
>>
>> So “drop#sd54g54” is not a URI because it does not conform to the above
>> syntax, as it has no “:”
>>
>> “drop:sd54g54” on the other hand would be a valid URI.
>>
>> This is what folks are saying when they say if you just change the “#” to
>> a “:” in your draft then it becomes legal.
>>
>> Dave
>>
>> *From:*Uri-review < uri-review-bounces@ietf.org
>> <mailto: uri-review-bounces@ietf.org>> *On Behalf Of *Timothy Mcsweeney
>> *Sent:* Sunday, May 24, 2020 11:26 AM
>> *To:* Erik Wilde < erik.wilde@dret.net <mailto: erik.wilde@dret.net>>;
>> *Subject:* Re: [Uri-review] Request for review
>>
>>
>> Hi Erik,
>>
>>
>> Thank you, I will have another look at my reference to section 3.
>>
>> Would you agree that in " https://ietf.org" rel="noopener nofollow">https://ietf.org
>> the colon is not part of the hier-part?
>>
>> On May 24, 2020 at 12:02 PM Erik Wilde < erik.wilde@dret.net
>> <mailto: erik.wilde@dret.net>> wrote:
>>
>>
>>
>> hey tim.
>>
>>
>> On 2020-05-24 17:53, Timothy Mcsweeney wrote:
>>
>> Yes, I agree and understand that the same way as you.   But when
>> the "#"
>>
>> leaves the client it is not leaving as a fragment,
>>
>> what people are telling you is that "#" and anything following it never
>>
>> leaves the client, by definition..
>>
>>
>> it is leaving as a
>>
>> way to separate the URI components, <scheme> and <path> or for http it
>>
>> would be separating <scheme> and <authority>.  It is this that
>> makes me
>>
>> believe that even if the colon is required for http resolution, it is
>>
>> not necessarily required for all URI.
>>
>> this discussion could be more productive if you had a brief look at the
>>
>> specs you're depending on. the very first rule shown in
>>
>> is
>>
>>
>> URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
>>
>>
>> each URI is defined like this and must have a colon.
>>
>>
>> cheers,
>>
>>
>> dret.
>>
>>
>> --
>>
>> erik wilde | mailto: erik.wilde@dret.net <mailto: erik.wilde@dret.net> |
>>
>> | http://dret.net/netdret" rel="noopener nofollow">http://dret.net/netdret
>> |
>>
>> | http://twitter.com/dret" rel="noopener nofollow">http://twitter.com/dret
>> |
>>
>>

_______________________________________________
Uri-review mailing list
_______________________________________________
Uri-review mailing list
Uri-review@ietf.org
https://www.ietf.org/mailman/listinfo/uri-review" rel="noopener nofollow">https://www.ietf.org/mailman/listinfo/uri-review