Re: p1: HTTP(S) URIs and fragment identifiers

"Martin J. Dürst" <> Mon, 22 April 2013 08:10 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 87DAD21F8C55 for <>; Mon, 22 Apr 2013 01:10:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -10.299
X-Spam-Status: No, score=-10.299 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id aJp9L5OzjI-n for <>; Mon, 22 Apr 2013 01:10:05 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id D449621F8BC0 for <>; Mon, 22 Apr 2013 01:10:04 -0700 (PDT)
Received: from lists by with local (Exim 4.72) (envelope-from <>) id 1UUBoY-0001i0-Kr for; Mon, 22 Apr 2013 08:09:30 +0000
Resent-Date: Mon, 22 Apr 2013 08:09:30 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtp (Exim 4.72) (envelope-from <>) id 1UUBoU-0001gz-5B for; Mon, 22 Apr 2013 08:09:26 +0000
Received: from ([]) by with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <>) id 1UUBoR-0005BK-34 for; Mon, 22 Apr 2013 08:09:26 +0000
Received: from ([]) by (secret/secret) with SMTP id r3M88eKQ020225; Mon, 22 Apr 2013 17:08:40 +0900
Received: from (unknown []) by with smtp id 24cd_5a35_d7b54836_ab23_11e2_beaf_001e6722eec2; Mon, 22 Apr 2013 17:08:40 +0900
Received: from [IPv6:::1] (unknown []) by (Postfix) with ESMTP id 8A358BFBD3; Mon, 22 Apr 2013 17:08:17 +0900 (JST)
Message-ID: <>
Date: Mon, 22 Apr 2013 17:08:29 +0900
From: "\"Martin J. Dürst\"" <>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv: Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: Zhong Yu <>
CC: Mark Nottingham <>, Julian Reschke <>, " Group" <>
References: <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Received-SPF: none client-ip=;;
X-W3C-Hub-Spam-Status: No, score=-4.6
X-W3C-Hub-Spam-Report: AWL=-2.271, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001
X-W3C-Scan-Sig: 1UUBoR-0005BK-34 07881b06aaf75d625b3a1ca2c3076c7d
Subject: Re: p1: HTTP(S) URIs and fragment identifiers
Archived-At: <>
X-Mailing-List: <> archive/latest/17465
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

On 2013/04/21 1:39, Zhong Yu wrote:
> I like Mark's idea better. In many contexts when people use the term
> HTTP URI they would consider fragment a legit optional part, so the
> spec should not appear to exclude it in general.

This problem was detected for the general case by IRI WG last summer. 
Please see:

The IRI WG is now closed, and it is unclear when or whether any of the 
work will be continued/completed. But this is a clear issue that those 
defining URI schemes (anew as the text in the tracker assumes, or also 
just as a spec update as for http and https) have to understand and deal 

 From that issue:

 > 1) Fragment identifiers are part of URIs, and scheme definitions
 > cannot and MUST NOT disallow fragments on specific schemes (even if
 > the usability of a fragment id on the particular scheme being defined
 > seems questionable at the time the scheme definition is made).

 > 2) Fragment identifiers are independent of schemes, depending on MIME
 > media types, and therefore scheme definitions cannot define anything
 > about fragment identifiers.

> On Sat, Apr 20, 2013 at 3:16 AM, Mark Nottingham<>  wrote:
>> On 20/04/2013, at 6:07 PM, Julian Reschke<>  wrote:
>>> On 2013-04-20 09:30, Mark Nottingham wrote:
>>>> On 20/04/2013, at 5:28 PM, Julian Reschke<>  wrote:
>>>>> On 2013-04-20 06:07, Mark Nottingham wrote:
>>>>>> P1 sections 2.7.1 and 2.7.2 define the HTTP and HTTPS URI schemes without fragment identifiers.
>>>>>> While it's true that HTTP sends these URIs without fragids "on the wire" in the request-target, the schemes *do* allow fragids pretty much everywhere else they're used (including some places in HTTP, e.g., the Location header).
>>>>>> Given that this is going to be the definition for these URI schemes, and we already require that the fragid be omitted in the request-target, shouldn't the syntax allow a fragment identifier?
>>>>> No.
>>>>> Fragment identifiers are allowed for *any* URI scheme; the scheme definition doesn't need to include it.
>>>> Then why do we include query?
>>> Because we are defining a<>.

Because the query is scheme-specific. For people who mostly work with 
HTTP, the query part seem like a sure thing, but there are many schemes 
where query parts are not allowed, and others where query parts have 
very specific protocol implications (e.g. mailto:), whereas in HTTP, 
it's pretty much just "something sent to the server", and details are 
left to client and server internals.

>> But this section *isn't* defining just a single protocol element; it's defining the form of HTTP URIs.
>> I suspect the right thing to do here is to specify that HTTP(s) URIs use the path-abempty form of the hier-part, give some examples, and leave the rest of the ABNF to RFC3986 (or its successors).

That's one way to do it. And of course it has to say that the <scheme> 
part is "http" for http URIs and "https" for https URIs. That's pretty 
obvious to everybody, but we are writing the spec that's nailing this down.

Another alternative is to say that the http-URI and https-URI 
productions correspond to (are concretizations of) the absolute-URI 
production in RFC 3986.

Another solution is to just leave the current text.

But in any of the three above cases (and in particular in the last one), 
it might be very helpful to add a note that of course these schemes can 
take fragments, and point to for details.

If the fragment part is included, then it would be better to rename the 
productions to something like http-URI-reference and 
https-URI-reference, and say that they correspond to URI-reference in 
RFC 3986.

>> At any rate, I don't see the http-uri or https-uri rules actually used anywhere normatively,

Do you mean inside the HTTP spec suite, or have you checked other specs? 
If not the productions then the actual syntax is used VERY widely.

Regards,   Martin.

>> so I *think* this is editorial. I.e., it's on your conscience :)
>> Cheers,
>> --
>> Mark Nottingham