RFC1808 bug? (fragment syntax)
Dan Connolly <connolly@w3.org> Mon, 23 September 1996 16:50 UTC
Received: from cnri by ietf.org id aa02730; 23 Sep 96 12:50 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa19733;
23 Sep 96 12:50 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id
SAA16728 for uri-out; Fri, 20 Sep 1996 18:20:20 -0400
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by
services.bunyip.com (8.6.10/8.6.9) with SMTP id SAA16723 for
<uri@services.bunyip.com>; Fri, 20 Sep 1996 18:20:18 -0400
Received: from beach.w3.org by mocha.bunyip.com with SMTP
(5.65a/IDA-1.4.2b/CC-Guru-2b)
id AA09710 (mail destined for uri@services.bunyip.com);
Fri, 20 Sep 96 18:20:17 -0400
Received: (from connolly@localhost) by beach.w3.org (8.7.5/8.7.3) id WAA01129;
Fri, 20 Sep 1996 22:22:30 GMT
Date: Fri, 20 Sep 1996 22:22:30 GMT
Message-Id: <199609202222.WAA01129@beach.w3.org>
From: Dan Connolly <connolly@w3.org>
To: uri@bunyip.com
Subject: RFC1808 bug? (fragment syntax)
Sender: owner-uri@bunyip.com
Precedence: bulk
The URL RFCs are definitely getting crufty... ------- Start of forwarded message ------- From: MACRIDES@SCI.WFBR.EDU (Foteos Macrides) Subject: Re: Extended URL for frames To: www-html@w3.org Date: 16 Sep 1996 00:09:52 +0200 Message-ID: <01I9IM1G4HH4005LBE@SCI.WFBR.EDU> jrd@netcom.com (Jon Degenhardt) wrote: >Daniel W. Connolly <connolly@w3.org> writes: >> Hmmm... I'm pretty sure I've seen implementations that scan >> from the right for the first #, and consider that to be the >> split between the URL and the fragment identifier. > >This is the first step in the parsing algorithm described in RFC 1808, >"Relative Uniform Resource Locators" (http://www.w3.org/pub/WWW/ >Addressing/rfc1808.txt). RFC1738 was written back in the days when the assumption was that there'd be only one '#', as a fragment delimiter, and that in all other cases it would be hex escaped. It also recommends hex escaping for URL schemes which do no normally have a fragment. So based on that, the direction of parsing for the '#' is irrelevant. The libwww, through the current W3C Reference Library, parses for the fragment from right to left, and thus will use the last one if there is more than one unescaped '#'. One would expect right to left parsing for NS, since it started off as a rewrite of XMosaic, for the Mosaics themselves, for Arena, probably Amaya, and for most if not all old and moderately old browsers (i.e., they'll "fail" the "test" as did NS). The wording of RFC1808 instead indicates left to right parsing, and implies that non-hex escaped '#'s can occur to the right of the one which delimits the fragment. The MSIE parser probably was guided by RFC1808, which is probably why the "test" "worked" with it. Fote ========================================================================= Foteos Macrides Worcester Foundation for Biomedical Research MACRIDES@SCI.WFBR.EDU 222 Maple Avenue, Shrewsbury, MA 01545 ========================================================================= ------- End of forwarded message -------
- RFC1808 bug? (fragment syntax) Dan Connolly