Re: [apps-discuss] Scope of RFC3986 and successor - what is a URI?

Sam Ruby <rubys@intertwingly.net> Sat, 17 January 2015 19:00 UTC

Return-Path: <rubys@intertwingly.net>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 45F441ACF02 for <apps-discuss@ietfa.amsl.com>; Sat, 17 Jan 2015 11:00:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1S3pU3OPzkvx for <apps-discuss@ietfa.amsl.com>; Sat, 17 Jan 2015 11:00:22 -0800 (PST)
Received: from cdptpa-oedge-vip.email.rr.com (cdptpa-outbound-snat.email.rr.com [107.14.166.228]) by ietfa.amsl.com (Postfix) with ESMTP id C9E001ACD4E for <apps-discuss@ietf.org>; Sat, 17 Jan 2015 11:00:21 -0800 (PST)
Received: from [98.27.51.253] ([98.27.51.253:50099] helo=rubix) by cdptpa-oedge03 (envelope-from <rubys@intertwingly.net>) (ecelerity 3.5.0.35861 r(Momo-dev:tip)) with ESMTP id 3B/4B-15759-441BAB45; Sat, 17 Jan 2015 19:00:21 +0000
Received: from [192.168.1.102] (unknown [192.168.1.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: rubys) by rubix (Postfix) with ESMTPSA id 5CE9B140872; Sat, 17 Jan 2015 14:00:20 -0500 (EST)
Message-ID: <54BAB143.1080006@intertwingly.net>
Date: Sat, 17 Jan 2015 14:00:19 -0500
From: Sam Ruby <rubys@intertwingly.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: Graham Klyne <gk@ninebynine.org>, apps-discuss@ietf.org
References: <20140926010029.26660.82167.idtracker@ietfa.amsl.com> <54B18B61.8010308@seantek.com> <54B19435.8070401@intertwingly.net> <54B1B211.3050807@seantek.com> <54B1B682.3070609@intertwingly.net> <012001d02d91$6ec42300$4001a8c0@gateway.2wire.net> <54B2781C.4040505@intertwingly.net> <018e01d02dc6$1d03b0a0$4001a8c0@gateway.2wire.net> <54B2CC75.5080900@intertwingly.net> <54B79930.3070009@ninebynine.org> <54B7AEC2.9010109@intertwingly.net> <CAKHUCzz=jZAF-i2_pwGpkER5vNhv95CMwdBCMwigPJ0FA_t4_A@mail.gmail.com> <54B7BD4A.1090803@intertwingly.net> <54B7CF28.7060408@gmx.de> <54B7D605.2060307@intertwingly.net> <f5boaq0gdw5.fsf@troutbeck.inf.ed.ac.uk> <54B806A2.8020803@intertwingly.net>, <CAKHUCzzN4Eu6R_f2Sf8EtiAp-8w3ds5Yp3-PBHK+B0wGRxEtmw@mail.gmail.com> <54b9381b.8ca1e00a.243f.ffffcae4@mx.google.com> <alpine.DEB.2.00.150116172024 0.20283@tvnag.unkk.fr> <98A81DE7-1845-46EC-A3EB-F00438863ECB@seantek.com> <54B93F2A.5070900@intertwingly.net> <54BA7EE2.1040102@ninebynine.org>
In-Reply-To: <54BA7EE2.1040102@ninebynine.org>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-RR-Connecting-IP: 107.14.168.142:25
X-Cloudmark-Score: 0
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/wQ7_8Mo-4jhOCGZj82NHePe6Bw8>
Subject: Re: [apps-discuss] Scope of RFC3986 and successor - what is a URI?
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Jan 2015 19:00:23 -0000

On 01/17/2015 10:25 AM, Graham Klyne wrote:
> On 16/01/2015 16:41, Sam Ruby wrote:
>> As to what the breakage it, that is less clear to me.  There are existing
>> parsers that don't percent encode square brackets when they occur at
>> some point
>> after a question mark is encountered in the input. Perhaps those
>> parsers lose
>> the ability to claim that they are "RFC 3986 compliant".
>
> Surely, it's not the role of a *parser* to %-encode, but a *generator*
> of URIs?
>
> The primary role of a URI parser is to simply decide if a given string
> is or is not a valid URI.  A parser can only be RFC3986-compliant in the
> extent to which it correctly makes this determination in accordance with
> RFC3986.  Of course, parsers may do more than this, but the detail of
> such behaviour is not specified by RFC3986.
>
> (I would say that a *generator* of URIs that does not %-encode square
> brackets in fragments is not RFC3986 compliant.)

As many people have pointed out, nomenclature seems to be a big problem 
here.  I started to write a reply that spells this out, but I realized 
that I was repeating things that I've said before, and figured it made 
sense to pull it out into a separate blog post that I can point to:

http://intertwingly.net/blog/2015/01/17/RFC-3986bis

TL;DR: URL parsers consume URLs and generate URIs.  Such URIs are not 
RFC 3986 complaint.  I’d like to fix that.

> #g
> --

- Sam Ruby