Re: [apps-discuss] Scope of RFC3986 and successor - what is a URI?

Larry Masinter <masinter@adobe.com> Sun, 18 January 2015 21:09 UTC

Return-Path: <masinter@adobe.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 566261ACE16 for <apps-discuss@ietfa.amsl.com>; Sun, 18 Jan 2015 13:09:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qW7acBQv8duo for <apps-discuss@ietfa.amsl.com>; Sun, 18 Jan 2015 13:09:35 -0800 (PST)
Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2on0070.outbound.protection.outlook.com [65.55.169.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 94A7D1ACD13 for <apps-discuss@ietf.org>; Sun, 18 Jan 2015 13:09:35 -0800 (PST)
Received: from DM2PR0201MB0960.namprd02.prod.outlook.com (25.160.216.28) by DM2PR0201MB0959.namprd02.prod.outlook.com (25.160.216.27) with Microsoft SMTP Server (TLS) id 15.1.53.17; Sun, 18 Jan 2015 21:09:33 +0000
Received: from DM2PR0201MB0960.namprd02.prod.outlook.com ([25.160.216.28]) by DM2PR0201MB0960.namprd02.prod.outlook.com ([25.160.216.28]) with mapi id 15.01.0053.000; Sun, 18 Jan 2015 21:09:33 +0000
From: Larry Masinter <masinter@adobe.com>
To: Sam Ruby <rubys@intertwingly.net>
Thread-Topic: [apps-discuss] Scope of RFC3986 and successor - what is a URI?
Thread-Index: AQHQMK/BbmNYRMhSwEGw8Ys6JgInPJzBF8kAgAAGVICAAAr/AIAAG5asgAAEooCAAAkQgIAAA94AgAA70ACAAB+YgIAAFYKQgAA0mYCAABAxgIABeKuAgAGoDQCAABJigIAAFm7wgACjAQCAABVEQA==
Date: Sun, 18 Jan 2015 21:09:32 +0000
Message-ID: <DM2PR0201MB09600E751795BF09A6237674C34D0@DM2PR0201MB0960.namprd02.prod.outlook.com>
References: <20140926010029.26660.82167.idtracker@ietfa.amsl.com> <54B79930.3070009@ninebynine.org> <54B7AEC2.9010109@intertwingly.net> <CAKHUCzz=jZAF-i2_pwGpkER5vNhv95CMwdBCMwigPJ0FA_t4_A@mail.gmail.com> <54B7BD4A.1090803@intertwingly.net> <f5ba91kjdt0.fsf@troutbeck.inf.ed.ac.uk> <54B7D851.7060201@intertwingly.net> <CAL0qLwbx3gSr1fJ1iw5QMk3dj2Dm4JMQzsUV_fnr9ef+M2T19g@mail.gmail.com> <54B7E32A.9090800@intertwingly.net> <CAL0qLwbJrcpKhsCAsD_CLAqQQ9rR8GhtpG2xeGO4mGQLriAcYQ@mail.gmail.com> <54B82FD7.9060208@intertwingly.net> <DM2PR0201MB0960CBA360C126703D002895C34E0@DM2PR0201MB0960.namprd02.prod.outlook.com> <54B86E01.5000607@intertwingly.net> <DM2PR0201MB096068FC251451B76FBA859CC34C0@DM2PR0201MB0960.namprd02.prod.outlook.com> <54B9B78F.3060000@intertwingly.net> <DM2PR0201MB0960802D3C63137E802FEEFFC34D0@DM2PR0201MB0960.namprd02.prod.outlook.com> <54BB2AB3.5090805@intertwingly.net> <DM2PR0201MB096099AB8117CDB65A29FA51C34D0@DM2PR0201MB0960.namprd02.prod.outlook.com> <54BBC640.5000607@intertwingly.net>
In-Reply-To: <54BBC640.5000607@intertwingly.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2601:9:8380:992:ec21:132e:caaf:6bd0]
authentication-results: spf=none (sender IP is ) smtp.mailfrom=masinter@adobe.com;
x-dmarcaction-test: None
x-microsoft-antispam: BCL:0;PCL:0;RULEID:(3005004);SRVR:DM2PR0201MB0959;
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:; SRVR:DM2PR0201MB0959;
x-forefront-prvs: 046060344D
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(199003)(189002)(74316001)(105586002)(86362001)(99286002)(76176999)(62966003)(50986999)(54356999)(77156002)(15975445007)(97736003)(33656002)(102836002)(64706001)(2950100001)(68736005)(46102003)(54206007)(19580395003)(110136001)(93886004)(106356001)(92566002)(122556002)(40100003)(76576001)(2656002)(106116001)(101416001)(87936001)(3826002); DIR:OUT; SFP:1101; SCL:1; SRVR:DM2PR0201MB0959; H:DM2PR0201MB0960.namprd02.prod.outlook.com; FPR:; SPF:None; MLV:sfv; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: adobe.com does not designate permitted sender hosts)
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: adobe.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Jan 2015 21:09:32.8629 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: fa7b1b5a-7b34-4387-94ae-d2c178decee1
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0201MB0959
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/cU-hKbTMgH5RDP-Hb0x2h8qcHZs>
Cc: "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] Scope of RFC3986 and successor - what is a URI?
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Jan 2015 21:09:39 -0000

>         The primary role of a URI parser is to simply decide if a given
>         string is or is not a valid URI.  A parser can only be
>         RFC3986-compliant in the extent to which it correctly makes 
>         this determination in accordance with RFC3986.

I disagree with both these sentences, at least as I understand the
terms "primary role", "parser", "valid", "compliant".

The primary role of a parser http://en.wikipedia.org/wiki/Parsing#Parser
is to take input data and give a structural representation of the input,
while checking for correct syntax in the process.

Deciding validity is the role of a http://en.wikipedia.org/wiki/Validator.

RFC 3986 mainly defines conformance for URIs, and I don't
find any 2119-MUST/MAY/SHOULD for parsers, so 
any parser that accepts valid-per-3986 URI input and parses
it successfully could be "conforming",  even if also accepts invalid
input. 



> LDAP schemas that make use of IA5String can be said to be spec compliant 
> in that they accept all RFC 3986 compliant URIs.  But that's quite a 
> different matter than saying that they implement RFC 3986 in any 
> meaningful way.
 
Pass-through applications that embed URIs, and implementations
that only do minimal validation (but don't reject all invalid input)
should be counted as meaningful use cases and implementations
when considering whether changes are compatible.

RFC 3986 is an Internet Standard, see "Revising a Standard"
https://tools.ietf.org/html/rfc2026#section-6.3 
 
Changes would be best accomplished by making a new
standard rather than update the old, unless it's clear
that the overwhelming majority of the installed base
wouldn't be affected.  I don't think your examples
meet that criterion. 
It's possible that other updates to 3986 might. We
should discuss:

* Changes that make previously invalid strings valid URIs
 (e.g., allowing additional reserved ASCII characters like #
and <>  in the fragment?)

* Changes that make previously valid strings invalid
 (e.g., disallowing 258.258.258.258 as a host?) 

* Changes that affect how the structure of URI strings
  are described, the output of parsing. These might be
 OK but so many other specs make normative
  reference into 3986 that it's worthwhile being cautious.

Updating or replacing 3987 I_IRI to align with W_URLWW
is a different story because it's a different status.

Larry
--
http://larry.masinter.net