Re: use of IRIs in Atom

Mark Nottingham <mnot@mnot.net> Wed, 01 August 2012 16:43 UTC

Return-Path: <owner-atom-syntax@mail.imc.org>
X-Original-To: ietfarch-atompub-archive@ietfa.amsl.com
Delivered-To: ietfarch-atompub-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4A46711E81ED for <ietfarch-atompub-archive@ietfa.amsl.com>; Wed, 1 Aug 2012 09:43:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -104.587
X-Spam-Level:
X-Spam-Status: No, score=-104.587 tagged_above=-999 required=5 tests=[AWL=-2.588, BAYES_00=-2.599, J_CHICKENPOX_34=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 56MC0fcKWrMn for <ietfarch-atompub-archive@ietfa.amsl.com>; Wed, 1 Aug 2012 09:43:37 -0700 (PDT)
Received: from hoffman.proper.com (IPv6.Hoffman.Proper.COM [IPv6:2605:8e00:100:41::81]) by ietfa.amsl.com (Postfix) with ESMTP id 3299311E81DC for <atompub-archive@ietf.org>; Wed, 1 Aug 2012 09:43:36 -0700 (PDT)
Received: from hoffman.proper.com (localhost [127.0.0.1]) by hoffman.proper.com (8.14.5/8.14.5) with ESMTP id q71FkBvQ025694 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 1 Aug 2012 08:46:11 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
Received: (from majordom@localhost) by hoffman.proper.com (8.14.5/8.13.5/Submit) id q71FkBeG025692; Wed, 1 Aug 2012 08:46:11 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
X-Authentication-Warning: hoffman.proper.com: majordom set sender to owner-atom-syntax@mail.imc.org using -f
Received: from mxout-07.mxes.net (mxout-07.mxes.net [216.86.168.182]) by hoffman.proper.com (8.14.5/8.14.5) with ESMTP id q71FkAUV025680 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <atom-syntax@imc.org>; Wed, 1 Aug 2012 08:46:10 -0700 (MST) (envelope-from mnot@mnot.net)
Received: from dhcp-40be.meeting.ietf.org (unknown [130.129.64.190]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 3B03522E253; Wed, 1 Aug 2012 12:37:20 -0400 (EDT)
Content-Type: text/plain; charset="windows-1252"
Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\))
Subject: Re: use of IRIs in Atom
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <50195AF5.4000308@stpeter.im>
Date: Wed, 01 Aug 2012 09:37:22 -0700
Cc: James M Snell <jasnell@gmail.com>, atom-syntax@imc.org, masinter@adobe.com, duerst@it.aoyama.ac.jp, chris@lookout.net
Message-Id: <F768EEB0-FC62-432C-BE3F-C2AA18AEF85E@mnot.net>
References: <50185435.5020300@stpeter.im> <CABP7RbfYv0oLJ5KDpVoTVBKz0FHh69Jm365VFts76QZ94fb8mg@mail.gmail.com> <67CECBD6-0810-4C21-832A-EA2CDFD24F9B@mnot.net> <50195AF5.4000308@stpeter.im>
To: Peter Saint-Andre <stpeter@stpeter.im>
X-Mailer: Apple Mail (2.1485)
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by hoffman.proper.com id q71FkBUU025687
Sender: owner-atom-syntax@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/atom-syntax/mail-archive/>
List-Unsubscribe: <mailto:atom-syntax-request@imc.org?body=unsubscribe>
List-ID: <atom-syntax.imc.org>

Well, indeed; all they really need to be is a string that's unique. The important aspect is that they're compared character-by-character, and IMHO that's made most apparent when they're just strings.

Anyway, like you say…


On 01/08/2012, at 9:36 AM, Peter Saint-Andre <stpeter@stpeter.im> wrote:

> And why even URLs? They could have been UUIDs or somesuch. But that's
> water under the bridge, I suppose...
> 
> On 8/1/12 10:26 AM, Mark Nottingham wrote:
>> My .02 - if we were to do Atom again, I wouldn't have used IRIs for identifiers; they're not presented to users, so they don't really offer any benefit over URIs.
>> 
>> Cheers,
>> 
>> 
>> On 31/07/2012, at 3:11 PM, James M Snell <jasnell@gmail.com> wrote:
>> 
>>> IRIs in Atom are generally used for two purposes:
>>> 
>>> 1. Opaque-identifiers
>>> 2. Links
>>> 
>>> When used as identifiers, the IRIs will typically be validated to generally conform to Absolute IRI structure requirements but are otherwise treated as opaque strings. Comparison is performed character-by-character in a case-sensitive manner without any further processing.
>>> 
>>> When used as links, the IRIs will typically be handed off to existing HTTP stacks -- in some cases converted to URIs first. Most Atom stacks that I have seen do not perform any particular processing of such URLs beyond the possible IRI2URI conversion. Atom does support the use of relative references and the establishment of base URLs resolved against either the HTTP context (e.g. request URI, content-location header, etc) or the xml:base attribute. Beyond ASCII conversion and relative reference resolution, however, I would wager that the majority of Atom stacks typically treat such IRIs as generally opaque and depend on the underlying HTTP stack to "process" the IRI.
>>> 
>>> - James
>>> 
>>> On Tue, Jul 31, 2012 at 2:55 PM, Peter Saint-Andre <stpeter@stpeter.im> wrote:
>>> 
>>> The IETF's IRI WG is working to update RFC 3987 (Internationalized
>>> Resource Identifiers). We're doing some informal research to determine
>>> how IRIs are used in existing protocols [1], and Atom seems like an
>>> interesting case. In particular with respect to IRI processing, we're
>>> curious whether Atom implementations (a) strictly follow the rules from
>>> RFC 3987, (b) use the same processing algorithms as other XML
>>> applications, or (c) use the same processing algorithms as HTML
>>> applications. If you have insights into this issue, please do let us know.
>>> 
>>> Thanks!
>>> 
>>> Peter
>>> 
>>> [1] http://lists.w3.org/Archives/Public/public-iri/2012Jul/0060.html
>>> 
>>> 
>>> 
> 

--
Mark Nottingham
http://www.mnot.net/