Re: URN UUID question

"Martin J. Dürst" <> Wed, 26 March 2014 02:36 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 0E4561A0088 for <>; Tue, 25 Mar 2014 19:36:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.899
X-Spam-Status: No, score=0.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, MIME_8BIT_HEADER=0.3, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id GWvln-MOeOcZ for <>; Tue, 25 Mar 2014 19:36:06 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 8CD1B1A006C for <>; Tue, 25 Mar 2014 19:36:06 -0700 (PDT)
Received: from ( []) by (Postfix) with SMTP id 71D8E32E54A; Wed, 26 Mar 2014 11:36:04 +0900 (JST)
Received: from (unknown []) by with smtp id 071e_081e_607fe232_b48f_11e3_9b01_001e6722eec2; Wed, 26 Mar 2014 11:36:03 +0900
Received: from [IPv6:::1] (unknown []) by (Postfix) with ESMTP id DE0DFBF544; Wed, 26 Mar 2014 11:36:03 +0900 (JST)
Message-ID: <>
Date: Wed, 26 Mar 2014 11:35:57 +0900
From: =?UTF-8?B?Ik1hcnRpbiBKLiBEw7xyc3Qi?= <>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
MIME-Version: 1.0
To: Sandro Hawke <>, Joel Kalvesmaki <>, "Dale R. Worley" <>
Subject: Re: URN UUID question
References: <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: discussion of new namespace identifiers for URNs <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 26 Mar 2014 02:36:09 -0000

Hello Sandro,

On 2014/03/24 20:44, Sandro Hawke wrote:
> On 03/24/2014 12:48 PM, "Martin J. Dürst" wrote:
>> On 2014/03/20 09:15, Joel Kalvesmaki wrote:
>>> I need unique, persistent names. The name needs to be a single
>>> IRI/URI to
>>> allow any version of any document to be named easily in any RDF
>>> declarations any third party might want to make.
>>> After reading the specs on the tag URN, I'm very impressed, and think
>>> that
>>> it will suit the XML model nicely. Tag URNs provide two extra bonuses I
>>> hadn't anticipated: human readability and decentralized unique agent
>>> identification.
>>> I do wish IRI forms of tag URNs had gotten off the ground, but maybe
>>> that
>>> will come some day?
>> Looking at, that indeed seems to
>> be an unfortunate oversight.
>> Instead of
>> >>>>
>>    In the interests of tractability to humans, tags SHOULD NOT be minted
>>    with percent-encoded parts.  However, the tag syntax does allow
>>    percent-encoded characters in the "pchar" elements (defined in RFC
>>    3986 [1]).
>> >>>>
>> It should allow percent-encoded parts also in the authorityName part,
>> and specify that in all cases, such percent-encoded parts must be
>> created and interpreted using UTF-8. After all, that's what RFC 3986
>> (which is heavily cited) says for authority names.
> Interesting.   I'm trying to remember the motivations here.
> Certainly unnecessary percent encoding is a problem because it causes
> confusion about whether two URIs are the same.   (If you have to ask
> that, they are not.   But people may not realize that.   Some people
> might think ",2014:A" and
> ",2014:%41" are the same, but they are not.)

Actually, they are not the same for XML namespaces and for RDF, but for 
http and for search engines, they are (pretty much) the same. Now tag: 
URIs are probably more used in RDF than in http, so your point is 
certainly important.

> On the authorityName, if it's a DNSName, presumably you'd use punycode,
> not percent encoding, right?

Well, if it's an IRI, just use the original characters :-).

> If it's an emailAddress, presumably you'd
> use punycode for the DNSname part of it.  I don't know what one's
> supposed to use for the part before the @ in an email address?      I
> haven't kept up on the email standards.    Is there consensus about that?

Please see and Essentially, UTF-8 is used in the 
mail protocol and format. For URIs, see So again, just using the actual 
characters is the best thing to do.

>> Sandro, Tim, is there a chance this can be fixed sooner or later?
> I'm not using or endorsing tag: URIs at all these days.   From my
> perspective, http or https URLs are better in very-nearly every
> situation.    But I wouldn't be opposed to someone else updating the tag
> URI spec.


Regards,   Martin.