Re: internationalization of URIs

Ted Hardie <> Wed, 24 October 2007 17:56 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1IkkSb-0000Ln-B6; Wed, 24 Oct 2007 13:56:05 -0400
Received: from discuss by with local (Exim 4.43) id 1IkkSa-0000Lh-1j for; Wed, 24 Oct 2007 13:56:04 -0400
Received: from [] ( by with esmtp (Exim 4.43) id 1IkkSZ-0000Kj-O2 for; Wed, 24 Oct 2007 13:56:03 -0400
Received: from ([]) by with esmtp (Exim 4.43) id 1IkkST-0007kA-FN for; Wed, 24 Oct 2007 13:56:03 -0400
Received: from ( []) by (8.13.6/8.12.5/1.0) with ESMTP id l9OHtkh7013318 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Wed, 24 Oct 2007 10:55:46 -0700
Received: from [] ( []) by (8.13.6/8.13.6/1.0) with ESMTP id l9OHthWC029003; Wed, 24 Oct 2007 10:55:44 -0700
Mime-Version: 1.0
Message-Id: <p06240602c3453209e29f@[]>
In-Reply-To: <>
References: <200710151939.l9FJdIkM003350@localhost.localdomain> <p06240601c339e99bc2e9@[]> <>
Date: Wed, 24 Oct 2007 10:55:43 -0700
To: Martin Duerst <>, Thomas Narten <>,
From: Ted Hardie <>
Subject: Re: internationalization of URIs
Content-Type: text/plain; charset="us-ascii"
X-Spam-Score: -4.0 (----)
X-Scan-Signature: fb6060cb60c0cea16e3f7219e40a0a81
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: general discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

Hi Martin,
	Some comments below.  I've kept your distribution,
but I'd suggest we pick a place to continue discussion; suggest
public-iri for followups.

>- There is in principle nothing that would prevent the IETF from
>  using IRIs as protocol elements in a new protocol. In my opinion,
>  this would actually be the right thing. The conversion to URIs
>  as protocol elements is there first and foremost for existing
>  protocols that are based on URIs.

I agree that there is nothing to prevent the IETF from using IRIs
as protocol elements.  For the large majority of IETF protocols,
though, this isn't the optimal choice.  For most protocol slots,
the aim isn't to present the element to a human, but to enable
protocol processing to proceed in a deterministic way.    This
means that most protocol slots don't benefit in any meaningful
way from the greatly increased  set of unreserved characters.
IRIs do introduce increased complexity in protocol processing
(I believe your "Ladder of comparison" section describes some
of the most critical ways), and when this is not needed because
of human interaction it is simpler to avoid it. 

Any protocol that wishes to use IRIs is also strongly advised
to take into account what restrictions of the UCS it wishes to
make; as you point out in IRI-bis, section 6.1, a scheme-specific
definition or processor may need to make a number of rules that exclude
look-alikes in both characters and delims.  Those will add increased
complexity as well, especially if they vary from scheme to scheme.

>>When I read Martin's comments about drop-downs, elided scheme
>>names, and similar tricks, my protocol-geek hat tightened on my head
> >and gave me a pretty severe headache.  Taking it off for a moment,
> >though, showed me things are still okay.  As presentation elements,
>>things like drop-downs, inference of scheme by an initial www, and
>>similar tricks are more reasonable.
>Detail: the scheme isn't inferenced by an initial www. A very quick
>test on one single browser showed that a leading 'ftp' label inferences
>ftp://, but there is no need for an initial www to infer http://.

I actually didn't mean in browser behavior; sorry for being unclear.  I mean
that humans see "www" now and assume that they should plug it into a web browser;
"infer a scheme" was probably not the ideal way of describing that, since most
users don't think of the scheme in that process (though many browsers do
prepend the scheme in final display).

> >We also have agreed, as a community, to take on work on some work
>>that does not rely on a presentation layer separation from the protocol.
>>We have agreed to work on email addresses, as one example,
>>and that working group decided not to use a pure presentation layer
>Yes. I expect this way of designing protocols to become more frequent.
>For new protocols, for me, it would be a non-brainer. For existing
>protocols, the decision is of course much more difficult, and
>will once go one way, and once the other way.

For new protocol slots that will be exposed to humans, I agree that
this will be more frequent.  The tricky bit is always figuring out
what will be.

Thanks for your comments,