Re: "Difficult Characters" draft

"Martin J. Duerst" <> Wed, 07 May 1997 09:18 UTC

Received: from cnri by id aa23047; 7 May 97 5:18 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa06061; 7 May 97 5:18 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id EAA01297 for uri-out; Wed, 7 May 1997 04:49:40 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with ESMTP id EAA01292 for <>; Wed, 7 May 1997 04:49:38 -0400 (EDT)
Received: from ( []) by (8.8.5/8.8.5) with SMTP id EAA02992 for <>; Wed, 7 May 1997 04:49:32 -0400 (EDT)
Received: from by with SMTP (PP) id <>; Wed, 7 May 1997 10:48:55 +0200
Date: Wed, 07 May 1997 10:48:48 +0200
From: "Martin J. Duerst" <>
To: Alain LaBont/e'/ <>
cc: URI mailing list <>
Subject: Re: "Difficult Characters" draft
In-Reply-To: <>
Message-ID: <Pine.SUN.3.96.970507103030.245X-100000@enoshima>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Precedence: bulk

On Mon, 21 Apr 1997, Alain LaBont/e'/ wrote:

> >Actually, you can write
> >
> >> http://wWw.lAmUtUeLlE.CoM/agent/home.htm?aid=S200569
> >
> >and it will still work. But please leave the part after the first
> >single slash alone.
> All right... That is not very user friendly. Totally inconsistent... from a
> user perspective, undesirable...

Yes, it's definitely not very consistent. But one has to be aware of
the fact that URLs are patched together from an enormous variety of
things, they are kind of a microcosmos reflecting a large number
of internet and other software mechanisms. And all of these have
different histories, different requirements, and so on, and this
shows up in URLs.

To take up the above examlpe, because DNS ignores case differences,
it is impossible for URLs to decree that
should not match with
On the other hand, because case differences in forms and query parts
can be extremely relevant, it is impossible to decree, on the URL
side, that
are equivalent, even though they might be made equivalent on the
server side. So for URLs, and likewise for other kinds of identifiers,
a general case equivalence policy is not possible. The only consistent
and reliable message to users is:

	When transcribing URLs, take care of case (and all other details)!

If you respect this message, you will never see bad surprises. The
fact that in some cases some mistakes are tolerated has to be taken
as a gift and is in no way guaranteed. And users that are not thought
this message will find out quickly.

> >In case there is indeed equivalence, as we currently have it in domain
> >names, it will be the task of domain name internationalization to
> >decide what to do about it, whether to make the usual domain names
> >case sensitive or whether to introduce case eqivalences for characters
> >outside ASCII or whatever. There is no problem with any kind of
> >URL scheme or mechanism to introduce additional eqivalences where
> >they see fit, but we can't introduce them for all URLs.
> I'm puzzled that the notion of consistency is neglected... I learned
> something.

Alain - The Internet is a highly dynamic, if not to say chaotic,
environment. People always try to be consistent, but they also
have their own ideas about how things should work. And the reason
that DNS does case-equivalence is maybe in part due to some idea
of user-friendliness, but mainly due to the fact that in the old
times, there were quite a few computers that had difficulties
representing both upper case and lower case.

Regards,	Martin.