Re: "Difficult Characters" draft (in URLs)

"Martin J. Duerst" <mduerst@ifi.unizh.ch> Sun, 11 May 1997 17:00 UTC

Received: from cnri by ietf.org id aa10892; 11 May 97 13:00 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa11165; 11 May 97 13:00 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id MAA18439 for uri-out; Sun, 11 May 1997 12:24:02 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with ESMTP id MAA18433 for <uri@services.bunyip.com>; Sun, 11 May 1997 12:23:57 -0400 (EDT)
Received: from josef.ifi.unizh.ch (josef.ifi.unizh.ch [130.60.48.10]) by mocha.bunyip.com (8.8.5/8.8.5) with SMTP id MAA05449 for <uri@bunyip.com>; Sun, 11 May 1997 12:23:53 -0400 (EDT)
Received: from enoshima.ifi.unizh.ch by josef.ifi.unizh.ch with SMTP (PP) id <25800-0@josef.ifi.unizh.ch>; Sun, 11 May 1997 18:23:53 +0200
Date: Sun, 11 May 1997 18:23:38 +0200
From: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
Reply-To: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
To: Alain LaBont/e'/ <alb@sct.gouv.qc.ca>
cc: URI mailing list <uri@bunyip.com>
Subject: Re: "Difficult Characters" draft (in URLs)
In-Reply-To: <3.0.1.16.19970421090941.327fb71e@riq.qc.ca>
Message-ID: <Pine.SUN.3.96.970508161120.245v-100000@enoshima>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="ISO-8859-1"
X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by services.bunyip.com id MAA18434
Sender: owner-uri@bunyip.com
Precedence: bulk
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by services.bunyip.com id MAA18439

Hello Alain,

Thanks for the lots of facts about French uppercase accented letters
(let's abbreviate them as UCAL for the sake of this discussion).
We are definitely making progress towards sorting these things out.

On Mon, 21 Apr 1997, Alain LaBont/e'/ wrote:

> A 23:16 97-05-07 +0200, Martin J. Duerst a écrit :

> >Is the vast majority of (France-)French users, at the present time,
> >able, in reasonable time and without having to consult a friend or
> >a manual and searching in menus and the like, to input an upper-case
> >accented letter?

> >If due to newer keyboards and the like, the answer is yes, then
> >of course we don't need a warning as I proposed it above.
> 
> [Alain] :
> Facts:
> 1. Keyboards exist in France to do that although they are not spread like
>    they are in Canada.

"Keyboards exist" is not very helpful. If the market penetration of such
keyboards is 10%, we better leave out UCAL; if it is 95%, we don't
have to worry much.


> 2. You also suggested that the general practice is to copy a URL as is,
>    which is easy with current browsers and GUIs.

By this, you probably mean cut/copy/paste, which is copying inside
the computer. That's of course not a problem, but that's not our
concern. Our concern is how easy it is to transfer UCAL from paper
to the computer.


> 3. It is also easy to copy a character, say under Windows or Mac, which is
>    the real environment for a vast majority of users, even if the majority
>    of Franco-French users don't have a keyboard with all upper case accented 
>    letters.

That's point 1. But it could work for paper->computer transfer if there
is a resonable chance that the user has a document with these characters
around. If not, he may be able to ask a French Canadian friend to send
him a mail with all these in it :-).


>    It is also possible to use macros to speed access to those
>    characters or to any character that is frequently used but that you don't 
>    have on your keyboard. Finally the <ALT><NumKeypad> is a widely-known
>    dirty but useful practice in France for which there were even protests
>    when one manufacturer removed this possibility on portables...

Yes. How well does the average user, not the DOS/Windows freak, in
France know these things? How many users just move their fingers to
the respective keys when they see the character? How many have
to ask some friend what the number for a particular letter was?
How many have to go and check the manual?


> 4. The most important: they now teach (or recommend to teach) in
> Franco-French 
>    schools to put accents on capital letters when the hardware can reproduce 
>    them (which is the case all the time with browsers and current graphic 
>    technology). If you wish I could give references.

No problem; I trust you. But how many people are there that have
done school in the earlier days where this was different? How many
people are there that believe that capital letters are not supposed
to have accents, because they have been told so in school?

I have to say that I am not completely unfamilliar with this problem,
we have similar things here in Switzerland. Because our typewriters
have to be usable for German, French, and Italian, they don't have
the German sharp s and the German uppercase Umlauts, and also of
course not the French UCAL. Sometimes I see secretaries writing
"Ue" or "Ae", and it's hard to convince them that they should
access the correct letter on the keyboard. Of course, for the
sharp s, no chance anymore, because we in Switzerland have no
clue about where to use it and where not (and we are happy with
it :-).


> 5. Typographers might correct what appears a spelling mistake to them in a URL 
>    in a magazine. I see this all the time. Now for typographers, upper case
>    accented letters even in France were always considered sacro-sanct, in
>    spite of what was taught in schools before because of embarrassment with 
>    mechanical typewriter technology (to avoid pupils' burn-outs (; )

So this means that to be exact, we would have to advise against
uppercase letters that if written correctly carry an accent,
in both forms (with or without an accent).


> The best compromise that could be done *at the limit* would be to be silent
> on this. But at least **don't** *recommend* to "avoid using upper case
> accented capitals", which will be interpreted as "avoid putting accents on
> capital letters"

If you think it will be interpreted this way, we have to be more
clear about it.

The main goal of the draft is to advise against letters that
may cause confusion and problems. In the case of UCAL, you
seem to see a conflict with another goal, namely to tell
people how to use their writing system correctly. The current
state of things is that the writing system cannot be used
correctly (ASCII only). The more we improve on this, the
better. If current keyboard deployment and the consequeces
of typewriter-adapted school education over decades make
it advisable to be careful about certain letters, then
there is nothing bad about it if we make this clear.
The draft is supposed to be a guideline for creating URLs,
to avoid the bad surprises of a user not finding a resource.
If we avoid a recommendation despite the fact that we knew
there was some problem, that's a bad idea.


> unless you recommend at once to avoid capital letters for
> everybody, not only the French-speaking people.

That's clearly an argument I can't accept. A lot of scripts don't
even have case distinctions. This is no reason that those scripts
that do have such a distinction shouldn't be able to use it. Some
scripts have more letters than others. Saying that a script A can
only use so and so many letters because B doesn't have more than
that is very strange.

The story is that *some* French in the past messed up UCAL when
they introduced typewriters. The great majority of the French
at least went silently with it. Newer technology is on the way
to fix this. We either judge that the situation is currently
fixed well enough, or we judge that it will still take some
time. Either way, we can't let the rest of the world be affected
by the messy French situation, as well as French shouldn't
be affected by problems in other languages and scripts.

Regards,	Martin.