Re: [I18n-discuss] Comments on "troublesome-characters" from Arabic script

Andrew Sullivan <ajs@anvilwalrusden.com> Thu, 27 July 2017 21:13 UTC

Date: Thu, 27 Jul 2017 17:13:20 -0400
From: Andrew Sullivan <ajs@anvilwalrusden.com>
To: Abdulaziz Al-Zoman <azoman@citc.gov.sa>, Raed Al-Fayez <rfayez@citc.gov.sa>
Cc: "'i18n-discuss@iab.org'" <i18n-discuss@iab.org>
Message-ID: <20170727211320.dkano7pdmjxoj62h@mx4.yitter.info>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <043D7B5CFC1AB8469108EA7BB5F68BB00124076934@ry0mail1.citc.gov.sa> <EDEC5B615F83D44981FA2D0DCA9971670131709366@ry0mail1.citc.gov.sa>
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18n-discuss/0t6tpegmmaZACWHHk7bCyYpchrQ>
Subject: Re: [I18n-discuss] Comments on "troublesome-characters" from Arabic script
Precedence: list

Greetings, and thank you both for the comments on the draft.

I am keen to understand something in what you are both saying, because
it suggests to me that our intent in this draft is not coming across clearly.

On Mon, Jul 24, 2017 at 04:41:34AM +0000, Abdulaziz Al-Zoman wrote:

> For example, the registry includes some essential characters
> (letters) that may result at the end useless identifiers if these
> characters are restricted or blocked (because they are part of the
> repository).   For instance, with respect to the Arabic language,
> the registry consists of a large portion of the Arabic basic
> alphabet that may result to a limited character set for creating
> identifiers

On Tue, Jul 25, 2017 at 08:20:28AM +0000, Raed Al-Fayez wrote:

> I regret to inform you that I completely opposing of including
> essential code points of the Arabic language (which are also used by
> many other languages in the Arabic script) to the repository tables
> of this standards track internet-draft. As a reader of this
> (later-to-be) RFC may consider them as "Troublesome Characters"
> while they are not!
> 
> The majority of the problematic cases listed in the internet-draft's
> table (part of the Arabic code points) were due to the misuses of
> non-spacing marks. Also, most of the cases are not used by any
> language in the Arabic script and some of the others do not make any
> sense. Therefore, it is not wise and not practical to risk essential
> code points because of not solid cases.
> 

Do you both agree that the characters to which you are referring can
properly be used for identifies only by people who actually understand
how they relate to each other, and not without substantial care and
understanding on the part of whoever is permitting the use or
registering the identifier?  If not, do you think instead that the
characters can be used without any policies at all?

The aim of the draft is to create the conditions for guidance for
operators and applications, so that it is possible for (for instance)
my user agent to work globally, even when some parts of the global
linguistic environment is foreign to me and when there are no
in-protocol clues about the language I'm facing.  (Examples of this
sort of identifier are things like DNS names, mail names, XMPP
chatroot names, and so on.  Web pages and the like have the ability to
negotiate language, which makes the problem somewhat less acute.)  One
thing we can do to make those conditions a little bit safer, I think,
is to have evidence that whoever is in charge of the registration
permissions actually has some policy or set of rules; and so my user
agent could check to see that, even if I can't read the identifier
reliably, the person responsible for its creation has followed some
set of rules that prevents serious confusion from arising.  In other
words, the registry is not intended to "block" all of the included
code points, but rather to indicate that these are code points for
which it is even more important than usual that extra care has been
taken.

I am a little worried that the above does not seem to be the
conclusion you drew from the draft, which suggests to me that we have
not made ourselves clear enough.  Is that a fair assessment, or are
you opposed to the inclusion of these code points instead because you
do not think they warrant special attention?

Thanks and best regards,

A

-- 
Andrew Sullivan
ajs@anvilwalrusden.com

[I18n-discuss] Comments on "troublesome-character… Abdulaziz Al-Zoman
[I18n-discuss] Comments on "troublesome-character… Raed Al-Fayez
Re: [I18n-discuss] Comments on "troublesome-chara… Andrew Sullivan
Re: [I18n-discuss] Comments on "troublesome-chara… Asmus Freytag
Re: [I18n-discuss] Comments on "troublesome-chara… Abdulaziz H. Al-Zoman
Re: [I18n-discuss] Comments on "troublesome-chara… John C Klensin
Re: [I18n-discuss] Comments on "troublesome-chara… Asmus Freytag