Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC

"Asmus Freytag (c)" <asmusf@ix.netcom.com> Fri, 07 December 2018 20:36 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03C68130FE8 for <i18nrp@ietfa.amsl.com>; Fri, 7 Dec 2018 12:36:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ix.netcom.com; domainkeys=pass (2048-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WkFPs0Aq3B7w for <i18nrp@ietfa.amsl.com>; Fri, 7 Dec 2018 12:36:02 -0800 (PST)
Received: from elasmtp-mealy.atl.sa.earthlink.net (elasmtp-mealy.atl.sa.earthlink.net [209.86.89.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5FF64130F41 for <i18nrp@ietf.org>; Fri, 7 Dec 2018 12:36:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ix.netcom.com; s=dk12062016; t=1544214962; bh=xe1rXjyCMLg9CMQX6mM8pIxRR8cgVawacsKB VllMJZg=; h=Received:Subject:To:Cc:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=nRojYsEE2P79bvTLY3HiJScQoktVloIXb 1cFupb/2tPfw8s3oGcWKj1IGPLyEF7l17+K4kzyNd/wU4TbJdfyHj4TD1O7cN82Z6/v AsbxRSKp0U1xSOZQyEkE9HALLcI+7rywKztxSv5VyTVyB3HI431FED+SXhw7YZwGB8u 3OCydzmjrZ3uuA25CsX8XwPHvpDrycCU7gTxyP7v192SctABxjZfTQDLJkc7ows6QCz cE5IGpkqcsrohM2mTfC/dMpLvRwXOf+vNYtxrVBcDnOElBql+f6fu2NFFbJWWeESis4 aaQjACxPJPG+cYP8bv/l9dckxV6VSu7hzdflbIBxg==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016; d=ix.netcom.com; b=P/xmX1zbsBlw19acaUAK5XNrk2Vi9wD4Kf/5p8aRHDH+qo4RLQEGWr3cRymSJ+ccSyzMwyH9e+lNMzXR3HoPsBfGpuW4ib4eHabeCaGpZi9K3M482207cRU3YTItEWpA6Y5F6Hv4DbccSuD1hm2MVh8ao7C1RHw0Gs0QCSlS8rQi3ZSMGtF81xllvoHD+DVoewHNv934V+VLwhm4zSbS/T7ApmO9HTJCDMTVNN8IRytNPX3mr0wTMt268v3fYkZ7ZCw7dysas+B0RGRe5Tn2dQ8cEJLGqR9lJGWZzh4JV/7YlUpoGQH7fiSMcNCOSWdIMMs3rGhGbHVEqdXP/vZG3g==; h=Received:Subject:To:Cc:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [174.21.171.131] (helo=[192.168.1.111]) by elasmtp-mealy.atl.sa.earthlink.net with esmtpa (Exim 4) (envelope-from <asmusf@ix.netcom.com>) id 1gVMqs-0007Wd-HA; Fri, 07 Dec 2018 15:35:58 -0500
To: John C Klensin <john-ietf@jck.com>, Larry Masinter <LMM@acm.org>, 'Patrik Fältström' <paf=40frobbit.se@dmarc.ietf.org>
Cc: i18nrp@ietf.org, 'Paul Hoffman' <paul.hoffman@vpnc.org>
References: <154385119878.18333.5085298134102919486.idtracker@ietfa.amsl.com> <FF6F9EB9-C73B-4EC0-AC4F-3E3BFBABA0AB@vpnc.org> <8E20D432-01B0-4B52-80BB-3348C5FE73AF@vpnc.org> <CC73FC25-92FC-4822-B267-15C41CE450F2@frobbit.se> <D81CDFF3-8CDF-4168-9CEA-E8DC3A133B73@vpnc.org> <217ede0e-ea1f-bb31-a276-f8c618c71278@ix.netcom.com> <8885EE4C-412E-4337-A099-66354A36CEA1@vpnc.org> <EC12FDAE-4ABD-4AD3-A35A-B39D2C8A0AE0@frobbit.se> <f4417f80-fa86-11e6-baf7-2365981e18b1@ix.netcom.com> <48A2A546-4FEA-4060-8706-34D210B2ABAF@frobbit.se> <055301d48dc8$0ea95120$2bfbf360$@acm.org> <07CB0B3B-E48A-40CD-BBC9-E6CAA2FB29F0@frobbit.se> <001d01d48dee$82b415c0$881c4140$@acm.org> <1f879380-f586-cddf-ae4b-62cfc106308a@ix.netcom.com> <00f301d48e63$071e9be0$155bd3a0$@acm.org> <0D2335F6D932D325C3FBA91E@PSB>
From: "Asmus Freytag (c)" <asmusf@ix.netcom.com>
Message-ID: <6a8c84c4-a7af-9398-e706-199a6ec61d81@ix.netcom.com>
Date: Fri, 07 Dec 2018 12:36:02 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.2
MIME-Version: 1.0
In-Reply-To: <0D2335F6D932D325C3FBA91E@PSB>
Content-Type: multipart/alternative; boundary="------------1B21039557AC80C716DA191D"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b28d93432b0f0788b9ff0cc4684a2ff45fafcf7ed490891998350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 174.21.171.131
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/YkplnC1Iy5NJA8RJ3rcvoyEH6zo>
Subject: Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2018 20:36:05 -0000

On 12/7/2018 12:02 PM, John C Klensin wrote:
>
> --On Friday, December 7, 2018 11:28 -0800 Larry Masinter
> <LMM@acm.org> wrote:
>
>> ...
>> The reason for emphasizing transcription is not that there
>> aren't other operations that are possibly more frequent
>> (copy/paste a URL from one context to another, remember on a
>> bookmark list) but rather that transcription is the most
>> stringent requirement – if a user can transcribe a name
>> resulting in the same sequence of Unicode codepoints, then
>> they can display the name, distinguish the name from other
>> (transcribable) names.
> Larry, I hope Asmus will respond further to this because he is
> far more expert on relevant issues across a very wide range of
> scripts and writing systems than I am, but I don't believe what
> you write above is true.

Globalization is often misunderstood - it's neither the magic bullet that
makes everything equally accessible to everyone, nor is it exclusively
about presenting everything to every user in their native context.

Instead, it's an interesting mixture of both.

For IDNs the goal is to allow full native support of mnemonic identifiers
in the context of the language/script native to the user - for all such
combinations of languages and scripts so that, in principle all users
can be supported.

But it's also about enabling the accessing of resources outside your
native bubble - Unicode is a big step in that direction, because before it
you were often locked into a native character set baked into your system.

However, a label in a foreign script will not be "mnemonic" for you,
and there's nothing we can do about that in this context. However,
we should expect that labels in your script (and where relevant your
language) are indeed able to be mnemonic.

Otherwise, why not use digits like telephone numbers?

I think that is ultimately what Larry has in mind and what I cover by
"typable". You should be able to read a label, understand it, and
reproduce it by typing, or be able to tell someone about it.

For the Latin script (because of the features that you are attempting
to outline below) even arbitrary sequences of code points do not
generally compromise the ability for a label to be mnemonic and to
support dealing with it other than by clicking/pasting.
(The above is strictly true only for ASCII, once you add the full range
including combining marks a bit more care must be taken - as will
be evident when the draft for the Latin Root Zone LGR becomes
available, until then, I'll spare myself the digression into details)

For other scripts, there's a wide variety of issues.

One example from Arabic: Quite a few letters in Arabic share positional
forms, that is, they are only distinct if, for example, at the end of a 
word.

Also, some languages using the Arabic script support keyboards that
only have one of these letters, but not the "standard" Arabic-language
one. As a result, while users can "read" an Arabic-language name,
a geographical name, for example, they could not type that name.
If they did, it may even look right, but be different.

Therefore, a registry that attempts to support both languages in
the Arabic script, but does not support variants, will have a fraction
of its labels that are de-facto unusable for some users.

Understood in that sense, "typable for a native user" is a valid goal
that informs necessary features of label generation rules (registry
policies) that enable the use of labels as mnemonic, in the same
way we take for granted for non-IDN labels.

A./

>   Some scripts --of which the Roman
> Script (Basic Latin of some millennia ago) is a good example--
> simply have characters that are more easily distinguishable from
> either other by people who cannot actually read the script or
> associated language(s) that others.  Put a hypothetical person
> who has never seen either before in front of a short string of
> characters in that script and in front of a script written with
> connected characters, complex use of ligatures, and subtle
> distinctions among characters (look at Arabic digits for two and
> three for an example of what I mean by "subtle" ... or look at
> "O" and, in many contemporary type designs, "Q").  Then supply
> that person with character pickers for the two scripts.  You
> would almost certainly get accurate transcription of the Roman
> characters and probably would not get it with the other script.
> In neither case would be ability to transcribe imply the ability
> to render and display (I'd encourage observation of five and six
> year olds learning to write), nor would it imply that ability to
> accurately copy and paste.
>
> So, ability to transcribe may be a useful goal, but it isn't,
> IMO, close to Patrik's global requirements list.
>
> best,
>     john
>
>
>
>