Re: [I18nrp] [art] Use Unicode if Using Unicode?
John C Klensin <john-ietf@jck.com> Thu, 11 October 2018 02:01 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
by ietfa.amsl.com (Postfix) with ESMTP id 49DE2130DFF;
Wed, 10 Oct 2018 19:01:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5
tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001]
autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44])
by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id odI_SPfFzZZ2; Wed, 10 Oct 2018 19:01:07 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51])
(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
(No client certificate requested)
by ietfa.amsl.com (Postfix) with ESMTPS id DC4FA130DF9;
Wed, 10 Oct 2018 19:01:06 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB)
by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD))
(envelope-from <john-ietf@jck.com>)
id 1gAQHh-0003ji-2a; Wed, 10 Oct 2018 22:01:05 -0400
Date: Wed, 10 Oct 2018 22:00:59 -0400
From: John C Klensin <john-ietf@jck.com>
To: Shawn Steele <Shawn.Steele=40microsoft.com@dmarc.ietf.org>, art@ietf.org,
i18nrp@ietf.org
Message-ID: <FB4FE0D631E6F6D4C72B19A1@PSB>
In-Reply-To: <MW2PR2101MB0908F009734817997508274282E00@MW2PR2101MB0908.namprd21.prod.outlook.com>
References: <MW2PR2101MB0908F009734817997508274282E00@MW2PR2101MB0908.namprd21.prod.outlook.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/TsP3a7i2paP3toHn4KmA19Wb6nM>
Subject: Re: [I18nrp] [art] Use Unicode if Using Unicode?
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>,
<mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>,
<mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Oct 2018 02:01:10 -0000
Shawn, I'm confused about what you are suggesting, so let me clarify where I'm confused and then hope you can enlighten me... --On Wednesday, October 10, 2018 19:02 +0000 Shawn Steele <Shawn.Steele=40microsoft.com@dmarc.ietf.org> wrote: > The draft states "It further suggests for the IETF a path > forward regarding ensuring IDNA2008 follows the evolution of > the Unicode Standard" and "this document requests that IANA > update the tables to Unicode 11." > Each Unicode version creates a data file with information from > applying the IDNA2008 rules that can be used for IDNA mapping > algorithms. (Indeed, that's where Windows gets the data > from). Taking a half-step back, IDNA imposes two requirements for new versions of Unicode, both of which are addressed by the present draft. One is that the changes in the new version be examined to ensure that nothing new has been done that requires changes to IDNA (most likely adding exceptions or rules for particular code points) or careful explanation somewhere. No one I know of expected that provision would be exercised very often but, for a variety of reasons, the consensus was that it was an important safeguard. The other was that the IDNA ruleset be run against the characters and properties in the Unicode Character Database to produce a table reflecting the combination of that version of Unicode and IDNA, as modified, at that time. That table was (and is) to be stored with IANA but, while we expect it to be checked carefully for accuracy, it is not actually authoritative -- only the rules and categories specified by IDNA (and the Unicode properties used to support them) are. The discussion in RFC 5895 notwithstanding, one of the critically important properties of IDNA2008 is that U-labels and A-labels are duals: one can get from one to the other and back without any loss of information. That reversibility is true in only some cases for Unicode normalization (especially with compatibility normalization) or case folding, much less for other mapping scenarios. So I don't understand what sort of "mapping" you are talking about. The only Unicode-created IDN data files I'm aware of are those associated with the UTS#46 effort. Because UTS#46 makes recommendations that are inconsistent with IDNA2008, if Microsoft is using those tables, its usage is non-conforming to IDNA2008. I certainly cannot prevent Microsoft from doing that (and wouldn't try), but it would certainly not be consistent with general interoperability of IDNs or what is known elsewhere as Universal Acceptance of those domain names. > If the goal is to "follow the evolution of the Unicode > Standard" and the Unicode Standard is providing data that > conforms to the IDNA rules, then why not just point directly > to the Unicode derived tables? The simplest answer to your question is that, unless I've missed something, the conditional is false: the Unicode Standard is not providing data that conforms to the IDNA2008 rules. Instead, the data they are providing is more like one of the creative IDNA2003-IDNA2008 hybrids to which Patrik refers. best, john
- Re: [I18nrp] [art] Use Unicode if Using Unicode? John C Klensin
- Re: [I18nrp] [art] Use Unicode if Using Unicode? Shawn Steele
- Re: [I18nrp] [art] Use Unicode if Using Unicode? Larry Masinter
- Re: [I18nrp] [art] Use Unicode if Using Unicode? Asmus Freytag