Re: [I18ndir] draft-faltstrom-unicode and "outdated versions of Unicode" ... and the "review model" document

Asmus Freytag <asmusf@ix.netcom.com> Sat, 16 March 2019 22:03 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 092CD129BBF for <i18ndir@ietfa.amsl.com>; Sat, 16 Mar 2019 15:03:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ix.netcom.com; domainkeys=pass (2048-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 91NjUmlzXjZW for <i18ndir@ietfa.amsl.com>; Sat, 16 Mar 2019 15:03:08 -0700 (PDT)
Received: from elasmtp-masked.atl.sa.earthlink.net (elasmtp-masked.atl.sa.earthlink.net [209.86.89.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E229F1240D3 for <i18ndir@ietf.org>; Sat, 16 Mar 2019 15:03:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ix.netcom.com; s=dk12062016; t=1552773787; bh=EV/pUprPtsMR9CvPkqGbEkmxqCCNpxoAanlM 01qDUyg=; h=Received:Subject:To:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=Hia5gBDHnKYeb4H/i63fUdLWm3Hspp9iE Xcx0/ocZm2Qp+CAqxMgJJ0JBUJ+89UGTg3Gam3hQcpvxe0Uey8nt2VssnRmdWfr0ZG7 Kkcvo3XNRDVtRwS/MchL48qJ1+TlBA6zMSdfx+IqMcNbOW4/7QvjthHtFhs83W6DKgL AsfSyIRNAhy8dKYhI1kudTo/c25yzJ98ZTC/3DmWF7PFtsWf+8BnjEw1+/f+sQDrEHM bs18EK4JeIOVhBsTNLr6bzIvLGoDz8jnJCSXK4CWcTdfwLRcUIugzPw1TnD0+wBpoOE 1zsDQFcNlSQyeVlez+jFk1Vw8s5kM6JSrwO5RlsPA==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016; d=ix.netcom.com; b=UEqbIqATczTN+mKPoxZIG8e2MnjOO0YKRhZbyOeW4OSyMmbp3+HTdrcgbjLP+dKHMIWh4XoZEbj7DVdF5PHmTBuD9QxIePXG/t+h/xzFMHX6qQH58UjLaZo9SoKcc6Zh1Bb3XohG2roLJsxtKMUdAdhkzYHOYck6mjn27/abv8DeUIubfW4AgrT+WkO1Te8H6GKqrigbLyxikSraYCNWKNPFfaPcsjwy2AfmQS3fHfhjeT/2i/NnRURa8BS0oN02dYV3k7uQ1EuLz/MmSiIlYDk1ibGalp7TE01blR0mQV0i73Z2wC9LqxAZxu5uK1pRnhTgRhL9Y5AahRObM72zxg==; h=Received:Subject:To:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [97.113.245.20] (helo=[192.168.1.114]) by elasmtp-masked.atl.sa.earthlink.net with esmtpa (Exim 4) (envelope-from <asmusf@ix.netcom.com>) id 1h5HOT-000E9n-Vp for i18ndir@ietf.org; Sat, 16 Mar 2019 18:03:06 -0400
To: i18ndir@ietf.org
References: <0DA0CBAEF3730D672CD7ADC7@PSB>
From: Asmus Freytag <asmusf@ix.netcom.com>
Message-ID: <ef0dc10e-1a19-a1b1-2c46-7ba67b960a8a@ix.netcom.com>
Date: Sat, 16 Mar 2019 15:03:14 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3
MIME-Version: 1.0
In-Reply-To: <0DA0CBAEF3730D672CD7ADC7@PSB>
Content-Type: multipart/alternative; boundary="------------31FCCB97883162A0F4253858"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b2817f643b8a0bc7e0699f0f56bfd1ac47dbbd10f749626fa3350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 97.113.245.20
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/oxWWgGhral3ZQ_01HniP4Vnm6jU>
Subject: Re: [I18ndir] draft-faltstrom-unicode and "outdated versions of Unicode" ... and the "review model" document
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Mar 2019 22:03:11 -0000

I find the IANA tables useful myself. While I'm fully capable of 
deriving them, not having to do that is a benefit. Having them 
unpredictably "out-of-date" adds an uncertainty.

Two things should be avoided in the future:

 1. any consistent lag between the IANA tables and Unicode, say
    something like a 1-year delay (amounting to being at least one
    version off).
 2. any arbitrary "halt" to generating the tables.


As you write, the tables aren't normative because the application of the 
current state of the algorithm to the latest version of the Unicode 
Standard defines the normative data.

The way Unicode works is that it releases the Unicode Character Database 
ahead of time for beta review. It is during this time (which now will 
occur relatively predictably on a yearly pattern for major versions) 
that the review of new characters and effect of potential property 
revisions should be performed.

At the same time a set of "beta" IANA tables can be generated (draft 
update). These can be sent for review in IETF.

If no issues are found and Unicode's beta process does not result in a 
(rare) last minute correction, then the draft tables can be released 
concurrent with Unicode's version release (or as nearly concurrent as 
practical).

If issues are found, the process of adding an exception can be started 
before final release of the Unicode version. If the process of reaching 
consensus in IETF does converge in a timely manner (either before 
Unicode's final release or in a short time window), only the IANA tables 
including the exception should be posted.

However, if the consensus on an exception does not converge, the 
unmodified tables should be posted; while this would make visible the 
"retroactive" nature of applying an exception it only documents the 
normative contents of IDNA2008: as soon as the Unicode version is 
public, the properties can be calculated, and until the RFC with the new 
exception is published, the original, unmodified calculation is what is 
defined normatively by IDNA2008. Therefore, as far as normative content 
of IDNA2008 is concerned, any exception declared after a code point was 
encoded is "retroactive". (The same applies even more so for exceptions 
based on property changes). Therefore, normatively, it doesn't help to 
hold updates of these tables.

A./

PS: freezing any particular zone to an earlier version of Unicode is 
outside the scope of IDNA2008 or the IANA tables. However, it is a major 
pain. For the Root Zone it turns out that the only code point that might 
have been added is, of all cases U+08A1, which is completely harmless in 
the root because none of the Arabic combining marks are allowed. 
However, in terms of the tools we use to create the MSR and the RZ-LGR 
tables and documentations we definitely have "leakage" from later 
versions of Unicode. For example, the tables we generate show fonts and 
annotation data from later versions, and, while marked as "excluded", 
some tables show the presence of "future" characters.





On 3/16/2019 1:58 PM, John C Klensin wrote:
> Hi.  In the confusion of the last week, I apparently wrote one
> note that I never sent.  As we think about what needs to be said
> or clarified going forward, some of that note may be useful even
> though I think all of us are now in synch about it and hence,
> for the purposes of the directorate of
> draft-faltstrom-unicodeXX, I'm probably kicking the proverbial
> dead horse.
>
> --On Sunday, March 10, 2019 13:35 -0700 Asmus Freytag
> <asmusf@ix.netcom.com> wrote:
>
>> ...
>> "The Unicode Standard is being updated at least yearly and
>> implementors and vendors of OS, libraries and other tools are
>> aggressively updating their systems/libraries to the new
>> versions of
>> Unicode. Therefore, IETF recognizes the burden on implementers
>> to support IDNA2008 corresponding to an outdated version of
>> Unicode.
> It is probably worth having tables as part of an IANA registry.
> It is probably worth keeping them up to date (see below about
> both of those statements). But those tables are not normative
> and do not define a particular version of Unicode for use with
> IDNA2008.  The requirement on implementers is that they choose a
> version of Unicode and then work either directly with the
> [current] rules of IDNA2008 or derive or find a table that
> corresponds to that version of Unicode.  There is absolutely
> nothing about IDNA2008 that would require an implementer to work
> with an outdated version of Unicode of to "support IDNA2008
> corresponding with it" (whatever that means).
>
> I say "probably" above because, if the existence of the IANA
> tables going forward is to create confusion about this --
> especially if we don't get IANA to keep versions of each table
> corresponding to every version of Unicode from 5.2 forward -- I
> think it may be critical to either add more notes to the top of
> that IANA table or to get rid of the IANA tables entirely.
> That is something we should think about when we are working on
> the "review model" document because that would be a logical
> place to adjust the instructions to IANA going forward.
>
> Now, should some body with a claim to regulatory or contractual
> authority over some collection of zones say "you are allowed to
> only use Unicode version X" or "you must use the version of
> Unicode corresponding to the IANA tables at the time of
> registration of a label", IDNA2008 does not prohibit such a
> restriction and the decision to impose it is Not An IETF
> Problem.  I would, personally, consider such a restriction
> rather dumb unless it turned out to be a requirement of some
> more elaborate system for advising registries, but, even if
> people agreed with me, that evaluation doesn't belong in
> draft-falstrom or any of the core IDNA2008 documents.
>
> I believe and hope that anything in the above that needs to be
> said in draft-faltstrom-unicode11 (or 12) has already been said.
> However, for work going forward, are we all in agreement about
> the above?
>
> best,
>     john
>
>
>      john
>
>
>