Re: [I18nrp] The evolutionary future of IDNA (was: Re: Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC)

Asmus Freytag <asmusf@ix.netcom.com> Thu, 06 December 2018 18:26 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AE430130E4E for <i18nrp@ietfa.amsl.com>; Thu, 6 Dec 2018 10:26:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ix.netcom.com; domainkeys=pass (2048-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8yuRARktjVrk for <i18nrp@ietfa.amsl.com>; Thu, 6 Dec 2018 10:26:11 -0800 (PST)
Received: from elasmtp-masked.atl.sa.earthlink.net (elasmtp-masked.atl.sa.earthlink.net [209.86.89.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7578B130E96 for <i18nrp@ietf.org>; Thu, 6 Dec 2018 10:26:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ix.netcom.com; s=dk12062016; t=1544120771; bh=H3HulFYiV+56i3VzvaHj4prwEqegeFtMlHwi q+NqBog=; h=Received:Subject:To:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=pUwzPzHbTlVhOZud7goRhskCslKCz+HA8 wRsQd3nfkdKZGwDcvz1jkJ6+OfRu/gEmD9JHjpkoTL9refVGn8HMSvbrwJ47HcIq2TH OfbWsgg4zlNUcaDy0SSInlIg+Dl79r3b9i6ft/fPSQlPsl4xQtju6qOgNCL+F13xaFp VTJnub/65WFdpAWEQiQRkdFHEG4M97g5f9UkxiYIlnGgTLjCAqYGVZhqeKaY9W41UCH Vw6HwPRHhb8rPPFUzYAKcufvnSWgpG7JmhbwNYsi0hYjJ2TdyX5ANoXtNmwMV8GmwyN L+Q3FCmIEoLcSiohWJHpAlLHt83I/ehICzGFhMH4Q==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016; d=ix.netcom.com; b=kfcXUBW9UzcBJ/CBZbNqZcgx7545kNkH3On/9zANNnT6EUXqks0hltiW1UHPxVzA/k32D1MfP0eQyzBn1PfE+WV6RFA/D+6uo1h1LZLM737JXsEsMDhmTR93m3za4tObSH+w+31RRD8O5PT1JAFrs6Lx+c4BetZGYOqB4F+e9cLHSQlo7PZsE2zY47A+nesmO5NsbUHKWcBXUhZl8bw7jRz4eHncwHv+lkz5XHLbmr7ng2QaNru16XXUYdjbuenxjv8YaXJVVhV3+jU8Z7K6kroIanlWHe52IEhTlcc1B0hAwjl2TZ4bvGtS3rLyRFWRFshe2aLUfa4broL6GkJWFA==; h=Received:Subject:To:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [174.21.171.131] (helo=[192.168.1.111]) by elasmtp-masked.atl.sa.earthlink.net with esmtpa (Exim 4) (envelope-from <asmusf@ix.netcom.com>) id 1gUyLh-0000eC-J2; Thu, 06 Dec 2018 13:26:09 -0500
To: John C Klensin <john-ietf@jck.com>, Paul Hoffman <paul.hoffman@vpnc.org>, i18nrp@ietf.org
References: <9F6A8117BA3220C4447B1D72@PSB>
From: Asmus Freytag <asmusf@ix.netcom.com>
Message-ID: <4df6a433-6532-dca7-5399-6b366c6a93a0@ix.netcom.com>
Date: Thu, 06 Dec 2018 10:26:13 -0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.2
MIME-Version: 1.0
In-Reply-To: <9F6A8117BA3220C4447B1D72@PSB>
Content-Type: multipart/alternative; boundary="------------B9CC50AFFAC3B42A9C145550"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b28d93432b0f0788b9b83e09bd37769fc07be7c2eb8cb565be350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 174.21.171.131
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/BQxVPvB_CliMO40IEnL1Ajvbnpg>
Subject: Re: [I18nrp] The evolutionary future of IDNA (was: Re: Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC)
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2018 18:26:14 -0000

On 12/6/2018 8:01 AM, John C Klensin wrote:
> One way or another, I don't think it is appropriate to avoid
> asking whether Asmus's analysis makes a part of IDNA2008
> technically defective and therefore obligates us to fix it.  If
> the answer is that it is and we are, that may be the strongest
> argument for not advancing draft-faltstrom-unicode11 at least
> until we have the underlying issues in hand because the update
> (or confirmation that no update is needed) it represents is
> required to address contextual rules as well as code points.

I come to the opposite conclusion.

Getting Patrik's draft out is urgent. Having the status of code points 
between
6.3 and 11.0 unresolved is untenable (and soon that will be 12.0 and 
13.0 and
whatever additions now come every year).

Any issues with complex scripts are not made worse by clarifying the 
status of
these new code points under existing RFC 5892.

Not updating the registry simply means implementers are using UTS#46
(option C3) to get the data.

Perennially yanking the emergency brake is really not the way to go 
about this;
it just leads to terminal paralysis and an impression that IETF is 
abandoning the
i18n issues, leaving them to other players.

A./

Moving forward, we can then focus on the remaining issues:

(1) in which form to present both guidance and expectations wrt to
       registry policies that conform to the spirit ("support only scripts
       languages that you fully understand").

       John has covered that in his previous post - there are different
       tacks we can take on this and we probably should discuss these
       options based on the drafts that he and I are coauthors on.

(2) whether it is advisable to augment the CONTEXTO registry and
        if so, how.

        Unicode has become more explicit in recent years in documenting
        complex scripts and in that process documenting that there are
        sequences that are deprecated. (They don't have a formal property
        unlike deprecated characters, but they are nevertheless listed as
         "do not use" which is equivalent to the recommendation Unicode
         makes for deprecated code points).

          These sequences could be listed using existing CONTEXTO pseudo
          code, but that, to me seems a terrible choice (on a practical 
level).

          It might be better to add a simple list of deprecated sequences.

  (3) whether it might not be useful to publish the CONTEXT information
         in machine-readable form (using the schema from RFC 7940).
         As of today, all existing context conditions can be modeled in the
         format described by RFC 7940.

  (4) whether there is a need/benefit to add CONTEXT restrictions that
         go beyond deprecated sequences and that model some generic
         behavior of code points for complex scripts.

         Tentatively, my gut-level reaction would be that this is going too
         far. While we should clearly express that all registry policies 
should
          contain such constraints suitable for the given script(s), 
this is one
         of the cases where one-size does not fit all: depending on what
         languages are supported, tighter or looser constraints are 
required.

          The existing rules on CONTEXTJ are about as far as generic rules
           can be written - in the Root Zone we simply disallowed Joiners
           so we didn't have to evaluate whether CONTEXTJ would have
           given the correct answer.

          In either case, it is clear that the implicit model in 
IDNA2008 which
          largely allows arbitrary sequences is not appropriate for all 
scripts.

  (5)    what form to give any guidance to registries. This is the question
           the "troublesome characters" draft attempts to answer. However,
           parallel to this, there is the ICANN effort to create "reference
           LGRs" that, instead of approaching the issue generically, intend
            to give specific "best practices" recommendations on the basis
           of individual scripts and languages.

            Would any activity by IETF merely be duplicative?