Re: [I18ndir] Review of Unicode-07: Finishing

Martin J. Dürst <> Fri, 22 March 2019 04:49 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8D448130E8D for <>; Thu, 21 Mar 2019 21:49:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.922
X-Spam-Status: No, score=-0.922 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id IWp5OZwQZunU for <>; Thu, 21 Mar 2019 21:49:54 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A9CE5130E8C for <>; Thu, 21 Mar 2019 21:49:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=selector1-it-aoyama-ac-jp; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wsolz83c75PO3o3bmA/A6gCbbej0DKUhxDgoM9cvdf0=; b=to1zh225vyTfB9slwjMil3QD0+rNxbTPg/3hdzQhGDq8mwl4wV1MHM7tCw/QYVOsXBswS8jE3dGm9Ecz1AqvGgT4AJDZUL5ahMcohkMRIp7Z/uO09mXHPQ7FFcphuwyZhs1gTnJObo89EtknIHmos+CatoNS/ynEn/weKic/i8Y=
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1709.14; Fri, 22 Mar 2019 04:49:50 +0000
Received: from ([fe80::98b6:d90e:9ae7:302]) by ([fe80::98b6:d90e:9ae7:302%3]) with mapi id 15.20.1709.015; Fri, 22 Mar 2019 04:49:50 +0000
From: =?utf-8?B?TWFydGluIEouIETDvHJzdA==?= <>
To: Harald Alvestrand <>, "" <>
Thread-Topic: [I18ndir] Review of Unicode-07: Finishing
Thread-Index: AQHU4ACi9lo9QuCEk02QBqtgirMDlKYXFV2A
Date: Fri, 22 Mar 2019 04:49:50 +0000
Message-ID: <>
References: <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-clientproxiedby: (2603:1096:405:4::34) To (2603:1096:404:12e::18)
authentication-results: spf=none (sender IP is );
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: []
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: aae1dd05-e6cb-4992-676e-08d6ae81d103
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(7021145)(8989299)(4534185)(7022145)(4603075)(4627221)(201702281549075)(8990200)(7048125)(7024125)(7025125)(7027125)(7023125)(5600127)(711020)(4605104)(2017052603328)(7153060)(7193020); SRVR:TYAPR01MB2336;
x-ms-traffictypediagnostic: TYAPR01MB2336:
x-microsoft-antispam-prvs: <>
x-forefront-prvs: 09840A4839
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(136003)(39840400004)(366004)(346002)(376002)(396003)(199004)(189003)(476003)(386003)(186003)(486006)(110136005)(53546011)(6506007)(71190400001)(26005)(52116002)(2616005)(11346002)(71200400001)(102836004)(76176011)(14444005)(66574012)(99286004)(446003)(97736004)(14454004)(256004)(2501003)(85182001)(6116002)(86362001)(6512007)(229853002)(3846002)(508600001)(305945005)(74482002)(25786009)(68736007)(31696002)(31686004)(53936002)(316002)(85202003)(6436002)(5660300002)(66066001)(6246003)(8936002)(105586002)(81156014)(81166006)(8676002)(106356001)(786003)(7736002)(2906002)(6486002); DIR:OUT; SFP:1102; SCL:1; SRVR:TYAPR01MB2336;; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:0; MX:1;
received-spf: None ( does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: 8HmY4HA0mZ/G/Xh6bas1/153OU3PwcKhopWUIsG2StUTU/aBXopMzn3+bquKLcHVOgUlIzBQUDsosrIKXUDklIuuyYgYzLl82L9A/ZdA4uOINPlJU2QURfvOiAqrSOGXHi7fpztFfKoeUxhSLbMnNYNwBEqjNU5e8O2D8e3GkUEY1fijEpuVeDAHa5tT/Z27lwuMCs7pRzu5+RGNiNAcqEU667ozQYrBURZN5Vr60Z07NzXp37OiakJ16/AnJv5vgr2T3SIdgNeh/jFoeleJz8OgjzQoSJaXt8PokTl5DQkBc1ymV9uw+yrzQeU1DjE3pRMo2ZvlTZybA3ka3nOy2rECT9YNZ2bxdC5N6W9ykjqpb3m1Fj4t2fZFYZGXwJWQiURjdPqE+BRzRSDW5NwldVWHqfWKrTSCbyGlMj5NQLc=
Content-Type: text/plain; charset="utf-8"
Content-ID: <>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: aae1dd05-e6cb-4992-676e-08d6ae81d103
X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Mar 2019 04:49:50.2791 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYAPR01MB2336
Archived-At: <>
Subject: Re: [I18ndir] Review of Unicode-07: Finishing
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 22 Mar 2019 04:49:57 -0000

Hello Harald,

Many thanks for all your work. Just two very small issues below:

On 2019/03/22 01:10, Harald Alvestrand wrote:
> With the petering out of discussion, I think it's time to file our
> review of unicode-07 and call it "done". We were asked for a review, we
> have written a review. The review should be part of the public record,
> but I don't think we need to call attention to it by posting it all over
> the place, given that we have new versions of the document to work from.
> Current proposed text:
> **
> *Directorate review of draft-faltstrom-unicode11-07*
> *
> Overall conclusion: Not ready yet, needs some updates. New I-D recommended.
> [Note: As part of the discussion that resulted in this text, a new I-D
> has been issued.]
>      Context issues
> The discussion of draft-faltstrom-unicode11 in the directorate has shown
> that the directorate members share a number of concerns about the
> current state of IDNA, only some of which are directly relevant to this
> memo.
> IDNA2008 considered limits to what was reasonable to register and use in
> the DNS at a number of levels:
>    *
>      A level of “don’t register stuff that causes confusion”. This
>      requires human judgment, and reasonable people may disagree about
>      what causes confusion.
>    *
>      A level of “don’t register stuff that is structurally invalid under
>      the relevant writing system”. Aspects of this can be captured in
>      rulesets (ICANN’s RZ-LGR efforts fall not this category), but
>      requires deep expertise; this is captured in IDNA2008 as the “don’t
>      register what you don’t understand” rule.

My understanding was that ICANN’s RZ-LGR efforts would (at least 
partially or mostly) fall in this category.

>    *
>      A level of “this is stuff that you should never register, and
>      applications can reasonably choose to treat it as an error or an
>      attack if it ever shows up”. This is the distinction that is
>      captured in the classification of codepoints as DISALLOWED, and
>      where IDNA2008 (with updates) gives precise rules.
> The current document focuses on the last level only - the maintenance of
> the distinction between PVALID and DISALLOWED. (It also considers
> whether new CONTEXTO and CONTEXTJ rules are needed).
> It is clear from directorate discussion that work needs to be done at
> the other levels outlined above too, but it is not clear from the
> discussion what form that work should take or what fora that work is
> reasonably performed in; the work may or may not involve a revision of
> the basic IDNA2008 specifications.
> We suggest to insert a paragraph in the document describing the context
> of the state of IDNA2008, and explain what issues this document does not
> attempt to address. Specifically that the conclusion of the document is
> what to do regarding Unicode versions up to and including 11, and that
> this is not to be used as expectations of future versions of Unicode.
> In addition, it’s become clear that IDNA2008 does not specify the
> mechanisms and expectations of the review of new versions of Unicode in
> enough detail; with the review of a number of versions of Unicode behind
> us, we should be able to describe those procedures and expectations
> better than IDNA2008 does. However, this may need to happen in another
> document than this one.
>      Content issues
> Section 4.1 does not specify where to find the conclusion of the IETF
> discussion on U+08A1.
> It is not easy to see from the text whether the algorithms and
> procedures will render U+0628 U+0654 an illegal sequence or a legal
> sequence. No matter what the resolution is, the document should make it
> obvious what the conclusion is (and why).
> 27D0..2B4C  ; DISALLOWED

 From the two lines above, it's totally unclear what the problem is, and 
what fix we propose.

Regards,   Martin.

> Section 4.1 ought to include numbers for how many characters ended up in
> DISALLOWED vs PVALID - ideally, for each Unicode version since IDNA2008
> was issued. This may also be something that is recommended for the IANA
> tables rather than this document.Given the time that has passed since
> this work started, we should consider whether or not to include Unicode 12.
>      Nits
> These have been submitted separately to the author, and are not
> enumerated here.
> *

Prof. Martin J. Dürst
Department of Intelligent Information Technology
College of Science and Engineering
Aoyama Gakuin University
Fuchinobe 5-1-10, Chuo-ku, Sagamihara
252-5258 Japan