Re: [I18ndir] [Idna-update] FWD: Last Calls on two IDNA documents

John C Klensin <john-ietf@jck.com> Sun, 04 August 2019 04:10 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A071B12002F; Sat, 3 Aug 2019 21:10:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WMRlNlvt1RIu; Sat, 3 Aug 2019 21:10:04 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D544B120018; Sat, 3 Aug 2019 21:10:04 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1hu7qM-0006O6-U4; Sun, 04 Aug 2019 00:10:02 -0400
Date: Sun, 04 Aug 2019 00:09:57 -0400
From: John C Klensin <john-ietf@jck.com>
To: "J-F C. Morfin" <jfc@morfin.org>
cc: i18ndir@ietf.org, idna-update@ietf.org
Message-ID: <D07A89030F4AB7A71B6A14FF@PSB>
References: <54A888A04C434DE23AAB1A67@PSB>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/L2QxzzkWCzkPec81PnJnhBWcMUQ>
Subject: Re: [I18ndir] [Idna-update] FWD: Last Calls on two IDNA documents
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 04 Aug 2019 04:10:07 -0000


--On Sunday, August 4, 2019 01:53 +0200 "J-F C. Morfin"
<jfc@morfin.org>; wrote:

> John,
> all of us trust Patrick's and your competence and wiseness IRT
> iDNs. And thank you for your constant updating. However, I
> observe that the retained solution regularly calls for updates.

Thanks although more eyes --and different perspectives-- on this
stuff are always helpful.  Put differently, while I'm glad you
trust us, I don't :-(.  And it does indeed call for updates.

> I am considering a polynym (strict lingual synonyms) oriented
> DDDS service (bigdata multilingual database labels for
> cross-origin data). A first idea is to build it on a similar
> basis as iDNs. However concerns are about Unicode stability
> over decades.

First of all, if you are building a new system, you don't need
the baggage that IDNs/ IDNA are carrying around as the result of
the DNS not being well-designed for non-ASCII labels at the
beginning and of many legacy applications that are even less
well designed and implemented for those situations. 

> The need is for a string in any language returning the its
> polynym in any other language indicated by its langtag. Would
> you have a better suggestion? The project is to try to locate
> the service at a meditarean group of universities.

You are facing several problems.  One is Unicode itself.  As
"universal" coding systems goes, it is probably as good or
better than anything else we can easily imagine, but its design
and implementation involved design choices that are not optimal
for all purposes (or all languages and writing systems).  A
different system might do better for some of those choices, but
would almost certainly be worse for others.  That is one of the
reasons for the update problem: as long as new code points (or
at least new code points that are not symbols or otherwise
excluded from "words") are being added, some decisions about the
properties associated with code points are going to involve
tradeoffs and those tradeoffs may not be optimal for IDNs, for
your system, or for any other one with its own set of
constraints and desirable characteristics.   So, if one is going
to have strict stability and forward compatibility for
identifiers or other strings, especially if more than just code
point assignments are involved, regular review of properties and
property assignments would seem to be a requirement. If you can
define things so that all you care about is stability of the
code point assignments and binding to abstract characters
themselves, and you have no need to define or enforce
equivalence of different encodings of "the same" string or
character within that string, then there is probably no issue.

The second problem is one with which I believe you are familiar
because you've raised equivalent issues before in other
contexts.  While some technical terms that were invented only
very recently do work, there are many terms, both mononyms and
polynyms, for which exact and unique translations do not exist
or are controversial.  If you want to check that out further,
looking at the EUROVOC and AGROVOC multilingual controlled
vocabularies might be a start.  Those two examples are, however,
specialized vocabularies; more traditional multilingual thesauri
pose even more complex problems.

best,
    john'



> Best
> jfc