Re: [apps-discuss] CONTEXTJ in TLD DNS-Labels (draft-liman-tld-names-05)

John C Klensin <john-ietf@jck.com> Tue, 19 July 2011 13:24 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2873E21F87A4 for <apps-discuss@ietfa.amsl.com>; Tue, 19 Jul 2011 06:24:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.634
X-Spam-Level:
X-Spam-Status: No, score=-102.634 tagged_above=-999 required=5 tests=[AWL=-0.035, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X1A2WFgxGszA for <apps-discuss@ietfa.amsl.com>; Tue, 19 Jul 2011 06:24:06 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by ietfa.amsl.com (Postfix) with ESMTP id 5194521F85A7 for <apps-discuss@ietf.org>; Tue, 19 Jul 2011 06:24:06 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1QjAHM-000Ans-Uu; Tue, 19 Jul 2011 09:24:05 -0400
Date: Tue, 19 Jul 2011 09:24:04 -0400
From: John C Klensin <john-ietf@jck.com>
To: Behnam Esfahbod <behnam@esfahbod.info>
Message-ID: <85FB14D637D54FBC5A95D68E@PST.JCK.COM>
In-Reply-To: <CANp6Ttw4MaAJy2VRvZ8929oBju9jL3b69PkSyFLi-SC4YaNTnw@mail.gmail.com>
References: <B464B2C6607E04FD0572AA74@192.168.1.128> <CANp6Ttw4MaAJy2VRvZ8929oBju9jL3b69PkSyFLi-SC4YaNTnw@mail.gmail.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Cc: apps-discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] CONTEXTJ in TLD DNS-Labels (draft-liman-tld-names-05)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Jul 2011 13:24:07 -0000

Behnam,

I'm sorry I was not clear.   Let me try again, first by
reference to Patrik's comment: independent of how ICANN has
formulated the variant investigation, the question remains "what
is safe across all scripts" and not "what does a particular
language need".  The ASCII ("English") examples were not
intended to justify the situation, only to point out that
restrictions have been with us for a very long time and that one
of those restrictions is that a string being a valid word in
some language does not create an entitlement to use that string
as a DNS label... and never has.  In retrospect, the terms
"domain name system", and the earlier "hostname" are misleading.
Precision would have called for substituting something like
"mneumonic" for "name".

Second, while your detailed explanation is appreciated, we fully
understand the importance of ZWNJ to writing Persian (and most
non-Arabic language use of Arabic script) and, although the use
is a little different, the importance of ZWNJ and ZWJ in writing
most Indic scripts.  CONTEXTJ was not included in IDNA2008 by
some magical accident: we (including both Patrik and myself)
fought to include it in the standard precisely to facilitate
those uses.

But, examples, explanations, and language requirements aside,
the issue remains one of whether those characters are safe in
the root.  With the understanding that this is just my opinion,
part of that safety evaluation is that the root zone almost
certainly should have a clear and simple set of rules, rules
that are easily checked and enforced by the various types of
(language-independent) software that call on the DNS.  While one
could imagine a large collection of rules based on a model of
"determine the script, guess at the language, and then interpret
and render accordingly", it is almost certainly not feasible
even if ICANN agrees to use self-discipline about single-script
labels.  First, the DNS and IDNA do not support explicit
language information and heuristics to determine language that
work well with moderate or large blocks of text are not reliable
when strings are only a few characters long.  Second, and
equally important, we know that complex procedures based on
layers of tables are rarely implemented correctly.

So, again returning to one of the implications of Patrik's note:
please assume that we understand the importance of this
character to most of the languages that use Arabic script (and
to most of the languages that use several of the Indic scripts)
and that, in case knowing this is helpful, we understood it long
before ICANN created the VIP program.  We also understand its
importance regardless of how (or whether) "variants" (whatever
that means in the general case) are supported.  The question is
whether the use of characters that, among other things, become
invisible if the wrong rendering engine is chosen, is safe in a
root context or can be made safe by a plausible, understandable,
and, if appropriate, enforceable set of rules.

regards,
   john




--On Monday, July 18, 2011 21:01 -0400 Behnam Esfahbod
<behnam@esfahbod.info> wrote:

> Hi John,
> 
> I think it is time to stop general pronouncements that have
> been repeated and repeated so many times over these past years
> and get down to specifics.  Here are two very concrete points
> you should note:
> 
> 1. ZWNJ is not a special quirk of Persian language, it is not
> a mnemonic tool,    nor is it an optional writing-style
> device.  ZWNJ is used in the writing of    MOST languages
>...