Re: [apps-discuss] CONTEXTJ in TLD DNS-Labels (draft-liman-tld-names-05)

John C Klensin <john-ietf@jck.com> Mon, 04 July 2011 12:17 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 54A2F21F8670 for <apps-discuss@ietfa.amsl.com>; Mon, 4 Jul 2011 05:17:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level:
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[AWL=1.000, BAYES_00=-2.599, GB_I_INVITATION=-2, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UN-jA4YY1RyY for <apps-discuss@ietfa.amsl.com>; Mon, 4 Jul 2011 05:17:43 -0700 (PDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by ietfa.amsl.com (Postfix) with ESMTP id 51E3F21F866D for <apps-discuss@ietf.org>; Mon, 4 Jul 2011 05:17:43 -0700 (PDT)
Received: from [127.0.0.1] (helo=localhost) by bs.jck.com with esmtp (Exim 4.34) id 1Qdi5p-0008BB-1l; Mon, 04 Jul 2011 08:17:37 -0400
X-Vipre-Scanned: 0749C4C70026270749C614-TDI
Date: Mon, 04 Jul 2011 08:17:36 -0400
From: John C Klensin <john-ietf@jck.com>
To: Behnam Esfahbod <behnam@esfahbod.info>
Message-ID: <B464B2C6607E04FD0572AA74@[192.168.1.128]>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Cc: apps-discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] CONTEXTJ in TLD DNS-Labels (draft-liman-tld-names-05)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Jul 2011 12:17:44 -0000

--On Thursday, June 30, 2011 16:56 -0400 Behnam Esfahbod
<behnam@esfahbod.info> wrote:

> Hello,
> 
> The restriction rules of the latest draft for "Top Level
> Domain Name Specification", section 4 [1], is unfortunately
> very restrictive for some languages, including Persian
> language (in Arabic script).
>...
> Hereby, I would like to request reconsidering the restriction
> rules of TLD DNS-Labels and to allow characters with CONTEXTJ
> derived property value in the labels. (Of course the IDNA2008
> contextual restrictions for CONTEXTJ characters must apply.)


Behnam,

(personal opinion only; not wearing any present or past "hats")

ICANN makes up their own rules about this sort of thing, maybe
with emphasis on "makes up".  They have fairly consistently told
IETF participants that IETF opinions are relevant only if they
impose clear technical restrictions (and maybe not then).   The
degree of attention paid in ICANN to the arguments of either RFC
3675 or RFC 4185 are perhaps indicative.

IDNA2008 imposes no special requirements on top level IDNs -- if
a string is valid at any other level of the tree, it does not
violate the protocol to use it at the top level.  So, from an
IETF standpoint, there is nothing to reconsider in order to
permit ZWNJ to be used in domain names.

That said, let me provide a different perspective on the
problem.  This perspective may be reflected in both current
ICANN policy and in draft-liman-tld-names-05.   It certainly
influenced the recommendations I made about the latter document.

First of all, while various folks around ICANN seem to forget it
regularly, there has never been an entitlement to use names or
words in the DNS.  There are many words in English that cannot
be written using the traditional "preferred name syntax" (often
called "LDH") rules.  There are also many family names, business
names, and so on that cannot be used.  While IDNs permit use of
many characters that are not part of the basic Latin-derived
alphabet, they don't change the fundamental principle that
something being a valid word or phrase in a language does not
create an automatic "right" to have it as a domain name.
Instead, almost every decision as to what strings should be
permitted to be used as labels within a given zone represents a
tradeoff (either globally across zones or within a particular
zone) because the desire to use a particular string as a
mnemonic label and the risks that string might create when used
for Internet references.

The characters listed as CONTEXTJ and CONTEXTO are inherently
tricky.  Used in the wrong context, they produce labels that
don't compare equal but that have exactly the same appearance.
To avoid those and other problems, the 2003 version of IDNA
excluded them entirely from IDNs using the "map to nothing"
technique that we now recognize as creating dangerous
ambiguities.  While we decided to permit them in appropriate
contexts in IDOA2008, the reality is that the contextual rules
themselves aren't adequate without additional restrictions that
go beyond anything we can do as a DNS or IDNA technical matter.

If one were using, e.g., Persian strings in a domain whose use
is largely limited to mnemonics derived from Persian and almost
certain to be treated as Persian writing style and rendered with
Persian-specific rendering engines, then there should be little
problem using ZWJ and ZWNJ.  That situation would exist, or at
least be manageable, in .IR and any Arabic script (and
Persian-language-derived) variation on that name.   But the root
zone (and hence TLD names) are inherently problematic because it
has to available for labels in all scripts and because neither
languages nor writing style can be encoded reliably in the DNS
(at least without causing lots of other problems).  However, if
a Persian string containing ZWNJ were rendered either by a
rendering engine designed for the Arabic language (or one that
was simply naive about Arabic script), the resulting situation
would be an invitation to phishing, to fraudulent certificates,
and so on.

So, while permitting ZWNJ (and ZWJ) in top-level domain names
seems attractive, it is not safe -- much less safe than
permitting the PVALID characters that are always displayed.
Striking the balance between safety and the desire to be able to
include all mnemonics based on any language in the world will
always be a difficult choice and, ultimately, a policy one, but,
as long as the community believes that security, stability, and
predictable behavior take precedence over the use of any given
character in any given script, decisions that exclude
problematic characters from the root will continue to be
justified.

best regards,
   john