Re: [DNSOP] draft-liman-tld-names-04

Tony Finch <dot@dotat.at> Fri, 26 November 2010 16:52 UTC

Return-Path: <fanf2@hermes.cam.ac.uk>
X-Original-To: dnsop@core3.amsl.com
Delivered-To: dnsop@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 69B0E28C0EA for <dnsop@core3.amsl.com>; Fri, 26 Nov 2010 08:52:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.572
X-Spam-Level:
X-Spam-Status: No, score=-3.572 tagged_above=-999 required=5 tests=[AWL=1.027, BAYES_00=-2.599, GB_I_LETTER=-2]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6ffXCrcabVrd for <dnsop@core3.amsl.com>; Fri, 26 Nov 2010 08:52:35 -0800 (PST)
Received: from ppsw-50.csi.cam.ac.uk (ppsw-50.csi.cam.ac.uk [131.111.8.150]) by core3.amsl.com (Postfix) with ESMTP id 8BCCE28C0DE for <dnsop@ietf.org>; Fri, 26 Nov 2010 08:52:31 -0800 (PST)
X-Cam-AntiVirus: no malware found
X-Cam-SpamDetails: not scanned
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
Received: from hermes-2.csi.cam.ac.uk ([131.111.8.54]:52077) by ppsw-50.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.157]:25) with esmtpa (EXTERNAL:fanf2) id 1PM1YE-0007M6-rM (Exim 4.72) (return-path <fanf2@hermes.cam.ac.uk>); Fri, 26 Nov 2010 16:53:34 +0000
Received: from fanf2 (helo=localhost) by hermes-2.csi.cam.ac.uk (hermes.cam.ac.uk) with local-esmtp id 1PM1YE-0001Ul-H4 (Exim 4.67) (return-path <fanf2@hermes.cam.ac.uk>); Fri, 26 Nov 2010 16:53:34 +0000
Date: Fri, 26 Nov 2010 16:53:34 +0000
From: Tony Finch <dot@dotat.at>
X-X-Sender: fanf2@hermes-2.csi.cam.ac.uk
To: Andrew Sullivan <ajs@shinkuro.com>
In-Reply-To: <20101125175247.GH21047@shinkuro.com>
Message-ID: <alpine.LSU.2.00.1011261558520.4075@hermes-2.csi.cam.ac.uk>
References: <20101117091928.GA30093@nic.fr> <4CE9E942.20906@dougbarton.us> <0E561274-43FE-4657-951E-74C8FF0FD307@hopcount.ca> <4CEC43DC.1060709@dougbarton.us> <E7796748-6880-4928-B96D-0024E27E98D5@hopcount.ca> <4CEC69C5.3040209@dougbarton.us> <7B9EF625-1E25-42BE-9546-61C5B7EFC6DA@hopcount.ca> <8CEF048B9EC83748B1517DC64EA130FB43E0037FD1@off-win2003-01.ausregistrygroup.local> <20101124142303.GB19441@shinkuro.com> <alpine.LSU.2.00.1011251734170.4075@hermes-2.csi.cam.ac.uk> <20101125175247.GH21047@shinkuro.com>
User-Agent: Alpine 2.00 (LSU 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: Tony Finch <fanf2@hermes.cam.ac.uk>
Cc: dnsop@ietf.org
Subject: Re: [DNSOP] draft-liman-tld-names-04
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Nov 2010 16:52:36 -0000

On Thu, 25 Nov 2010, Andrew Sullivan wrote:

> So what aside from [...] do you want?

Something like this:


Abstract

This memo clarifies the syntax of top-level domain labels in the domain
name system as specified in RFC 1123, and how this syntax relates to the
allocationn policy for TLDs. It describes the current

[...blah...]

Background

[RFC0952] defines a host name in the first paragraph under "ASSUMPTIONS",
as follows:

      A "name" ... is a text string up to 24 characters drawn from the
      alphabet (A-Z), digits (0-9), minus sign (-), and period (.).
      Note that periods are only allowed when they serve to delimit
      components of "domain style names".  (See RFC-921, "Domain Name
      System Implementation Schedule", for background).  No blank or
      space characters are permitted as part of a name.  No distinction
      is made between upper and lower case.  The first character must be
      an alpha character.  The last character must not be a minus sign
      or period.

[RFC1123] section 2.1 reaffirms this definition, but makes one change
to the syntax:

      The syntax of a legal Internet host name was specified in RFC-952
      [DNS:4].  One aspect of host name syntax is hereby changed: the
      restriction on the first character is relaxed to allow either a
      letter or a digit.  Host software MUST support this more liberal
      syntax.

In addition, the DISCUSSION in Section 2.1 says:

      'However, a valid host name can never have the dotted-decimal form
      #.#.#.#, since at least the highest-level component label will be
      alphabetic.'  [Section 2.1]

Some implementers may have understood the above phrase "will be
alphabetic" to be a protocol restriction. This is incorrect. It is in fact
a description of the TLD allocation policy at that time.

The TLD allocation policy has since had two significant syntactic changes.

On 16 November 2000 the first long TLD (.museum) was allocated, and it
was added to the root zone in June 2001.

In October 2007, the first IDNA test TLDs were added to the root zone.
These were the first TLDs with non-alphabetic characters. ICANN approved a
policy for allocating IDNA ccTLDs in October 2009 and the first production
IDNA TLDs were added to the root zone in January 2010.

Deployed software that checks DNS top-level labels for conformance with
past allocation policy is likely to reject domain names allocated after a
policy change.


Syntax of TLD labels - protocol level

All labels of a domain name have the same syntax. The syntax of TLDs is
not specially restricted at the protocol level.

   domain  = *(label ".") label ["."]

   label   = let-dig [ldh-str]

   let-dig = ALPHA / DIGIT

   ldh-str = *( ALPHA / DIGIT / "-" ) let-dig

A label can be up to 63 characters long. A domain name can be up to 255
characters long.

A domain name as a whole shall not match the dotted quad representation of
an IPv4 address.

   IPv4    = 3(digits ".")

   digits  = 1*DIGIT


Syntax of TLD labels - allocation policy

The syntax of allocated TLDs is restricted in order to ensure that no
domain name can match an IPv4 dotted quad, and for compatibility with past
practice and deployed software. The policy is subject to change by ICANN.
This section describes the syntax of domain names permitted by the current
allocation policy.

IDNS encodes Unicode strings within the syntax permitted for domain name
labels. The Unicode string used by applications is known as a U-Label;
its corresponding encoding in the DNS is known as an A-Label. The terms
A-Label and U-Label are used in this document as defined in [RFC5890].
Valid A-Labels always contain non-alphabetic characters.

In order to accommodate the wish to express TLD names in scripts other
than the ASCII subset of Latin, it is necessary to allow non-alphabetic
characters in the corresponding TLD DNS-Labels.  Following past practice,
the U-label form of a TLD name is restricted by applying rules analogous
to those already imposed on ASCII TLD DNS-Labels.

ASCII TLDs have the following syntax:

   TLD = 1*63(ALPHA)

IDNA TLDs obey the following requirements:

   1.  the DNS-Label is a valid A-Label according to [RFC5890];

   2.  the derived property value of all code points, as defined by
       [RFC5890], is PVALID;

   3.  the general category of all code points, is one of { Ll, Lo, Lm, Mn }.


[... etc etc ...]

Tony.
-- 
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/
HUMBER THAMES DOVER WIGHT PORTLAND: NORTH BACKING WEST OR NORTHWEST, 5 TO 7,
DECREASING 4 OR 5, OCCASIONALLY 6 LATER IN HUMBER AND THAMES. MODERATE OR
ROUGH. RAIN THEN FAIR. GOOD.