Re: [Inip-discuss] Domain Names

Lyman Chapin <lyman@interisle.net> Sat, 23 January 2016 20:49 UTC

Return-Path: <lyman@interisle.net>
X-Original-To: inip-discuss@ietfa.amsl.com
Delivered-To: inip-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 273F51A1B2B for <inip-discuss@ietfa.amsl.com>; Sat, 23 Jan 2016 12:49:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jdLTXyyskukJ for <inip-discuss@ietfa.amsl.com>; Sat, 23 Jan 2016 12:49:19 -0800 (PST)
Received: from mail.shire.net (mail.shire.net [199.102.78.250]) by ietfa.amsl.com (Postfix) with ESMTP id D58861A1B29 for <inip-discuss@iab.org>; Sat, 23 Jan 2016 12:49:18 -0800 (PST)
Received: from c-73-38-218-31.hsd1.nh.comcast.net ([73.38.218.31] helo=[192.168.1.147]) by mail.shire.net with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.77) (envelope-from <lyman@interisle.net>) id 1aN57W-0002zE-2B; Sat, 23 Jan 2016 13:49:18 -0700
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: multipart/signed; boundary="Apple-Mail=_82AF8780-CB5A-430D-AF75-E535609EECA2"; protocol="application/pkcs7-signature"; micalg="sha1"
From: Lyman Chapin <lyman@interisle.net>
In-Reply-To: <D2C7C367.12CF3%edward.lewis@icann.org>
Date: Sat, 23 Jan 2016 15:49:17 -0500
Message-Id: <DE0BA0C3-D623-4B3A-A343-6829F74444CC@interisle.net>
References: <D285CCDC.11B63%edward.lewis@icann.org> <A3306B3F-2C01-4236-8A5F-119C1669425B@isoc.org> <D2A15E6C.124B4%edward.lewis@icann.org> <7047EC59-873A-4A76-80EF-3F2899A9052A@interisle.net> <D2C7C367.12CF3%edward.lewis@icann.org>
To: Edward Lewis <edward.lewis@icann.org>
X-Mailer: Apple Mail (2.1283)
X-SA-Exim-Connect-IP: 73.38.218.31
X-SA-Exim-Mail-From: lyman@interisle.net
X-SA-Exim-Scanned: No (on mail.shire.net); SAEximRunCond expanded to false
Archived-At: <http://mailarchive.ietf.org/arch/msg/inip-discuss/9QoSazr7c1BsSnHKdOuE6MdUvZk>
Cc: "inip-discuss@iab.org" <inip-discuss@iab.org>
Subject: Re: [Inip-discuss] Domain Names
X-BeenThere: inip-discuss@iab.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IAB Internet Names and Identifiers Discussion List <inip-discuss.iab.org>
List-Unsubscribe: <https://www.iab.org/mailman/options/inip-discuss>, <mailto:inip-discuss-request@iab.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/inip-discuss/>
List-Post: <mailto:inip-discuss@iab.org>
List-Help: <mailto:inip-discuss-request@iab.org?subject=help>
List-Subscribe: <https://www.iab.org/mailman/listinfo/inip-discuss>, <mailto:inip-discuss-request@iab.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Jan 2016 20:49:21 -0000

On Jan 22, 2016, at 11:40 AM, Edward Lewis wrote:

> My in-box is a mess[0], I just found this looking for another message.  So
> my response to this is terribly delayed.
> 
> [0] - I'm using an up-to-date commercial email reader that apparently
> cannot correctly sort by date. ;)
> 
> On 1/7/16, 16:57, "Lyman Chapin" <lyman@interisle.net> wrote:
> 
>> In graph-theoretic terms, the domain name space constitutes a labelled
>> directed rooted tree in which the syntax of the label associated with
>> each vertex other than the unlabelled root is defined by RFCs 1035, 1123,
>> and 2181. The term "nth level domain name label" refers to a member of
>> the set of all vertices for which the path to the root contains n edges.
>> For n=1 the term most often used is "top level domain name label" or
>> simply "top level domain" (TLD). A fully qualified domain name is a
>> sequence of labels that represents a path from the root to a leaf vertex
>> of the domain name space. The shorter term "domain name" is not formally
>> defined; in common usage it may be the shorthand equivalent of "fully
>> qualified domain name" (FQDN) or refer to any non-empty subset of the
>> sequence of labels formally identified by a fully qualified domain name.
> 
> A clarifying question - (is) this definition specific to names in the DNS?
> 
> The reason I ask is that I believe that only within the DNS is the name
> space finite and that is because the DNS is defined for a 1980's-era code
> base.  (I.e., fixed widths, static limits on space.)

If we maintain (at least for the sake of argument) a distinction between the "domain name space" and the "domain name system", then the domain name space is defined by (a) its mathematical structure as a LDRT, and (b) the generation rules for syntactically valid labels. The generation rules may be the product of 1980s-era constraints, but they apply today as much as they did then - that is, no one has suggested that we change the rules to allow longer labels, or longer sequences of labels (domain names) - so the domain name space is finite independent of the way in which the protocols of the domain name system associate semantics with domain names. Obviously we wouldn't have these specific generation rules for domain name labels if we hadn't developed them in the context of the domain name system - the labels have a maximum length of 63 LDH characters because that maximum length, with an assumed ASCII-octet encoding, was determined to be "just right" for all of the reasons described in RFC 1034 - and the early specs routinely refer to the domain name space as a component of the DNS (see for example Section 2.4 of RFC 1034). But even the early specs distinguish the domain name space ("specifications for a tree structured name space") from another component of the DNS - resource records ("data associated with the names"). The domain name space is just names (identifiers); the data associated with the names are not themselves part of the domain name space.

> 
>> In this formulation, the term "domain name space" refers to the complete
>> graph consisting of all possible vertices and edges - not just those with
>> which a specific meaning has been associated (what we might call
>> "allocated" labels). It is a finite graph because the length of the
>> longest possible FQDN is finite. At any point in time, there is another
>> labelled directed rooted tree - a sub-graph of the domain name space -
>> containing only vertices that represent allocated labels.
>> 
>> -------------------------------
>> 
>> So mathematically, a domain name is simply a sequence of labels. In most
>> of the contexts in which we talk about, write about, or use domain names
>> we add representational elements like "top to bottom" or "separating
>> character," but those are not properties of a domain name.
>> 
>> Not terribly pragmatic, I know; but it might have a place in your draft -
> 
> I completely agree.  With all of that - sequence/separating character/not
> pragmatic.
> 
> A question rolling in my mind is - do we want to offer something a few
> steps away from mathematic towards pragmatic?  Like - are IETF(tm) Domain
> Names things that have ... dots between labels and other constraining
> elements?  (Using "dots between labels" as an example.)

That question is on my mind too!

(1) We can talk about a domain name without specifying where and how the data associated with that name are stored, or served in response to queries. The domain name is clearly an identifier; it is not equivalent to a zone file, which is an implementation-specific (RFC 1035) way in which to store the data associated with a name. The domain name may appear in an $ORIGIN directive or SOA RR within the zone file, but the domain name is not the zone file.

(2) How should we think about the use of the "dot" character to mark the boundary between labels in a domain name when the name is represented as text? (We certainly don't store the "dot" in the internal data structures that hold domain names, other than those that are used explicitly for the purpose of displaying domain names for human users.) It's amusing to see the "dot" in written representations of IDN domain names that use scripts in which there is no "dot" character (which doesn't matter after the punycode conversion, of course).

(3) It seems to me that what is particular to the DNS is therefore not domain names (RFC 1034: "The terms 'domain' or 'domain name' are used in many contexts beyond the DNS described here. Very often, the term domain name is used to refer to a name with structure indicated by dots, but no relation to the DNS.") but zone files - the data (meaning) associated with domain names. We might postulate (I don't assert this): For a name that is intended to be both an identifier and a directive - local, for example, or onion - the domain name satisfies only the "identifier" part; the directive part ("do this instead of that" in the course of name resolution) should be encoded in the zone file. Another way to put it would be to say that we should use the zone file to encode the protocol switch semantics that we want to associate with a name rather than putting the name into a special names registry.

Such a postulate might or might not be consistent with our search for something "pragmatic" - I'm not sure yet where to take it. It might mean that for a name like "local" the switch to mDNS would be encoded as a zone file directive rather than embedded in the resolver code. One consequence of going down that path would be the opportunity to make the simplifying assertion that every well-formed (syntactically correct) domain name can be processed by the resolution protocol of the DNS. It might be kicked out to some other resolution system when the DNS resolver hit a switch directive, but there would be no need to have special-case code in the resolver to deal with every possible type of switch.

- Lyman