Re: [dnsext] I-D Action:draft-ietf-dnsext-aliasing-requirements-00.txt

Alex Bligh <alex@alex.org.uk> Sat, 26 February 2011 13:07 UTC

Return-Path: <alex@alex.org.uk>
X-Original-To: dnsext@core3.amsl.com
Delivered-To: dnsext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5F7F03A6857 for <dnsext@core3.amsl.com>; Sat, 26 Feb 2011 05:07:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.849
X-Spam-Level:
X-Spam-Status: No, score=-0.849 tagged_above=-999 required=5 tests=[AWL=-1.750, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, J_CHICKENPOX_55=0.6, MANGLED_LOAN=2.3, OBSCURED_EMAIL=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SinLmrt1qBKe for <dnsext@core3.amsl.com>; Sat, 26 Feb 2011 05:07:03 -0800 (PST)
Received: from mail.avalus.com (mail.avalus.com [89.16.176.221]) by core3.amsl.com (Postfix) with ESMTP id 0F0E23A6841 for <dnsext@ietf.org>; Sat, 26 Feb 2011 05:07:02 -0800 (PST)
Received: from [192.168.100.15] (87-194-71-186.bethere.co.uk [87.194.71.186]) by mail.avalus.com (Postfix) with ESMTPSA id 7BEADC562CA; Sat, 26 Feb 2011 13:07:55 +0000 (GMT)
Date: Sat, 26 Feb 2011 13:07:54 +0000
From: Alex Bligh <alex@alex.org.uk>
To: Suzanne Woolf <woolf@isc.org>, dnsext@ietf.org
Message-ID: <D3C451913423BA973D633EC1@Ximines.local>
In-Reply-To: <20110223114720.GA10740@bikeshed.isc.org>
References: <20110223001502.31614.56353.idtracker@localhost> <20110223114720.GA10740@bikeshed.isc.org>
X-Mailer: Mulberry/4.0.8 (Mac OS X)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Subject: Re: [dnsext] I-D Action:draft-ietf-dnsext-aliasing-requirements-00.txt
X-BeenThere: dnsext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Alex Bligh <alex@alex.org.uk>
List-Id: DNS Extensions working group discussion list <dnsext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/dnsext>, <mailto:dnsext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsext>
List-Post: <mailto:dnsext@ietf.org>
List-Help: <mailto:dnsext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsext>, <mailto:dnsext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Feb 2011 13:07:05 -0000

Suzanne,

--On 23 February 2011 11:47:20 +0000 Suzanne Woolf <woolf@isc.org> wrote:

> Your comments are sought. It was pushed out this week to keep the
> discussion here going.

I think this is a very useful draft.

My main comment is that I had expected the draft to concentrate more
on what types of behaviour were requirements from a functionality point
of view (rather than a not breaking stuff point of view). We know how
to not break stuff (don't fix things). But we don't actually yet know
what the proposed required behaviour is. Some requirements conflict.
I'd sort of presumed we'd have a table of requirements (both sorts)
vs. potential solutions.

See below for detailed comments.

-- 
Alex Bligh


--- draft-ietf-dnsext-aliasing-requirements-00.txt	2011-02-23 
00:08:14.000000000 +0000
+++ draft-ietf-dnsext-aliasing-requirements-00-amb-comments.txt	2011-02-26 
13:01:45.000000000 +0000
@@ -22,6 +22,16 @@
    potential use cases, two or more names that users will regard as
    having identical meaning may sometimes require corresponding behavior
    in the underlying infrastructure, possibly in the DNS itself.  It's
+
+AB: Is the desired property really 'corresponding behaviour in the
+AB: underlying *infrastructure*' or 'corresponding behaviour in the
+AB: applications used by those users'. I suspect it is the latter and
+AB: this is a critical distinction. I don't think your average user
+AB: can see how the underlying infrastructure works. If the result at
+AB: the user experience level is the same, that should be good enough.
+AB: Indeed BNAME type approaches seek to replicate the behaviour at
+AB: user level using a different infrastructure.
+
    not clear how to accommodate this required behavior of such names in
    DNS resolution; in particular, it's not clear when they are best
    accommodated in registry practices for generating names for lookup in
@@ -177,13 +187,28 @@
    assumptions about how "names" or "words" work.  One aspect of this is
    the notion or expectation that multiple sets of names may be similar
    to a human user, and expected to behave "the same" or as "aliases" of
-   one another, across multiple services and interactions.  The DNS was
+   one another, across multiple services and interactions.  The DNS was 
originally
    designed with the implicit expectation that names would be based on
-   ASCII characters, and the "similarity" or "sameness" property doesn't
+   ASCII characters, and the "similarity" or "sameness" property did not
    seem to arise terribly often in the names people originally wanted to
    use in the DNS; thus the requirements of identical resolution of
    "aliased" or "bundled" names hasn't figured prominently as an
    attribute that needed to be accommodated in the generation or lookup
+
+AB: Perhaps add something like: "The DNS was originally designed from the
+AB: perspective of those using ASCII characters only, and (to a great
+AB: extent) labels consisting only of characters 0-9, A-Z, a-z and -.
+AB: An early decision was made to treat character case as insignificant
+AB: in lookups (one form of bundling), but to treat the presence of
+AB: hyphens as significant (another form of bundling). This currently
+AB: leads to situations where some names users might consider should
+AB: perform identically do not perform identically (e.g. a-b.example.com
+AB: and ab.example.com), whereas other pairs do perform identically
+AB: (e.g. AB.EXAMPLE.COM and ab.example.com). A user who does recognise
+AB: the significance of dots in names, or TLDs may have yet greater
+AB: expectations of DNS to resolve seemingly similar names, none of
+AB: which are ever likely to be met."
+
    of DNS names.  However, with the standardization of internationalized
    domain names protocols (ref: IDNA and IDNAbis), more and more
    internationalized domain name labels [RFC3490] are appearing in DNS
@@ -193,17 +218,27 @@
    users consider "the same" in some languages.  Accordingly, Internet
    users hope for such labels to behave in DNS contexts as they expect
    the corresponding human constructs to behave, regardless of the
-   specific service (smtp, http, etc.) involved..
+   specific service (smtp, http, etc.) involved.
+
+AB: Perhaps it is necessary to distinguish between these expectations
+AB: and the expectations of users we already fail to meet with
+AB: regards to case and hyphens, dots etc. as set out above.

    The general issues of what "the same" means, or of defining
    "variants" in human scripts as codified in Unicode (or anywhere else)
    are well outside the scope of the DNS or the expertise of most of the
    people who work on it.  They are matters for philosophers and
+
+AB: This is actually a key difference between the original design
+AB: decision re case insensitivity etc. and IDNA.
+
    applications developers, respectively.  However, to the extent that
    these issues can be specified as involving the resolution of names in
    the DNS, it's reasonable to describe those expectations and attempt
    to accommodate them.

+AB: Section break here on existing methods?
+
    There is some existing technology defined in the DNS for behavior
    that can be described as one name behaving "the same" as another.
    For a single node in the DNS tree, CNAME can be used to map one name
@@ -215,7 +250,7 @@
    that combines the characteristics of CNAME and DNAME is not currently
    defined in the DNS.

-   If existing protocol does not meet the zone administrator's need to
+   If the existing protocol does not meet the zone administrator's need to
    be able to treat one label, name, or zone as "the same" as another,


@@ -400,6 +435,30 @@
 2.2.  Identical DNS Resolution for Bundled DNS Names

    To some extent, the desired behavior can be described: "identical DNS
+
+AB: Are you trying to describe one particular form of desired behaviour
+AB: (i.e.  do you mean "this form of desired behaviour") or stating
+AB: that this is *the* desired behaviour. If the latter, I think this
+AB: is 'yet to be proven'. We have the example of SSL web sites
+AB: where it's (probably) precisely the opposite of the desired behaviour,
+AB: as currently, protocol developments aside, a different certificate is
+AB: needed for each variant, and each of those needs a different IP
+AB: address. Enforcing identical resolution (particularly at the TLD
+AB: level) between (say) TLDX and TLDY is likely to mean that the
+AB: operator of secure.example.TLDX has no choice but for
+AB: secure.example.TLDY to resolve to the same IP address, which will
+AB: inevitably lead to a security warning on the browser of the human
+AB: who thinks these two are identical - this is almost certainly
+AB: worse than (say) a redirect.
+AB:
+AB: [Later]: I see some of this is handled under 2.4 and 3.3, but reading
+AB: this in order doesn't make much sense. There is a bit of horse-before
+AB: cart here in that the draft seems to start by saying "identical
+AB: DNS resolution is useful" then quite a long while later (3.3)
+AB: say "actually, it may not be". Is the better approach not to say
+AB: "similar end user behaviour is the goal" and the question then
+AB: is "how can that be achieved?"
+
    resolution" means that the process of resolving two domain names will
    end with the same result, in most cases the same IP address.  In the
    history of DNS protocol development, there have been two attempts to
@@ -423,7 +482,7 @@
    variants, and in fact their characteristics differ widely, but it's
    possible to define some.  For example, the definition of variant
    characters in the JET Guidelines [RFC3743], intended for use with the
-   CJK language/script communities, is roughly this: One conceptual
+   CJK language/script communities, is roughly this: one conceptual
    character can be identified with several different Code Points in
    character sets for computer use.  In UNICODE definitions of some
    scripts, including Han (chinese), some characters can be identified
@@ -434,6 +493,14 @@
    (similarity in appearance is not required by the definition but often
    occurs).

+AB: it is probably worth stating that "It is inevitable that no character
+AB: variant rule is ever likely to encompass all characters of similar
+AB: appearance, as this may be undesirable (consider lower case l, upper
+AB: case I and number 1 in some fonts), and is likely to be too complex
+AB: in a unicode environment. No aliasing scheme will solve the problem
+AB: of protecting users and domain registrants from domain names that
+AB: are crafted to look the same as another, but are in fact different."
+
    With the introduction of IDNs in the DNS, perhaps most prominently in
    the root zone, decisions about how to deal with IDN variants is a
    significant challenge ahead of us.  We describe here a couple of
@@ -455,6 +522,17 @@
    (U+4E2D U+570B) are in the root today.  The first one (U+4E2D U+56FD)
    can be considered the "original" IDN TLD and the second one (U+4E2D
    U+570B) can be considered the IDN TLD "variant".  Ideally, it should
+
+AB: This is begging the question. Would it really be ideal? Assume
+AB: all the problems at the NS layer were solved and looking up
+AB: foo.bar.china1 gave the same result for foo.bar.china2, for
+AB: all values of foo.bar. This would then give the problem
+AB: with SSL certificates (and, no doubt other application layer
+AB: protocols). The problem is that applications using DNS
+AB: also need to recognise the fact that the two are variants.
+AB: So either you need to add this to the "However:" bit below,
+AB: or not assume that this is in fact idea.
+
    be possible to treat the original IDN TLD and its IDN TLD variant as
    "identical" for purposes of DNS resolution, in a way similar to the
    case mapping most DNS users take for granted, in which the uppercase
@@ -475,6 +553,22 @@
    share some operational experience around implementation of registry
    policy regarding managing multiple DNS trees as "the same"

+AB: I think the Chinese problem is actually more complex than you
+AB: make out. I believe you have described the 'chinese TLD'
+AB: prolem, rather than the general problem of 'Simplified and
+AB: Traditional Chinese' which is how the section is headed. The
+AB: section could thus be renamed - in which case one should note
+AB: that the TLD problem alone could be solved by registry policy
+AB: (as you mention with Greek, below). However, whilst I may have
+AB: misunderstood, I thought the 'simplified and traditional
+AB: chinese' problem was that variant ideograms could occur anywhere
+AB: within the domain name, and the mapping between them was not
+AB: done on a per (unicode) character basis, i.e. AB might be
+AB: equivalent to PQ, and CD to RS, but that doesn't necessarily
+AB: mean AD is equivalent to PS. Similar, EF might be equivalent
+AB: to U (one character) and GH to VWX (three characters). Saying that,
+AB: my limited understanding of this problem is second hand.
+
 2.3.2.  An example: Greek

    In Greek, almost every word has the "tonos" accent sign, but where it
@@ -494,8 +588,11 @@
    "the same word," in a sense very much like the case insensitivity
    that native users of Latin script take for granted in the DNS.

-
-
+AB: Perhaps mention that here or elsewhere it might be possible to
+AB: address these problems at the presentation layer, i.e. entirely
+AB: outside DNS (or arguably at nameprep stage). For instance, tonos
+AB: characters could be stripped and final sigmas rewritten as
+AB: normal sigmas. This is not ideal, but is one angle of attack.



@@ -519,7 +616,7 @@
    for zones maintained in parallel but for less work.  However, we
    later assert a proposed requirement that synthesizing the same record
    as a query would have obtained from an enumerated parallel tree isn't
-   enough-- that the property of association or "sameness" we're
+   enough; the property of association or "sameness" we're
    creating with specific mechanisms needs to be useful in some specific
    way to the consumer of the data.

@@ -572,6 +669,18 @@
        the use of DNS records or other mechanisms not really intended
        for the purpose, leads to confusion and inconsistency.

+AB: I wonder if it is worth mentioning that canonicalisation of a name
+AB: may be either a useful, or an undesirable process. Taking the
+AB: simplified Chinese example, or Arabic, some brands may wish to
+AB: appear (including in the browser bar, or in email addresses) written
+AB: one way, rather than another. On a web site this could be achieved
+AB: through a meta reload. On email by accepting all inbound email
+AB: variants but only using one outbound. Here, a variant solution
+AB: which canonicalised a single variant to a canonical version would
+AB: be useful (like a CNAME that doesn't actually get looked up except
+AB: by the application). In other circumstances, there has been
+AB: a desire expressed for all variants to be treated as equal
+AB: citizens.

 3.  Operational Considerations

@@ -601,7 +710,21 @@
    technology; it can be done, and is done today, entirely with
    provisioning logic and registry policy.

+AB: Note this isn't only a registry solution. It could be done by
+AB: users too (e.g. through a zonefile preprocessor that was
+AB: run prior to signing of the zone).
+
    However, it doubles the work and the number of records required.  If
+
+AB: I'm not sure it does double the work (presumably it's done
+AB: programmatically) but the effect on the number of records could
+AB: be far worse than doubling. Taking an example where 2 characters
+AB: are variants of eachother, the number of variants could be
+AB: as much as 2^((63-4)/(bytes per character)), i.e. huge. This
+AB: would therefore require limiting the number of supported
+AB: variants in the registry, either by policy or by economic
+AB: means
+
    provisioning isn't done carefully, errors can arise, leaving
    inconsistencies.  And provisioning multiple trees does nothing to
    link the resulting names directly; there is no property of
@@ -626,6 +749,12 @@
    maintained in parallel and possibly available for audit by the
    authority over example.com, depending on its delegation policy.

+AB: That may be a desirable feature as opposed to a limitation. For
+AB: the reasons given above (SSL example) it may well be that the
+AB: need for ensuring sameness is only in the zones managed by the
+AB: registry and not for the child zones for which it is not
+AB: authoritative.
+
 3.1.2.  Impact of special mechanisms

    Once we begin to consider mechanisms for maintaining parallel zones
@@ -695,6 +824,10 @@
    An example used more than once in discussion is provided by SSL, as a
    protocol that uses domain names without necessarily using the DNS
    protocol per se.  SSL certificates are tied to one domain name.  It
+
+AB: my understanding (and I am not an expert) is that this is changing,
+AB: but certificates supporting more than one are not widely supported.
+
    would be helpful to applications to have a non-protocol-specific way
    to identify securely cases where multiple domain names can be
    canonicalized to the domain name used for an SSL certificate.
@@ -749,17 +882,67 @@
        as they may be, it's not clear where the incentives would lie to
        deploy it.  This is particularly a concern for implementors and
        application developers.
+
+AB: I am not sure this is a reasonable requirement. After all, IDN
+AB: itself would have failed this test, as it is 'more overhead'
+AB: than LDH zones. The 'incentive to deploy' is presumably that
+AB: customers want it. May be add the word "disproportionate' before
+AB: 'overhead' within the first line.
+
    4.  Any mechanism proposed MAY require new RRtypes and special
        processing for them.
    5.  Any mechanism proposed MUST NOT only reduce costs of generating
        and providing authoritative service for DNS zones.  It would be
        too easy to reduce costs on the authority server provider while
        adding costs elsewhere, particularly in terms of complexity.
+
+AB: The first sentence here is ambiguous and says something different
+AB: from the second. A solution which increased costs for everyone
+AB: satisfy this criteria on one reading. Proposed text for
+AB: first sentence:
+AB: "Any mechanism proposed MUST NOT, in order to reduce cost for
+AB: generating and providing authoritative service for DNS zones,
+AB: disproportionately increase costs elsewhere, particularly
+AB: in terms of complexity"
+
        Given the central importance of DNS service to Internet
        operations, any change undertaken to lower the cost to providers
        may be useful, but should not simply shift costs to DNS users,
        whether applications or end users.

+AB: These are all, I think, impact requirements, i.e. "don't break
+AB: things". I think we also need to set out requirements in terms
+AB: of positive functionality.
+AB:
+AB: Here are some things that have been suggested are requirements.
+AB: I don't know whether they actually are requirements.
+AB:
+AB: * Ability for any variant name to resolve to the same records
+AB:
+AB: * Ability to do the above automatedly (i.e. without manually
+AB:   duplicating all records for variant names)
+AB:
+AB: * Ability to have exceptions to the above
+AB:
+AB: * Ability to work at TLD, registry and end registrant level
+AB:
+AB: * Ability to enforce sameness on child zones as well as parents
+AB:   (recursive sameness), i.e. more than that the child zones
+AB:   just have the same NS records
+AB:
+AB: * Ability for RRs within children of two "same" zones to differ
+AB:   (non-recursive sameness) - that's exactly the opposite to the
+AB:   above and is driven by the SSL/application layer point
+AB:
+AB: * Ability to support an arbitrarily large number of DNs per
+AB:   bundle
+AB:
+AB: * Ability for an application to be able to canonicalize a name
+AB:   i.e. for the application to be able to determine the canonical
+AB:   name not just get an IP address back.
+AB:
+AB: * Ability for all variant names to be treated as 'equal citizens'
+AB:   (no one canonical variant)

 5.  Possible Solutions

@@ -773,6 +956,20 @@
    other, or in preventing the use of such variants as might be
    considered confusing or dangerous.

+AB: At the risk of generating more work, I think it might be useful
+AB: to list some mechanisms which result in no protocol work
+AB: here, so that we have a proper taxonomy of all requirements.
+AB: I am sure there will be some need to produce a matrix of solutions
+AB: versus conformity with requirements, and for that we need to
+AB: have a common language of what potential solutions mean. The
+AB: ones not listed here that I can immediately think of are:
+AB: - online synthesis
+AB: - the "I think you mean" record, allowing a canonicalizing pointer
+AB:   at the application level, and returning NXDOMAIN to anyone
+AB:   not supporting it (I think someone else had a better name)
+AB: - manual provisioning / zonefile preprocessing
+AB: - rewrite DN at application / nameprep stage
+
    In addition, there are new proposals for DNS protocol to support
    "aliases" in the DNS as part of the desired behavior of "variant"
    names: Names direction[BNAME], and "Zone clone".
@@ -867,6 +1064,9 @@
    compatibility to avoid harm to implementations that expect, and use,
    the old behavior.

+AB: I think it would be fair to make CNAME+DNAME a separate proposal
+AB: as they have different characteristics (subtle breakage vs. new RRs)
+
 5.2.  Zone Clone

    The proposal of "zone clone" or "dns shadow", is an alternative