Re: [dnssd] Partial review of draft-ietf-dnssd-mdns-dns-interop-04

Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de> Sun, 12 March 2017 11:25 UTC

Return-Path: <j.schoenwaelder@jacobs-university.de>
X-Original-To: dnssd@ietfa.amsl.com
Delivered-To: dnssd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4D1611298A8; Sun, 12 Mar 2017 04:25:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XCb8sAn0bQmD; Sun, 12 Mar 2017 04:25:31 -0700 (PDT)
Received: from atlas3.jacobs-university.de (atlas3.jacobs-university.de [212.201.44.18]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1E6881298B1; Sun, 12 Mar 2017 04:25:31 -0700 (PDT)
Received: from localhost (demetrius5.irc-it.jacobs-university.de [10.70.0.222]) by atlas3.jacobs-university.de (Postfix) with ESMTP id E50337BB; Sun, 12 Mar 2017 12:25:29 +0100 (CET)
X-Virus-Scanned: amavisd-new at jacobs-university.de
Received: from atlas3.jacobs-university.de ([10.70.0.205]) by localhost (demetrius5.jacobs-university.de [10.70.0.222]) (amavisd-new, port 10030) with ESMTP id LedZY_BcSS15; Sun, 12 Mar 2017 12:25:27 +0100 (CET)
Received: from hermes.jacobs-university.de (hermes.jacobs-university.de [212.201.44.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hermes.jacobs-university.de", Issuer "Jacobs University CA - G01" (verified OK)) by atlas3.jacobs-university.de (Postfix) with ESMTPS; Sun, 12 Mar 2017 12:25:29 +0100 (CET)
Received: from localhost (demetrius3.jacobs-university.de [212.201.44.48]) by hermes.jacobs-university.de (Postfix) with ESMTP id E9DFD20038; Sun, 12 Mar 2017 12:25:28 +0100 (CET)
X-Virus-Scanned: amavisd-new at jacobs-university.de
Received: from hermes.jacobs-university.de ([212.201.44.23]) by localhost (demetrius3.jacobs-university.de [212.201.44.32]) (amavisd-new, port 10024) with ESMTP id SZsufhA22sW0; Sun, 12 Mar 2017 12:25:28 +0100 (CET)
Received: from elstar.local (elstar.jacobs.jacobs-university.de [10.50.231.133]) by hermes.jacobs-university.de (Postfix) with ESMTP id E6A1B20036; Sun, 12 Mar 2017 12:25:27 +0100 (CET)
Received: by elstar.local (Postfix, from userid 501) id 69C1D3EB7296; Sun, 12 Mar 2017 12:25:32 +0100 (CET)
Date: Sun, 12 Mar 2017 12:25:31 +0100
From: Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de>
To: Andrew Sullivan <asullivan@dyn.com>
Message-ID: <20170312112531.GC50035@elstar.local>
Mail-Followup-To: Andrew Sullivan <asullivan@dyn.com>, ops-dir@ietf.org, dnssd@ietf.org, draft-ietf-dnssd-mdns-dns-interop.all@ietf.org
References: <148897586598.20191.8422735308130046248@ietfa.amsl.com> <20170308154458.GU9646@dyn.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
X-Clacks-Overhead: GNU Terry Pratchett
Content-Transfer-Encoding: 8bit
In-Reply-To: <20170308154458.GU9646@dyn.com>
User-Agent: Mutt/1.6.0 (2016-04-01)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnssd/SSyTS1SLrB5V4j6O-UePFogFKS8>
Cc: ops-dir@ietf.org, dnssd@ietf.org, draft-ietf-dnssd-mdns-dns-interop.all@ietf.org
Subject: Re: [dnssd] Partial review of draft-ietf-dnssd-mdns-dns-interop-04
X-BeenThere: dnssd@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
Reply-To: Juergen Schoenwaelder <j.schoenwaelder@jacobs-university.de>
List-Id: "Discussion of extensions to DNS-based service discovery for routed networks." <dnssd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnssd>, <mailto:dnssd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnssd/>
List-Post: <mailto:dnssd@ietf.org>
List-Help: <mailto:dnssd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnssd>, <mailto:dnssd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Mar 2017 11:25:33 -0000

Andrew,

your response is helpful for me to understand things better. The
situation indeed seems complicated.

On Wed, Mar 08, 2017 at 10:44:59AM -0500, Andrew Sullivan wrote:
> Hi,
> 
> Thanks for the review.  I think there is an issue here that is
> apparently not clear enough to you as a reader, so it's really helpful
> to learn that.  I'd like to figure out how to make it clearer.
> 
> On Wed, Mar 08, 2017 at 04:24:25AM -0800, Jürgen Schönwälder wrote:
>  
> > Disclaimer: I regret that DNS does not use UTF-8 in its labels, to me
> > , as an outside observer, it seems IDNA was a big mistake. But given
> > the IETF standards we have, it seems one has to properly IDNA encode
> > UTF-8 in DNS labels. 
> 
> This isn't exactly correct, and I think that it may be the source of
> the problem you had with the document.
> 
> DNS as such makes literally no restrictions on what "characters" can
> appear in a label, except for length.  A label can have 63 octets.
> Any octet is permitted.  Perhaps unfortunately, however, in the
> original specification ASCII was treated specially in two ways.
> First, labels are matched "case insensitively" in ASCII.  Second, STD
> 13 recommended the letter,digit,hyphen rule for labels to maximise
> compatibility with deployed hostnames.  Because STD 13 was written
> before RFC 2119 keywords were in use, the distinctions among good
> operational practice and protocol restrictions was not always clear to
> every reader, so many people understood LDH to be a DNS restriction,
> but it wasn't.

This was clear to me since I am well aware that the DNS protocol ships
octets.

> Apart from the special treatment of ASCII, DNS also makes no rules
> about character sets.  So, there's no way to tell, for a given label,
> what its encoding scheme is -- UTF-8, JIS, ISO-8859-n, or what all.

Yes, but historically (if I understand things right), the character
set was effectively restricted to LDH (ASCII) - the restriction was
not part of the protocol but imposed by other 'recommendations'.

> It is because of the above that we ended up with IDNA.  Nothing about
> IDNA prevents other schemes for labels (including using characters
> outside the ASCII range).  The possibilities are discussed at some
> length in RFC 6055.

With a time machine available, we should probably have moved from LDH
(ASCII) to UTF-8 (or a subset of UTF-8). Multiple different character
_encodings_ in the label space can't really be a feature. Note that I
consider further restrictions (i.e., use of white space) something
separate from the encoding scheme used.

I think I recall that people said back in a day that moving from LDH
to UTF-8 may be risky since we do not know whether all DNS
implementations got the "DNS ships octets" thing right. And there may
be situations where entering a UTF-8 name correctly may not be
feasible on certain devices. Looking back in time, these may have been
valid concerns back then but they are likely less so today and instead
of having a transition plan to have UTF-8 encoded labels we went for a
solution that gives us complexity forever. [And yes, I do not know all
the details of the discussions and so I am likely summarizing things
up very wrongly.] But from an outsider perspective, it seems that
having (i) a transition plan to UTF-8 encoding of labels and (ii) a
clear separation of further label restrictions from the encoding
aspects would have been desirable.

> At the same time, we know that the root zone and a very large number
> of TLDs (maybe all, but I bet not) restrict delegations to LDH, and
> that they use some variation of IDNA (including non-IETF IDNA variants
> based on UTS#46 -- this is how some TLDs allow registering emoji,
> since they weren't "emoji" when IDNA2003 came out and many of the code
> points were undefined under Unicode 3.2.  Emoji are DISALLOWED under
> IDNA2008 because they're not "letters or digits" in any writing
> system).  So, given the distributed administration of the DNS, there
> is no way to be sure how a given octet sequence ought to be
> interpreted, but we can be fairly sure that some administrators of
> some zones will definitely not permit UTF-8 labels in their zones.

Well, obviously, this is the clunky result we have now.

> This is the problem the I-D is intending to highlight, and it sounds
> like it wasn't clear to you.  Is any of the above clearer?  Would any
> more introductory material help to make this plain?

For me personally, it would have been good to know more about what is
out there. I understand traditional LDH labels and I understand
IDNA2003 and IDNA2008 labels. I understand that mDNS says UTF-8 and
pretty much no restrictions on instance portion of a service name.
Are there significant deployments out there that use other encodings
or restrictions?

> > would that be? I looked up RFC 6055 and I am surprised to read text
> > about 'intelligent (stub) resolver' there as this seems to be asking
> > for trouble. 
> 
> Yes :) That's the trouble I think we're in, and it's why I worked on
> this I-D in the first place.
> 
> > To me, it sounds like you are saying 'some portions of
> > the name hierarchy may use raw UTF-8 labels while other portions may
> > not and we do not tell you how to set them appart in a reliable
> > way'. If this is the approach, then the approach is in my personal
> > view broken. (And please recall the disclaimer above.)
> 
> It may be broken, but it is the reality in a distributed system with
> distributed administration.  One cannot make rules about what other
> people put in their zones, and the DNS is already clear that any octet
> at all is permitted in the DNS.

The protocol ships octets. The question is how to interpret them. If
there would be agreement that the octets by default represent UTF-8
encoded characters life would start to be somewhat simpler. This would
solve the encoding part, it does not solve the 'which restrictions
apply to this label part'. OK, I am ignorant of the complexity of
deploying this but a decent transition plan to something simpler may
be better long term than more layeres of things that do not play well
together.

> > In section 3, what does the term 'implicated' mean?
> 
> Would "involved" make it clearer to you?  The point is that different
> portions of a DNS-SD name have different properties, and only some of
> those are going to have potentially ambiguous i18n encodings in the
> DNS.  
> 
> > What I do not understand: Why is it desirable to treat DNS labels
> > that
> > carry DNS-SD Service Instance Names different than any other labels
> > that may contain UTF-8 characters? 
> 
> Because DNS-SD says that the labels should be raw UTF-8 in the DNS,
> and because of IDNA some zones have a policy where raw UTF-8 is not
> permitted (and some resolvers may automatically convert U-labels to
> A-labels).

Yes, sounds like a decent conflict asking for surprises.

/js

-- 
Juergen Schoenwaelder           Jacobs University Bremen gGmbH
Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany
Fax:   +49 421 200 3103         <http://www.jacobs-university.de/>