Re: [idn] I fear I cannot use IDN in the next 10 years
"Eric A. Hall" <ehall@ehsco.com> Tue, 09 October 2001 01:57 UTC
Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA06573 for <idn-archive@lists.ietf.org>; Mon, 8 Oct 2001 21:57:03 -0400 (EDT)
Received: from lserv by psg.com with local (Exim 3.33 #1) id 15qlrG-000Gss-00 for idn-data@psg.com; Mon, 08 Oct 2001 18:38:58 -0700
Received: from goose.ehsco.com ([207.65.203.98]) by psg.com with esmtp (Exim 3.33 #1) id 15qlrE-000Gsm-00 for idn@ops.ietf.org; Mon, 08 Oct 2001 18:38:57 -0700
Received: from [24.252.219.84] (account ehall HELO ehsco.com) by goose.ehsco.com (CommuniGate Pro SMTP 3.4.8) with ESMTP-TLS id 46312 for idn@ops.ietf.org; Sun, 07 Oct 2001 20:38:36 -0500
Message-ID: <3BC25528.FC918BB4@ehsco.com>
Date: Mon, 08 Oct 2001 20:38:48 -0500
From: "Eric A. Hall" <ehall@ehsco.com>
Organization: EHS Company
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: idn@ops.ietf.org
Subject: Re: [idn] I fear I cannot use IDN in the next 10 years
References: <200110040700.f9470bj18628@valinor.malmo.trab.se> <p0510030eb7e24eb9f210@[165.227.249.20]> <5.1.0.14.2.20011004213749.02478cd0@dcrocker.net> <5.1.0.14.2.20011004235936.023e1500@dcrocker.net> <5.1.0.14.2.20011005141903.04649150@dcrocker.net> <3BBE467C.A9797B21@ehsco.com> <p0510033ab7e3fc24f02c@[165.227.249.20]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit
Paul, First of all, I'd like to thank you for taking the time to enumerate your many perceived technical issues with the UDNS model. These points deserve to be fully debated and discussed. However, I think that your comments also represent a larger issue which needs to be resolved first. We need to come to a group consensus on the question of whether or not a UTF-8 namespace is necessary, desirable, or neither of those, and this needs to happen before the technical points can be debated in appropriate context. Without consensus on this underlying point, debates over relative costs will be debates over half-full vs half-empty, and will get us nowhere. For example: > UDNS gives us all of these problems for the limited benefit that some > applications don't have to implement one additional encoding for > sending host names on the wire. Does that seem like a good balance? The above is a comparison of UDNS' perceived cost relative to a benefit, but we haven't fully discussed the benefits as of yet. Clearly, ACE has many costs, some of which are quite high AND ongoing, although most of us agree that some form of ACE is necessary, and is therefore worth whatever cost may be required. What we do not agree on is the benefit that UDNS (or a similar mechanism) would provide. Making a decision based on cost alone is misrepresentative of the true value of UDNS, just as a decision on ACE which was based solely on cost would have misrepresented the true value of its benefits (being necessary, it can have almost any cost). Personally, I believe that UDNS is also necessary, and is therefore worth almost any cost (although as I will show, I think the cost is somewhat lower than you do). I base this on a few key arguments: * BCP18's is "Official Internet Policy" which requires support for UTF-8 in all new protocols, and in modifications to all existing protocols: "lack of an ability to use UTF-8 is a violation of this policy; such a violation would need a variance procedure". Simply put, policy requires this WG to devise some kind of support for UTF-8, unless it can be proven unreasonable. Nobody has proven it to be unreasonable. * Without a UTF-8 DNS interface, no new protocols or applications can be developed that are UTF-8 clean. Instead, they will be UTF-8 for everything EXCEPT domain names, and in some cases this will be fatal. One example we have already discussed for this is mapping between LDAPv3 distinguished names and DNS domain names (mapping dc= RDNs to DNS). Failure to support UTF-8 is a heavy blow to such efforts, requiring tremendous amounts of development effort and infrastructure oversight, likely hindering the use and development of LDAP in the Internet. * Global public networking is only in its infancy; there will likely be many thousands of new protocols and applications which are developed over the next few decades (some in the IETF WGs, most in business, educational or personal networks, and most will be developed outside of the US). If those applications follow the Official Internet Policy, they will be UTF-8 only or use it as the preferred encoding. However, those applications will also be saddled with ACE conversion wherever they have to interact with domain names, meaning they cannot be UTF-8 clean. This will be extremely acute in non-US development environments. * Towards the above point, UDNS provides an optional, user-driven transition path from ASCII to UTF-8. UDNS allows applications to be written so that they only function in UTF-8 environments. While this may not seem a reasonable objective of this WG, this should be the long-term vision of what we are trying to achieve. We should be enabling the development of a truly international Internet infrastructure which is UTF-8 clean throughout, and UDNS provides this transitional path. Conversely, ACE does not provide this transition. In fact, the goal of seamless backwards compatibility actively hinders migration. Let me restate this point, since it is somewhat complex. Although UDNS supports a dual-mode model, once UDNS (or something similar) were approved, developers could begin to work on applications which ONLY used UDNS, without any code for ASCII or ACE. At first this would be small private apps, but over the course of a couple of decades, it would likely be most of the new apps. Without UDNS (or something similar), it would still be ASCII at the end of that timeframe. In short, we need UDNS if we are ever to drop ASCII. * The modernization and internationalization of legacy protocols, applications and formats will almost certainly require a UTF-8 DNS eventually. It will not be possible to build and deploy UTF-8 extensions to SMTP without a UTF-8 DNS infrastructure. As with the above points, once a UTF-8 approach has gained some critical mass, these extensions become feasible. Without the infrastructure, SMTP will continue to be bound to ASCII. * UDNS is optional and transitional, on a per-domain basis. This allows a transition to occur as users see fit, without requiring a replacement of the existing DNS infrastructure with an alternate naming service. While this is technically not a "requirement" per se, the seamless user-driven transitional aspects should be requirements. Something very much like UDNS will be required for this. A "new naming service" is unlikely to reach critical mass without some kind of backwards compatibility, which UDNS provides. Furthermore, the cost of UDNS is incremental to that of ACE, since they share many common features and functions. If a developer is going to add ACE support to an application (which they must do in the short term), it is incremental to add UDNS support at the same time. On the other hand, adding ACE now and then going back to glue on something else is a second development effort, with greater costs. * UTF-8 is infinitely more manageable and serviceable than ACE. By being able to use UTF-8 tools and services, the Internet can be kept running much easier than users having to transliterate ACE operations. We should not be contributing to frailty. When a network is already broken, ACE obfuscates the problem by displaying aq--gobbledygook in traceroute, netstat, tcpdump and all of the other tools. ACE libs can certainly address this part of the issue, the extent of the support for tasks like importing trace data into a spreadsheet or viewing it in an editor is not as compelling. The massive number of UTF-8 tools are extremely compelling in terms of general manageability and serviceability of the global DNS. * Finally, there are some problems that ACE cannot solve, which UDNS can. The clipboard problem practically goes away, if not directly then indirectly, simply because we can facilitate the use of UTF-8 everywhere, rather than having to transliterate between application- and protocol-specific encodings. EG, if we ALLOW for the consistent use of UTF-8, then it is more likely to happen than if we actively PROHIBIT it by mandating yet another encoding which MUST be accounted for in every operation. So that's the "benefit" side of the argument. If we can agree that these are important considerations, and that some kind of support for UTF-8 is a requirement (as per item #1), then we can debate costs and features as they compare to that benefit. Here is my evaluation of your perceived costs. > - UDNS causes some strings that are legal in one encoding to be > illegal in the other, and vice versa, meaning that some host names > will be illegal part of the time, unpredictably This issue would have to be resolved if we were ever to move beyond ACE. It is beneficial to design for such limits up front, rather than revoking names later which will be incompatible. This is only a cost if ACE and some other service are developed at different times, and is NOT a cost if they are developed together. This is, in fact, a motivating factor for beginning work on UDNS *NOW*. > - UDNS requires more work to be done by authoritative DNS servers Yes, UDNS requires authoritative servers to maintain name mapping information. However, I see this as being a transitional cost. In the beginning, few clients will support UDNS, but over the course of several years, most clients will support UDNS. At some point (two decades out? shorter?) it should be possible to switch entirely to UDNS, meaning that there will no longer be a need for servers to store both versions. The longer we delay beginning such a transition, the longer it will be before we can complete that transition. If we never start it, we will never complete it. > - UDNS UTF-8 queries that fail will cause more load on the DNS Yes, it will cause one additional lookup. This is equitable with requiring one extra delegation server in the path. This is not a great cost to begin with. Furthermore, seeing as how it is also transitional, it will only be a cost for the short-term, and will not be a cost after a few years. > - UDNS is not compatible with DNS security I'm not sure I understand this point. Also, DNSSEC has many issues, and appears to be destined for a redesign at this point. > - UDNS requires that applications have the logic to send a new DNS query > format Yes, an alternative API is required to prevent clashes with legacy applications (this was proven by DJB's pi test). However, this is not a cost burden which is solely related to UDNS. For example, when hostnames are extended beyond the RFC972/1123 rules, an alternative wide() API will be required for several functions. Applications which transliterate ACE at relatively deep levels (such as for DHCP) will also require this. For these reasons, I consider this a cost of internationalization rather than a cost for UDNS. > - Applications that implement the UTF-8 part of UDNS have to handle > the inevitable errors that come from queries that bounce, have to > recast those queries in ACE (assuming that the name is even legal in > ACE, which some won't be), and then have to emit those queries again, > causing more DNS traffic As with the other issues, this is a transitional cost which dissipates as deployment goes up. It is a cost in the beginning, but it is not a significant cost after a few years. > - The errors caused by UTF-8 queries are the same as other legitimate > DNS errors, meaning that applications will have to have their own > (probably-nonstandard) logic for differentiating between expected > errors and real errors Because UDNS is an applicaton of EDNS, it uses the EDNS error codes. My experience is that these codes are farily simple to work with. From my perspective, you have raised four transitional costs (one of which is nearly permanent but which does have an end), no perpetual costs, and one cost I don't grok. That is not very expensive considering the benefits and requirements, in my opinion. Conversely, ACE-only does have at least one ongoing cost which is quite high (client-side name mutations), and several perpetual costs in terms of usability and internationalization of the global Internet. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … David Hopwood
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … Paul Hoffman / IMC
- Re: [idn] I fear I cannot use IDN in the next 10 … Dave Crocker
- Re: [idn] I fear I cannot use IDN in the next 10 … Dave Crocker
- Re: [idn] I fear I cannot use IDN in the next 10 … Paul Hoffman / IMC
- Re: [idn] I fear I cannot use IDN in the next 10 … Dave Crocker
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … D. J. Bernstein
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … Dan Oscarsson
- Re: [idn] I fear I cannot use IDN in the next 10 … liana.ydisg
- Re: [idn] I fear I cannot use IDN in the next 10 … Dave Crocker
- Re: [idn] I fear I cannot use IDN in the next 10 … Paul Hoffman / IMC
- Re: [idn] I fear I cannot use IDN in the next 10 … Dave Crocker
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- Re: [idn] I fear I cannot use IDN in the next 10 … Eric A. Hall
- [idn] I fear I cannot use IDN in the next 10 years Dan Oscarsson
- Re: [idn] I fear I cannot use IDN in the next 10 … DougEwell2