Re: [idn] Reality Check
Martin Duerst <duerst@w3.org> Tue, 10 July 2001 15:38 UTC
Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA04133 for <idn-archive@lists.ietf.org>; Tue, 10 Jul 2001 11:38:38 -0400 (EDT)
Received: from lserv by psg.com with local (Exim 3.31 #1) id 15JzRK-000PBS-00 for idn-data@psg.com; Tue, 10 Jul 2001 08:28:42 -0700
Received: from sh.w3.mag.keio.ac.jp ([133.27.194.41]) by psg.com with esmtp (Exim 3.31 #1) id 15JzRJ-000PBJ-00 for idn@ops.ietf.org; Tue, 10 Jul 2001 08:28:41 -0700
Received: from enoshima (i205162.ppp.asahi-net.or.jp [61.125.205.162]) by sh.w3.mag.keio.ac.jp (8.9.3/3.7W) with ESMTP id AAA12432; Wed, 11 Jul 2001 00:25:17 +0900 (JST)
Message-Id: <4.2.0.58.J.20010710234410.05ed44e0@sh.w3.mag.keio.ac.jp>
X-Sender: duerst@sh.w3.mag.keio.ac.jp
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58.J
Date: Wed, 11 Jul 2001 00:23:48 +0900
To: Dan Oscarsson <Dan.Oscarsson@trab.se>, idn@ops.ietf.org
From: Martin Duerst <duerst@w3.org>
Subject: Re: [idn] Reality Check
Cc: Marc Blanchet <Marc.Blanchet@viagenie.qc.ca>, James Seng/Personal <jseng@POBOX.ORG.SG>
In-Reply-To: <200107101101.f6AB1EP18634@malmo.trab.se>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Hello Dan, others, At 13:04 01/07/10 +0200, Dan Oscarsson wrote: >Has this working group lost the contact with the real world? > >All discussion is about ACE/nameprep/IDNA/backward compatibility! >I have tried to think of how I as a programmer, developer, editor and >user would handle DNS names. >An ACE-solution will result in e-mails containing lines like: >To: =?quoted-printable-encoded?= ><ACE-like-user-name@ACE-domain.ACE-domain.com> >A line with a lot of different mixed encoded forms. To handle this >as a programmer you have to idetify the different parts of the line and then >for each part identify what encoding is used, and the decode each before the >line can be displayed to a user or handled in a program. >As of today I have fewer programs than I want that handles e-mail - this >is because the difficulty in programmatically handling MIME. >With ACE in domain names and something like it in user names it will be >even worse. Why could we not use something simple like having the >entire line encoded in UTF-8? Instead of having a difficult time to >parse data and a lot of different decoders I could use decode UTF-8 into my >local character set. This is a subject I have partially discussed with Keith earlier. I think it's indeed very serious. There are thousands of different file formats and occasions where domain names turn up. If ACE leaks into only a small percentage of them, a lot of people have a lot of problems. They may complain about ACE more than Keith is complaining about NAT. Practical books about a programming language often contain examples of how to process your mail, and so on. All these examples are based on the fact that the characters can be parsed directly. Grepping for a word in the subject headers in a mail spool file is easy as long as the subject is in ASCII. For anything beyond ASCII, it just fails. I remember Ulrich Drepper, responsible for gclib, saying at a dinner here in Japan that UTF-8 was the right way to go, because it would allow people like him to provide the base for internationalization (rather than just doing nothing at all), and would allow others, more familiar with their own language, to build on it. In some sense, RFC 2277 is based on a similar assumption. ACE severely breaks this assumption. Internationalization doesn't come for free. Most people who think ACE is a good idea just don't see that work. In the short term, having to update a DNS server (even if it's just with an 8-bit clean version) seems a lot of work. But it's actually extremely easy, compared to updating applications. Also, passing ACE in application protocols seems extremely easy. But it just means that these application protocols aren't really internationalized yet, and that a lot of work is waiting out there. >Many edits their html files with a text editor, or writes documents >with embedded DNS names and URLs. The only way you can expect people >to enter DNS names and URLs in those files is by using the same character >set as the rest of the text and they will not convert them into >ACE, %-encoding or other unnatural form. Yes indeed. Please see http://www.ietf.org/internet-drafts/draft-masinter-url-i18n-07.txt for URIs (working on updating it) and http://www.ietf.org/internet-drafts/draft-ietf-idn-uri-00.txt for the DNS part in URIs (will resubmit it to keep it alive; I don't expect this to be discussed in London, but I haven't seen any other proposals for how to solve this part of the problem, and I guess the WG should deal with it once the important issues are dealt with, and it would look silly to resubmit as a personal contribution and later again as a WG draft [sorry for this lengthy sentence]). >Does the IDNA (ACE in application) solution that appears be the only >focus of this working group match the real needs of people in the >current and future world? No. People will suffer from ACE for years to come. When MIME was created, Unicode barely existed, so there was some excuse. But there is no excuse for ACE. I wish I had patented it in December 1996; all bad ideas should be patented ;-(. Regards, Martin.
- RE: [idn] Reality Check Brian W. Spolarich
- Re: [idn] Reality Check Keith Moore
- Re: [idn] IDN security and ACE leakage Soobok Lee
- [idn] Reality Check Dan Oscarsson
- Re: [idn] Reality Check Eric A. Hall
- Re: [idn] Reality Check John C Klensin
- Re: [idn] Reality Check D. J. Bernstein
- Re: [idn] Reality Check liana.ydisg
- Re: [idn] UTF-8 as the long-term IDN solution James Seng/Personal
- RE: [idn] Reality Check Martin Duerst
- Re: [idn] Reality Check John C Klensin
- Re: [idn] Reality Check Adam M. Costello
- RE: [idn] Reality Check Rick H Wesson
- Re: [idn] Reality Check Adam M. Costello
- RE: [idn] Reality Check Patrik Fältström
- Re: [idn] Reality Check Edmon
- RE: [idn] Reality Check Martin Duerst
- RE: [idn] Reality Check Brian W. Spolarich
- Re: [idn] Reality Check Edmon
- [idn] Re: UTF-8 as the long-term IDN solution Dave Crocker
- Re: [idn] Reality Check Martin Duerst
- Re: [idn] Reality Check Edmon
- Re: Just send UTF-8 with nameprep (was: RE: [idn]… Keith Moore
- Re: [idn] IDN security and ACE leakage Martin Duerst
- RE: [idn] Reality Check Russ Rolfe
- Re: [idn] Reality Check Keith Moore
- Re: [idn] Reality Check Keith Moore
- RE: [idn] Reality Check Brian W. Spolarich
- [idn] UTF-8 as the long-term IDN solution D. J. Bernstein
- [idn] IDN security and ACE leakage Soobok Lee
- Re: [idn] Reality Check Dan Oscarsson
- Re: [idn] Reality Check Keith Moore
- Re: [idn] Reality Check Eric A. Hall
- Re: [idn] Reality Check Adam M. Costello
- RE: [idn] Reality Check Russ Rolfe
- Re: [idn] Reality Check Eric A. Hall
- Re: [idn] Reality Check Adam M. Costello
- RE: [idn] Reality Check Erik Nordmark
- Re: [idn] IDN security and ACE leakage Martin Duerst
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] IDN security and ACE leakage Soobok Lee
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] UTF-8 as the long-term IDN solution James Seng/Personal
- Re: [idn] Reality Check Martin Duerst
- Re: [idn] UTF-8 as the long-term IDN solution D. J. Bernstein
- Re: [idn] Reality Check Edmon
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] Reality Check Mats Dufberg
- RE: [idn] Reality Check John C Klensin
- Re: [idn] Reality Check Eric A. Hall
- Just send UTF-8 with nameprep (was: RE: [idn] Rea… Martin Duerst
- Re: [idn] Reality Check D. J. Bernstein
- Re: [idn] Reality Check Eric A. Hall
- Re: [idn] Reality Check Keith Moore
- Re: [idn] Reality Check John C Klensin
- Re: [idn] Reality Check Adam M. Costello
- Re: [idn] Reality Check Eric A. Hall
- Re: [idn] Reality Check Adam M. Costello
- RE: [idn] Reality Check Russ Rolfe
- Re: [idn] IDN security and ACE leakage Soobok Lee