Re: [EAI] [IETF] Display of Email Addresses [was: Internationalized Email Internet Draft]
John C Klensin <klensin@jck.com> Fri, 14 October 2016 16:34 UTC
Return-Path: <klensin@jck.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DD6A3129867 for <ima@ietfa.amsl.com>; Fri, 14 Oct 2016 09:34:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.896
X-Spam-Level:
X-Spam-Status: No, score=-4.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.996] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4oT2n6LUOOFG for <ima@ietfa.amsl.com>; Fri, 14 Oct 2016 09:34:42 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6B1A8129864 for <ima@ietf.org>; Fri, 14 Oct 2016 09:34:42 -0700 (PDT)
Received: from [198.252.137.10] (helo=JcK-HP8200) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <klensin@jck.com>) id 1bv5RO-000B2n-8l; Fri, 14 Oct 2016 12:34:38 -0400
Date: Fri, 14 Oct 2016 12:34:33 -0400
From: John C Klensin <klensin@jck.com>
To: nalini.elkins@insidethestack.com, "HANSEN, TONY L" <tony@att.com>, ima@ietf.org
Message-ID: <DB82BB41C548C1D17CDA7BEE@JcK-HP8200>
In-Reply-To: <1600128477.303347.1476456552432@mail.yahoo.com>
References: <20161006055447.32573.qmail@pro-236-157.rediffmailpro.com> <9EC0EB65-9C58-43ED-9A80-1DA32C58E3E0@att.com> <E125B6AC26988823306936BF@JcK-HP5.jck.com> <1600128477.303347.1476456552432@mail.yahoo.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: klensin@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/ima/Y6W-VT33lGg5qhTlQ44iMNHF2_Q>
Cc: Harish Chowdhary <harish@nixi.in>
Subject: Re: [EAI] [IETF] Display of Email Addresses [was: Internationalized Email Internet Draft]
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ima/>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Oct 2016 16:34:45 -0000
--On Friday, October 14, 2016 14:49 +0000 nalini.elkins@insidethestack.com wrote: > John / Tony, > > Continuing my splitting of topics! Hope this makes some > kind of sense to others. >> By contrast, Section 1.1 talks about display of email >> addresses, including the local part ("in Punycode" [2]). >> While a mail delivery server is free to create whatever >> aliases for a ?>mailbox local part it likes, including >> "xn-t2bmh3a" or "123456", "george" or "example", in general >> converting a local part using the Punycode algorithm and >> displaying the result is >prohibited by the EAI standards >> (and, incidentally, RFC5321). More important, it will >> often lose information and is potentially very dangerous. > This is a very interesting problem. We are hoping to do > some kind of spreadsheet or other visual where we can show > what happens with a number of mail servers. For example, > what does Yahoo mail do, what does gmail do, why some clients > fail, etc. I am glad to see another one, but note that this was done with what was then available before the Beijing workshop, that something similar has been done (or claimed to be done) in the "Universal Acceptance" group (see below), and maybe elsewhere. You also need to be very careful about the tests you run. For example, on delivery and when referring to its own (virtual) mailbox names, gmail apparently discards some or all ASCII delimiter characters in local parts, treating local parts containing such characters as equivalent to ones with the characters dropped. I have no idea what they do with delimiter characters from other scripts, but some other systems would consider that delimiter-dropping behavior pathological or worse. A different way of looking at that is that they are preserving the local parts but dropping certain characters in mapping from mailbox name to actual storage. Either is fine with the protocols as long as the computational aliasing is done only at or beyond the delivery server. There are also a large number of issues at the boundaries between email and HTML/HTTP that affect many implementations that fail to account for the differences. That problem, and to some extent the gmail behavior above, are examples of another situation we see quite often: the problems are there even in all-ASCII strings and identifiers, i18n simply amplifies them and makes them more obvious. > I am in the process of setting up a demo system > for all this. Let me tell you, I have learned quite a bit. Sadly, I am not surprised. I am also not surprised if a good deal of what you have learned has been painful. It certainly has been for many of the rest of us. I think it is also important to keep in mind that most of the underlying issues are ultimately the result of variations and evolution in human language and writing systems. Some of the decisions we have made in the IETF, and that the Unicode Consortium has made, have made some issues harder and some easier but, as long as we want to deal with the full range of human languages and writing systems, most of the problems are inherent and the choices end up being about compromises (or, if you prefer, winners and losers). Some of the relationships are rather old. For example, rather explicit decisions were made many centuries ago that Latin and Chinese characters should be easily distinguishable by people with some familiarity with them but without necessarily being literate. Latin ended up with a lower familiarity requirement by having very few characters; Chinese ended up with many characters and the advantages of a single writing system that could express meaning of many languages, even ones that are mutually incomprehensible while Rome took the approach that everyone should simply learn Latin. That makes the two scripts special even though some evolution to both in more recent centuries has muddied the distinguishability somewhat. > Including about DNS queries which don't resolve properly. > Sigh. That is still ANOTHER topic. As I say, I want to > get organized and have a good way to show this. Not quite > there yet! See Andrew's comment (and my earlier one) about the Universal Acceptance effort. But I suggest you need to go further than reaching out to them. The sad reality is that there are very few people in the Internet (not just IETF) community with a good understanding of email protocols and operations, DNS and IDNA protocols and operations, writing systems and character coding including the many writing system issues that have nothing to do with character coding, and Unicode operations. On many days, I'm not even sure I'm part of that rather small group. Most of us are willing to try to explain issues to others, but only to those who want to put the energy into listening and learning, not just saying thing that amount to "as long as I think it works for my language, everything else is fine" or "just tell me what to do because I have no intention of learning or thinking about the issues". That expertise is spread sufficiently thin that answering a question or reviewing a document in one places causes something else to be delayed or not happen (e.g., as mentioned earlier, responding to your notes, which I considered important, has retarded progress on both PRECIS and URNBIS work for several days). So there are extremely pragmatic reasons for you to try to either merge your work into the Universal Acceptance effort (or vice versa) or to establish really clear boundaries between the two. And that is probably all the time I can put in on this today. best, john
- [EAI] [IETF] Internationalized Email Internet Dra… Harish Chowdhary
- Re: [EAI] [IETF] Internationalized Email Internet… HANSEN, TONY L
- Re: [EAI] [IETF] Internationalized Email Internet… John C Klensin
- Re: [EAI] [IETF] Internationalized Email Internet… nalini.elkins
- [EAI] [IETF] Content Issues [ was: Internationali… nalini.elkins
- [EAI] [IETF] Homographic Attacks [was: Internatio… nalini.elkins
- [EAI] [IETF] Display of Email Addresses [was: Int… nalini.elkins
- [EAI] [IETF] Arabic / Bidirectional Writing Syste… nalini.elkins
- [EAI] [IETF] Multiple Addresses [ was: Internatio… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… Andrew Sullivan
- [EAI] [IETF] Relational Databases: UTF8 [was: Int… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… nalini.elkins
- [EAI] [IETF] Migration / Backward Compatibility [… nalini.elkins
- [EAI] [IETF] Terminology [was: Internationalized … nalini.elkins
- [EAI] General issues and strategy (was: Re: Conte… John C Klensin
- Re: [EAI] [IETF] Internationalized Email Internet… nalini.elkins
- Re: [EAI] General issues and strategy (was: Re: C… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… John C Klensin
- Re: [EAI] [IETF] Content Issues [ was: Internatio… Franck Martin
- Re: [EAI] [IETF] Content Issues [ was: Internatio… John C Klensin
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… John Bucy
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… nalini.elkins
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… John C Klensin