Re: [EAI] [IETF] Content Issues [ was: Internationalized Email Internet Draft]
Franck Martin <fmartin@linkedin.com> Fri, 14 October 2016 16:53 UTC
Return-Path: <fmartin@linkedin.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DF77E127077 for <ima@ietfa.amsl.com>; Fri, 14 Oct 2016 09:53:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.317
X-Spam-Level:
X-Spam-Status: No, score=-7.317 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.996, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=linkedin.com header.b=ANP+pX8F; dkim=pass (1024-bit key) header.d=linkedin.com header.b=hSc/AEbT
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cl0Z11sUoPPa for <ima@ietfa.amsl.com>; Fri, 14 Oct 2016 09:53:54 -0700 (PDT)
Received: from mail522.linkedin.com (mail522.linkedin.com [108.174.6.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4683A129505 for <ima@ietf.org>; Fri, 14 Oct 2016 09:53:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linkedin.com; s=proddkim1024; t=1476464032; bh=N1hIlkCaWXgvmfmM63KD5MkpGnEVlmFsZ7P+KfxhTu0=; h=MIME-Version:From:Date:Subject:To:Content-Type; b=ANP+pX8FVddfOdQdzGkTVhrxm2RdcmkvkBl9B152kgOGkwVTb+qsNvaoTADK3yNtz aeWxd9oeEt+FEIf4qf9kYfR8aSC/3Tu6Lr2IER+rOwkNq+TEir/kwDZ82GlDIEsf2/ RleN6/7mcaHIMI0xvB93lBXATd+QxjFZ+uiifXrY=
Authentication-Results: mail522.prod.linkedin.com x-tls.subject="/C=US/ST=California/L=Mountain View/O=Google Inc/CN=smtp.gmail.com"; auth=pass (cipher=ECDHE-RSA-AES128-GCM-SHA256)
Authentication-Results: mail522.prod.linkedin.com; iprev=pass policy.iprev="2607:f8b0:400d:c0d::248"; spf=softfail smtp.mailfrom="fmartin@linkedin.com" smtp.helo="mail-qt0-x248.google.com"; dkim=pass header.d=linkedin.com; tls=pass (verified) key.ciphersuite="TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256" key.length="128" tls.v="tlsv1.2" cert.client="C=US,ST=California,L=Mountain View,O=Google Inc,CN=smtp.gmail.com" cert.clientissuer="C=US,O=Google Inc,CN=Google Internet Authority G2"
Received: from [2607:f8b0:400d:c0d::248] ([2607:f8b0:400d:c0d::248.33473] helo=mail-qt0-x248.google.com) by mail522.prod.linkedin.com (envelope-from <fmartin@linkedin.com>) (ecelerity 3.6.21.53563 r(Core:3.6.21.0)) with ESMTPS (cipher=ECDHE-RSA-AES128-GCM-SHA256 subject="/C=US/ST=California/L=Mountain View/O=Google Inc/CN=smtp.gmail.com") id 58/17-11653-0AD01085; Fri, 14 Oct 2016 16:53:52 +0000
Received: by mail-qt0-x248.google.com with SMTP id z54so81432897qtz.0 for <ima@ietf.org>; Fri, 14 Oct 2016 09:53:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linkedin.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=N1hIlkCaWXgvmfmM63KD5MkpGnEVlmFsZ7P+KfxhTu0=; b=hSc/AEbTpcIW5cTR59hXxFtP2QPnOy3afa4HO4Tm06cZYK2zZQNzgOR9d724BfoCzO hSBq56nzzUpJo+GnknhlMZtnFAGpw8jU4Dqbh8NFFhyaZswMU7G+8G651l+G+43luV/4 l1iwAa444fv2uQGZXaivKkKh+GFEUydhViz2w=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=N1hIlkCaWXgvmfmM63KD5MkpGnEVlmFsZ7P+KfxhTu0=; b=L20c7fDWPL4Sk4ZtJVggGVcrmSwJn74iaCeWBu1fqhc7ufw7PjK9N2FUQcV2ST24un fkYa+CkAwRXReIYVXL5x/fdO7vM8t5r2wlEUn+2/mImLBd6J+Qwr3FLRgc4RbUj3jiI4 STmm3MEI1ayjZO21s8Bt/a6VQanFfKCJEopJ2ohATkuZfhtmEDnqy1ECuNYSGFNtN3lf 0eTRoSYodDttS1uMBCoBDSZ4eYuROV5xKRL+7u7y9pSGoQz6rXpP6jbDOyauEQUeXLNy eXLv7YGjff4P3e712C5TjR3JhpuKUg27YieMJXgNdpJ8KoJKaRWIZyBe4SWzktWqOXsk 2g2Q==
X-Gm-Message-State: AA6/9RlRu3RQN5WUTr0nasJOPZDj1lx3MlKgNdnko+WtZ5Qt5mRY0JHjjEOLXHevaNG8frkIHjmbiE4vyge2Bk/6nKQLlLD991BGKgCwCloLZCFIjN4SxDrfbelTRKYSBC7CAfUpyWA5r/Dr3aPTLAv88w==
X-Received: by 10.55.163.214 with SMTP id m205mr12168741qke.68.1476464018518; Fri, 14 Oct 2016 09:53:38 -0700 (PDT)
X-Received: by 10.55.163.214 with SMTP id m205mr12168717qke.68.1476464018175; Fri, 14 Oct 2016 09:53:38 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.140.95.84 with HTTP; Fri, 14 Oct 2016 09:53:17 -0700 (PDT)
In-Reply-To: <489025644.216489.1476451836537@mail.yahoo.com>
References: <20161006055447.32573.qmail@pro-236-157.rediffmailpro.com> <9EC0EB65-9C58-43ED-9A80-1DA32C58E3E0@att.com> <E125B6AC26988823306936BF@JcK-HP5.jck.com> <489025644.216489.1476451836537@mail.yahoo.com>
From: Franck Martin <fmartin@linkedin.com>
Date: Fri, 14 Oct 2016 09:53:17 -0700
Message-ID: <CANyRh9-dag0j4KjE8_h7KmH=chFGrbn24=9c6Hyw+1JdiN79Vg@mail.gmail.com>
To: nalini.elkins@insidethestack.com
Content-Type: multipart/alternative; boundary="94eb2c070a8453df7c053ed61105"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ima/JHy1F7Jn0R7Bn49a39fEhoIBroA>
Cc: Harish Chowdhary <harish@nixi.in>, "ima@ietf.org" <ima@ietf.org>
Subject: Re: [EAI] [IETF] Content Issues [ was: Internationalized Email Internet Draft]
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ima/>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Oct 2016 16:53:58 -0000
I tried to subscribe to this list using my email address 弗兰克@互联网.公司 but mailman replied: Your subscription is not allowed because the email address you gave is insecure. On Fri, Oct 14, 2016 at 6:30 AM, <nalini.elkins@insidethestack.com> wrote: > John / Tony, > > I am going to split your comments into separate threads so that I can keep > track of each. The first is about co-mingling content vs. headers. > > >(1) The so-called EAI standards, as listed in the Introduction, are > about email envelope and header information presented directly (e.g., in > UTF-8) as non-ASCII characters. A good deal of the document appears to > address mail >content information such as textual message bodies, in > other scripts. With the possible exception of language selection when a > message is sent with the same basic text in several languages > (multipart/alternative was designed with >that case in mind but have been > used in other ways), we thought we solved that content problem with MIME > in 1992. If MIME is inadequate, the authors or others should produce a > document explaining the issues and not confuse >them with EAI / > SMTPUTF8. If it is adequate, then, like Tony although perhaps for > different reasons, I don't see what Section 1.2 is doing here, what the > relevance of Section 3.2 is, and several other statements should be > examined >carefully to be sure they are talking about addresses and/or > headers and not content. > > Yes. I see your point. Let me say first the basic thing that we are > trying to do is to discuss the holistic user experience of > internationalized emails from an operational point of view. In so doing, > the co-mingling happened. We could do a second draft for content issues or > change the abstract of this one to better state what our real goal is. > > Secondly, as you guys know well, there are lots of other issues with IDN, > browser support, etc. What we were actually hoping is that we could have > a forum (perhaps like DNSOps or v6Ops) where we could come together to > define and discuss such problems, move towards best practices (or work > arounds! Not that I like that, but it happens.) Because we have not even > started on problems that we see such as search algorithm ranking of IDNs > and so on. We were hoping that others would step up to author such other > drafts. > > > Thanks, > > Nalini Elkins > Inside Products, Inc. > www.insidethestack.com > (831) 659-8360 > > > ------------------------------ > *From:* John C Klensin <klensin@jck.com> > *To:* "HANSEN, TONY L" <tony@att.com>; ima@ietf.org > *Sent:* Sunday, October 9, 2016 7:57 PM > *Subject:* Re: [EAI] [IETF] Internationalized Email Internet Draft > > > > --On Thursday, October 06, 2016 4:39 PM +0000 "HANSEN, TONY L" > <tony@att.com> wrote: > > > I think getting deployment feedback from EAI is important, and > > this draft is an excellent start. > > > > I'm not convinced that section 1.2 describes a real problem. > > People do this all the time today with various combinations of > > languages. Why is the combination of Russian and Chinese any > > different? If you think it is, then please expand on the > > aspect that does make it more difficult. > > > > I forwarded a number of nits to the authors. > > Hi. I was going to hold off until some later and more mature > version of this draft, but since Tony has commented, while I > believe the issues with EAI deployment are important, I see > several problems with this draft, some of which were actually > discussed in the WG but appear to be ignored here. Perhaps more > important, it is seriously incomplete relative to issues that > have been discussed at great length in the EAI WG, at the APEC > meeting on internationalized email in Beijing in October 2014, > the May 2015 workshop in Thailand, and elsewhere. I strongly > suggest that, if there is going to be a discussion in Seoul, > this document is in need of a great deal of work first. Some of > those issues are: > > (1) The so-called EAI standards, as listed in the Introduction, > are about email envelope and header information presented > directly (e.g., in UTF-8) as non-ASCII characters. A good deal > of the document appears to address mail content information such > as textual message bodies, in other scripts. With the possible > exception of language selection when a message is sent with the > same basic text in several languages (multipart/alternative was > designed with that case in mind but have been used in other > ways), we thought we solved that content problem with MIME in > 1992. If MIME is inadequate, the authors or others should > produce a document explaining the issues and not confuse them > with EAI / SMTPUTF8. If it is adequate, then, like Tony > although perhaps for different reasons, I don't see what Section > 1.2 is doing here, what the relevance of Section 3.2 is, and > several other statements should be examined carefully to be sure > they are talking about addresses and/or headers and not content. > > (2) Within an address, there is, as the I-D points out and > consistent with RFC 5321, a local part and a domain part. RFCs > 6530 and 6531 make it quite clear (at least we thought they did) > that they are handled differently. For the domain part, the > rules are laid out in the IDNA2008 specs (RFC 5890ff). Issues > about look-alike characters have been extensively discussed and > written about (even though some of us have questioned the > quality of some of that work). It does not seem useful to me to > revisit those issues here, especially without reference to the > prior work and discussions or if some of the discussion here is > wrong or contains obvious omissions. As an example from the > first paragraph of Section 6.1, Latin "c" (U+0063) and Cyrillic > "c" (U+0441) are typically written with identical graphemes, but > are not on the list. More important, while the "paypal" > example with U+0430 substituted for "a" (U+0061) has been used > repeatedly, including in a careful study in an article that is > not cited in this draft, it is possible to write "раура1" > with the first five characters in Cyrillic and the last one a > digit (which is script independent) > (\u'0440'\u'0430'\u'0443'\u'0440'\u'040'\u'0031' [1]), therefore > not even violating conventions prohibiting mixed-script labels. > There is, of course, no ambiguity in the A-label form, although > the authors quite properly point out that it is not > user-friendly. > > By contrast, Section 1.1 talks about display of email addresses, > including the local part ("in Punycode" [2]). While a mail > delivery server is free to create whatever aliases for a mailbox > local part it likes, including "xn-t2bmh3a" or "123456", > "george" or "example", in general converting a local part using > the Punycode algorithm and displaying the result is prohibited > by the EAI standards (and, incidentally, RFC5321). More > important, it will often lose information and is potentially > very dangerous. > > (3) Arabic should not be confused with a strictly right-to-left > writing system. I am not aware of any such systems in wide use > for contemporary languages today. The problem is that numerals, > whether written in European digits, Arabic or Arabic-Indic > digits, Chinese (Han) digits, or many others, have been written > left to right since that type of positional notation was > invented and became widely used. As a result, the scripts are > referred to (in Unicode-speak) as "bidirectional" or "bidi" [3]. > Their implications for domain names and IDNA are the subject of > RFC 5893. > > (4) Multiple addresses for one user (and Section 4). Keeping in > mind that many people maintain a number of identities, and even > multiple email addresses, for different purposes, I don't > understand what point you are trying to make with this section. > Many of us believe that users who have mailboxes whose names > involve non-ASCII local parts and who engage in communications > outside their primary language group will find it necessary to > maintain either separate all-ASCII mailboxes or all-ASCII > aliases to their primary mailboxes and to do so for a very long > time. That issue has been extensively analyzed and discussed > but this document avoids that work, which is both a problem and > an opportunity. > > (5) Section 2.1 asserts that email servers), implying all of > them, store data (messages?) in relational databases. That is > simply false. Some do; others don't. Even for those that do, > there may be a difference between Unicode-capable data storage > and Unicode-capable keys or indexes. There is also absolutely > no requirement that any such system store Unicode strings > encoded in UTF-8; many do not. > > (6) There is a necessary difficulty with SMTPUTF8, which is that > one cannot transmit a message with non-ASCII characters in > addresses or headers to a system that does not support them. > Final delivery systems should probably not accept messages > unless they have reason to predict that the mail store will > handle them _and_ that the user associated with the target > mailbox will be able to retrieve them. Since a user with an > all-ASCII mailbox name might still receive a message with, e.g., > a non-ASCII backward-pointing address in the envelope or > headers, making that decision is not straightforward. That > leads to a strong case that, if one wants broad deployment of > SMTPUTF8, the place to start is with the MUAs (including the > Webmail systems) and associated POP and IMAP servers and > clients. The "to various extents" list in the first part of > Section 3 is not particularly helpful in that regard. > > (7) Finally, this is an internationalization (i18n) problem as > much as it is an email problem. Terminology (and, where > characters or code points are referred to, their precise > identification) is very important because the alternative is > typically a good deal of user confusion about what you are > talking about and other impediments to making progress. Saying > "English" were you mean "Basic Latin Script" or "ASCII" is not > helpful, especially given that 5321 local parts can include any > ASCII character and that ASCII is not sufficient to write > English. Conversely, it appears that there are a few places > where, correctly or incorrectly, you really do mean "English" > when you say that. Similarly, talking about one particular > encoding when you mean "Unicode" is confusing and may be > misleading. RFC 6365 may give you a start on some of the issues. > > regards, > john > > > ------------- > [1] I recommend the authors have a look at RFC 5137. > > [2] Punycode is an encoding method, not a display format. See > RFC 5890, Section 2.3.4. > > [3] http://unicode.org/reports/tr9/ > > > _______________________________________________ > IMA mailing list > IMA@ietf.org > https://www.ietf.org/mailman/listinfo/ima > > > > _______________________________________________ > IMA mailing list > IMA@ietf.org > https://www.ietf.org/mailman/listinfo/ima > >
- [EAI] [IETF] Internationalized Email Internet Dra… Harish Chowdhary
- Re: [EAI] [IETF] Internationalized Email Internet… HANSEN, TONY L
- Re: [EAI] [IETF] Internationalized Email Internet… John C Klensin
- Re: [EAI] [IETF] Internationalized Email Internet… nalini.elkins
- [EAI] [IETF] Content Issues [ was: Internationali… nalini.elkins
- [EAI] [IETF] Homographic Attacks [was: Internatio… nalini.elkins
- [EAI] [IETF] Display of Email Addresses [was: Int… nalini.elkins
- [EAI] [IETF] Arabic / Bidirectional Writing Syste… nalini.elkins
- [EAI] [IETF] Multiple Addresses [ was: Internatio… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… Andrew Sullivan
- [EAI] [IETF] Relational Databases: UTF8 [was: Int… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… nalini.elkins
- [EAI] [IETF] Migration / Backward Compatibility [… nalini.elkins
- [EAI] [IETF] Terminology [was: Internationalized … nalini.elkins
- [EAI] General issues and strategy (was: Re: Conte… John C Klensin
- Re: [EAI] [IETF] Internationalized Email Internet… nalini.elkins
- Re: [EAI] General issues and strategy (was: Re: C… nalini.elkins
- Re: [EAI] [IETF] Display of Email Addresses [was:… John C Klensin
- Re: [EAI] [IETF] Content Issues [ was: Internatio… Franck Martin
- Re: [EAI] [IETF] Content Issues [ was: Internatio… John C Klensin
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… John Bucy
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… nalini.elkins
- Re: [EAI] [IETF] Multiple Addresses [ was: Intern… John C Klensin