Re: [EAI] [eai] #11: Capture the scenario of "SHOULD not use UTF-8 in Message-ID" for future advice draft

ned+ima@mrochek.com Mon, 19 September 2011 18:19 UTC

Return-Path: <ned+ima@mrochek.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 37DDD21F8CBE for <ima@ietfa.amsl.com>; Mon, 19 Sep 2011 11:19:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.461
X-Spam-Level:
X-Spam-Status: No, score=-2.461 tagged_above=-999 required=5 tests=[AWL=-0.014, BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xvPf+W+Km0MC for <ima@ietfa.amsl.com>; Mon, 19 Sep 2011 11:19:04 -0700 (PDT)
Received: from mauve.mrochek.com (mauve.mrochek.com [66.59.230.40]) by ietfa.amsl.com (Postfix) with ESMTP id 92B3721F8CB1 for <ima@ietf.org>; Mon, 19 Sep 2011 11:19:04 -0700 (PDT)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01O68HLP5NUO00WQFF@mauve.mrochek.com> for ima@ietf.org; Mon, 19 Sep 2011 11:19:47 -0700 (PDT)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01O68GZ66CTS014O5Z@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ima@ietf.org; Mon, 19 Sep 2011 11:19:42 -0700 (PDT)
From: ned+ima@mrochek.com
Message-id: <01O68HLMXW6O014O5Z@mauve.mrochek.com>
Date: Mon, 19 Sep 2011 11:06:52 -0700
In-reply-to: "Your message dated Mon, 19 Sep 2011 10:25:01 -0700" <4E777AED.5010906@dcrocker.net>
MIME-version: 1.0
Content-type: TEXT/PLAIN; Format="flowed"
References: <061.9721166081247d3e53392a010db4b79a@trac.tools.ietf.org> <01O68AN5NRSW014O5Z@mauve.mrochek.com> <4E777AED.5010906@dcrocker.net>
To: Dave CROCKER <dhc2@dcrocker.net>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mrochek.com; s=mauve; t=1316456459; bh=a+93mTq4IkKqeFXV+2d+JP8i6EXqJY6bLrN+gxCqHWA=; h=From:Cc:Message-id:Date:Subject:In-reply-to:MIME-version: Content-type:References:To; b=R83svVahipJftanEDqBG+CAhDV2rwYbnPjSsU3wYdNoffKQVrtE8ZW+iMcFBfeRRq r35Wg5ETJD4nPd26qBcDiZvc6OOcPIXk3KmHR//DgL7shYuA7jegIgAnIMljFOeMMA UM8jVLLjzhDuwgTanwtrufqaWhLjG0ZzwhZuvP5A=
Cc: ima@ietf.org
Subject: Re: [EAI] [eai] #11: Capture the scenario of "SHOULD not use UTF-8 in Message-ID" for future advice draft
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Sep 2011 18:19:05 -0000

> On 9/19/2011 7:58 AM, ned+ima@mrochek.com wrote:
> >>   "Implementers of message-id generation algorithms may prefer to restrain
> >>   their output to ASCII since that has a few minor advantages, such as when
> >>   constructing References fields in mailing-list threads where some senders
> >>   use EAI and others not."


> I looked over the draft, and this language made me very uncomfortable.

I don't see why. It's a fact that using 8-bit utf-8 in message-ids (1) isn't
really necessary and (2) creates a situation where replies are forced to be EAI
messages when otherwise they may not have needed to be. Saying that  generators
of such fields need to be aware of the consequences seems completely reasonable
to me, and it also aligns quite well with the nearby sections on line limits,
encoded words, and encoding restrictions, all of which have very much the same
overall character as this.

> The reason is that -- once again -- it has this draft trying to juggle a hybrid
> (UTF-8/ASCII-only) world, rather than a much simpler UTF-8 only world.  Indeed,
> the language that results looks awkward.

No, it really isn't doing that.

> A simple question is why this juggling is appropriate here but not in the rest
> of the core drafts, including this one?

Since the juggling you refer to is in fact nonexistent, this question is 
unanswerable.

> For the situations that need this special treatment, why do they /not/ also need
> it for all other occurrences of UTF-8.

Because they aren't comparable cases. Again, there is no real need to generate
utf-8 message-ids - these are fields for automatic processing, not presentation
to the user. The same cannot be said for addresses, subject fields, etc. etc.
And yes, there are other automatically processed fields where we're now
allowing 8-bit utf-8, but AFAIK none of them have the additional cascading
effect message-ids have. (And if someone can identify additional cases they
should also be noted in this specification.)

In short, this is a special case, one that arises from a combination of
factors, and for that reason it's worth noting.

> And in answering that question, consider
> whether those are edge conditions or likely to be typical.

The message-id issue is most certainly going to create issues for clients.

				Ned