Re: [EAI] UTF-8 in Message-IDs Mon, 15 August 2011 15:35 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C1C8A21F8B94 for <>; Mon, 15 Aug 2011 08:35:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.351
X-Spam-Status: No, score=-2.351 tagged_above=-999 required=5 tests=[AWL=0.096, BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id IN1dbcZssutL for <>; Mon, 15 Aug 2011 08:35:42 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 1C7C521F85B8 for <>; Mon, 15 Aug 2011 08:35:42 -0700 (PDT)
Received: from by (PMDF V6.1-1 #35243) id <> for; Mon, 15 Aug 2011 08:35:19 -0700 (PDT)
Received: from by (PMDF V6.1-1 #35243) id <> (original mail from for; Mon, 15 Aug 2011 08:35:11 -0700 (PDT)
Message-id: <>
Date: Mon, 15 Aug 2011 08:19:44 -0700 (PDT)
In-reply-to: "Your message dated Mon, 15 Aug 2011 10:17:45 +0100" <>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=iso-8859-1; Format=flowed; DelSp=yes
References: <> <> <>
To: Charles Lindsey <>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;; s=mauve; t=1313422415; bh=zRNxqcNAuzITt5J84CDYKKr9Fz6M3oc+ectOblzHdj4=; h=From:Cc:Message-id:Date:Subject:In-reply-to:MIME-version: Content-type:References:To; b=MaLlp/X2WD+DBumppeznpjUTdOiCEqSMnZYocZ2abVb8jcZ2ayGL3sVcU2rbdkMh2 OLmWVzT4HPi4A61VqimlFeplfaW2UfMPIFdntpcWVm/Pm17YtVs+YR9ZbksA1cGXRW +I1sUPdct2hr6qU9Y0smBhP+89SyDiZsuWu7j/Gg=
Cc: IMA <>
Subject: Re: [EAI] UTF-8 in Message-IDs
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 15 Aug 2011 15:35:42 -0000

> On Sat, 13 Aug 2011 23:08:48 +0100, <> wrote:

> >> And the "EAI experiment" phase did not test this plan,
> >> there's no evidence I'm aware of that UTF-8 in Message-
> >> IDs is harmless.
> >
> > I'm sure it is not, just as utf-8 in addresses is far from harmless and
> > is
> > going to require all sorts of infrastructure changes.
> >
> > But I fail to see how this is in any way relevant. We're defining a new
> > message
> > format here that *cannot* be downgraded to the old format and retain all
> > semantics. This is true irrespective of how message-ids are handled. As
> > such,
> > utf-8 in message-ids is a small additional cost.

> Yes, but you have to consider all the other protocols using mail-like
> formats and their use of the Message-ID.

These protocols don't make use of email addresses? 

> For exmaple, if EAI were to be
> carried over into Netnews (quite a likely development) it would NOT be
> regarded as a "new message format" since the transport paths for Netnews
> are already 8-bit clean and it would simply be necessary for those who
> wish to take advantage of the new facilities to ensure that their user
> agents were suitably upgraded. There will never be a need for
> "downgrading" except at gateways back into the email system.

And all netnews applications properly handle the construction and display of
utf-8 addresses in message headers? Just as one example, netnews already
suports the proper form of Unicode normalization on input for this. Right?

An 8-bit clean transport is nowhere near sufficient to accomodate the other
changes to the format we're making here. Message ids are actually the least of
it since they are machine generated.

> But within Netnews the Message-ID plays a crucial role, so it is
> reasonable to ask whether UTF-8 in it would cause problems. According to
> the current standards it would not be allowed, of course, but in practice
> the transport paths might or might not barf. So the question has to be
> asked (and I do not know the answer off the top of my head).

I'm sorry, but the entire approach of this WG is to define a format that is NOT
downgradeable to previous formats without substantial information loss. Given
that's the context here, how can the question of "how well does this downgrade
in this particular case" possibly be relevant?

Look, I'll be the first to state that the approach being taken here is risky.
In fact I would never have chosen this approach myself. But that doesn't
change the fact that there's a consensus to proceed in this fashion.