Re: [EAI] UTF-8 in Message-IDs

Dave CROCKER <> Mon, 22 August 2011 20:46 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 427A121F8B55 for <>; Mon, 22 Aug 2011 13:46:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -5.028
X-Spam-Status: No, score=-5.028 tagged_above=-999 required=5 tests=[AWL=-0.271, BAYES_00=-2.599, DATE_IN_PAST_96_XX=1.69, RCVD_IN_DNSWL_MED=-4, SARE_SUB_ENC_UTF8=0.152]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 1ktJCZHW5zeX for <>; Mon, 22 Aug 2011 13:46:16 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id A051321F8AAA for <>; Mon, 22 Aug 2011 13:46:16 -0700 (PDT)
Received: from [] ( []) (authenticated bits=0) by (8.13.8/8.13.8) with ESMTP id p7MKlGn5013148 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <>; Mon, 22 Aug 2011 13:47:22 -0700
Message-ID: <>
Date: Tue, 16 Aug 2011 07:47:50 -0700
From: Dave CROCKER <>
Organization: Brandenburg InternetWorking
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0
MIME-Version: 1.0
References: <C31E821E731AC23ED7EE191F@PST.JCK.COM> <20110815175214.4833.qmail@joyce.lan> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 ( []); Mon, 22 Aug 2011 13:47:22 -0700 (PDT)
Subject: Re: [EAI] UTF-8 in Message-IDs
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 22 Aug 2011 20:46:17 -0000

On 8/15/2011 2:03 PM, wrote:
> The addition of utf-8 to various syntactic elements isn't convoluted. In fact,
> having looked at the impact on our code base, it's going to be pretty easy to
> add.

That mirrors the effect I saw when making generic changes to the ABNF, rather 
than by treating UTF-8 as an extended set of special cases around the syntax.

>> I don't see anything "compelling" in that line of reasoning, otherwise
>> we could also say that using 8bit in DNS queries works, and therefore
>> it's okay to use UTF-8 in host names (example).
> Except that it doesn't - the DNS protocols may support 8-bit but there are
> specific restrictions on host names that some deployed software honors. And if
> this wasn't the case you'd probably hear exactly this argument being made.

In practical terms (de facto group consensus), the recent version of the EAI 
effort has chosen to design for a hypothetically-pure 8-bit environment. 
Dealing with the much messier hybrid world is deferred, here.

The Unicode effort for the DNS chose to work within the hybrid world and, 
therefore, had to make fundamentally different design choices.

Design DNS support for a truly pure UTF-8 environment, end-to-end, and it well 
might be reasonable to 'just do 8 bits'.  But they didn't have that luxury.



   Dave Crocker
   Brandenburg InternetWorking