Experiment with UTF-8 in message-IDs

Julien ÉLIE <julien@trigofacile.com> Sun, 09 October 2011 19:28 UTC

Return-Path: <owner-ietf-usefor@mail.imc.org>
X-Original-To: ietfarch-usefor-archive@ietfa.amsl.com
Delivered-To: ietfarch-usefor-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0EF2A21F8B0A for <ietfarch-usefor-archive@ietfa.amsl.com>; Sun, 9 Oct 2011 12:28:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.453
X-Spam-Level:
X-Spam-Status: No, score=0.453 tagged_above=-999 required=5 tests=[BAYES_50=0.001, MIME_8BIT_HEADER=0.3, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Kinx6e9mAurH for <ietfarch-usefor-archive@ietfa.amsl.com>; Sun, 9 Oct 2011 12:28:44 -0700 (PDT)
Received: from hoffman.proper.com (IPv6.Hoffman.Proper.COM [IPv6:2605:8e00:100:41::81]) by ietfa.amsl.com (Postfix) with ESMTP id 363CF21F8AFF for <usefor-archive@ietf.org>; Sun, 9 Oct 2011 12:28:43 -0700 (PDT)
Received: from hoffman.proper.com (localhost [127.0.0.1]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id p99JRKVQ015702 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 9 Oct 2011 12:27:21 -0700 (MST) (envelope-from owner-ietf-usefor@mail.imc.org)
Received: (from majordom@localhost) by hoffman.proper.com (8.14.4/8.13.5/Submit) id p99JRKV2015701; Sun, 9 Oct 2011 12:27:20 -0700 (MST) (envelope-from owner-ietf-usefor@mail.imc.org)
X-Authentication-Warning: hoffman.proper.com: majordom set sender to owner-ietf-usefor@mail.imc.org using -f
Received: from denver.dinauz.org (denver.dinauz.org [91.121.7.193]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id p99JRJfx015682 for <ietf-usefor@imc.org>; Sun, 9 Oct 2011 12:27:20 -0700 (MST) (envelope-from julien@trigofacile.com)
Received: from localhost (localhost.localdomain [127.0.0.1]) by denver.dinauz.org (Postfix) with ESMTP id 850818169 for <ietf-usefor@imc.org>; Sun, 9 Oct 2011 21:27:17 +0200 (CEST)
Received: from denver.dinauz.org ([127.0.0.1]) by localhost (denver.dinauz.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RDygRrS+Q48V for <ietf-usefor@imc.org>; Sun, 9 Oct 2011 21:27:17 +0200 (CEST)
Received: from MacBook-Pro-de-Julien-Elie.local (AAubervilliers-552-1-100-186.w83-199.abo.wanadoo.fr [83.199.211.186]) by denver.dinauz.org (Postfix) with ESMTPSA id 51E2F8168 for <ietf-usefor@imc.org>; Sun, 9 Oct 2011 21:27:17 +0200 (CEST)
Message-ID: <4E91F594.40205@trigofacile.com>
Date: Sun, 09 Oct 2011 21:27:16 +0200
From: =?ISO-8859-1?Q?Julien_=C9LIE?= <julien@trigofacile.com>
Organization: TrigoFACILE -- http://www.trigofacile.com/
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.7; fr; rv:1.9.2.23) Gecko/20110920 Thunderbird/3.1.15
MIME-Version: 1.0
To: ietf-usefor@imc.org
Subject: Experiment with UTF-8 in message-IDs
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Sender: owner-ietf-usefor@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-usefor/mail-archive/>
List-Unsubscribe: <mailto:ietf-usefor-request@imc.org?body=unsubscribe>
List-ID: <ietf-usefor.imc.org>

Hi all,

In the IETF working group for IMA (Internationalized eMail Address),
there is a current thread about UTF-8 in message-IDs:
    http://www.ietf.org/mail-archive/web/ima/current/threads.html#04330

Quick references in the thread:

http://www.ietf.org/mail-archive/web/ima/current/msg04430.html
http://www.ietf.org/mail-archive/web/ima/current/msg04344.html
http://www.ietf.org/mail-archive/web/ima/current/msg04345.html
http://www.ietf.org/mail-archive/web/ima/current/msg04420.html
http://www.ietf.org/mail-archive/web/ima/current/msg04422.html



RFC 5536 (USEFOR) currently allows only ASCII characters in message-IDs.

INN 2.4 and INN 2.5 have always rejected message-IDs containing
non-ASCII chars.  (I have not looked at INN 2.3 and before.)  When
a message-ID is not valid per RFC 850/1036/... and now 5536, the
article is rejected.

200 news.trigofacile.com InterNetNews server INN 2.6.0 (20110908 prerelease) ready (transit mode)
IHAVE <©@fr>
435 Syntax error in message-ID
MODE READER
200 news.trigofacile.com InterNetNews NNRP server INN 2.6.0 (20111003 prerelease) ready (posting ok)
ARTICLE <©@test>
501 Syntax error in message-ID
QUIT
205 Bye!


(Note that 435 is answered to IHAVE for legacy reasons; 501 should be
the real response code per RFC 3977.)




My question is:  should we try right now to relax the check so as to allow
UTF-8 in message-IDs?
If yes, is there something else to enforce?  (NFC normalization?)

Of course, other requirements from RFC 5536 will remain (that is to say
no comments in the Message-ID: header field, and no ">" or WSP).
U+00A0 (&nbsp; in HTML) and other spaces encoded in UTF-8 are allowed,
aren't they?



We plan on releasing INN 2.5.3 soon, so perhaps we can relax the check
starting from INN 2.5.3.  I will ask in the INN workers mailing-list,
if naturally there is no complaints in this USEFOR mailing-list against
going this way.

-- 
Julien ÉLIE

« I don't know if it's what you want, but it's what you get. »
  (Larry Wall)