Require guidance on Unicode in IETF formats

Stephane Bortzmeyer <> Wed, 10 January 2007 20:32 UTC

Received: from [] ( by with esmtp (Exim 4.43) id 1H4k7Q-0003Oq-9b; Wed, 10 Jan 2007 15:32:20 -0500
Received: from [] ( by with esmtp (Exim 4.43) id 1H4k7O-0003MR-LL for; Wed, 10 Jan 2007 15:32:18 -0500
Received: from ([] by with esmtp (Exim 4.43) id 1H4k7M-0003z8-IO for; Wed, 10 Jan 2007 15:32:18 -0500
Received: by (Postfix, from userid 10) id B5664240822; Wed, 10 Jan 2007 21:32:09 +0100 (CET)
Received: from (preston []) by (Postfix) with ESMTP id 7F198110AC; Wed, 10 Jan 2007 21:30:57 +0100 (CET)
Received: by (Postfix, from userid 1000) id 49B03AAEE0; Wed, 10 Jan 2007 21:30:57 +0100 (CET)
Date: Wed, 10 Jan 2007 21:30:56 +0100
From: Stephane Bortzmeyer <>
To: "Lisa Dusseault - App. Area Director" <>, "Ted Hardie - App. Area Director" <>
Message-ID: <>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/
X-Operating-System: NetBSD 3.0_STABLE sparc64
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 52e1467c2184c31006318542db5614d5
Subject: Require guidance on Unicode in IETF formats
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: DIscussion on state machine specification in IETF protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

We require some guidance from our Area Directors about the use of
Unicode in an IETF format. On the mailing list, a
discussion was raised on wether we should accept only ASCII in the
language we define (our work is to define a format, not a protocol) or
the full Unicode character set.
and follow-ups.)

Some people claimed that Unicode support was more or less mandatory at
the IETF and that a format without it had no chance of being
adopted. Besides, internationalization is a very good thing, anyway,
for the world-wide Internet.

Some people feared that mandating Unicode would complicate the grammar
and would drastically reduce the number of tools available to write
parsers for this format. They think that Cosmogol, being intended
mostly for RFC or other ultra-technical usages do not have the same
requirments as a general protocol like HTTP or NNTP.

We identified the following RFC as possibly relevant:

RFC 2277 / BCP 18 IETF Policy on Character Sets and Languages

RFC 2223 Instructions to RFC Authors

RFC 3536 Terminology Used in Internationalization in the IETF

But none seems to bring a clear answer. Is Unicode support a MUST, a
SHOULD or a MAY in a new protocol?

How many *new* IETF formats are in Unicode? (Apart from those based
only on XML, like Atom in RFC 4287.) Old formats like ABNF do not
count because they derive from an older format.

Cosmogol mailing list