UTF-8 or US-ASCII in RFC 822bis?
Pete Loshin <pete@loshin.com> Thu, 13 May 1999 00:04 UTC
Received: from CS.UTK.EDU (CS.UTK.EDU [128.169.94.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA11442 for <drums-archive@odin.ietf.org>; Wed, 12 May 1999 20:04:23 -0400 (EDT)
Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK) id UAA10265; Wed, 12 May 1999 20:03:57 -0400 (EDT)
Received: by cs.cs.utk.edu (bulk_mailer v1.12); Wed, 12 May 1999 20:03:42 -0400
Received: by CS.UTK.EDU (cf v2.9s-UTK) id UAA10216; Wed, 12 May 1999 20:03:40 -0400 (EDT)
Received: from chmls06.mediaone.net (LOCALHOST.cs.utk.edu [127.0.0.1]) by CS.UTK.EDU with ESMTP (cf v2.9s-UTK) id UAA10172; Wed, 12 May 1999 20:03:28 -0400 (EDT)
Received: from chmls06.mediaone.net (24.128.1.71 -> chmls06.mediaone.net) by CS.UTK.EDU (smtpshim v1.0); Wed, 12 May 1999 20:03:28 -0400
Received: from loshin.com (loshin.ne.mediaone.net [24.128.200.158]) by chmls06.mediaone.net (8.8.7/8.8.7) with ESMTP id UAA22160 for <drums@cs.utk.edu>; Wed, 12 May 1999 20:03:25 -0400 (EDT)
Message-ID: <373A1566.16B1D365@loshin.com>
Date: Wed, 12 May 1999 19:57:26 -0400
From: Pete Loshin <pete@loshin.com>
X-Mailer: Mozilla 4.51 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
CC: drums@cs.utk.edu
Subject: UTF-8 or US-ASCII in RFC 822bis?
References: <372711F8.F1021116@ecal.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
List-Unsubscribe: <mailto:drums-request@cs.utk.edu?Subject=unsubscribe>
Content-Transfer-Encoding: 7bit
OK. I've just been poring over some RFCs and I-Ds, and was wondering. Here's what "Internet Message Format Standard" (http://www.ietf.org/internet-drafts/draft-ietf-drums-msg-fmt-07.txt) says about character sets: 2.1. General Description At the most basic level, a message is a series of characters. A message that is conformant with this standard is comprised of characters with values in the range 1 through 127 and interpreted as US-ASCII characters [ASCII]. For brevity, this document sometimes refers to this range of characters as simply "US-ASCII characters". And here's what RFC 2277, "IETF Policy on Character Sets and Languages" (BCP-18) says about character sets: 3.1. What charset to use All protocols MUST identify, for all character data, which charset is in use. Protocols MUST be able to use the UTF-8 charset, which consists of the ISO 10646 coded character set combined with the UTF-8 character encoding scheme, as defined in [10646] Annex R (published in Amendment 2), for all text. So, my question is, what's going on? Is 822bis going to support UTF-8, as it is supposed to do (at least, according to the way I read RFC 2277)? Or is there some reason not to support UTF-8 that I don't know about? Or am I missing something somewhere, and it really does support UTF-8? Any explanations or clarifications would be most welcome. Thanks! -pl +-------------------------------------------------------------+ | Pete Loshin <pete@loshin.com> +1 781/646-6318 | | | | _IPv6 Clearly Explained_ Morgan Kaufmann January 1999 | | _TCP/IP Clearly Explained_ 3rd ed Morgan Kaufmann June 1999 | | | +-------------------------------------------------------------+
- Heads-up: base64 John Stracke
- UTF-8 or US-ASCII in RFC 822bis? Pete Loshin
- Re: UTF-8 or US-ASCII in RFC 822bis? Robert Elz
- Re: UTF-8 or US-ASCII in RFC 822bis? Perry E. Metzger
- Re: UTF-8 or US-ASCII in RFC 822bis? D. J. Bernstein