Re: Requesting a revision of RFC3023

MURATA Makoto <murata@hokkaido.email.ne.jp> Sun, 21 September 2003 15:39 UTC

Received: from above.proper.com (localhost [127.0.0.1]) by above.proper.com (8.12.9/8.12.8) with ESMTP id h8LFdvKP080064 for <ietf-xml-mime-bks@above.proper.com>; Sun, 21 Sep 2003 08:39:57 -0700 (PDT) (envelope-from owner-ietf-xml-mime@mail.imc.org)
Received: (from majordom@localhost) by above.proper.com (8.12.9/8.12.9/Submit) id h8LFdvCV080063 for ietf-xml-mime-bks; Sun, 21 Sep 2003 08:39:57 -0700 (PDT)
X-Authentication-Warning: above.proper.com: majordom set sender to owner-ietf-xml-mime@mail.imc.org using -f
Received: from mail.asahi-net.or.jp (mail1.asahi-net.or.jp [202.224.39.197]) by above.proper.com (8.12.9/8.12.8) with ESMTP id h8LFdtKP080058 for <ietf-xml-mime@imc.org>; Sun, 21 Sep 2003 08:39:56 -0700 (PDT) (envelope-from murata@hokkaido.email.ne.jp)
Received: from [127.0.0.1] (i217217.ppp.asahi-net.or.jp [61.125.217.217]) by mail.asahi-net.or.jp (Postfix) with ESMTP id 751F376C6; Mon, 22 Sep 2003 00:39:56 +0900 (JST)
Date: Mon, 22 Sep 2003 00:37:18 +0900
From: MURATA Makoto <murata@hokkaido.email.ne.jp>
To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Subject: Re: Requesting a revision of RFC3023
Cc: ietf-xml-mime@imc.org, WWW-Tag <www-tag@w3.org>
In-Reply-To: <p06002001bb935fe56feb@[192.168.254.4]>
References: <20030921205754.506D.MURATA@hokkaido.email.ne.jp> <p06002001bb935fe56feb@[192.168.254.4]>
Message-Id: <20030922002118.5076.MURATA@hokkaido.email.ne.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.06.02
Sender: owner-ietf-xml-mime@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-xml-mime/mail-archive/>
List-ID: <ietf-xml-mime.imc.org>
List-Unsubscribe: <mailto:ietf-xml-mime-request@imc.org?body=unsubscribe>

> By Unicode signature, I'm guessing you mean the BOM? That problem 
> seems to have been easily dealt with by simply deciding to allow it 
> in UTF-8. It doesn't appear to have caused any problems in practice 
> today.

In the case of XML, I think you are right.  In general, however, see

http://www.ietf.org/internet-drafts/draft-yergeau-rfc2279bis-05.txt

> I don't know what you problems you refer to with "representation of 
> non-BMP characters". UTF-8 precisely specifies how these characters 
> are represented. There's no issue here. Did you mean something else?

Quite a few implementations use 6 bytes (rather than 4 bytes) to represent 
non-BMP characters.  See

http://www.unicode.org/reports/tr26/

-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>