Re: comment for draft-moore-mime-cdisp-00

Keith Moore <moore@cs.utk.edu> Wed, 06 August 1997 20:46 UTC

Received: (from majordomo@localhost) by mail.proper.com (8.8.5/8.7.3) id NAA27006 for ietf-822-bks; Wed, 6 Aug 1997 13:46:08 -0700 (PDT)
Received: from spot.cs.utk.edu (SPOT.CS.UTK.EDU [128.169.92.189]) by mail.proper.com (8.8.5/8.7.3) with ESMTP id NAA27001 for <ietf-822@imc.org>; Wed, 6 Aug 1997 13:46:04 -0700 (PDT)
Received: from cs.utk.edu by spot.cs.utk.edu with ESMTP (cf v2.11c-UTK) id QAA26350; Wed, 6 Aug 1997 16:50:07 -0400 (EDT)
Message-Id: <199708062050.QAA26350@spot.cs.utk.edu>
X-Mailer: exmh version 2.0gamma 1/27/96
X-URI: http://www.cs.utk.edu/~moore/
From: Keith Moore <moore@cs.utk.edu>
To: "Kenzaburou Tamaru (Exchange)" <kenzat@EXCHANGE.MICROSOFT.com>
cc: ietf-822@imc.org, Mark Crispin <MRC@cac.washington.edu>, moore@cs.utk.edu
Subject: Re: comment for draft-moore-mime-cdisp-00
In-reply-to: Your message of "Tue, 05 Aug 1997 21:06:19 PDT." <2FBF98FC7852CF11912A00000000000105ED95ED@DINO>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 06 Aug 1997 16:50:06 -0400
Sender: owner-ietf-822@imc.org
Precedence: bulk

> Regarding the Filename parameter, none of Asia language is taken into
> consideration. As result, it is sort of nightmare in specially Japan.
> Many copmanies use different encoding scheme. This is because none of
> RFC does allow to use Asia character for file name as of today, it
> allows only US-ASCII. 

I believe this is solved with draft-moore-mime-cdisp-01.txt, which
references draft-freed-pvcsc-03.txt.

In particular, a content-disposition header could look like

content-disposition: attachment; filename*=iso-2022-jp'xx'yyyyyyyy

where xx is the ISO 639 language code for Japanese (sorry, I don't
have a copy of ISO 639) and yyyyyyyy is Japenese text in iso-2022-jp
charset encoded using URL-style "%XX" encoding for each octet.

Unfortunately, the encoding is very inefficient.  Each Japanese
two-octet character will require six ASCII characters to encode (for
example, ABCD hex gets encoded as "%AB%CD") Text in the UTF-8 charset
would require even more encoding overhead using this method.

So while it is possible to encode text in any language using this
method, I could certainly understand if users of Asian languages would
prefer a more efficient encoding for those languages.

Keith