Re: multiple character sets in GeneralText

John Haxby <J.Haxby@isode.com> Wed, 08 June 1994 09:13 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa00716; 8 Jun 94 5:13 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa00712; 8 Jun 94 5:13 EDT
Received: from survis.surfnet.nl by CNRI.Reston.VA.US id aa01848; 8 Jun 94 5:13 EDT
Received: from glengoyne.isode.com by survis.surfnet.nl with SMTP (PP) id <07414-0@survis.surfnet.nl>; Wed, 8 Jun 1994 10:59:14 +0200
To: "Kevin E. Jordan" <Kevin.E.Jordan@cdc.com>
cc: mime-mhs@surfnet.nl
Subject: Re: multiple character sets in GeneralText
In-reply-to: Your message of "Tue, 07 Jun 1994 10:36:09 CDT." <2df493ea3898002@mercury.udev.cdc.com>
Date: Wed, 08 Jun 1994 09:58:51 +0100
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Haxby <J.Haxby@isode.com>
Message-ID: <"survis.sur.420:08.05.94.08.59.24"@surfnet.nl>

> Has this question been asked before...  RFC1494 gives no guidance in the case
> where an X.400 GeneralText body part contains multiple ISO-8859-x character
> sets, e.g. ISO-8859-1 and ISO-8859-7.  This body part will contain multiple
> integers in its character set SET, and it will also contain escape sequences
> in its text that define designators, character set shifts, etc.  How should
> this be mapped to MIME?  The following possibilities seem like options:
> 
>    1. Generate a text/plain part with multiple "charset=ISO-8859-x" parameters
>       and leave the ISO-2022 escape sequences in.  Are multiple charset
>       parameters legal in text/plain parts?
> 
>    2. Generate a default application/octet-stream part which preserves the
>       whole X.400 body part intact.

There is a third option (which is actually described by the RFC1494) and that
is to use the application/x400-bp encoding.   This means that there will be
some binary goop on the front of the text (the externally defined wrapper)
but the bulk of the text will be readable.  Hopefully the converter to
application/x400-bp will have the sense to use quoted-printable encoding
rather than base64.


Section 7.1.2 suggests that only one character set can be used; there is
certainly no mention of multiple character sets and no mention of whether
(encoded) escape sequences should be used to switch between them or some
other notation.

RFC1494 should, perhaps, define a `text/general' MIME content type that can
take lots of charset= values.

--
John Haxby			J.Haxby@isode.com
ISODE Consortium		+44 81 332 9091