RE: Lastest drfat of RFC-XXXX

Greg Vaudreuil <gvaudre@NRI.Reston.VA.US> Wed, 20 March 1991 20:01 UTC

Received: from dimacs.rutgers.edu by NRI.NRI.Reston.VA.US id aa17266; 20 Mar 91 15:01 EST
Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06376; Wed, 20 Mar 91 14:09:57 EST
Received: from NRI.RESTON.VA.US by dimacs.rutgers.edu (5.59/SMI4.0/RU1.4/3.08) id AA06369; Wed, 20 Mar 91 14:09:49 EST
Received: from NRI by NRI.NRI.Reston.VA.US id aa15106; 20 Mar 91 14:04 EST
Org: Corp. for National Research Initiatives
Phone: (703) 620-8990 ; Fax: (703) 620-0913
To: Nathaniel Borenstein <nsb@thumper.bellcore.com>
Cc: ietf-smtp@dimacs.rutgers.edu, gvaudre@NRI.Reston.VA.US
Subject: RE: Lastest drfat of RFC-XXXX
Date: Wed, 20 Mar 1991 14:03:58 -0500
From: Greg Vaudreuil <gvaudre@NRI.Reston.VA.US>
Message-Id: <9103201404.aa15106@NRI.NRI.Reston.VA.US>

Nathaniel and Co-Authors, 

Below is an annotated copy of the proposed RFC.  Comments are
delineated by "|".  


Greg Vaudreuil
Internet Mail Extensions Working Group Chairmen

------- Forwarded Message

Date:    Wed, 20 Mar 91 12:07:23 -0500
From:    Nathaniel Borenstein <nsb@thumper.bellcore.com>
To:      ietf-smtp@dimacs.rutgers.edu
Subject: Lastest drfat of RFC-XXXX

Request for Comments: XXXX                                          
March, 1991

                                                    N. Borenstein, Bellcore
                                                    N. Freed, Innosoft
                                                    S. Vance, TGV
                                                    K. Carosso, Innosoft

     A Multipart Content Type and Encoding Mechanism for RFC 822 Messages



STATUS OF THIS MEMO

This RFC suggests extensions to the RFC 822 message representation
protocol to allow multi-part textual and non-textual messages to be
represented and exchanged without loss of information. Discussion and
suggestions for improvements are welcome.  This memo does not specify an
Internet standard.  Distribution of this memo is unlimited.

If this RFC becomes a standard, it would effect the following other RFC's:

WOULD OBSOLETE:  RFC 1154
WOULD UPDATE:    RFC 822, RFC 934, RFC 1049
WOULD AFFECT:    RFC 1148

| This specification should aim to be a complete specification, and be
| an end to the document spagetti.  I hope this will obsolete RFC 934,
| and RFC 1049 as well as RFC 1154.  All relevent sections of RFC 934
| and RFC 1049 should be incorporated into this document before it is
| published as a proposed standard.

INTRODUCTION

One of the limitations of RFC 821/822 based mail systems is the fact
that they limit the contents of electronic mail messages to relatively
short lines of seven-bit ASCII.  This forces a user to convert any
non-textual data that she may wish to send to another user into a
seven-bit ASCII representation before invoking her local mail UA. 
Examples of encodings currently used in the Internet include pure
hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in RFC 1113,
and many others.

This limitation becomes even more apparent as gateways are designed to
allow for the exchange of mail messages between RFC 822 hosts and X.400
hosts.  X.400 specifies mechanisms for the inclusion of non-textual body
parts within electronic mail messages.  The current standards for the
mapping of X.400 messages to RFC 822 messages specify that either X.400
non-textual body parts should be converted to (not encoded in) an ASCII
format, or that they should be discarded, notifying the RFC 822 user
that discarding has occurred.  This is clearly undesirable, as
information that a user may wish to receive is lost.  Even though a
user's UA may not have the capability of dealing with the non-textual
body part, the user might have some mechanism external to the UA that
can extract useful information from the body part.  Moreover, it does
not allow for the fact that the message may eventually be gatewayed back
into an X.400 MHS, where the non-textual information would definitely
become useful again.

This memo describes an encapsulation mechanism that may be used to
describe multiple part messages. The parts themselves may contain
textual or nontextual data; non-textual data is encoded in a form that
can survive mailers unaware of this specification.


ENCAPSULATION

In devising an encapsulation scheme, two things must be considered: how
to convert the non-textual data to a representation which may be
transmitted over a seven-bit SMTP connection without loss of data, and
how to preserve information about the structure of the data itself. 
This "structural" information must include, at a mimimum, the type of
data involved. This type information may be something recognized by many
systems or it may be some type of data specific to a single operating
system. 

This memo proposes that two RFC 822 headers be used to indicate the
inclusion of non-textual information in a mail message: Content-Type and
Content-Encoding.


Content-Type

| A full, complete, and implementable description of current content
| types should be included below.  See the previous note.

The Content-Type header is already defined in RFC1049. This RFC adds an
additional Content-Type header value of "text". (An additional value
"multipart" is also added below.) The meaning of this Content-Type is
that the message body consists of normal text in the canonical SMTP
format (US ASCII character set, generally unchanged by MTA's).   A
certain amount of freedom is allowed in the processing and handling of
text:

    (1) Delimiters other than CR-LF pairs may be used in the local
    representation of a message on some systems.  When transferred via
    SMTP, however, this local representation must be converted to proper
    SMTP format; the use of other delimiters during SMTP transmission is
    not allowed.

| This is a continuing problem, but is not within the scope of this
| document. This is a message format extensions document, not an SMTP 
| extensions document.  If RFC 821 needs work, it should be done separately. 
| In so far as this information is relevent to implementors of User Agents, it
| should be included as "Implementors advice"

    (2) Isolated CR and LF characters are not well tolerated in general;
    they may be lost or converted to delimiters on some systems.  The
    use of CR or LF characters that are not part of a CR/LF sequence is
    NOT PERMITTED in multipart messages.  Sequences such as CR LF LF are
    also invalid; the correct sequence is CR LF CR LF.  The effect in a
    multi-part message of a CR without a following LF, or an LF without
    a preceeding CR, is undefined.  Although RFC-822 defines these as
    ordinary characters when used outside of the CR/LF sequence, some
    implementations treat one (or both) as equivalent to newline or as
    error characters that are discarded.  Messages which contain
    embedded bare CR or LF characters must use one or another of the
    defined encoding formats to encode these characters "safely". 
    (Discussion: Some environments use a bare CR or bare LF as the local
    newline convention.  If a message contains embedded bare CR or LF
    characters, it is impossible to transform it from Internet to local
    conventions without interfering with this local convention.)

| See the previous note.  Confusion between RFC 821 and RFC 822 issues
| may need to be clarified.  

    (3) TAB characters may be misinterpreted or may be automatically
    converted to variable numbers of spaces.  This is unavoidable in
    some environments, notably those not based on the ASCII character
    set. Such conversion is STRONGLY DISCOURAGED, but it may occur, and
    users of TEXT format must be prepared to tolerate it.

| Same comment....

    (4) Lines longer than 80 characters may be wrapped in some
    environments. Line wrapping is STRONGLY DISCOURAGED, but unavoidable
    in some cases. Applications which depend on lines not being wrapped
    should use mechanisms other than unencoded text bodyparts to
    transmit messages. 

| Same comment....

See RFC 821 and RFC1113 for additional information about canonical SMTP
formats.

| This document does not, and should not update 821. "Canonical" SMTP
| formats are used by the RFC 822 Message agents only.

**** NOTE:  Stef, at least, is still unhappy with using the keyword
"text" here, preerring "ASCII", but it corresponds to some established
practice, so I've left it in.

If the Content-Type header is missing from the message's RFC822 header,
a Content-Type of "text" is assumed.  This is consistent with previous
behavior.

***** THIS WOULD BE A GOOD PLACE TO DEFINE A BUNCH OF STANDARD
CONTENT-TYPES FOR X.400.  I don't know anything about X.400, really, but
one way to simplify the Content-type issue might be to define a standard
"ASN.1" Content-type, and then allow the actual X.400 ASN.1-form content
type to be specified in the resource, e.g. "Content-type: ASN.1;
some-version-number; G3Fax".  This way, we don't need a new RFC for each
X.400 body part that can be encoded in ASN.1.  But an X.400 expert will
have to comment on the reasonableness of this scheme, and on what should
go in the RFC 1049 version-number.  In general, there should be a full
specification of the types and encodings to be used for representing
X.400 body parts.

**** This would also be a good place to officially list the
"Content-type" fields for lots of character sets.  can anyone suggest a
complete list?

Content-Encoding

| This is a complex and "new" feature that needs full treatment.  See
| the minutes of the last meeting for some suggested text.  A full
| discussion is needed both in this document, and on the mailing list
| about the content-encoding option.

| As I understood the explanation of this option at the IETF meeting,
| this document specifies several "Standard" encodings in which a sending
| UA could use to send any particular content type.  This and the
| requirement that all user agents must be able to receive messages in
| all standard content-encodings to be conformant needs to be explained.
| The issues of effeciency in terms of the trade-off between a single
| encoding type per content-type and multiple encoding-types needs to be
| explicitly made.

The Content-Encoding field is designed to specify a two-way mapping
between the "native" representation of a type of data and a
representation that can be readily exchanged using 7 bit mail transport
protocols as defined by RFC 821 (SMTP). This field has not been defined
by any previous RFC. The field's value should be specified
comma-separated ordered list of encoding types, in the event that a user
wishes to use more than one type of encoding on the message or message
part. The specified encodings are applied in left-to-right order.  When
the message is decoded at the destination, the decoding should occur in
the reverse order of the encoding (right to left, as specified by the
order of the encoding types).

| This feature of nested encodings is new to me.  Can someone explain
| the purpose of this?  If an encoding has the effect of converting from
| 8 bit or binary to 7 bit ASCII text, there is no need to do the nested
| encoding. 

| The standard example of a nested illustrates my point.  "Content-type:
| tar, compress, uuencode" is a standard example of this feature.
| Neither tar, nor compress satisfy the requirements of a
| "content-encoding".  This seems to be much more a description of the
| content-type than the encoding as defined in the previous paragraph. 

Implementors are free to define new content encoding types, but should
prefix them with "x-" to indicate their non-standard status, e.g.
"Content-Encoding:  x-my-new-encoding". 

| Discuss the implications of using non-standard encodings to exchange
| mail.  This on face value is a mechanism to promote pools of
| non-interoperable mail users.

Encoding types may be divided into two classes: those that create
seven-bit ASCII output and those that create non-seven-bit ASCII output.
 The valid encoding types that generate seven-bit ASCII output are:

  o BASE64
  o HEXADECIMAL
  o QUOTED-PRINTABLE
  o NONE

***** In St. Louis, many people seemed to want to get rid of the
HEXADECIMAL choice.  However, the best reason for keeping HEX never came
up, which is that there are lots of utilities around that already can
generate it, so that it would be the easiest to generate from, say, a
shell script on UNIX.   The "X-<atom>" choice was eliminated by
consensus in St. Louis, on the grounds that we did not want to encourage
the proliferation of encoding types.

An encoding type of NONE implies that the message is already in
seven-bit ASCII format. This value is assumed if the Content-Encoding
header is not present.

These values are not case sensitive.  That is, Hexadecimal and
HEXADECIMAL and heXadeCimAl are all equivalent.

Any number of encodings may be applied to the data and specified in the
Content-Encoding header, but the last encoding performed and specified
must be from the list of ASCII output generating encodings.

| Ah, a partial explanation.  Please specify a usable list of non-8 to
| 7 bit encodings so my local implementor has a clue as to what to
| expect.  This seems nice for hand-decoding, but very complex for
| machine parsing and decoding unless the options are defined and bounded.

| One class of encoding I can think of would be compression.  Is this
| the place where such options should be placed?

The following sections will define the three initial standard encoding
mechanisms.

Quoted-Printable Content-Encoding

**** This section has changed radically since previous drafts.  In
particular, note that the rule for quoting newlines (special case #2) is
completely new, inspired by some concerns expressed in St. Louis.  

The Quoted-Printable encoding is intended to represent data that is
largely, but not entirely, printable ASCII.  Printable ASCII portions of
body parts encoded in this way should be recognizable by humans, if
necessary, without translation.

In this encoding, ASCII characters 9 (tab), 13 (nl), 15 (cr), 32 through
37, inclusive, 39 through 91, and 93 through 127, inclusive, are
unchanged.  All other characters, including characters 38 and 92, are to
be represented in either of two quotation styles:

Style #1:  Any 8 bit value may be represented a "\" followed by a two
digit hexadecimal representation of the character's ASCII value.  Thus,
for example, character 12 (control-L, or formfeed) can be represented by
"\0C", the backquote can be represented by "\60", and the backslash
character (92) itself can be represented by "\5C".

Style #2:  An 8 bit value from 128 through 255 may, alternately, be
represented by an ampersand character followed by the character obtained
by the removal of the high order bit, i.e. by subtracting 128 from the
value.  Thus  the 8 bit value 193 may be represented as "&A".  

Style #1 is generally preferred, in that style #2 might include control
characters (e.g. TAB) that are altered by some MTA (see NOTES TO
IMPLEMENTERS, below).  Style #2 is provided for improved readability of
some 8-bit character sets in which turning on the 8th produces a
character similar to the corresponding 7 bit character, e.g. the 8th bit
simply adds an umlaut.  In such cases, style #2 is somewhat more
readable, but should be used carefully, as explained in the NOTES TO
IMPLEMENTERS.

Additionally, there are three special cases that may be represented otherwise:

Special case #1:  The literal ampersand and backslash characters may
themselves be quoted by backslashes.  Thus, the backslash may be
represented as "\\" and the ampersand as "\&".  Note that this is not
ambiguous with regard to the first clause, because neither "\" nor "&"
are part of the hexadecimal alphabet.

Special case #2:  A backslash at the end of a line may be used to
indicate a non-significant line break.  That is, if one needs to include
a long line without line breaks, but is concerned that MTA's will break
the line into multiple lines, a message encoded with the
quoted-printable encoding may include "soft" line breaks by preceding
the line break with a backslash.  Thus if the "raw" form of the line is
a single line that says:

Now is the time for all good men to come to the aid of their country. 
Now is the time for all good men to come to the aid of their country. 
Now is the time for all good men to come to the aid of their country.

This could be represented, in the quoted-printable encoding, as

Now is the time for all good men to come to the aid of their country.  \
Now is the time for all good men to come to the aid of their country.  \
Now is the time for all good men to come to the aid of their country.  

This provides a mechanism with which long lines can be encoded in such a
way as to be restored by the user agent.  

NOTES TO IMPLEMENTERS OF ENCODING AGENTS:  for maximum portability
across MTA's, it is recommended that any long lines be represented using
"soft" line breaks which are inserted before any line reaches the 80th
character.  It is also recommended that trailing white space (white
space at the end of a line) not be relied upon, as some MTA's freely
delete such trailing white space.  It is also recommended that the
persistence of character codes less than 32 should not be relied on,
particularly the TAB, CR, and LF characters.  Where such characters
would be required for representation in style #2, it is recommended that
style #1 be used.  

Since the dash character ("-") is represented as itself in the
Quoted-Printable encoding, care must be taken to quote lines within
Quoted-Printable encoded body parts that start with "-", as described in
RFC 934 (see the section on multiple part messages below).

Hexadecimal Content-Encoding

The Hexadecimal Content-Encoding is intended to represent arbitrary data
that is not humanly-readable in a form that can be passed through 7 bit
mail transport agents.  It transforms a byte stream into a series of
two-digit hexadecimal values.  Thus, the sequence of the five 8-bit
values "ABC control-L newline" would be represented by "4142430C0A". 
Since newlines are themselves encoded as 0A, non-data newlines may be
scattered freely to break the stream into multiple lines.  In fact, it
is recommended that newlines be included at least every 60 characters
(30 encoded characters).  Such newlines will be discarded by the decoder.

The hexadecimal encoding is a simple way to represent arbitrary 8 bit
data in 7 bit mail, but not a very efficient one, as it doubles the size
of the data.  The Base64 encoding, to be described below, is a
reasonably simple alternative that only increases the size of the data
by 33 percent.

Since the dash character ("-") is not used in hexadecimal encodings,
there is no need to worry about quoting apparent encapsulation
boundaries within hexadecimal-encoded body parts.


Base64 Content-Encoding

The Base64 Content-Encoding is designed to represent arbitrary 8 bit
data in a form that is not humanly readable.  The encoding and decoding
algorithms are simple, but the encoded data is only about 33 percent
larger than the unencoded data.  This encoding is also used in Privacy
Enhanced Mail applications; it is described in RFC1113. The ability in
RFC1113 to imbed clear text within such an encoding is not allowed in
this context, however. The following description of the encoding is
adapted from RFC 1113; apart from the exclusion of the "*" mechanism for
imbedded clear text there are no significant technical changes.

A 64-character subset of International Alphabet IA5 is used, enabling 6
bits to be represented per printable character.  (The proposed subset of
characters is represented identically in IA5 and ASCII.) One additional
character, "=", is used to signify special processing functions.  The
character "=" is used for padding within the printable encoding
procedure. The encoding function's output is delimited into text lines
(using local conventions), with each line except the last containing
exactly 64 printable characters and the final line containing 64 or
fewer printable characters.  (This line length is easily printable and
is guaranteed to satisfy SMTP's 1000 character transmitted line length
limit.)

The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right across a
24-bit input group is formed by concatenating 3 8-bit input groups, this
is then treated as 4 concatenated 6-bit groups.

Each 6-bit group is used as an index into an array of 64 printable
characters. The character referenced by the index is placed in the
output string. These characters, identified in Table 1 below, are
selected so as to be universally representable, and the set excludes
characters with particular significance to SMTP (e.g., ".", "<CR>",
"<LF>").

                             Table 1

   Value Encoding  Value Encoding  Value Encoding  Value Encoding
       0 A            17 R            34 i            51 z
       1 B            18 S            35 j            52 0
       2 C            19 T            36 k            53 1
       3 D            20 U            37 l            54 2
       4 E            21 V            38 m            55 3
       5 F            22 W            39 n            56 4
       6 G            23 X            40 o            57 5
       7 H            24 Y            41 p            58 6
       8 I            25 Z            42 q            59 7
       9 J            26 a            43 r            60 8
      10 K            27 b            44 s            61 9
      11 L            28 c            45 t            62 +
      12 M            29 d            46 u            63 /
      13 N            30 e            47 v
      14 O            31 f            48 w         (pad) =
      15 P            32 g            49 x
      16 Q            33 h            50 y

Special processing is performed if fewer than 24 bits are available in
an at the end of a message or part of a message.  A full encoding
quantum is always completed at the end of a message. When fewer than 24
input bits are available in an input group, zero bits are added (on the
right) to form an integral number of 6-bit groups.  Output character
positions which are not required to represent actual input data are set
to the character "=".  Since all canonically encoded output is an
integral number of octets, only the following cases can arise: (1) the
final quantum of encoding input is an integral multiple of 24 bits;
here, the final unit of encoded output will be an integral multiple of 4
characters with no "=" padding, (2) the final quantum of encoding input
is exactly 8 bits; here, the final unit of encoded output will be two
characters followed by two "=" padding characters, or (3) the final
quantum of encoding input is exactly 16 bits; here, the final unit of
encoded output will be three characters followed by one "=" padding
character.

SINGLE AND MULTIPLE PART MESSAGES

In the case of single part messages, the aforementioned headers should
all appear as part of the RFC 822 message headers.

In the case of multiple part messages, a "Content-type: Multipart"
header should appear in the RFC 822 message header. The message body is
now assumed to contain multiple parts separated by encapsulation
boundaries.

The Content-type header, as defined by RFC 1049, has two optional fields
that may follow the type name. These fields are for a version number and
a resource specification.  In the case of the "multipart" content-type,
this document defines version number 1; if the version number is
omitted, it is to be assumed to be version 1, where other versions might
be defined later.  The resource specification, if included, is used to
specify the format of the encapsulation boundary, as described below.


Encapsulation Boundary Mechanism

Most crucial to a multipart message format is the mechanism for marking
boundaries between the different parts of a message.  For this purpose,
this memo specifies the use of RFC 934 encapsulation boundaries, and
refers the reader to RFC 934 for a complete specification.  In essence,
a new body part is initiated by a line that is recognizable as an
"encapsulation boundary" by the fact that it starts with a dash (decimal
code 45, "-") not followed by a space (decimal code 32, " ").  Lines
which are part of a body-part but begin with a dash must be prefixed by
"- " to avoid being recognized as encapsulation boundaries.

RFC 934 describes what follows the encapsulation boundary as, in effect,
an RFC 822 message in miniature.  It further specifies that the only
required header fields for each encapsulated part are "Date" and "From".
This memo alters these requirements as applied to body-parts in
multipart messages.  The "Date" and "From" header fields are made
optional, but a "Content-Type" header field is recommended. The default
text content type should be assumed if no "Content-Type" header appears.
Note that a "Content-Encoding" header can and often should appear among
the message part headers.

It should be noted that any lines preceding the first encapsulation
boundary are regarded by RFC 934 as undifferentiated textual
information.  This is unchanged by the current memo, implying that
multipart messages may begin with a text prefix.  It is recommended that
mechanisms which compose multipart mechanisms in formats that are not
easily readable by human beings, in particular, should include a brief
explanatory message in this prefix area, for the benefit of people who
might read the message with older user agents that do not properly
interpret multipart messages.

The use of "Content-Type: Multipart" as a message part within another
"Content-Type: Multipart" is explicitly allowed. RFC 934's mechanism for
quoting lines that begin with "-" should be used to "stuff" such
multipart messages inside of other multipart messages. At least 10
levels of nesting should be supported by software that processes the
contents of such messages, and any hard limit on nesting levels should
be avoided by implementors.

Alternatively, if the Content-type header specifies a resource field,
this resource is used to further constrain the recognized encapsulation
boundaries.  In such cases, the only encapsulation boundaries to be
recognized are lines that begin with two dashes followed by the resource
specification.  Other lines that begin with dashes do NOT nee to be
quoted.  Thus, if the Content-type header is

Content-type: multipart; 1; foobar

then the only recognized encapsulation boundaries will be lines that
begin with the sequence "--foobar".  This approach is particularly
useful for deeply nested encapsulations.  The resource specification
(encapsulation boundary specification) should be no more than 40
characters long.

**** It has been suggested that we might want to specify a recommended
length for the boundary specifications and/or an algorithm for
generating them to be world-unique.  Personally, I'm happy to leave that
open, though.

****NEW PROPOSAL:  In private email, Vincent Lau made a proposal for a
"Dash-Count" mechanism which appears to me to be workable.  Though it
makes the job of separating encapsulated body parts slightly harder, by
providing another option, it offers some benefits in certain situations.
 I've tentatively added it to the proposal, and I encourage comments. 
Here it is:

Optional Dash-Count header

The key question with multipart messages is locating the boundaries
between the parts.  Many mechanisms have been proposed, none of which
are satisfactory in all situations.  In particular, some people have
proposed mechanisms that depended on line counts or character counts,
which have been rejected because these counts are not stable across all
MTA's.  Instead, the primary scheme selected was based on encapsulation
boundaries, as described above.  However, the encapsulation boundary
mechanism requires either the selection of a boundary that does not
appear in the body part, or the quotation of the boundary when it does
appear in the body part.  Instead, another alternative is provided by
the optional "Dash-Count" header field.

If an encapsulated body part includes a "Dash-Count" header field, this
is used to specify the number of dash characters (decimal 45) that
appear in the body (but not the headers) of the encapsulated message. 
When separating a multipart message into its component body parts, the
encapsulation boundary is not searched for until after the specified
number of dash characters have appeared.  This allows the use of
extremely simple encapsulation boundaries without any worry about
quoting within the body part itself, but at the possible cost of an
extra pass through the message body.

For example, the following is a multipart message with two parts, each
of them text:

From: Someone
To: Someone Else
Subject:  Two text objects
Content-type: multipart

- ---
Content-type: text
Dash-Count:  10

This piece of text contains 10 "-" characters, some of which really look
like an 
encapsulation boundary, but aren't considered as such because of the 
Dash-Count header above.
- --------
Amazingly, this is still part of the first text object!
- ---
Content-type: text
Dash-Count:  13

This piece of text contains 12 "-" characters, some of which really look
like an 
encapsulation boundary, but aren't considered as such because of the 
Dash-Count header above.
- --------
Amazingly, this is still part of the first text object!
- ---
And so is this!
- ---

****Question:  As currently written, the rfc really includes THREE
different mechanisms for marking the encapsulation boudnaries:  
	1.  Simple RFC-934 style boundaries, with "- " quoting of lines that
begin with "-"
	2.  Restricted RFC-934 boundaries, with "cookies" specifying the
encapsulation boundary.
	3.  Simple RFC-934 boundaries, with Dash-Count instead of quoting.
My personal opinion is that, with the addition of Dash-Count, the
cookies are no longer necessary, and should be eliminated to reduce
complexity.  But I'm reluctant to do so without hearing from more people
than Vincent and myself.  Other opinions, anyone?

RFC1154 ENCODING

Note that the encoding specified by RFC 1154 is not used.  This encoding
is inappropriate for real-world applications due to its dependence on
line counting.  Line counting cannot be guaranteed consistent across
gateways, since line wrapping may occur in some environments, thus
destroying the validity of the counting information.  The encodings
specified here are all resistant to all but the most extreme types of
line wrapping.

| This explanation needs to be included earlier with the more general
| explanation for the choice of the 934 multi-part mechanism.  This is a
| significant engineering decision that needs to be explained both in
| terms of effeciency of parsing, and robustness given a real world mail
| system.  Note: Mechanisms have been proposed to deal with the line
| wrapping problems of RFC 1154, so that cannot be the only reason to
| abandon line counts as a separation mechanism.

It is expected that current users of RFC1154 will either change to use
this encapsulation scheme or that translators to/from this encapsulation
to RFC1154 will be developed.

| Please delete this from this document. It is unnecessarily 
| antagonistic given the unpublished, and non-standard status of this
| document.

COMPLEX EXAMPLE

<to be written -- a fairly complex example to show most of the features
of this RFC>


CONCLUSION

<to be written>

| Please include contact info for the four authors at this time, to
| allow us mailing list readers to get in touch with one or more of the
| contributers directly.

------- End of Forwarded Message