Re: Line Wrapping Question

Jamie Zawinski <jwz@netscape.com> Thu, 08 February 1996 13:14 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa10756; 8 Feb 96 8:14 EST
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa10752; 8 Feb 96 8:14 EST
Received: from list.cren.net by CNRI.Reston.VA.US id aa05627; 8 Feb 96 8:14 EST
Received: from localhost (localhost [127.0.0.1]) by list.cren.net (8.6.12/8.6.12) with SMTP id HAA09462; Thu, 8 Feb 1996 07:47:50 -0500
Received: from urchin.netscape.com (unknown.netscape.com [198.95.250.59]) by list.cren.net (8.6.12/8.6.12) with ESMTP id HAA09444 for <ietf-822@list.cren.net>; Thu, 8 Feb 1996 07:47:41 -0500
Received: from gruntle (gruntle.mcom.com [205.217.230.10]) by urchin.netscape.com (8.6.12/8.6.9) with SMTP id EAA12781; Thu, 8 Feb 1996 04:46:42 -0800
Message-Id: <3119F0B3.494C2AB5@netscape.com>
Date: Thu, 08 Feb 1996 04:46:43 -0800
X-Orig-Sender: owner-ietf-822@list.cren.net
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Jamie Zawinski <jwz@netscape.com>
To: "John W. Noerenberg" <jwn2@qualcomm.com>
Cc: "Sukvinder Singh Gill (Exchange)" <sukvg@wspu.microsoft.com>, "ietf-822@list.cren.net" <ietf-822@list.cren.net>
Subject: Re: Line Wrapping Question
References: <v03004a01ad3f08e3033c@[129.46.54.66]>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Sender: jwz@netscape.com
X-Mailer: Mozilla 2.1a0 (X11; U; Linux 1.2.13 i586)
X-Listprocessor-Version: 8.0(beta) -- ListProcessor by CREN

John W. Noerenberg wrote:
> 
> >Currently, when sending Internet Mail with long lines, we
> >default to using QP encoding so that MIME aware clients
> >can unwrap the lines, and display as required on their
> >viewers. (We use 140chars as the threshold)
> 
> I'm confused about your 140 character threshold.  140 is the threshold of
> what?  Your QP-encoder should be emitting lines of text no longer than 76
> characters.

There are two issues: how long QP-encoded lines should be (and that
should be 76) and *whether* to use QP at all.  Since SMTP says that
lines may be up to 1000 characters long, one needn't necessarily 
encode lines which aren't longer than that.  I presume that this
is the threshold he's talking about.

The MIME spec does not say that a legal MIME message must have lines
less than 76.  It says that *if* a part is QP-encoded, *then* it must
follow that line-length constraint.

> There is no perfect solution.  QP is intended to convey information to a
> MIME-aware UA.  Someone reading a message containing QP with a UA that
> doesn't follow MIME rules is gonna see stuff they think looks funny.
> There's no 2 ways about that.

Well that's true, but I would phrase it as, "QP is an alternate way of
representing data".

> We wrestled with this in Eudora's design literally for years.  The problem
> predates QP.   Without MIME a UA designer has to make assumptions about the
> nature of both sender and receiver's display.  But windowing systems
> invalidate the assumptions.

But QP has nothing to do with the sender's or receiver's display.
All it has to do wit is whether both support MIME.  QP is about
*transport*, not about *presentation*.  QP is exactly like Base64
in this respect, or uuencode.  It is "ASCII armor" to prevent data
from being mangled along the way.  Nothing more.

> The problem you have to solve is facilitating communication between people
> whose systems are built under conflicting assumptions about what can be
> displayed and how it is displayed.
> 
> What to do about those awful equal signs is putting the emphasis in the
> wrong place.

True.

>  QP line-wrapping permits the sender to signal where line breaks should
> occur and where line breaks can be ignored. 

Phrasing it this way bothers me, since it speaks to it from a
presentation view again, rather than properly disconnecting the
representation of the data from the raw data itself.

> QP also deals with 8-bit
> character codes which can be rendered as printable glyphs assuming your
> display is not limited to USASCII.  But only MIME-aware UAs are prepared to
> deal with the encoding.  Over the years there have been Luddites who have
> loudly proclaimed that QP is wrong because it makes the text worse.

I don't think this is a fair characterization of that side of the
argument; the argument is more along the lines of, "I'd rather take my
chances that my transport is 8-bit-clean than assume my recipients can
do MIME."  In some situations, that's completely the right thing to do
(depending on your audience, and what you know about the paths between
you and them.)

> As a consequence, Eudora goes to extrodinary means to estimate what is a
> reasonable presentation of a message without QP to permit those who are
> forced to live with the Luddites among them to be able to lead a reasonably
> tranquil life.  Users have to have to be able to choose whether or not to
> use the capability, because only they can decide what is required for the
> message they are trying to express.  And their UA must arm them with enough
> information so they can choose wisely.  But it cannot make the choice for
> them.

Can you tell us what Eudora's rules are?

> Eventually the benefits of richer character sets overwhelm even the most
> ardent Luddite.  Or they die.  One or the other is guaranteed.
> 
> Of your choices, we are closest to a).  However, Eudora encourages the use
> of QP.  If other MIME-compliant UAs do likewise, eventually d) will come to
> pass.

In Netscape, there is a preference toggle between "Allow 8-bit" and
"MIME Compliant (Quoted-Printable)".  It defaults to 8-bit now, since
there was great resistence to the use of QP when we made that the
default in an earlier beta.  And this resistance was, in fact, from
the very community that QP was designed to help: those with non-7bit
languages.  Many of our European users told us that the chance of their
recipients having only a 7-bit path was far, far less likely than the
chance that their recipients were able to decode MIME messages.  Sad
but true...

In Netscape, a MIME part (either the main text, or an "attachment") 
will be encoded if:

   - it is of a non-text type; or
   - the "use QP" option has been selected -and- high-bit characters
     exist in the document; or
   - any NULLs exist in the document; or
   - any line is longer than 900 bytes (SMTP requires 1000, but we
     picked 900 to allow some "slop".)

If we have decided that a document should be encoded, we must then
decide what style of encoding to use:

   - If more than 10% of the document consists of bytes outside of
     the printable-7bit-ASCII range, then we always use base64 instead
     of QP, under the assumption that this is a "binary" file of some
     kind.

   - Base64 is always used for non-text documents (image/*, audio/*, 
     etc) on the off chance that a GIF file (for example) might contain
     primarily bytes in the ASCII range, thus failing the 10% test.  In
     this case, using the quoted-printable representation might cause
     corruption due to the translation of CR or LF to CRLF.  So, when we
     don't know that the document is of a type that has "lines", we
     don't use QP.

So this means, roughly:

   - 7-bit text files with lines <900 will always be sent as-is;
   - text files with lines >900 will always be sent in QP;
   - 8-bit text files will be sent as-is or with QP, at the user's
     choice;
   - files which are clearly not text will be sent in base64.

-- 
Jamie Zawinski    jwz@netscape.com   http://www.netscape.com/people/jwz/
``A signature isn't a return address, it is the ASCII equivalent of a
  black velvet clown painting; it's a rectangle of carets surrounding
  a quote from a literary giant of weeniedom like Heinlein or Dr. Who.''
                                                         -- Chris Maeda