Re: HTML for email

Keith Moore <moore@network-heretics.com> Mon, 01 March 2021 18:27 UTC

Return-Path: <moore@network-heretics.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 071363A20EA for <ietf@ietfa.amsl.com>; Mon, 1 Mar 2021 10:27:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.917
X-Spam-Level:
X-Spam-Status: No, score=-1.917 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=messagingengine.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zoHl--EHvAe4 for <ietf@ietfa.amsl.com>; Mon, 1 Mar 2021 10:27:11 -0800 (PST)
Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 95AC43A20E8 for <ietf@ietf.org>; Mon, 1 Mar 2021 10:27:11 -0800 (PST)
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id CC2A95C019D; Mon, 1 Mar 2021 13:27:09 -0500 (EST)
Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Mon, 01 Mar 2021 13:27:09 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=alTN75 Rw/lx6bjo50bGt9T87SeyKOmTzA2GoX0RFrYI=; b=cyXb9MgZjA5cluPxhhJQNY szJ/zkv4Zw5T6PvUUvyYCErDm5CcOUO6YOWL1wBrsonIgaKQYiLhNJbnA0sT6arS G4h1ID8T+ErVRug7AFQlO79b8Zu9eFOqRfbIoQ8+AlWoVrfxFBwUmaJw93rkAyTL U5GXb5FT+IfA8FbZAsI8XnTL0jTF23J8XTSznSLbjiYKymBgpAGEFjKJMlYW/agy D5lKm1kZlwVQ2J7DSUgEXHWmAvkvSi/47xixMlZXzSIX1hsbzfTM8/yenXl7JCC7 IVWeoydgyPPm0V3LXrxjQ5b1ZgfZhwM8135iPDDefhO47pyywVS7s7zEZcUs5j9w ==
X-ME-Sender: <xms:_TE9YFZ2AjB7mkAi3XF_dylLzWfSAvD45RA182djTXVPwIFw-aiOPQ> <xme:_TE9YMbCm5pp7GVzYUlVvJZXF_8_7gNzpmHZLJaTk8VYSjWCj5lDA-NuK4If238cP BjYsJVobrCMMA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrleekgdduuddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepuffvfhfhkffffgggjggtsegrtderredtfeejnecuhfhrohhmpefmvghithhh ucfoohhorhgvuceomhhoohhrvgesnhgvthifohhrkhdqhhgvrhgvthhitghsrdgtohhmqe enucggtffrrghtthgvrhhnpeevfeetudeigedtledvvddtudefjeejffdvfeetjeeiueel gfdtgfegtdffkeetudenucfkphepuddtkedrvddvuddrudektddrudehnecuvehluhhsth gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhhoohhrvgesnhgvthif ohhrkhdqhhgvrhgvthhitghsrdgtohhm
X-ME-Proxy: <xmx:_TE9YH8qOP-kwRBwbi6O4tzqsONNf2YAcNj0wrAborEzdd4EtvB0Kw> <xmx:_TE9YDqeD7VX2dlpk-4m1egBPspHV78_5gLAr19W5cyDNJL9Ivt9ug> <xmx:_TE9YAo_ioXUM9w5m7BJV-WF3a__Hspjed5zMuTKK2WxEW_BC-kD0w> <xmx:_TE9YN0Vj8hNReW3zUGb4waauFVTPm4i_SplD1wYMDkaHrR8lI_gvg>
Received: from [192.168.1.90] (108-221-180-15.lightspeed.knvltn.sbcglobal.net [108.221.180.15]) by mail.messagingengine.com (Postfix) with ESMTPA id 359D9240057; Mon, 1 Mar 2021 13:27:09 -0500 (EST)
Subject: Re: HTML for email
To: Phillip Hallam-Baker <phill@hallambaker.com>
Cc: IETF Discussion Mailing List <ietf@ietf.org>
References: <20210227190200.06ED46F10439@ary.qy> <4064.1614454347@localhost> <s1f0vo$ejp$1@gal.iecc.com> <59240886-320d-fae3-6b98-7b83dacaf5e7@network-heretics.com> <CAMm+LwhWCsG68GOws-Zm9TDcEZ4trGBhq7Dm-_0Ci8Ri7kDK=Q@mail.gmail.com>
From: Keith Moore <moore@network-heretics.com>
Message-ID: <ecf84c57-97e4-e9ac-6836-4e61b654260c@network-heretics.com>
Date: Mon, 01 Mar 2021 13:27:07 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <CAMm+LwhWCsG68GOws-Zm9TDcEZ4trGBhq7Dm-_0Ci8Ri7kDK=Q@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------C64B448CCC9F175E9E7023FF"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/mbfk250FNHCzYNC5kgkNgyF8Gpk>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Mar 2021 18:27:14 -0000

On 3/1/21 9:22 AM, Phillip Hallam-Baker wrote:

> Yes HTML is a disaster for email. But so is plaintext wrapped at 66 
> characters by the server because people didn't know better.

Ok, first please understand that I'm not blaming anyone.   I'm also not 
making any proposal, at least not yet, and I am not currently sure about 
the best way forward.   I'm just making some observations based on 
long-term experience with email collaboration in IETF.

In some sense, plain text works better for IETF's needs than HTML, and 
maybe plain text works better for the needs of anyone doing technical 
discussion over email.  There is some virtue in simplicity.    One of 
the virtues of plain text email is that there's not (much of) a layer of 
interpretation between the text in the message and what the recipient 
sees.   (Of course there is a layer - the character encoding scheme - 
but not much more than that.)

That's not to say that there are no technical problems with plain text 
email or that it can't be improved.   Bbut the biggest problem with 
improving either is one of deployment.

>
> The reasons HTML is a disaster are
>
> 1) There is no standard for HTML in email.
While true, it's not immediately clear that HTML is readily extensible 
to fix these problems, because it needs to be defined in such a way that 
a variety of MUA implementations produce consistent behavior when 
multiple parties make successive edits/replies to a message, when 
portions of multiple messages are quoted in a message, and so on.    You 
might, for instance, need to specify the representation of text copied 
from one email and pasted to another.

One way to view the problem of HTML in email, is that in email you need 
to have the ability of many different parties to edit the document over 
and over, by different implementations, without producing a corrupted 
mess.   HTML is not designed for that.   But plain text email has a 
similar problem, and so does every "word processor" format I've seen 
that's more complex than, say, WordStar.   Anyone who has been around 
IETF for awhile has seen the effect of multiple layers of line-wrapping 
and ">" (or similar marks) added at the beginning of lines.

(Actually the problem is even worse, because some MUAs used in a 
conversation will treat the quoted parts as plain text and others will 
try to make them into HTML.   So you get multiple incompatible layers.)

Still, humans can manually "clean up" text that has been subject to that 
kind of repeated alteration.  But cleaning up HTML that has suffered 
similar damage generally doesn't happen, partly because there's a layer 
between the actual HTML and the user interface that keeps users from 
doing exactly what they need.

(I'm not suggesting that we should instead edit the raw HTML in messages 
when composing replies.   In addition to requiring participants to be 
HTML experts, the HTML generated by most MUAs is far too messy for that.)

> 2) HTML has been turned into a presentation format.

I realize this is heresy, but a presentation format is what people 
actually need in the vast majority of cases.

Semantic markup has its place.  When you're writing a book or maybe even 
a long article, you need to focus on content, not layout.   The 
presentation needs to be fine-tuned after the content is written or 
mostly written, and often by different people than those who wrote the 
content.   (and sometimes the content is tweaked for the sake of 
presentation).   Semantic markup makes good sense for that kind of 
application.

But for discussion, a semantic markup layer just gets in the way.   
That's also true for most web pages.   Web developers need to be able to 
dictate what the content looks like on the screen (while still being 
responsive to different kinds of displays), and they're forced to deal 
with a layer that tries to second-guess them.

> 3) Email messages used annotations for a decade before HTML which 
> doesn't support them

Right.   And it turns out that we need annotations in email.

> 4) The SMTP email infrastructure does not provide a viable means of 
> knowing what formats are accepted by a recipient so there is no way to 
> fix this.

I'm not sure that would solve the problem at least for IETF's case, or 
for any use case that involves large numbers of potential 
participants.   When you compose a message to send to a mailing list, 
should your user agent poll the capability of every recipient to find 
out what kind of message format each can accept?   Should it send out 
different formats to different recipients?   Should it try to identify a 
common subset so it only has to generate one message and so that 
recipients' experiences will be more consistent?   What if you have an 
email conversation between a small number of people, a new recipient is 
added, and everyone's messages change format because the new recipient's 
capabilities don't support the common subset of the other recipients?   
What about the very common case when a single recipient has multiple 
user agents with different capabilities?

In other words, be careful what you wish for.   There's a lot of value 
in having a common format and minimal set of capabilities that everyone 
supports.

>
> One painful side effect of 1 and 2 is that messages come with embedded 
> font size specifiers which is beyond stupid. The sender has no idea 
> what device I am reading something on. But Gmail will happily chose 
> font size settings that are frequently stupid. I have no control over 
> that as a user.
>
> But the last point is the most important because the difficulty of 
> fixing the SMTP infrastructure has become greater than the difficulty 
> of replacing it with something fit for purpose.
SMTP has turned out to be surprisingly (to me at least) fit for 
purpose.   It was designed in an era when you couldn't expect complete 
and full-time connectivity between senders and receivers, and also 
couldn't expect everyone to have access to the same network (e.g. 
ARPAnet vs. X.25), so it used store-and-forward.   But it turned out 
that store-and-forward was useful even in environments that could 
provide complete connectivity.   And later on it turned out to be useful 
for getting mail through firewalls.   In many environments 
store-and-forward is used to implement spam filters, virus filters, 
etc., to hide internal enterprise network infrastructure from outside 
viewers, and several other purposes. And store-and-forward helps make 
email more reliable, because it separates the problem of persistent 
delivery from the sending user agent's responsibility.

I do think some sort of recipient capability discovery could be useful 
for most messages that are sent to relatively few recipients (and 
actually had a proposal for this a few years ago, specifically to 
discover recipients' public keys), but probably not for IETF-style email 
discussions.   And implementing capability discovery for email means 
basically having two kinds of services that need to stay in sync, which 
creates additional risks.

Keith