Re: [Emailcore] Ticket #14: G.7.8. Review different size limits

Steffen Nurpmeso <steffen@sdaoden.eu> Fri, 16 July 2021 16:47 UTC

Return-Path: <steffen@sdaoden.eu>
X-Original-To: emailcore@ietfa.amsl.com
Delivered-To: emailcore@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A1643A3D3E for <emailcore@ietfa.amsl.com>; Fri, 16 Jul 2021 09:47:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hQidNy8a-LI7 for <emailcore@ietfa.amsl.com>; Fri, 16 Jul 2021 09:47:25 -0700 (PDT)
Received: from sdaoden.eu (sdaoden.eu [217.144.132.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C385B3A3D3D for <emailcore@ietf.org>; Fri, 16 Jul 2021 09:47:24 -0700 (PDT)
Received: from kent.sdaoden.eu (kent.sdaoden.eu [10.5.0.2]) by sdaoden.eu (Postfix) with ESMTPS id 0BCAF16056; Fri, 16 Jul 2021 18:47:20 +0200 (CEST)
Received: by kent.sdaoden.eu (Postfix, from userid 1000) id 02091316; Fri, 16 Jul 2021 18:29:13 +0200 (CEST)
Date: Fri, 16 Jul 2021 18:29:13 +0200
Author: Steffen Nurpmeso <steffen@sdaoden.eu>
From: Steffen Nurpmeso <steffen@sdaoden.eu>
To: John Levine <johnl@taugh.com>
Cc: emailcore@ietf.org
Message-ID: <20210716162913.mo0D2%steffen@sdaoden.eu>
In-Reply-To: <20210715201638.983BD238781A@ary.qy>
References: <20210715201638.983BD238781A@ary.qy>
Mail-Followup-To: "John Levine" <johnl@taugh.com>, emailcore@ietf.org
User-Agent: s-nail v14.9.22-170-g4fc3932ea4
OpenPGP: id=EE19E1C1F2F7054F8D3954D8308964B51883A0DD; url=https://ftp.sdaoden.eu/steffen.asc; preference=signencrypt
BlahBlahBlah: Any stupid boy can crush a beetle. But all the professors in the world can make no bugs.
Archived-At: <https://mailarchive.ietf.org/arch/msg/emailcore/2XsDIMXzBpCW1wo6Qq0UX0smy40>
Subject: Re: [Emailcore] Ticket #14: G.7.8. Review different size limits
X-BeenThere: emailcore@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: EMAILCORE proposed working group list <emailcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/emailcore>, <mailto:emailcore-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/emailcore/>
List-Post: <mailto:emailcore@ietf.org>
List-Help: <mailto:emailcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/emailcore>, <mailto:emailcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Jul 2021 16:47:30 -0000

John Levine wrote in
 <20210715201638.983BD238781A@ary.qy>:
 |It appears that Steffen Nurpmeso  <steffen@sdaoden.eu> said:
 |>|> Speaking of limits, how do we feel about the 1000 octet limit on line
 |>|> length.  It is my impression that it is often exceeded in HTML text.
 |
 |>There was that UTF-8 RFC from i think a Microsoft guy who soaped
 |>this up to say "in modern times a character is the new byte", so
 |>that this should now be 1000*4, at least for SMTP, ...
 |
 |That is just wrong.  RFC 6532 specifically says if you're sending unencoded
 |UTF-8, the limit is still 1000 octets, not 1000 characters.

Yes, "octet" was the used term.  In the tradition of preceeding
RFCs.  I surely was too sloppy here.

 |>Maybe that is a weakness in the original folding algorithm, that
 |>whitespace at the begin of follow lines is preserved.  Maybe the
 |>(now) usual "line-continuation by escaping the linefeed with
 |>reverse solidus" would have been the better approach, ...
 |
 |That only applies to headers.  The 1000 octet limit applies to the
 |whole message.  For text we have flowed text which uses a space at
 |the end of a line as a signal to rewrap.

Yes.  (But for headers there is a SHOULD for automatic wrapping
(at 78), whereas bodies i would expect to be left alone unless the
limit of "at least / no lower than 1000" is excessed; for any
software that is not a pure MTA i would expect convertion to MIME
and using a content-transfer-encoding instead.  You very often see
this even enforced, for example on Mailman-driven mailing-lists,
where you are turned to base64 encoding whether useful/necessary
or not.  This started somewhen before Python 3 was truly released,
and it was still like this when i stopped looking at email content
regulary not too far in the past, .. which must have been shorty
before Mailman 2 was deprecated by support for Python 2 has ended,
i think.)

 |>Especially since presence of non 7-bit clean data imposes MIME
 |>encoding, anyway?
 |
 |Hm.  You might want to investigate the 8BITMIME SMTP extension, defined in
 |1993 and supported by every MTA I've looked at in this millenium.

I have these RFCs (i have 6152 locally, it obsoleted 1652).  But
i do not think i have ever encountered usage of this myself.

 |R's,
 |John
 |
 |PS: I don't feel strongly about lifting the 1000 octet limit but I \
 |would like
 |to understand how widely it's enforced by mail receivers.

I would assume transmission is rejected for any such email?
About line reading via fgets(3) and finite buffer .. this is
really more common than one would think, especially in elder
software.  Ever since i "really" program this was a "oh no no no,
do not do that!", so i personally ((basic ->) perl -> Java ->
C++/C) never actively used this i think (when i could make
a decision, ie, for C++/C).  I surely have seen fly by commits of
*BSD base-system software turning over such to dynamic line
readers over the last decade, where still necessary (OpenBSD
thus)!  The Unix console no. 1 MUA mutt turned many many code
places to a dynamic buffer type over the last hm maybe two years
for sure, but it surely did it right before already.  That is,
from the MUA side i would not expect to many restrictions on the
consumption side, at least in Unix i would expect them all to
honour "be liberal in what you expect and strict in what you
produce"; i have collected some test mails that happened to happen
over the last decade, and at least at the beginning i compared how
other Unix console mailers (and once even Apple Mail, of Snow
Leopard, but no) behave ... hm, the longest header line is 1009
bytes; there is a References: that is folded but spans more than
a full screen of a 212x58 screen, hmm, a (maybe manually made
worse) spam mail ("ANGLO AMERICA PLATINUM UK ONLINE
<*@outlook.com>") in 54.header-mem-booom.mbox.  This is from 2015,
i have forgotten whether i was still cross-testing by then.

 --End of <20210715201638.983BD238781A@ary.qy>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)