Re: [ietf-smtp] Stray <LF> in the middle of messages

Hector Santos <hsantos@isdg.net> Mon, 08 June 2020 14:04 UTC

Return-Path: <hsantos@isdg.net>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E727E3A0ACE for <ietf-smtp@ietfa.amsl.com>; Mon, 8 Jun 2020 07:04:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=isdg.net header.b=Dj4kvCQD; dkim=pass (1024-bit key) header.d=beta.winserver.com header.b=HuMdLxXE
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bAisx4QJ57cA for <ietf-smtp@ietfa.amsl.com>; Mon, 8 Jun 2020 07:04:06 -0700 (PDT)
Received: from mail.winserver.com (dkim.winserver.com [76.245.57.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D9E0C3A0ADC for <ietf-smtp@ietf.org>; Mon, 8 Jun 2020 07:04:02 -0700 (PDT)
DKIM-Signature: v=1; d=isdg.net; s=tms1; a=rsa-sha1; c=simple/relaxed; l=4688; t=1591625031; atps=ietf.org; atpsh=sha1; h=Received:Received:Received:Received:Message-ID:Date:From: Organization:To:Subject:List-ID; bh=51eTlDJQrFtYbvJrBQMQYu/fh6U=; b=Dj4kvCQDwnsnL9peKmyRKTLCVLlOHwPE3844RLWOT8vVXqNeX7kHO1wq+O0XXt RIwxHIDeoolce20gJltcYEQAcU2bcf0yVt2yg62tUoYuAaF1KuV3zuYR6cE5V5kl jBHxUvbqE0HJH/VDV8ORBk0dX1Jsz0/wOT49nw85XuxzQ=
Received: by winserver.com (Wildcat! SMTP Router v8.0.454.10) for ietf-smtp@ietf.org; Mon, 08 Jun 2020 10:03:51 -0400
Authentication-Results: dkim.winserver.com; dkim=pass header.d=beta.winserver.com header.s=tms1 header.i=beta.winserver.com; dmarc=pass policy=reject author.d=isdg.net signer.d=beta.winserver.com (atps signer);
Received: from beta.winserver.com ([76.245.57.74]) by winserver.com (Wildcat! SMTP v8.0.454.10) with ESMTP id 2352340383.1.5240; Mon, 08 Jun 2020 10:03:50 -0400
DKIM-Signature: v=1; d=beta.winserver.com; s=tms1; a=rsa-sha256; c=simple/relaxed; l=4688; t=1591625001; h=Received:Received: Message-ID:Date:From:Organization:To:Subject:List-ID; bh=l9JDQ2c Vyc368FywrBg7yTPe18VoqOI12hdpubVuZsg=; b=HuMdLxXEHYTIYgk95vbtTAo yviGl4opU9pLD5z+9jb7XlabHLIFMHAFaU2VV/L/3vhB0xwCyk6/r+RsxnU10HlK hBDyhtHcrceW4qBQJVSuPLAOvwh+6IAVoAGbjxKEfOg3iDeDk8jPhSUgeib+nSjk LX7kWc3uFw+KRWyrKtTM=
Received: by beta.winserver.com (Wildcat! SMTP Router v8.0.454.10) for ietf-smtp@ietf.org; Mon, 08 Jun 2020 10:03:21 -0400
Received: from [192.168.1.68] ([75.26.216.248]) by beta.winserver.com (Wildcat! SMTP v8.0.454.10) with ESMTP id 3358696312.1.32720; Mon, 08 Jun 2020 10:03:20 -0400
Message-ID: <5EDE4545.7070608@isdg.net>
Date: Mon, 08 Jun 2020 10:03:49 -0400
From: Hector Santos <hsantos@isdg.net>
Reply-To: hsantos@isdg.net
Organization: Santronics Software, Inc.
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.8.1
MIME-Version: 1.0
To: Leo Gaspard <ietf=40leo.gaspard.io@dmarc.ietf.org>, ietf-smtp@ietf.org
References: <87ftb8p1ii.fsf@llwynog.ekleog.org>
In-Reply-To: <87ftb8p1ii.fsf@llwynog.ekleog.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/MbzNopW7LHch2NNiRdqbUqt1--w>
Subject: Re: [ietf-smtp] Stray <LF> in the middle of messages
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Jun 2020 14:04:10 -0000

Hi Leo,

Here is quick summary for the three main operating systems. When it 
comes to the EOL (End Of Line) terminator of stored text files or 
messages:

o The *nix (unix-based) OSes use the <LF> (Linefeed) as an EOL terminator,

o The Apple Mac OS, use the <CR> (Carriage Return) as an EOL 
terminator, and

o The DOS/Windows OSes, used the <CR><LF> (Carriage Return, Linefeed) 
as the EOL terminator.

The Internet Mail format officially starting with RFC 822 selected the 
DOS/Windows format <CR><LF> format.

Why?

Maybe because at that point in time, Microsoft had owned 90% of the 
growing Personal Computer (PC) market.  The Mac was still considered 
(legally) a luxury commodity (otherwise their anti-trust status would 
no longer apply), and *nix was still mostly at the IT networking level.

But I think there may had been other technical reasons. Dave Crocker 
the editor of the RFC822, can maybe tell us if X.400 used the <CR><LF> 
format. I did work on X.400 mail when I worked at Big Circle W 
(Westinghouse) but I don't recall the format it used.

The point here, you need to keep the above differences in mind, when 
it comes to exchanging files or data in a heterogeneous network of 
three different text-based storage or transmission formats.

While the *nix or Mac may store the email in their native format, when 
it comes to SMTP transmission of the DATA payload , they MUST 
translate  it to a <CR><LF> format. In principle, all compliant SMTP 
senders and receiver MUST conform to the <CR><LF> end of line terminator.

It is great to see developers do their own thing, even "reinvent the 
world" rather than be reliant and dependent on other popular systems. 
It is a good way to learn.

Good Luck with your SMTP server project!

-- 
Hector Santos,
https://secure.santronics.com
https://twitter.com/hectorsantos



On 6/6/2020 1:06 PM, Leo Gaspard wrote:
> Hello world,
>
> I am in the process of writing an SMTP server, which obviously is going
> to be the best of all SMTP servers ever written and that will ever be
> written in our eon.
>
> However, in the process of taking over the world, I am facing something
> that surprises me.
>
> I read, in RFC5321, §2.3.8, this paragraph:
>
>> Lines consist of zero or more data characters terminated by the
>> sequence ASCII character "CR" (hex value 0D) followed immediately by
>> ASCII character "LF" (hex value 0A).  This termination sequence is
>> denoted as <CRLF> in this document.  Conforming implementations MUST
>> NOT recognize or generate any other character or character sequence
>> as a line terminator.  Limits MAY be imposed on line lengths by
>> servers (see Section 4).
>
> Which appear to clearly indicate that <LF> is not a valid line
> terminator.
>
> However, I notice that every single time I have tried to use `netcat` to
> send emails for demo purposes, it succeeded *without* sending <CRLF> and
> by sending only <LF>. While `telnet` does appear to convert typed <LF>
> into <CRLF>, it looks like (my version of) `netcat` does not. So most of
> the SMTP servers I have met with appear to consider <LF> as a valid line
> ending.
>
> This, in most cases, is not a big deal, because <LF> is not a valid
> character in SMTP commands, so saying that receiving an <LF> is
> equivalent to receiving a <CRLF> is not that big a problem.
>
> However, there is one case where the semantics is important: should one
> escape the <LF>. sequence while in a DATA block?
>
> I would guess that the fact that other SMTP servers appear to usually
> accept <LF>.<LF> as a terminator indicates that <LF>. should be escaped
> even though it is not strictly conforming with the RFC, but… I wanted to
> have the opinion of other people on this, before diving too deep in the
> implementation?
>
> The following paragraph also makes me wonder:
>
>> In addition, the appearance of "bare" "CR" or "LF" characters in text
>> (i.e., either without the other) has a long history of causing
>> problems in mail implementations and applications that use the mail
>> system as a tool.  SMTP client implementations MUST NOT transmit
>> these characters except when they are intended as line terminators
>> and then MUST, as indicated above, transmit them only as a <CRLF>
>> sequence.
>
> Should I understand this paragraph as meaning that if I ever receive
> such an ill-formed message, I… can? should? must? accept it and… can?
> should? must? convert the <LF> into proper <CRLF>?
>
> Thank you in advance for any thoughts you may have!
>    Leo

-- 
Hector Santos,
https://secure.santronics.com
https://twitter.com/hectorsantos