Re: [ietf-smtp] Proper definition of the term "email payload".

Mark Sapiro <mark@msapiro.net> Mon, 01 April 2019 00:51 UTC

Return-Path: <mark@msapiro.net>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4D79B1201BE for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 17:51:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=msapiro.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6gFSUBFMcZJ9 for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 17:51:15 -0700 (PDT)
Received: from sbh16.songbird.com (sbh16.songbird.com [72.52.113.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8561A12006F for <ietf-smtp@ietf.org>; Sun, 31 Mar 2019 17:51:14 -0700 (PDT)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by sbh16.songbird.com (Postfix) with QMQP id 5245411E0883; Sun, 31 Mar 2019 17:51:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=msapiro.net; s=default; t=1554079874; bh=hNa2lQ4ZI9htNPZbBKY3RbmTMp52uw3+x9pGaVffkCw=; h=Subject:To:References:Cc:From:Date:In-Reply-To:From; b=TypMkdLnKe1s6Yt4GiUBN5GI87XWgJHzll+OiUf/qimdd5zEf5AwpdRn23Oqq4TU8 db+sIr0kCK9c89mEP4Os9Yg5nQcWcCz6oyIvqAv4zjGr0joN10yGgtogIWU20nS9FS 7f66yfqTCBA8Qj9Nz9ajR7N48Fvb9sN8OQKHLlhA=
Received: from [10.211.115.100] (45-24-217-241.lightspeed.sntcca.sbcglobal.net [45.24.217.241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (no client certificate requested) (Authenticated sender: mark) by sbh16.songbird.com (MailScanner Milter) with SMTP id 43C37872CD; Sun, 31 Mar 2019 17:51:06 -0700 (PDT)
To: Viruthagiri Thirumavalavan <giri@dombox.org>
References: <20190401000943.57BB2143638@mail.wooz.org> <FCBB7422-295F-4E58-AC4E-42A6894C6406@python.org>
Cc: Barry Warsaw <barry@python.org>, ietf-smtp@ietf.org, "R. David Murray\"" <rdmurray@bitdance.com>
From: Mark Sapiro <mark@msapiro.net>
Openpgp: preference=signencrypt
Autocrypt: addr=mark@msapiro.net; prefer-encrypt=mutual; keydata= xsDiBEZ9K/oRBACgO8u4RWcqvZAvNJ4h06JU9Tw1bkBOjHZQYdW3SObtnfRHa2cuOjd36MFo qTAX6EpgIDnDkcgP+yqIwjsEQYL8j2bYKChDjN05r7XqiGViE41Z4h/mv5gUg0jKXuxSR7+r 6AWK5UKV1amKtbCEz1rtzXNZTLa9DGTIBiR23CyMgwCg/TgMALlrnAyHP35mB1EnR+L+3V8D /2h9IHPW6fRbwuW2u1+NQ1Gd8R9tYdfaUdJWDsVeeLXllxYVkvsKlNzVlwqYySY32zd3SqOC g22X+muxuDkbJDAArKyfvWB3Ch9lRFmidirgsErMCcqIqEhcoQCgFthDfEtqJwKpsQB8f4SQ z3B0yjRUVbsn75tRtmUlI0WF8mpgBACMtC784KjECFyA+QV42G0ou2sn15Hyv09gK3dnwGiZ FDqs7uayREheFv06qSrKMjzqcxDlrd5XAdvXbRcqbTD72LtI+NT6jjgS34up/x4jfP+9bT2a UABPI3Ue1CoBrT1AJiy4P4N75EhrLkhvxJ8HQZoyLXHb9mMILfP/SNi31c0eTWFyayBTYXBp cm8gPG1hcmtAbXNhcGlyby5uZXQ+wmMEExECACMCGwMGCwkIBwMCBBUCCAMEFgIDAQIeAQIX gAUCR40jYAIZAQAKCRBVW5delTuGkwIYAJ9lYTVXryEIpvdlBn4IFKeP5fI+8gCgxvl9nIA3 QcKIHY8UYIJYUEUpWzbOwU0ERn0sEhAIAJoczI2nBTePWTQ9LY34H53mPPSOooejymx2DDJV 05vima61lyX1f7RxIH01XDvQyaTT+ACUpH2qHuQxECHH1tkhH4fZ36TSECSMg3fYM9hRDwXp d4vXeeRWWv8kzhtTswSfqv3eZnO2UAyThT3ix2WuO93Up+x/OYxnP+5/xxl8Kuk5peFLQOKU SFgnQ/vP3RHmHoFuXGNKXOw5LX/tUkDxmzd/QVT1S7CKjBuSIRd8jSjRWHusA2Q40Jx7ruR7 LtwmQnQX/pVVXsFptiQHqPeAnMFtP9rkDMVwB6+RkK3BUrCR9g7iAwZQQogooMoyuuKXEX47 8O9mWHj8T75fdScABREH/0VIvhQQApJibsi50K+pOKO+fQ6ptJSe4BRv6XkS8DTmMbG+ydRl lPhDAMgfGnx15LpYUs0x9A5VYhi96qm2Ge8R/XshhruxvOadP0LdzReuQ9XQsRzUY3yCiSN8 /vWs+mPxCl0XkS/Yqd5XR7+A8NRFtPekL8l+dXJrgr4aTaWJ2NzQ7HsuDDksRpmJn13LoW11 xGUGQcXhH9rqX48jZil7fLZpHk7RX/t8+GtuNyrf0HJKEOPWvItvjR1tA7QLM/U0sO7cHwjc HND7ThDE4nmM58DZVA2tRER9tUYrSMgO62s+OB7Eqc28iNJ2aNTTfHDrhXLmfu1aFqoTIiNT znPCSQQYEQIACQUCRn0sEgIbDAAKCRBVW5delTuGk3rEAJ0QToUOUmUriKlEP3fSQVcgTs2G 4ACg88v4r/zXFLlGKort18h3fWZC0DM=
Message-ID: <75720834-0bdc-deba-40fc-24f17d5a6752@msapiro.net>
Date: Sun, 31 Mar 2019 17:51:05 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <FCBB7422-295F-4E58-AC4E-42A6894C6406@python.org>
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="ZZTWAHNll5P08uthqO5i6HPlFzzKrFUeP"
X-GPC-MailScanner-ID: 43C37872CD.A5CB8
X-GPC-MailScanner: Found to be clean
X-GPC-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-6.75, required 5, autolearn=not spam, ALL_TRUSTED -1.00, BAYES_00 -0.75, X_GPC_SASL -5.00)
X-GPC-MailScanner-From: mark@msapiro.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/THbZlH4XxbBN8jkxscXDEFUlenE>
X-Mailman-Approved-At: Sun, 31 Mar 2019 19:46:31 -0700
Subject: Re: [ietf-smtp] Proper definition of the term "email payload".
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Apr 2019 01:01:49 -0000

To elaborate just a bit on what Barry says, as far as the Python email
library is concerned, the stuff that comes over the wire as a response
to the SMTP DATA command (RFC 821 and successors) is the email.message
object. If you want to see the whole thing, you use the as_string() or
as_bytes() methods on that object.

That object consists of headers and body as described in RFC 822 and
successors. The Python email library refers to that body as the payload
of that message object.

I think this is all consistent and reasonable in terms of what the email
library is trying to do.

In the RFC 821 context, the metadata is the envelope which has a sender
and recipients and the entire message is the data, but in the RFC 822
context, the data is split into headers and body and we choose to call
the body the payload.

This is a semantic issue. In your "box of beer" example, the service
that delivers it considers the payload to be the box and contents, but
the consumer considers the payload to be only the contents (and maybe
just the beer and not the cans). Take your pick.

I.e., there is no one definitive answer to your question. You have
reasons for considering the RFC 821 DATA to be the payload, and you are
not wrong, and we have reasons for considering the RFC 822 body to be
the payload, and we are not wrong either

Forwarded message.
>  *From: *Barry Warsaw <barry@python.org <mailto:barry@python.org>>
>  *Subject: **Re: Proper definition of the term "email payload".*
>  *Date: *March 31, 2019 at 17:09:30 PDT
>  *To: *Viruthagiri Thirumavalavan <giri@dombox.org
>  <mailto:giri@dombox.org>>
>  *Cc: *ietf-smtp@ietf.org <mailto:ietf-smtp@ietf.org>, "R. David
>  Murray" <rdmurray@bitdance.com <mailto:rdmurray@bitdance.com>>, Mark
>  Sapiro <msapiro@value.net <mailto:msapiro@value.net>>
> 
> 
>  Hi, I hope you (and they!) don’t mind me CCing two other people who
>  have worked extensively on Python’s email library, and in fact much
>  more than myself in the recent years.  RDM has done the bulk of the
>  work on the new-in-Python-3 APIs, and Mark is a long time core
>  developer on GNU Mailman (the project that spawned Python’s email
>  library).
> 
>  There are two ways I think about this, and I’ll use the original RFC
>  numbers to clarify.  There’s RFC 821, which describes the on-the-wire
>  protocol for SMTP transfers, embodied in Python’s smtplib library.
>   Then there’s RFC 822, which describes the format of the content of
>  that SMTP transfer, but not the protocol itself.  Of course there are
>  lots of developments along the way, but that’s unimportant for the way
>  I think about these things.
> 
>  What I think you are describing, where the headers are part of the
>  payload, is more akin to RFC 821.  That’s the payload as far as the
>  actual bytes-on-the-wire are concerned.  Python’s email library is for
>  RFC 822 (and the many, many elaborations thereof), so in that case,
>  the payload is the body of the message.  On more practical terms, the
>  implementation makes this clear, and the APIs you use to change
>  headers are different in form and function than the ones you use to
>  change the body of the message.
> 
>  I think the Python documentation is fairly clear about this
>  distinction.  At least, I don’t remember seeing any feedback to the
>  contrary, although RDM may have a better sense of that.  Of course, we
>  are always open to improvements in Python’s documentation.
> 
>  Cheers,
>  -Barry
> 
>> On Mar 31, 2019, at 10:57, Viruthagiri Thirumavalavan
>> <giri@dombox.org <mailto:giri@dombox.org>> wrote:
>>
>> Hello IETF,
>>
>> I need some clarification about the term "email payload".
>>
>> Wikipedia says
>>
>> In computing and telecommunications, the payload is the part of
>> transmitted data that is the actual intended message. Headers and
>> metadata are sent only to enable payload delivery
>>
>> Python email library documentation says this.
>>
>> An email message consists of headers and a payload (which is also
>> referred to as the content). Headers are RFC 5322 or RFC 6532 style
>> field names and values, where the field name and value are separated
>> by a colon. The colon is not part of either the field name or the
>> field value. The payload may be a simple text message, or a binary
>> object, or a structured sequence of sub-messages each with their own
>> set of headers and their own payload. The latter type of payload is
>> indicated by the message having a MIME type such as multipart/* or
>> message/rfc822.
>>
>> It looks like Python email library author "Barry Warsaw" followed
>> similar definition found in wikipedia when defining his library
>> functions. But I feel like calling ONLY the email "Body Part" as
>> "payload" is wrong. The term "payload" should refer to the entire
>> "Message Part" in Email. i.e. Both Headers and Body.
>>
>> When you place an order for a "box of beer", you are not paying only
>> for the "beer cans", but also paying for the "container box". So the
>> payload here is the entire box.
>>
>> HTTP Example:
>>
>> HTTP/1.1 200 OK
>> Date: Sun, 10 Oct 2010 23:26:07 GMT
>> Content-Type: text/html
>> Content-Length: 1234
>>
>> <html>
>>
>> <head>
>> <title>Hello World!</title>
>> </head>
>>
>> <body>
>> (more contents)
>>  .
>>  .
>>  .
>> </body>
>> </html>
>>
>>
>> If you take a closer look at this HTTP example, the headers are only
>> just instructions for the client. The end user doesn't need to worry
>> about any piece of information found in those headers. So wikipedia
>> definition perfectly suited for applications like HTTP.
>>
>> But in Email, When a mail get transferred from Hop A to Hop C via Hop
>> B, the user in Hop A actually wants to deliver the whole "message
>> part" to Hop C. If Hop B, strips the headers and transfer only the
>> "Body" part, then it becomes an "Anonymous" message. So the end user
>> requires the information found in the "Headers" too. e.g. From,
>> Subject, Date etc. [In HTTP, title tag is equivalent to Subject and
>> it's found in the "head" Markup, not in the HTTP Headers]
>>
>> As you can see, the user is interested in the "entire message". So
>> the term "actual intended message" should refer to the "whole
>> message" extracted from the DATA command. The "actual intended
>> message" should be pictured like this in email.
>>
>> Also note that, when you migrate your mails to another mail service,
>> you need the whole message with Headers, not just the body.
>>
>> Based on my points, I believe calling only the "Body" part as
>> "Payload" is wrong. I would love to hear your thoughts on this. If
>> Barry Warsaw is here, would love to know your opinion too.
>>
>> PS: I did actually ask this question 2 years back in a stackexchange
>> website. I wasn't satisfied with the answer I got there. I don't want
>> to use the term incorrectly in my application. That's why I'm posting
>> it here.
>>
>> Thanks
>> --
>> Best Regards,
>>
>> Viruthagiri Thirumavalavan
>> Dombox, Inc.


-- 
Mark Sapiro <mark@msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan