Re: [ietf-smtp] Proper definition of the term "email payload".

Viruthagiri Thirumavalavan <giri@dombox.org> Mon, 01 April 2019 01:04 UTC

Return-Path: <giri@dombox.org>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0CCD1201D0 for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 18:04:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=dombox.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IorD4BCnnW8D for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 18:04:15 -0700 (PDT)
Received: from mail-yb1-xb42.google.com (mail-yb1-xb42.google.com [IPv6:2607:f8b0:4864:20::b42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E27C312006F for <ietf-smtp@ietf.org>; Sun, 31 Mar 2019 18:04:14 -0700 (PDT)
Received: by mail-yb1-xb42.google.com with SMTP id u75so2912554ybi.4 for <ietf-smtp@ietf.org>; Sun, 31 Mar 2019 18:04:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dombox.org; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Lu8tFmWkwueTsb3QV24hepVxkPjt4/kc+8DZm78pJFo=; b=b+x0dYrKJW0gp1U0mcCZ3Ni1x1fsVlWWxUDscC5Z2Xv2jGdz4N92SlMM67btYJ21Kh Glw59i74hEM09lSYuWiq5taNhkF3GV1+rar34LNe0DJzTkJD2wL4GopPK6+8P6byr8A6 kgtRyMw4wGDqVqZIL3nuFFgDP/ofMlQg2xRkK8OfuorQMsOtCJzPJKNlkG5XN4ESMojA kl+T7QPRILOIiSsfGHdcRc0drBxA11/6dXx0Pa1whcm3bkXcn78udoRjKWqgtUMvgWQn 1LCdDsXCKOFsP6f7fI/jNdAbz5QSOVaFPQ9k5cyL9Q/PRUiYDo+UIutTkheXOrPuNZVi d46Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Lu8tFmWkwueTsb3QV24hepVxkPjt4/kc+8DZm78pJFo=; b=hJuob6Og+3D0ZhapI2bw5u0NNNAPZxaUUGFFbrj+qfTJxetfNIAdiVJLFcJV3BqaNS 9poLgxB+iJwxhB9+BDNGTI379AGpKHqMPMn+fpPUiWzzjxBOyZa4J2CH0BENydHhsnkQ UiU2wOVKbTVYBCrfVYgrPmCmbVYsVdAgjzaIPJpn1OLsB24I9mqtZy33akMjAYPRRN53 /ZmIPOOwtUFiDN1Y1cxpRNqooyBEGBoQHu9oYUCHkmUZRUaJOofyQVfbMoxBphIei9RG K7HCcoX0ze6XTTh/oETVNzu2eQ6WsqzAov4u9028htCyVMZdzIcuYlhQUpAM4AG3fJWL 4rhg==
X-Gm-Message-State: APjAAAUy4u5QoYgWy2ib585ymQMzUdkKa60FImpQeKRyb0fs52P4NPHP aHwM1LGO48Xq09yg4Cd6KduJf34iysT0JmLZq5Gdog==
X-Google-Smtp-Source: APXvYqwOrSFQ+pxFzkZ3wD3e8y58Tz7Fsz9xdMVjiOCEwLSxlU8zkGeU+CXjm7bkiCsEGSZH3CgRoFOq96/pvJ/y4SM=
X-Received: by 2002:a25:217:: with SMTP id 23mr6953868ybc.298.1554080654039; Sun, 31 Mar 2019 18:04:14 -0700 (PDT)
MIME-Version: 1.0
References: <20190401000943.57BB2143638@mail.wooz.org> <FCBB7422-295F-4E58-AC4E-42A6894C6406@python.org> <75720834-0bdc-deba-40fc-24f17d5a6752@msapiro.net>
In-Reply-To: <75720834-0bdc-deba-40fc-24f17d5a6752@msapiro.net>
From: Viruthagiri Thirumavalavan <giri@dombox.org>
Date: Mon, 01 Apr 2019 06:34:03 +0530
Message-ID: <CAOEezJQo=TQcBqVGypW4YD4rLT0JNnfa1zZ9eh1gjQz9Cz8htA@mail.gmail.com>
To: Mark Sapiro <mark@msapiro.net>
Cc: Barry Warsaw <barry@python.org>, ietf-smtp@ietf.org, "R. David Murray" <rdmurray@bitdance.com>
Content-Type: multipart/alternative; boundary="0000000000005631fd05856d994e"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/iXbMaM8cE3YBSS31Rp79ha5yg_0>
Subject: Re: [ietf-smtp] Proper definition of the term "email payload".
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Apr 2019 01:04:18 -0000

Thanks Mark. You have written beautifully. And yes your answer makes sense.

On Mon, Apr 1, 2019 at 6:21 AM Mark Sapiro <mark@msapiro.net> wrote:

> To elaborate just a bit on what Barry says, as far as the Python email
> library is concerned, the stuff that comes over the wire as a response
> to the SMTP DATA command (RFC 821 and successors) is the email.message
> object. If you want to see the whole thing, you use the as_string() or
> as_bytes() methods on that object.
>
> That object consists of headers and body as described in RFC 822 and
> successors. The Python email library refers to that body as the payload
> of that message object.
>
> I think this is all consistent and reasonable in terms of what the email
> library is trying to do.
>
> In the RFC 821 context, the metadata is the envelope which has a sender
> and recipients and the entire message is the data, but in the RFC 822
> context, the data is split into headers and body and we choose to call
> the body the payload.
>
> This is a semantic issue. In your "box of beer" example, the service
> that delivers it considers the payload to be the box and contents, but
> the consumer considers the payload to be only the contents (and maybe
> just the beer and not the cans). Take your pick.
>
> I.e., there is no one definitive answer to your question. You have
> reasons for considering the RFC 821 DATA to be the payload, and you are
> not wrong, and we have reasons for considering the RFC 822 body to be
> the payload, and we are not wrong either
>
> Forwarded message.
> >  *From: *Barry Warsaw <barry@python.org <mailto:barry@python.org>>
> >  *Subject: **Re: Proper definition of the term "email payload".*
> >  *Date: *March 31, 2019 at 17:09:30 PDT
> >  *To: *Viruthagiri Thirumavalavan <giri@dombox.org
> >  <mailto:giri@dombox.org>>
> >  *Cc: *ietf-smtp@ietf.org <mailto:ietf-smtp@ietf.org>, "R. David
> >  Murray" <rdmurray@bitdance.com <mailto:rdmurray@bitdance.com>>, Mark
> >  Sapiro <msapiro@value.net <mailto:msapiro@value.net>>
> >
> >
> >  Hi, I hope you (and they!) don’t mind me CCing two other people who
> >  have worked extensively on Python’s email library, and in fact much
> >  more than myself in the recent years.  RDM has done the bulk of the
> >  work on the new-in-Python-3 APIs, and Mark is a long time core
> >  developer on GNU Mailman (the project that spawned Python’s email
> >  library).
> >
> >  There are two ways I think about this, and I’ll use the original RFC
> >  numbers to clarify.  There’s RFC 821, which describes the on-the-wire
> >  protocol for SMTP transfers, embodied in Python’s smtplib library.
> >   Then there’s RFC 822, which describes the format of the content of
> >  that SMTP transfer, but not the protocol itself.  Of course there are
> >  lots of developments along the way, but that’s unimportant for the way
> >  I think about these things.
> >
> >  What I think you are describing, where the headers are part of the
> >  payload, is more akin to RFC 821.  That’s the payload as far as the
> >  actual bytes-on-the-wire are concerned.  Python’s email library is for
> >  RFC 822 (and the many, many elaborations thereof), so in that case,
> >  the payload is the body of the message.  On more practical terms, the
> >  implementation makes this clear, and the APIs you use to change
> >  headers are different in form and function than the ones you use to
> >  change the body of the message.
> >
> >  I think the Python documentation is fairly clear about this
> >  distinction.  At least, I don’t remember seeing any feedback to the
> >  contrary, although RDM may have a better sense of that.  Of course, we
> >  are always open to improvements in Python’s documentation.
> >
> >  Cheers,
> >  -Barry
> >
> >> On Mar 31, 2019, at 10:57, Viruthagiri Thirumavalavan
> >> <giri@dombox.org <mailto:giri@dombox.org>> wrote:
> >>
> >> Hello IETF,
> >>
> >> I need some clarification about the term "email payload".
> >>
> >> Wikipedia says
> >>
> >> In computing and telecommunications, the payload is the part of
> >> transmitted data that is the actual intended message. Headers and
> >> metadata are sent only to enable payload delivery
> >>
> >> Python email library documentation says this.
> >>
> >> An email message consists of headers and a payload (which is also
> >> referred to as the content). Headers are RFC 5322 or RFC 6532 style
> >> field names and values, where the field name and value are separated
> >> by a colon. The colon is not part of either the field name or the
> >> field value. The payload may be a simple text message, or a binary
> >> object, or a structured sequence of sub-messages each with their own
> >> set of headers and their own payload. The latter type of payload is
> >> indicated by the message having a MIME type such as multipart/* or
> >> message/rfc822.
> >>
> >> It looks like Python email library author "Barry Warsaw" followed
> >> similar definition found in wikipedia when defining his library
> >> functions. But I feel like calling ONLY the email "Body Part" as
> >> "payload" is wrong. The term "payload" should refer to the entire
> >> "Message Part" in Email. i.e. Both Headers and Body.
> >>
> >> When you place an order for a "box of beer", you are not paying only
> >> for the "beer cans", but also paying for the "container box". So the
> >> payload here is the entire box.
> >>
> >> HTTP Example:
> >>
> >> HTTP/1.1 200 OK
> >> Date: Sun, 10 Oct 2010 23:26:07 GMT
> >> Content-Type: text/html
> >> Content-Length: 1234
> >>
> >> <html>
> >>
> >> <head>
> >> <title>Hello World!</title>
> >> </head>
> >>
> >> <body>
> >> (more contents)
> >>  .
> >>  .
> >>  .
> >> </body>
> >> </html>
> >>
> >>
> >> If you take a closer look at this HTTP example, the headers are only
> >> just instructions for the client. The end user doesn't need to worry
> >> about any piece of information found in those headers. So wikipedia
> >> definition perfectly suited for applications like HTTP.
> >>
> >> But in Email, When a mail get transferred from Hop A to Hop C via Hop
> >> B, the user in Hop A actually wants to deliver the whole "message
> >> part" to Hop C. If Hop B, strips the headers and transfer only the
> >> "Body" part, then it becomes an "Anonymous" message. So the end user
> >> requires the information found in the "Headers" too. e.g. From,
> >> Subject, Date etc. [In HTTP, title tag is equivalent to Subject and
> >> it's found in the "head" Markup, not in the HTTP Headers]
> >>
> >> As you can see, the user is interested in the "entire message". So
> >> the term "actual intended message" should refer to the "whole
> >> message" extracted from the DATA command. The "actual intended
> >> message" should be pictured like this in email.
> >>
> >> Also note that, when you migrate your mails to another mail service,
> >> you need the whole message with Headers, not just the body.
> >>
> >> Based on my points, I believe calling only the "Body" part as
> >> "Payload" is wrong. I would love to hear your thoughts on this. If
> >> Barry Warsaw is here, would love to know your opinion too.
> >>
> >> PS: I did actually ask this question 2 years back in a stackexchange
> >> website. I wasn't satisfied with the answer I got there. I don't want
> >> to use the term incorrectly in my application. That's why I'm posting
> >> it here.
> >>
> >> Thanks
> >> --
> >> Best Regards,
> >>
> >> Viruthagiri Thirumavalavan
> >> Dombox, Inc.
>
>
> --
> Mark Sapiro <mark@msapiro.net>        The highway is for gamblers,
> San Francisco Bay Area, California    better use your sense - B. Dylan
>
>

-- 
Best Regards,

Viruthagiri Thirumavalavan
Dombox, Inc.