Re: [ietf-smtp] Proper definition of the term "email payload".

Viruthagiri Thirumavalavan <giri@dombox.org> Mon, 01 April 2019 03:04 UTC

Return-Path: <giri@dombox.org>
X-Original-To: ietf-smtp@ietfa.amsl.com
Delivered-To: ietf-smtp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9EC341200E5 for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 20:04:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=dombox.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0X50GOVHEe-O for <ietf-smtp@ietfa.amsl.com>; Sun, 31 Mar 2019 20:04:16 -0700 (PDT)
Received: from mail-yw1-xc33.google.com (mail-yw1-xc33.google.com [IPv6:2607:f8b0:4864:20::c33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DDF171201D0 for <ietf-smtp@ietf.org>; Sun, 31 Mar 2019 20:04:15 -0700 (PDT)
Received: by mail-yw1-xc33.google.com with SMTP id p64so2739131ywg.7 for <ietf-smtp@ietf.org>; Sun, 31 Mar 2019 20:04:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dombox.org; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ONQtk9qYAwmQeJ+nqJLRtU14vYW8z9DGMo5BeXvZZQg=; b=EFguft+obOM1+nZnSQGBR99pbPZC4OqVjZb/VryanUN90D92RSqPtdoaGcOT2gt/Ep aPwBa6XXQQig+MYAddxQJY7hCfkwiHt5ZxaDFmxMetYSgW6fmEGZkqFLfMQnXaZAd6v0 qFxPzDZYTJ57Gl9P2rHNRHik7327MrXvROGgIp1Y4QybleHm+/kJjqWEncrJd1N62+z9 sjT5CcdlChXsV3xjPGyLB6Ku4LMk+aPVVXSFsqb90S+bCvcJ4N3tCFWeBZeHnBM5ntZ+ GIIBqvTzpFl8ZmmyqBHA3efU05MBpGWVb01YkpzpRpAYkv4XlFHe0Sv2iiwQf1e2bEuK /2PA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ONQtk9qYAwmQeJ+nqJLRtU14vYW8z9DGMo5BeXvZZQg=; b=TQMhqgAS4mdWmuN9le7VJm/A3wgz2furShSF87tP38gBoTurWeD/XqMPApEKfDx6jx k424MqlB/QZPd41rjJIogjfoJkir5XGCEWir0O8LFxKDLBu5m4zYMTjfKeXUX1QtZ2eQ uOzqAlved5HOvj+BraedvPp3wGKuVoPkeUIpddvfIqHXrUQK+xc7mwhwJXGm4BNlW9z/ Wv8df1BOGxkZ0jIx+zzACPuwiwjGtbYrplF/FHTxEZufgswZIRbicZHAI3Qi8hWnaBlo 5jgFN4DWw8CsspatteRkx0zmtOMSNnDXx0BQXLdtKt4Z8YxBXb9X7R+PVwH/v7tkdBaF +9TA==
X-Gm-Message-State: APjAAAVbB71iN0Hy4zoivkI4S1GikkCOQeGD/JM4F2h2X120BAQUcWVB Z1j+9e9C6RWwnZyFSnUE6gIuDhDxDcwLZuk108Gs6wOKTXo=
X-Google-Smtp-Source: APXvYqyXlCBDzMwcsnHyDp5k5Na8P3mXX2+f4RhvabqVK+ZAOfd69heHgTaicY/RyS3n4Glw8uhtQmkxOGEwZP8oJPs=
X-Received: by 2002:a81:1e42:: with SMTP id e63mr50842565ywe.150.1554087854948; Sun, 31 Mar 2019 20:04:14 -0700 (PDT)
MIME-Version: 1.0
References: <20190401000943.57BB2143638@mail.wooz.org> <FCBB7422-295F-4E58-AC4E-42A6894C6406@python.org> <75720834-0bdc-deba-40fc-24f17d5a6752@msapiro.net> <CAOEezJQo=TQcBqVGypW4YD4rLT0JNnfa1zZ9eh1gjQz9Cz8htA@mail.gmail.com> <221B317451ECA8F7674E1AE6@PSB>
In-Reply-To: <221B317451ECA8F7674E1AE6@PSB>
From: Viruthagiri Thirumavalavan <giri@dombox.org>
Date: Mon, 01 Apr 2019 08:33:38 +0530
Message-ID: <CAOEezJT+A+bFxK7WEixa3YCkQj0tkMxHzdTfL4Zar5LBQUMgaA@mail.gmail.com>
To: John C Klensin <john-ietf@jck.com>
Cc: ietf-smtp@ietf.org
Content-Type: multipart/alternative; boundary="0000000000008b5ad005856f463e"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-smtp/nKBZ9Mp_IUCR5NcDc-XNie-WI4w>
Subject: Re: [ietf-smtp] Proper definition of the term "email payload".
X-BeenThere: ietf-smtp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of issues related to Simple Mail Transfer Protocol \(SMTP\) \[RFC 821, RFC 2821, RFC 5321\]" <ietf-smtp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-smtp/>
List-Post: <mailto:ietf-smtp@ietf.org>
List-Help: <mailto:ietf-smtp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-smtp>, <mailto:ietf-smtp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Apr 2019 03:04:33 -0000

Got it Mr. Klensin. Thanks for the input. Have a nice day :-)



On Mon, Apr 1, 2019 at 8:23 AM John C Klensin <john-ietf@jck.com> wrote:

> Hi.
>
> In the hope that we don't need to iterate further, let me try to
> express the problem in a different way, one that is entirely
> consistent with Dave Crocker's comments and a few others.
>
> The difficult with a term like "payload" is that the definition
> depends on where one is looking from.  Internet email is a
> layered system and the perspective depends on the layer.  For
> SMTP (from RFC 821 through 5321 fairly consistently), what it is
> transferring ("the payload") would be the message content
> starting after the DATA command (or equivalent) and continuing
> to the end of data indication (normally CRLF.CRLF).  From a
> header specification standpoint pre-MIME (e.g., from the
> perspective of RFC 822), the payload would probably the message
> body after the blank line that indicates the end of the headers
> although I suppose one could construct an argument that would
> distinguish between trace information and everything else.  When
> we get to MIME (and especially "content-type=multipart/"), one
> might claim that multipart messages have multiple payloads, one
> per body part after the message headers and MIME body part
> headers are excluded.
>
> Dave noted that the term payload "does not appear at all in RFC
> 5321 or RFC 5322 or RFC 3501".    As the author of one of those
> documents, that omission is no accident and is closely connected
> to the discussion above.
>
> best,
>    john
>
>
>
> --On Monday, April 1, 2019 06:34 +0530 Viruthagiri
> Thirumavalavan <giri@dombox.org> wrote:
>
> > Thanks Mark. You have written beautifully. And yes your answer
> > makes sense.
> >
> > On Mon, Apr 1, 2019 at 6:21 AM Mark Sapiro <mark@msapiro.net>
> > wrote:
> >
> >> To elaborate just a bit on what Barry says, as far as the
> >> Python email library is concerned, the stuff that comes over
> >> the wire as a response to the SMTP DATA command (RFC 821 and
> >> successors) is the email.message object. If you want to see
> >> the whole thing, you use the as_string() or as_bytes()
> >> methods on that object.
> >>
> >> That object consists of headers and body as described in RFC
> >> 822 and successors. The Python email library refers to that
> >> body as the payload of that message object.
> >>
> >> I think this is all consistent and reasonable in terms of
> >> what the email library is trying to do.
> >>
> >> In the RFC 821 context, the metadata is the envelope which
> >> has a sender and recipients and the entire message is the
> >> data, but in the RFC 822 context, the data is split into
> >> headers and body and we choose to call the body the payload.
> >>
> >> This is a semantic issue. In your "box of beer" example, the
> >> service that delivers it considers the payload to be the box
> >> and contents, but the consumer considers the payload to be
> >> only the contents (and maybe just the beer and not the cans).
> >> Take your pick.
> >>
> >> I.e., there is no one definitive answer to your question. You
> >> have reasons for considering the RFC 821 DATA to be the
> >> payload, and you are not wrong, and we have reasons for
> >> considering the RFC 822 body to be the payload, and we are
> >> not wrong either
> >>
> >> Forwarded message.
> >> >  *From: *Barry Warsaw <barry@python.org
> >> >  <mailto:barry@python.org>> *Subject: **Re: Proper
> >> >  definition of the term "email payload".* *Date: *March 31,
> >> >  2019 at 17:09:30 PDT
> >> >  *To: *Viruthagiri Thirumavalavan <giri@dombox.org
> >> >  <mailto:giri@dombox.org>>
> >> >  *Cc: *ietf-smtp@ietf.org <mailto:ietf-smtp@ietf.org>, "R.
> >> >  David Murray" <rdmurray@bitdance.com
> >> >  <mailto:rdmurray@bitdance.com>>, Mark Sapiro
> >> >  <msapiro@value.net <mailto:msapiro@value.net>>
> >> >
> >> >
> >> >  Hi, I hope you (and they!) don't mind me CCing two other
> >> >  people who have worked extensively on Python's email
> >> >  library, and in fact much more than myself in the recent
> >> >  years.  RDM has done the bulk of the work on the
> >> >  new-in-Python-3 APIs, and Mark is a long time core
> >> >  developer on GNU Mailman (the project that spawned
> >> >  Python's email library).
> >> >
> >> >  There are two ways I think about this, and I'll use the
> >> >  original RFC numbers to clarify.  There's RFC 821, which
> >> >  describes the on-the-wire protocol for SMTP transfers,
> >> >  embodied in Python's smtplib library. Then there's RFC
> >> >   822, which describes the format of the content of that
> >> >  SMTP transfer, but not the protocol itself.  Of course
> >> >  there are lots of developments along the way, but that's
> >> >  unimportant for the way I think about these things.
> >> >
> >> >  What I think you are describing, where the headers are
> >> >  part of the payload, is more akin to RFC 821.  That's
> >> >  the payload as far as the actual bytes-on-the-wire are
> >> >  concerned.  Python's email library is for RFC 822 (and
> >> >  the many, many elaborations thereof), so in that case, the
> >> >  payload is the body of the message.  On more practical
> >> >  terms, the implementation makes this clear, and the APIs
> >> >  you use to change headers are different in form and
> >> >  function than the ones you use to change the body of the
> >> >  message.
> >> >
> >> >  I think the Python documentation is fairly clear about this
> >> >  distinction.  At least, I don't remember seeing any
> >> >  feedback to the contrary, although RDM may have a better
> >> >  sense of that.  Of course, we are always open to
> >> >  improvements in Python's documentation.
> >> >
> >> >  Cheers,
> >> >  -Barry
> >> >
> >> >> On Mar 31, 2019, at 10:57, Viruthagiri Thirumavalavan
> >> >> <giri@dombox.org <mailto:giri@dombox.org>> wrote:
> >> >>
> >> >> Hello IETF,
> >> >>
> >> >> I need some clarification about the term "email payload".
> >> >>
> >> >> Wikipedia says
> >> >>
> >> >> In computing and telecommunications, the payload is the
> >> >> part of transmitted data that is the actual intended
> >> >> message. Headers and metadata are sent only to enable
> >> >> payload delivery
> >> >>
> >> >> Python email library documentation says this.
> >> >>
> >> >> An email message consists of headers and a payload (which
> >> >> is also referred to as the content). Headers are RFC 5322
> >> >> or RFC 6532 style field names and values, where the field
> >> >> name and value are separated by a colon. The colon is not
> >> >> part of either the field name or the field value. The
> >> >> payload may be a simple text message, or a binary object,
> >> >> or a structured sequence of sub-messages each with their
> >> >> own set of headers and their own payload. The latter type
> >> >> of payload is indicated by the message having a MIME type
> >> >> such as multipart/* or message/rfc822.
> >> >>
> >> >> It looks like Python email library author "Barry Warsaw"
> >> >> followed similar definition found in wikipedia when
> >> >> defining his library functions. But I feel like calling
> >> >> ONLY the email "Body Part" as "payload" is wrong. The term
> >> >> "payload" should refer to the entire "Message Part" in
> >> >> Email. i.e. Both Headers and Body.
> >> >>
> >> >> When you place an order for a "box of beer", you are not
> >> >> paying only for the "beer cans", but also paying for the
> >> >> "container box". So the payload here is the entire box.
> >> >>
> >> >> HTTP Example:
> >> >>
> >> >> HTTP/1.1 200 OK
> >> >> Date: Sun, 10 Oct 2010 23:26:07 GMT
> >> >> Content-Type: text/html
> >> >> Content-Length: 1234
> >> >>
> >> >> <html>
> >> >>
> >> >> <head>
> >> >> <title>Hello World!</title>
> >> >> </head>
> >> >>
> >> >> <body>
> >> >> (more contents)
> >> >>  .
> >> >>  .
> >> >>  .
> >> >> </body>
> >> >> </html>
> >> >>
> >> >>
> >> >> If you take a closer look at this HTTP example, the
> >> >> headers are only just instructions for the client. The end
> >> >> user doesn't need to worry about any piece of information
> >> >> found in those headers. So wikipedia definition perfectly
> >> >> suited for applications like HTTP.
> >> >>
> >> >> But in Email, When a mail get transferred from Hop A to
> >> >> Hop C via Hop B, the user in Hop A actually wants to
> >> >> deliver the whole "message part" to Hop C. If Hop B,
> >> >> strips the headers and transfer only the "Body" part, then
> >> >> it becomes an "Anonymous" message. So the end user
> >> >> requires the information found in the "Headers" too. e.g.
> >> >> From, Subject, Date etc. [In HTTP, title tag is equivalent
> >> >> to Subject and it's found in the "head" Markup, not in the
> >> >> HTTP Headers]
> >> >>
> >> >> As you can see, the user is interested in the "entire
> >> >> message". So the term "actual intended message" should
> >> >> refer to the "whole message" extracted from the DATA
> >> >> command. The "actual intended message" should be pictured
> >> >> like this in email.
> >> >>
> >> >> Also note that, when you migrate your mails to another
> >> >> mail service, you need the whole message with Headers, not
> >> >> just the body.
> >> >>
> >> >> Based on my points, I believe calling only the "Body" part
> >> >> as "Payload" is wrong. I would love to hear your thoughts
> >> >> on this. If Barry Warsaw is here, would love to know your
> >> >> opinion too.
> >> >>
> >> >> PS: I did actually ask this question 2 years back in a
> >> >> stackexchange website. I wasn't satisfied with the answer
> >> >> I got there. I don't want to use the term incorrectly in
> >> >> my application. That's why I'm posting it here.
> >> >>
> >> >> Thanks
> >> >> --
> >> >> Best Regards,
> >> >>
> >> >> Viruthagiri Thirumavalavan
> >> >> Dombox, Inc.
> >>
> >>
> >> --
> >> Mark Sapiro <mark@msapiro.net>        The highway is for
> >> gamblers, San Francisco Bay Area, California    better use
> >> your sense - B. Dylan
> >>
>
>
>
>
>

-- 
Best Regards,

Viruthagiri Thirumavalavan
Dombox, Inc.