Re: Multiple "To:" and "Cc:" header lines in SMTP messages

Pete Resnick <presnick@qualcomm.com> Tue, 01 October 1996 06:02 UTC

Received: from cnri by ietf.org id aa09403; 1 Oct 96 2:02 EDT
Received: from list.cren.net by CNRI.Reston.VA.US id aa15275; 1 Oct 96 2:02 EDT
Received: from localhost (localhost.0.0.127.in-addr.arpa [127.0.0.1]) by list.cren.net (8.7.6/8.6.12) with SMTP id BAA09033; Tue, 1 Oct 1996 01:24:27 -0400 (EDT)
Received: from glaucus.cso.uiuc.edu (glaucus.cso.uiuc.edu [128.174.81.2]) by list.cren.net (8.7.6/8.6.12) with SMTP id BAA09012 for <ietf-smtp@list.cren.net>; Tue, 1 Oct 1996 01:24:13 -0400 (EDT)
Received: from resnick1.isdn.uiuc.edu by glaucus.cso.uiuc.edu (AIX 3.2/UCB 5.64/4.03) id AA07370; Tue, 1 Oct 1996 00:25:52 -0500
Message-Id: <v03010436ae7634c72e45@resnick1.isdn.uiuc.edu>
Date: Tue, 1 Oct 1996 00:23:58 -0500
Sender: owner-ietf-smtp@list.cren.net
Precedence: bulk
From: Pete Resnick <presnick@qualcomm.com>
To: Ned Freed <Ned.Freed@innosoft.com>
Cc: ietf-smtp@list.cren.net
Subject: Re: Multiple "To:" and "Cc:" header lines in SMTP messages
In-Reply-To: <01IA3FD2A8LI9OCV1R@INNOSOFT.COM>
References: "Your message dated Mon, 30 Sep 1996 15:36:46 -0500" <v03010432ae75d89baa6b@resnick1.isdn.uiuc.edu> <14386.843812494@domen.uninett.no> <c=US%a=telemail%p=dg%l=GROUCHO-960926092252Z-413@groucho.webo.dg.com> <01IA1WOQ216Q8Y55C6@INNOSOFT.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Sender: resnick@glaucus.cso.uiuc.edu
X-Mailer: Eudora [Macintosh version 3.0.1b4-10.96]
X-Listprocessor-Version: 8.1 -- ListProcessor(tm) by CREN

To note off the top, the questions at hand are (a) whether this is a
substantial problem that needs to be addressed by the standards and if so
(b) whether multiple occurances of fields are the way to address the
problem.

On 9/30/96 at 3:52 PM -0500, Ned Freed wrote:

>> Unfortunately, Ned seems dealing with
>> some set of customers who require it out of his product.
>
>I already explained why this is the case, and it should have been obvious that
>it has absolutely nothing to do with our product. Repeating: A significant
>number of sites are forced by law to expand lists into message headers so that
>the recipients of a given message can be determined. This is not limited to US
>government sites; it also extends to organizations with contractural
>relationships with the US government.

Just to be clear, I understood the reason that the situation came up. The
emphasis in the above sentence should have been read on "customers", not
"product". It is a certain set of customers who need this functionality and
that is the cause of the problem. I didn't mean to imply that it was only
your products problem, Ned.

But now comes the question which I posed at the bottom which did not get
answered: Are you saying that there are sites that are forced by law to
expand lists (note the emphasis here) into the *recipient headers* of
messages? Are you saying that laws are specific enough to require that SMTP
recipient header fields are used as opposed to putting the expansion into
the body of the message? Why can't you just expand into the body and leave
"reasonable" headers on the messages? Or put them in some other header
(multiple "Expanded-List:" headers) such that mailers on the other end
don't have to deal with hundreds or thousands of destination addresses to
parse and deliver to? The current state of affairs just seems like a poor
solution.

(Note that this is not equivalent to truncating the header and putting it
into the body as "Overflow headers". What I am suggesting is not generating
the long recipient headers in the first place.)

>As such, we have customers who encounter messages with *hundreds* or even
>*thousands* of recipients listed in the header. Our product doesn't do this
>expansion, it is already a done deal by the time the message reaches us.

Are you here saying that you are gatewaying messages which come in with
these large numbers of recipients to SMTP and therefore split up the
recipient lists into multiple fields yourself to deal with the broken
sendmail's? Or are you getting SMTP messages which you are then fixing for
broken sendmail's?

>(0) The user agents that do this sort of expansion typically don't stop doing
>    it at sites where the requirements don't exist. ALL-IN-1, for example,
>    always expands its lists into headers no matter what.[...]
>
>(1) Some LAN email systems, most notably cc:Mail, lack the concept of a clean
>    header/envelope separation. This leaves them with no choice but to
>    effectively expand lists into message headers.

Now this is a different issue than the one sited above. Before we were
talking about places where it was considered desireable to expand lists
into headers. Here we're talking about places where there is no list name
which could replace the many recipients. Is your claim is that the number
of times that such expanded messages hit old broken sendmail's (sendmail's
which will not be fixed) is high?

>(2) Insertion of lots of addresses into message header fields can also
>    occur as a result of improper bcc handling. Specifically, suppose someone
>    has a list with lots of receipients as their bcc address. The message
>    then passes through a system that summarily strips the bcc fields from
>the
>    messages (this is very common behavior). The message is now illegal
>    according to RFC822, which the next system down the line detects and
>    proceeds to copy the entire envelope into header fields. Sometimes this
>    is done with equally illegal Apparently-to fields, but other times it is
>    done using To: or Cc: headers.

The thought of changing the standard to accomodate the combination of
behavior of 3 broken SMTP acts (the act of removing headers, the act of
looking in the body, and the act of adding headers back on) is pretty
disheartening.

>(4) In quite a few cases this problem has been "botched away" before you
>    even see it, and thus ends up being counted as an entirely different
>    sort of problem. I routinely see messages with truncated header fields
>    where the truncated content of those fields got tacked on to the message
>    *body*, typically prefixed with a tag such as "overflow headers". This
>    happens because some other vendors don't take the multiple field
>    approach and instead prefer to make it impossible for any agent to
>    do reply-to-all properly.

I understand that experiences may differ, but though I used to see lots of
these, I haven't seen one in quite a long time.

>In other words, because I'm the only one who happens to be following this
>discussion on this particular list that has reported this particular problem,
>it must not be that widespread and hence can be ignored.

No, but neither is the converse true. Just because some particular group of
people run into a particular problem, even if it is with a large percentage
of their e-mail traffic, it does not make it a widespread pervasive
problem. I trust you are seeing lots of these. I am not yet inclined to
make that the sole criteria for "a problem which must be addressed."

>> We have had servers bounce messages for broken reasons in the past. We have
>> not in general rewritten standards to make all sorts of other behavior
>> non-conformant to solve the problems of a particular broken server.
>
>But that is what you are advocating doing, not me. Multiple To: and Cc: fields
>are discouraged but perfectly legal according to RFC 822. Their semantics are
>undefined but there is an obvious interpretation that makes sense. Yet you now
>want to change things so that use of multiple fields would be categorically
>illegal.

Not at all. We now have a situation where the the standard can reasonably
be stated as saying "SHOULD NOT generate" (though it uses different words).
Further, we have a situation where the standard *specifically states* that
the semantics are undefined, IMO effectively saying "DO NOT interpret" and
implementations have been written to ignore such constructs. The leap from
"SHOULD NOT generate" and "DO NOT interpret" to "MUST NOT generate" because
they cause problems and "SHOULD interpret in such and so way" because they
do exist now does not seem so big. (I could, BTW, live with "SHOULD NOT
generate". But see below.)

>All I'm suggesting is that we attempt to restrict the use of multiple
>fields to
>cases where they are really needed and that we lay the groundwork for getting
>implementations to actually handle multiple fields when they have to be used.
>This is entirely compatible with existing standards; it is your suggested
>approach that isn't.

It seems to me that this is saying "MAY generate" and "MUST interpret".
That seems to be much more incompatible with current practice.

And all of this to accomodate an as yet unknown number of older broken
versions of software out there on the net for which there is a readily
available fix. I find it somewhat disconcerting that if Eudora had decided
to bounce messages with multiple To: lines as against simply ignore them
(an overtly illegal move), you would now theoretically be in favor of my
language because it would promote more deliverability to disallow such
things.

>The approach I
>believe is correct is to only generate multiple fields when they are
>absolutely
>necessary to get mail delivered.

They are never "absolutely necessary", at least in the primary case you
have cited with regard to government agencies. There is no requirement that
such things be generated for delivery to succeed. There are other methods
which are more in line with the current state of the standards for
acheiving reliable delivery and preserving all of the information. Now, if
we're talking about the gateways from LAN systems as well, we can certainly
guarantee delivery, though there is some question about preserving
information.

>I also never said that these things abound throughout Internet mail. They have
>been and will continue to be a problem in some segments of the community, but
>that's all.

Which is why I'm inclined to not prop up what is currently broken behavior
if we don't have to.

>> The better way to deal with this situation seems to me to (a) make the
>> standard say "MUST NOT generate multiple recipient lines"; (b) make
>> interpreting multiple occurrences as a single concatenated line a "SHOULD";
>> (c) if you need to generate ultra long recipient lists and have to deal
>> with a site that is broken, put something short (group syntax?) in your To:
>> line and put the address list in the body of the message.
>
>So now you want to ban multiple fields but say that agents should support them
>if they are used? This makes far less sense than attempting to restrict the
>use of multiple fields to cases where there is presently no reasonable
>alternative, which is what I'm proposing.

(1) SHOULD interpret is desireable because the things are out there now.
(2) I am not committed to "MUST NOT generate", but would settle for "SHOULD
NOT generate". But...
(3) I have proposed what I think are reasonable alternatives, in which case
"MUST NOT generate" would not be a problem and would avoid ambiguity which
caused people to unnecessarily generate them.

>It is also totally incompatible with what RFC 822 now says is acceptable
>usage.
>If you object to changing things to say that agents must support presently
>legal but optional usage, I cannot see how you can justify changing things
>to say that agents must not use constructs that have been legal for
>years.

I think "acceptable" overstates the case by a long shot. But again, I could
live with "SHOULD NOT generate" if we need to.

>It also gives us the worst of all worlds, in that it breaks existing reply
>mechanisms that work. It also effectively mandates that agents indulge in what
>amounts to a gross layering violation.

It *only* breaks the reply mechanism for certain things on the bad side of
a broken sendmail (which may not include those government systems). And my
feeling is that you are already involved in the layer violation by
effectively gatewaying to the broken sendmail's in the first place.

pr

--
Pete Resnick <mailto:presnick@qualcomm.com>
QUALCOMM Incorporated
Work: (217)337-6377 / Fax: (217)337-1980