Re: Multiple "To:" and "Cc:" header lines in SMTP messages

Ned Freed <Ned.Freed@innosoft.com> Mon, 30 September 1996 23:08 UTC

Received: from cnri by ietf.org id aa25305; 30 Sep 96 19:08 EDT
Received: from list.cren.net by CNRI.Reston.VA.US id aa18151; 30 Sep 96 19:08 EDT
Received: from localhost (localhost.0.0.127.in-addr.arpa [127.0.0.1]) by list.cren.net (8.7.6/8.6.12) with SMTP id SAA29045; Mon, 30 Sep 1996 18:37:33 -0400 (EDT)
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by list.cren.net (8.7.6/8.6.12) with ESMTP id SAA28986 for <ietf-smtp@list.cren.net>; Mon, 30 Sep 1996 18:37:02 -0400 (EDT)
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-7 #8694) id <01IA366UGQBK9OCV1R@INNOSOFT.COM>; Mon, 30 Sep 1996 15:35:52 -0700 (PDT)
Message-Id: <01IA3FD2A8LI9OCV1R@INNOSOFT.COM>
Date: Mon, 30 Sep 1996 13:52:05 -0700
Sender: owner-ietf-smtp@list.cren.net
Precedence: bulk
From: Ned Freed <Ned.Freed@innosoft.com>
To: Pete Resnick <presnick@qualcomm.com>
Cc: Ned Freed <Ned.Freed@innosoft.com>, ietf-smtp@list.cren.net
Subject: Re: Multiple "To:" and "Cc:" header lines in SMTP messages
In-Reply-To: "Your message dated Mon, 30 Sep 1996 15:36:46 -0500" <v03010432ae75d89baa6b@resnick1.isdn.uiuc.edu>
References: <14386.843812494@domen.uninett.no> <c=US%a=telemail%p=dg%l=GROUCHO-960926092252Z-413@groucho.webo.dg.com> <01IA1WOQ216Q8Y55C6@INNOSOFT.COM>
MIME-version: 1.0
Content-type: text/plain; charset="us-ascii"
Content-transfer-encoding: 7bit
X-Listprocessor-Version: 8.1 -- ListProcessor(tm) by CREN

> We have a situation now where some older versions of sendmail (numbers as
> yet unspecified) bounce messages when they receive messages with large
> numbers of addresses in single headers.  This situation shouldn't be
> occurring with high frequency anyway since people tend to use either group
> syntax or mailing list addresses. Unfortunately, Ned seems dealing with
> some set of customers who require it out of his product.

I already explained why this is the case, and it should have been obvious that
it has absolutely nothing to do with our product. Repeating: A significant
number of sites are forced by law to expand lists into message headers so that
the recipients of a given message can be determined. This is not limited to US
government sites; it also extends to organizations with contractural
relationships with the US government.

As such, we have customers who encounter messages with *hundreds* or even
*thousands* of recipients listed in the header. Our product doesn't do this
expansion, it is already a done deal by the time the message reaches us.

This is real and it is a fairly commonplace occurrance for us. You may not have
seen this, but the reality of the modern Internet is that it is quite possible
for a single vendor, no matter how large their market share in a given area, to
never encounter behavior that another vendor in another area sees on a daily
basis. (Although I happen to know that this particular issue has in fact been
pointed out to Qualcomm because I was cc'ed on at least one problem report sent
in by a customer.)

There are also several other factors entering into this which I have
not yet bothered to describe. They are:

(0) The user agents that do this sort of expansion typically don't stop doing
    it at sites where the requirements don't exist. ALL-IN-1, for example,
    always expands its lists into headers no matter what. And according to
    reports I've seen there are on the order of seven million ALL-IN-1 users
    still out there. We even provide facilities to expand ALL-IN-1 lists
    "properly" in our product but sites don't seem especially interested in
    using these facilities, and of course not all ALL-IN-1 sites use our
    products.

(1) Some LAN email systems, most notably cc:Mail, lack the concept of a clean
    header/envelope separation. This leaves them with no choice but to
    effectively expand lists into message headers.

(2) Insertion of lots of addresses into message header fields can also
    occur as a result of improper bcc handling. Specifically, suppose someone
    has a list with lots of receipients as their bcc address. The message
    then passes through a system that summarily strips the bcc fields from the 
    messages (this is very common behavior). The message is now illegal
    according to RFC822, which the next system down the line detects and
    proceeds to copy the entire envelope into header fields. Sometimes this
    is done with equally illegal Apparently-to fields, but other times it is
    done using To: or Cc: headers.

(4) In quite a few cases this problem has been "botched away" before you
    even see it, and thus ends up being counted as an entirely different
    sort of problem. I routinely see messages with truncated header fields
    where the truncated content of those fields got tacked on to the message
    *body*, typically prefixed with a tag such as "overflow headers". This
    happens because some other vendors don't take the multiple field
    approach and instead prefer to make it impossible for any agent to
    do reply-to-all properly.

> Contrast that with a demonstrably large number of client mailers who (a)
> generate messages with large headers (and hence would be causing these
> bounces if they were prevalent) and (b) are unable to "properly" parse
> multiple occurrences of headers. Given that except for Ned, we have only
> heard complaints about (b) and not heard complaints of any magnitude about
> (a), I don't see the impetus for now backing off of the old standard and
> increasing the burden on mail clients. (I can go through why it's a pain in
> the rear for us [and perhaps others] to parse and deal with multiple, and
> perhaps discontiguous, recipient headers, but I'll leave that for another
> time.)

In other words, because I'm the only one who happens to be following this
discussion on this particular list that has reported this particular problem,
it must not be that widespread and hence can be ignored.

I'm sorry, but I do not accept this line of reasoning as valid. Just because
the Internet offers a high degree of connectivity doesn't mean that everyone on
it talks to everyone else all the time. In fact just the opposite is true --
the Internet carries a large number of different sorts of traffic on the
network that intersect infrequently if at all, and the degree of intersection
is dropping dramatically as the network grows. This is why I always take
reports other people make regarding problems seriously even when I've never
even heard or encountered the problem personally.

> We have had servers bounce messages for broken reasons in the past. We have
> not in general rewritten standards to make all sorts of other behavior
> non-conformant to solve the problems of a particular broken server.

But that is what you are advocating doing, not me. Multiple To: and Cc: fields
are discouraged but perfectly legal according to RFC 822. Their semantics are
undefined but there is an obvious interpretation that makes sense. Yet you now
want to change things so that use of multiple fields would be categorically
illegal.

All I'm suggesting is that we attempt to restrict the use of multiple fields to
cases where they are really needed and that we lay the groundwork for getting
implementations to actually handle multiple fields when they have to be used.
This is entirely compatible with existing standards; it is your suggested
approach that isn't.

> And then on the issue of what the standard should say:

> On 9/29/96 at 3:24 PM -0500, Ned Freed wrote:

> > I disagree; the proper recommendation is "SHOULD NOT generate a single field
> > for each separate address". The question of whether or not to generate
> > multiple field for lots of addresses should be left open, as it is too
> > environment-specific to recommend anything at this time.

> I don't see any logic in this position at all, Ned. You are saying that we
> should require people to handle multiple To: lines on incoming message. Why
> not then make the standard for a single address on every To: line?

> According to you, everyone's going to have to parse the blasted things
> anyway. I see no justification in making this a "SHOULD NOT generate".

Because this also causes problems with the installed base. The approach I
believe is correct is to only generate multiple fields when they are absolutely
necessary to get mail delivered.

I also never said that these things abound throughout Internet mail. They have
been and will continue to be a problem in some segments of the community, but
that's all.

> The better way to deal with this situation seems to me to (a) make the
> standard say "MUST NOT generate multiple recipient lines"; (b) make
> interpreting multiple occurrences as a single concatenated line a "SHOULD";
> (c) if you need to generate ultra long recipient lists and have to deal
> with a site that is broken, put something short (group syntax?) in your To:
> line and put the address list in the body of the message.

So now you want to ban multiple fields but say that agents should support them
if they are used? This makes far less sense than attempting to restrict the
use of multiple fields to cases where there is presently no reasonable
alternative, which is what I'm proposing.

It is also totally incompatible with what RFC 822 now says is acceptable usage.
If you object to changing things to say that agents must support presently
legal but optional usage, I cannot see how you can justify changing things
to say that agents must not use constructs that have been legal for
years.

It also gives us the worst of all worlds, in that it breaks existing reply
mechanisms that work. It also effectively mandates that agents indulge in what
amounts to a gross layering violation.

As such, I oppose any such recommendation in the strongest possible terms.

				Ned