Re: header-munging

Keith Moore <moore@cs.utk.edu> Tue, 24 September 1996 19:54 UTC

Received: from cnri by ietf.org id aa24919; 24 Sep 96 15:54 EDT
Received: from list.cren.net by CNRI.Reston.VA.US id aa11000; 24 Sep 96 15:54 EDT
Received: from localhost (localhost.0.0.127.in-addr.arpa [127.0.0.1]) by list.cren.net (8.7.6/8.6.12) with SMTP id PAA28529; Tue, 24 Sep 1996 15:13:16 -0400 (EDT)
Received: from ig.cs.utk.edu (IG.CS.UTK.EDU [128.169.94.149]) by list.cren.net (8.7.6/8.6.12) with SMTP id PAA28516 for <ietf-smtp@list.cren.net>; Tue, 24 Sep 1996 15:12:59 -0400 (EDT)
Received: from localhost by ig.cs.utk.edu with SMTP (8.6.10/2.8c-UTK) id PAA01115; Tue, 24 Sep 1996 15:12:33 -0400
Message-Id: <199609241912.PAA01115@ig.cs.utk.edu>
Date: Tue, 24 Sep 1996 15:12:33 -0400
Sender: owner-ietf-smtp@list.cren.net
Precedence: bulk
From: Keith Moore <moore@cs.utk.edu>
To: "D. J. Bernstein" <djb@koobera.math.uic.edu>
Cc: ietf-smtp@list.cren.net, moore@cs.utk.edu
Subject: Re: header-munging
In-Reply-To: Your message of "24 Sep 1996 00:36:27 -0000." <19960924003627.11749.qmail@koobera.math.uic.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Sender: moore@cs.utk.edu
X-Mailer: exmh version 1.6.7 5/3/96
X-URI: http://www.cs.utk.edu/~moore/
X-Listprocessor-Version: 8.1 -- ListProcessor(tm) by CREN

> > But as Keith has already pointed out, the RFC822 'Date' field is 
> > the creation date for the message, not the posting date,

[ everyone agrees that Date is the time when the message was sent ]
> as the moment when the user hit the send button.

Actually, that's how I define Date as well...I just think in terms of
the message being "created" when the send button is pushed, so to me
they're the same thing.

But the time when the send button is pushed is NOT necessarily the
same as the time that the message is posted to an SMTP server, and the
difference can easily be a matter of hours or even days.

> > It also leaves us with a 
> > non-null class of problems a submission server isn't likely to be able
> > to solve with sufficient degree of generality to be called a real solution
> 
> Uh, did you have any examples other than Sender and Date?

Here's the ones I can think of:

1. Sender's return addresses (From, Reply-To, Sender, and SMTP
MAIL FROM).

The SMTP server is NOT generally able to determine the domain of the
sender.  This is true regardless of whether the sender's MUA says
'From: user', 'From: user@host', or 'From: user@bogus.domain'.
(In ALL of these cases, the problem is mis-configuration of the MUA.)

a. To fix up 'From: user', the SMTP server has to assume that it knows
the sender's domain name.  In general, an SMTP server cannot assume
this.  In general, it's not possible for an ISP to provide an SMTP
server to its customers that can assume the sender's domain name.

b. To fix up 'From: user@host', the SMTP server has to assume that it
knows the sender's domain suffix (to append to user@host) and/or the
sender's DNS domain (to lookup 'host.domain' and see if there's a
CNAME record that matches it).  In general, an SMTP server cannot
assume this.  Many hostnames are quite popular.  Even the existance of
an A or MX record for host.server.domain does not reliably indicate
that the sender's user@host really means user@host.server.domain.

c. To fix up 'From: user@bogus.domain', the SMTP server has to not
only know when the bogus.domain is bogus (again, the presence of an A
or MX record for that domain does not imply that the domain is correct
for that sender), it has to know the correct domain for the sender.

In other words, the SMTP server doing the fixup has to know when it
has better information than the sender himself.  There's no way that
it can know this.  Even if the SMTP server only changes things when
the sender-supplied address is known to be incorrect, it can easily
make things worse and this happens often in practice.  

An valid and incorrect From address is worse than an invalid From
address.  The latter will simply cause replies to fail.  The former
can cause replies (perhaps of sensitive material) to go to the wrong
person.


2. Recipient addresses (To, Cc, Bcc) with incomplete domain
information.

Similar to #1.  The server can't reliably do fixup because it doesn't
really know what domain was intended, or whether the sender simply
tried and failed to match one of his own recipient aliases (i.e. one
from his "address book").  Incorrect guesses on the part of the SMTP
server can cause sensitive mail to go to the wrong recipient.


3. Date field.

Date that the user sent the message is not necessarily close to the
date of posting to the SMTP server, and the SMTP server doesn't know
the correct time zone of the sender.  This isn't usually as serious a
problem as the substitution of incorrect domain names, but it's still
not correct.


4. Message-ID field.

Like Ned, I don't see a big drawback with this kind of fixup.


5. Content-MD5 field.

An SMTP server has absolutely no business adding a Content-MD5 field,
because it cannot be certain that the message body has not been
altered since it was composed.


6.  Fixup of bogus/broken message body format.

This isn't (usually) a matter of incorrect MUA configuration (modulo
users who generate 8bitMIME even though they have a 7bit path, and
Eudora clients configured to generate BinHex).  Instead, it's usually
due to an obsolete or broken MUA.  There are lots of these out there.
If I had to support such legacy systems, I'd use whatever tools were
at my disposal -- including possibly an SMTP server that did fixup.

There are still risks of doing such fixup, because sometimes the
"brain damage" (say, gratuitous typing of all attachments as
application/octet-stream) is due to MUA bogosity, and sometimes it is
really what the sender intended -- and the SMTP server has no idea
which it is.

				  --

I realize that people use SMTP fixups all the time, and often
successfully.  The reason I keep harping on this is that I see a
significant number of failures which appear to be caused by SMTP
servers that rewrite addresses.  The failures usually get detected
only by an offsite recipient, who is not able to do much about the
problem.

There's a general principle here: don't add information to a document
unless you're an authoritative source of that information.  The
recipient of a message has every right to expect that the sender and
recipient addresses, the Date field, etc., are those supplied by the
sender.  

If it's not easier for the UA to get those things right in the first
place than to have an SMTP server to fix them up later, we need to
work on getting them right in the first place.  Otherwise we end up
with a situation where the recipient can't trust the message header
fields, and we start needing another message header where the sender
can put things that won't be altered by the mail transport...


Here's an attempt to summarize the situation:

1. Poor MUA configuration is an Internet-wide problem.  It exists 
   for individual net users, small workgroups, and large
   organizations.  A solution which works for one of these scenarios
   may not work well for another.

2. SMTP fixups, while useful in some cases, are not a general solution
   to this problem, may not even work for the majority of cases, and
   have significant risks with undesirable consequences.  Widespread
   use of this practice is known to cause a significant number of 
   operational problems, including delivery failure for messages, 
   delivery failure for nondelivery notifications,  inability to 
   reply to messages, delivery of messages to the wrong recipient, and
   delivery of replies to the wrong recipient.

   SMTP fixups are sometimes useful as a work around for incorrect
   or obsolete MUA implementations.  

   It's probably worthwhile to attempt to document when this technique
   works well and when it does not.

3. An SMTP submission server that bounces messages that are known to
   be incorrect, can help detect MUA configuration problems at the
   source (where they can be fixed!), works in a wide variety of
   usage scenarios, and has few drawbacks.  This practice
   should be documented and encouraged.

4. It would appear to be useful to have better ways for a site to
   supply configuration information to MUAs.  This doesn't solve the
   configuration problem for everybody, but it would solve it for
   those sites where most everyone's configurations are similar.  
   If done right (and careful design is crucial), it should work better
   than SMTP fixups because the MUA is in a better position to know 
   whether the information is correct, than the SMTP server (i.e. the 
   sender can presumably view the site-supplied configuration
   information, report inaccuracies to people who can fix them, and
   still override the site-supplied defaults if needed for her
   particular situation).  This approach needs further study.

-Keith