Re: header-munging

"D. J. Bernstein" <djb@koobera.math.uic.edu> Sun, 11 August 1996 12:40 UTC

Received: from ietf.org by ietf.org id aa29847; 11 Aug 96 8:40 EDT
Received: from cnri by ietf.org id aa29843; 11 Aug 96 8:40 EDT
Received: from list.cren.net by CNRI.Reston.VA.US id aa05297; 11 Aug 96 8:40 EDT
Received: from localhost (localhost [127.0.0.1]) by list.cren.net (8.6.12/8.6.12) with SMTP id HAA06981; Sun, 11 Aug 1996 07:25:57 -0400
Received: from koobera.math.uic.edu (qmailr@KOOBERA.MATH.UIC.EDU [128.248.178.247]) by list.cren.net (8.6.12/8.6.12) with SMTP id HAA06968 for <ietf-smtp@list.cren.net>; Sun, 11 Aug 1996 07:25:53 -0400
Received: (qmail-queue invoked by uid 666); 11 Aug 1996 11:30:01 -0000
Message-Id: <19960811113001.3802.qmail@koobera.math.uic.edu>
Date: Sun, 11 Aug 1996 11:30:01 -0000
X-Orig-Sender: owner-ietf-smtp@list.cren.net
Precedence: bulk
Sender: ietf-archive-request@ietf.org
From: "D. J. Bernstein" <djb@koobera.math.uic.edu>
To: ietf-smtp@list.cren.net
Subject: Re: header-munging
X-Listprocessor-Version: 8.0 -- ListProcessor(tm) by CREN

> Why can't the MTA do the fixup anyway?

Let's go back to the picture of how mail works:

               +----------+             +----------+
  -------      |          |    FedEx    |RCVD:FEDEX|      -------
  message ---> | envelope | ----------> | envelope | ---> message
  -------      |          |             |          |      -------
               +----------+             +----------+

Why should FedEx care what's in the message? The delivery information is
all on the envelope. Looking through messages would take some manpower,
and on occasion they'd carelessly tear or smudge some important letter.

E-mail works the same way. Why should an MTA look at the contents of the
message? The delivery information is all on the envelope. (Exception: a
Received line doesn't fit onto the outside of today's standard envelope,
so you're forced to stick it onto the top of the message instead.)

Admittedly, there _was_ once a mail service that opened every letter.
They claimed---correctly, I'm sure---that people were sending illegal
messages through the mail. So they read everything. They didn't allow
any ``bad'' messages to be sent. Of course, they were far too eager to
declare that something was ``bad,'' so they often screwed up somebody's
mail for no good reason. This was the USPS, circa 1944.

> sendmail has been doing this
> for years, and I'm not aware of any problems its caused.

sendmail's rewriting has been a disaster. ``Apparently-To'' reveals
blind recipients to each other. ``Cc: recipient list not shown: ;'' is
corrupted by all but the latest versions of sendmail. Any header field
beginning with a question mark is destroyed. And so on.

An even more serious problem is speed. See, sendmail does between five
and fifty gazillion DNS lookups for every incoming message, for
_absolutely no reason_ other than to rewrite the message header. Sending
fifty gazillion requests to the DNS server takes a very long time, even
if all the answers are in cache.

Rewriting causes these problems. To improve the situation, we have to
reduce the amount of rewriting. How could we do this with ESMTP options?

One possibility is to give the client a way to say ``Hey, don't rewrite
this!'' If sendmail sees this, it can skip rewriting. Good.

But that's not what SUBMIT does. SUBMIT gives the client a way to say
``Hey, _do_ rewrite this!'' How is sendmail supposed to take advantage
of that? Will it skip rewriting if the client _doesn't_ say this? No%---
that would cause trouble for older clients that don't understand the
option but that do need rewriting. So how is this supposed to help?

---Dan

% For some mailers, such as qmail, the answer is yes. It would be a
violation of qmail's fundamental responsibility as an MTA, not to
mention a waste of time and memory, for qmail to rewrite messages that
don't conform to RFC 822. Of course, many clients do violate RFC 822;
this is the interoperability problem I described in a previous message.
Right now, people using qmail on mail hubs normally divert messages from
those clients (identified by IP address) to a rewriting program.