Re: duplicate messages on IETF mailing lists

Vernon Schryver <vjs@calcite.rhyolite.com> Thu, 26 July 2001 18:20 UTC

Received: by ietf.org (8.9.1a/8.9.1a) id OAA00542 for ietf-outbound.10@ietf.org; Thu, 26 Jul 2001 14:20:03 -0400 (EDT)
Received: from calcite.rhyolite.com (calcite.rhyolite.com [192.188.61.3]) by ietf.org (8.9.1a/8.9.1a) with SMTP id NAA24894 for <ietf@ietf.org>; Thu, 26 Jul 2001 13:41:02 -0400 (EDT)
Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta16/8.12.0.Beta16) id f6QDspSq005353 for ietf@ietf.org env-from <vjs>; Thu, 26 Jul 2001 07:54:51 -0600 (MDT)
Date: Thu, 26 Jul 2001 07:54:51 -0600
From: Vernon Schryver <vjs@calcite.rhyolite.com>
Message-Id: <200107261354.f6QDspSq005353@calcite.rhyolite.com>
To: ietf@ietf.org
Subject: Re: duplicate messages on IETF mailing lists
X-Loop: ietf@ietf.org

> From: stanislav shalunov <shalunov@internet2.edu>

> ...
> It appears that astro.cs.utk.edu has transmitted the message at least
> five times, with probably sendmail running with `-q30m' option, the
> message was transmitted successfully, but astro.cs.utk.edu didn't
> learn about the success at least the first four times.
>
> The only reasonable explanation for this behaviour would be that
> odin.ietf.org is in violation of RFC1047 (or that astro is broken,
> which is ruled out by the fact that there are numerous messages from
> various places sent to various ietf.org lists that arrived in
> duplicates and triplicates). ...

Please consider the rest of RFC 1047, and note that while other messages
have been duplicated, only that one was multipled a dozen times.  That
suggests that the problem with that message is more on astro.cs.utk.edu,
perhaps because astro.cs.utk.edu is running a short timeout and ietf.org
was so busy pumping viruses and warnings that ietf.org was repeatedly
exceding that short timeout.  Of course it could have been some other
problem at astro.cs.utk.edu, or perhaps something in the message
triggered a long delay at ietf.org.

Since at least about 8.11.1, sendmail has had a default hour or two
timeout for the DATA command.  The fact that the messages from
astro.cs.utk.edu were coming more frequently than once an hour is evidence
that it has shorter than default sendmail timeout, and might be related
those duplicates.  As I recall, sendmail 5 in the mid-1980's had what
comments in sendmail.cf called a non-standard long timeout.  I don't have
convenient access to a 8.9.3 sendmail such as appears to be running on
astro.cs.utk.edu, but I bet the default timeout for 8.9.3 was longer than
the delays  between some of its duplicate messages.


A secondary use of a bulk mail detecting checksumming system like that
described at http://www.rhyolite.com/dcc/ is watching such problems. 
This is the DCC header from what is so far the 11th and last duplicate
copy of the message (not counting my directly addressed copy and one
of the copies via ietf.org):

} X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=249 env_From=ok From=23
}         Subject=many Message-ID=13 Received=13 Body=13 Fuz1=13

The count on the Received: header is for the textually last, usually
temporally first one.


This morning (7/27) ietf.org or optimus.ietf.org claims via the HELP
command to be running Sendmail 8.9.1a.  There are reasons that many
people consider sufficent to switch from ancient 8.9 to current 8.11.4
or even 8.12.0.


Vernon Schryver    vjs@rhyolite.com