Re: [apps-discuss] FW: Comments on Malformed Message BCP draft

Barry Leiba <barryleiba@computer.org> Tue, 19 April 2011 18:40 UTC

Return-Path: <barryleiba.mailing.lists@gmail.com>
X-Original-To: apps-discuss@ietfc.amsl.com
Delivered-To: apps-discuss@ietfc.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfc.amsl.com (Postfix) with ESMTP id D709DE0664 for <apps-discuss@ietfc.amsl.com>; Tue, 19 Apr 2011 11:40:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.526
X-Spam-Level:
X-Spam-Status: No, score=-103.526 tagged_above=-999 required=5 tests=[AWL=-0.549, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([208.66.40.236]) by localhost (ietfc.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sVW9CZlcgKlr for <apps-discuss@ietfc.amsl.com>; Tue, 19 Apr 2011 11:40:21 -0700 (PDT)
Received: from mail-ww0-f44.google.com (mail-ww0-f44.google.com [74.125.82.44]) by ietfc.amsl.com (Postfix) with ESMTP id 48154E0685 for <apps-discuss@ietf.org>; Tue, 19 Apr 2011 11:40:21 -0700 (PDT)
Received: by wwa36 with SMTP id 36so4905924wwa.13 for <apps-discuss@ietf.org>; Tue, 19 Apr 2011 11:40:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=qusT7ZSIW0FssuOh4Hc4kAHJrpChzUzZ8b6vyJ0Fuug=; b=fqVWQRuD0GblcLblOor+vx5CDr0+VUDN0HJyDCEHDhnNed1+GleHWMFc1CSfYgptPU 9g3btim5H8XAirdFzLXoJtdZEM10wFd70OrvtCwa7yqBb7JfFGR8PvQpUl+8kcYUjlpc RgEir0cUjbb+1E2OkX7gWaQDA9tqpi173oXBU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=B3NWRVr+pzZiVQll38qORcLBupVNZHFnuexsJFWF7CGbjXHsVj8v3mp8tgPW2KJyni LxpJkHly1vXs9KQrlg8nZCJN5WikjTcyPR3gjujuNYpRZrs9P1mEpJaKguMWXpMecVrn 2IrZHrLeQv/ISa9MvmfQ3Fp1LQugkbMqBFEjE=
MIME-Version: 1.0
Received: by 10.216.136.89 with SMTP id v67mr1672856wei.47.1303238420576; Tue, 19 Apr 2011 11:40:20 -0700 (PDT)
Sender: barryleiba.mailing.lists@gmail.com
Received: by 10.216.242.71 with HTTP; Tue, 19 Apr 2011 11:40:20 -0700 (PDT)
In-Reply-To: <F5833273385BB34F99288B3648C4F06F1343319E22@EXCH-C2.corp.cloudmark.com>
References: <F5833273385BB34F99288B3648C4F06F1343319E22@EXCH-C2.corp.cloudmark.com>
Date: Tue, 19 Apr 2011 14:40:20 -0400
X-Google-Sender-Auth: RBUEy8Vf8zfWrLGTAp0omY-MR6M
Message-ID: <BANLkTimbvd4jL5LH5BGW=w2tkdyZjnf0PQ@mail.gmail.com>
From: Barry Leiba <barryleiba@computer.org>
To: "Murray S. Kucherawy" <msk@cloudmark.com>
Content-Type: text/plain; charset="ISO-8859-1"
Cc: ietf-822@imc.org, apps-discuss@ietf.org
Subject: Re: [apps-discuss] FW: Comments on Malformed Message BCP draft
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Apr 2011 18:40:23 -0000

On reading all the comments about this, and thinking about it myself,
I'm of a very mixed mind.

First: I have no sympathy for the comments that we should fix this
stuff in 5322, and not in some "add-on".  This is *not* "fixing"
anything.  This is *not* saying that any of the "malformed" messages
are now valid.  This is not changing anything at all in 5322.  What
this is doing is acknowledging that senders often violate 5322, and
that those violations are *wrong*.  What it adds is that it also
acknowledges the reality that, as Nathaniel and others have said, we
can't just throw those wrong messages away, and there's some value in
agreeing how to handle them.  This document -- or its final version --
is an attempt to document that agreement.

Agents along the way -- MSAs, MTAs, MDAs, and MUAs -- will make their
guesses and fix-ups, and I do think it's in the best interest of
everyone for us to document less-harmful avenues to take, as well as
roads to hell.  So I support this document for that reason.

On the other hand, Ned's right that the "best" (or least bad, or
whatever) way to handle each situation *is* likely to change over
time, so locking the advice into a BCP might not work well.  It's also
clear that some malformations will do best with complicated heuristics
beyond what'll be recommended here.  The appearance of a non-header
line in a message header is a perfect example of that.  Consider these
two fragments:

1:
   Date: xxx
   Subject: this is a badly
   continued header line
   MIME-Version: 1.0

   This is the body of the message.

2:
   Date: xxx
   MIME-Version: 1.0
   Subject: this is the subject
   I've improperly terminated the header here.
   This is the rest of the body.

The right answer for the two is different.  In (1), we don't want to
assume the "continued header line" is the beginning of the body, and
in (2) we don't want to try to treat the "I've improperly" line as a
continuation of the subject.  A better answer will be to look ahead a
bit to try to re-establish context and make a better guess than can be
done simply by applying a rule.

And yet number three here will break that too:

3:
   Date: xxx
   MIME-Version: 1.0
   Subject: this is the subject
   I've improperly terminated the header here.
   Look: You know it's complicated.

   This is rest of the body.


On the other hand, we know that some of these issues are
straightforward.  Why make everyone figure it all out from scratch?

In the end, the best we can do is to make recommendations to try to
get some consistency.  I think it's worth doing a document like this,
but it's not at all straightforward, and we'll have to be very careful
at every step to make a few things clear:

1. The appearance of these broken messages is BAD, and the BEST thing
is to fix the software that's generating them.

2. Sometimes, it really IS the right thing to
reject/refuse/bounce/whatever-you-want-to-call-it the message.

3. We do or don't have a sense of a reasonably safe guess to make for
this particular malformation.

4. When we do have a reasonably safe guess, here's what it is.

Barry