[apps-discuss] Comments on Malformed Message BCP draft

Simon Tyler <styler@mimecast.com> Thu, 14 April 2011 09:58 UTC

Return-Path: <styler@mimecast.com>
X-Original-To: apps-discuss@ietfc.amsl.com
Delivered-To: apps-discuss@ietfc.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfc.amsl.com (Postfix) with ESMTP id EA79BE06AD for <apps-discuss@ietfc.amsl.com>; Thu, 14 Apr 2011 02:58:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([208.66.40.236]) by localhost (ietfc.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fl1FPU+PZ0ZU for <apps-discuss@ietfc.amsl.com>; Thu, 14 Apr 2011 02:58:53 -0700 (PDT)
Received: from serviceA.mimecast.com (serviceA.mimecast.com [213.235.63.104]) by ietfc.amsl.com (Postfix) with ESMTP id 6C4FEE0669 for <apps-discuss@ietf.org>; Thu, 14 Apr 2011 02:58:53 -0700 (PDT)
Received: from remote.mimecast.com (146.101.202.133 [146.101.202.133]) (Using TLS) by serviceA.mimecast.com; Thu, 14 Apr 2011 10:58:46 +0100
Received: from MC-LON-EXCH02.mcsltd.internal ([::1]) by MC-LON-EXCH02.mcsltd.internal ([::1]) with mapi id 14.01.0255.000; Thu, 14 Apr 2011 10:58:42 +0100
From: Simon Tyler <styler@mimecast.com>
To: "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Thread-Topic: Comments on Malformed Message BCP draft
Thread-Index: Acv6ioaMMlVzZylWW0aABPApC7cyaQ==
Date: Thu, 14 Apr 2011 09:58:42 +0000
Message-ID: <C9CC83DE.7031%styler@mimecast.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-Entourage/13.9.0.110114
x-originating-ip: [86.152.50.249]
MIME-Version: 1.0
X-MC-Unique: 111041410584600202
Content-Type: multipart/alternative; boundary="_000_C9CC83DE7031stylermimecastcom_"
Subject: [apps-discuss] Comments on Malformed Message BCP draft
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Apr 2011 09:58:55 -0000

Hi,

Having read the Malformed Message BCP draft I am interested in getting some discussion going on its content and to find the best way forward.

For those who missed it, the draft is at:

https://www1.tools.ietf.org/html/draft-kucherawy-mta-malformed-00

I have a few comments on it.

One thing that concerns me is the proposal that processing should stop when certain malformations are identified.

For example it is proposed that should a badly wrapped header field be found (quite how we define this is left open, I presume a line that does not start with a valid header field name followed by a colon) then the processing agent should treat this as the end of the header. My feeling is that this opens up a greater potential hole than the one closed and that can be exploited.

An example of the type of issue this could is cause is that should such a malformation occur before the MIME header fields in the header then this would cause the rest of the header and the message body to be treated as plain text. This could cause content analysis system to fail as they would not interpret the MIME content in the way that was presumably intended.

Given that these recommendations are unlikely to be followed by all clients and servers, I feel that this suggestion will allow content through an agent without suitable processing. My preference on this particular malformation would be to continue processing the header fields, this is based on the assumption that what follows the malformed header field is more likely to be further header fields and not body content. What we do with the malformed header field I am less certain about. We could just ignore it or we could treat it as part of the previous header field - both will be right as often as they wrong. I would welcome some additional thoughts on this.

I have similar feelings about some of the other suggestions including the lack of a MIME-Version header. We cannot ignore intended meaning just because a non-compliant application made a small error in the header fields. Everyone will be best served if we subject such messages to more analysis, not less.

On the whole I think a set of guidelines in this area is good but it will be hard to reach consensus without agreement on some basic underlying principles.  I would suggest that one such principle is to try to do what the intended recipient would most likely prefer, which is generally to fix and deliver non-spam messages.

I would also propose some additions to the draft. At Mimecast we see a lot of messages generated by all sorts of systems and amongst these we see a lot of different kinds of message malformations.

I'll suggest more as I think of them but for starters here are a few:

1. Excessively long lines in both headers and body. I commonly see lines that are several hundred Kbs in length. These are often valid messages in the sense that the content is desired by the receiver and in all respects other than line length are well formed. Obviously a limit has to be enforced and I would like to find a consensus on what sort of limit is reasonable. Initially I felt 8K was a good limit - it is after all 8 times the limit in RFC 5321. But it appears that this is too small a limit in real situations. When the limit is exceeded, what is the best option – a rejection or  forced line wrap. I am open to both.

2. Invalid characters in headers. I often see headers with un-encoded 8bit characters. These are often displayed correctly to the recipient where the client happens upon the correct character set by virtue of chance.

3. 8bit characters in MIME sections with a content-transfer-encoding of 7bit.

If you have read this far then I think you will agree with me that Murray has made a good start on a much needed set of guidelines. Now let's see if we can support him to expand on the work he has done and reach a consensus on the best approaches.

Simon