[apps-discuss] draft-ietf-appsawg-malformed-mail-03 review

Timo Sirainen <tss@iki.fi> Mon, 13 May 2013 10:04 UTC

Return-Path: <tss@iki.fi>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C65DF21F9423 for <apps-discuss@ietfa.amsl.com>; Mon, 13 May 2013 03:04:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -109.999
X-Spam-Level:
X-Spam-Status: No, score=-109.999 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_63=0.6, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aNBELp1nm23Y for <apps-discuss@ietfa.amsl.com>; Mon, 13 May 2013 03:03:57 -0700 (PDT)
Received: from dovecot.org (dovecot.org [193.210.130.67]) by ietfa.amsl.com (Postfix) with ESMTP id AA28321F9418 for <apps-discuss@ietf.org>; Mon, 13 May 2013 03:03:56 -0700 (PDT)
Received: from [192.168.10.100] (cs27091020.pp.htv.fi [89.27.91.20]) by dovecot.org (Postfix) with ESMTP id 13F8F1AE87A8 for <apps-discuss@ietf.org>; Mon, 13 May 2013 13:03:55 +0300 (EEST)
From: Timo Sirainen <tss@iki.fi>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Message-Id: <EB834F1D-495A-4354-BC65-2E8BB965D5F0@iki.fi>
Date: Mon, 13 May 2013 13:04:03 +0300
To: apps-discuss@ietf.org
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
X-Mailer: Apple Mail (2.1503)
Subject: [apps-discuss] draft-ietf-appsawg-malformed-mail-03 review
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 May 2013 10:04:02 -0000

A quick review from IMAP server developer's point of view:

> 7.  Line Terminaton
> 
>    Thus, handling agents MUST treat naked CRs and LFs as CRLFs when
>    interpreting the message.

Agreed for LFs, but doing this for CRs would often make the code much more complex. A simple strchr(str, '\n') to find a line termination would no longer work. I've also never seen naked CRs intended to be used as CRLFs, but I've seen them sometimes added due to some bug. I think naked CRs are a similar problem as NULs, and maybe should be just deleted or replaced by a space.

> 8.2.  Non-Header Lines
>        From: user@example.com {1}
>        To: userpal@example.net {2}
>        Subject: This is your reminder {3}
>        about the football game tonight {4}
>        Date: Wed, 20 Oct 2010 20:53:35 -0400 {5}

My code currently handles this by treating "about the football game tonight" as header key, which doesn't have a value.

>    The preferred implementation if option 4 above is not employed is to
>    apply the following heuristic when this malformation is detected:
> 

>    1.  Search forward for an empty line.  If one is found, then apply
>        option 3 above to the anomalous line, and continue.
> 
>    2.  Search forward for another line that appears to be a new header
>        field, i.e., a name followed by a colon.  If one is found, then
>        apply option 3 above to the anomalous line, and continue.

Having to search forward is always a pretty annoying way to deal with things. The searched text could even be megabytes after from the current position. My code usually does parsing by only looking up maybe 1 character forward.

> 8.7.  Eight-Bit Data
> 
>    Handling agents MUST reject messages containing null bytes that are
>    not encoded in some standard way, and SHOULD reject other non-ASCII
>    bytes that are similarly not encoded.  If rejection is not done, an
>    ASCII-compatible encoding such as those defined in [MIME] SHOULD be
>    used.

I agree with others that messages shouldn't be rejected because there are NULs or 8bit data in message body. Even in headers hotmail used to send 8bit data directly, but hopefully that's not as common anymore. My code replaces NULs with 0x80 characters (as does UW-IMAP) when sending them to IMAP clients.

> 9.2.  Missing MIME-Version Field

Agreed with Ned that a missing MIME-Version should be allowed (= treated as if it exists with value 1.0). Dovecot used to strictly require it, but after enough complaints from users I changed it so it's no longer required.