Re: [apps-discuss] Review of: draft-ietf-appsawg-malformed-mail-03
"Murray S. Kucherawy" <superuser@gmail.com> Mon, 06 May 2013 06:42 UTC
Return-Path: <superuser@gmail.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 69E1A21F856D for <apps-discuss@ietfa.amsl.com>; Sun, 5 May 2013 23:42:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.973
X-Spam-Level:
X-Spam-Status: No, score=-0.973 tagged_above=-999 required=5 tests=[AWL=-1.574, BAYES_50=0.001, HTML_MESSAGE=0.001, J_CHICKENPOX_83=0.6, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i1KPTU7czN7v for <apps-discuss@ietfa.amsl.com>; Sun, 5 May 2013 23:42:23 -0700 (PDT)
Received: from mail-wi0-x231.google.com (mail-wi0-x231.google.com [IPv6:2a00:1450:400c:c05::231]) by ietfa.amsl.com (Postfix) with ESMTP id 6459021F8E5D for <apps-discuss@ietf.org>; Sun, 5 May 2013 23:42:17 -0700 (PDT)
Received: by mail-wi0-f177.google.com with SMTP id hq12so2204722wib.10 for <apps-discuss@ietf.org>; Sun, 05 May 2013 23:42:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=+9uglJA78W0cngvatsGUgZcC5o1SOhmw5y488mNitgE=; b=Q5FlA4/KhpqbGy/e596D12nWBOXRyzT29lq2t5udFUcO9T/LWuJZ284awoqfm+TU/F SFJpNnukwCXXMwuKftOdQoYCGn01ECkr5Iu3wm+Jxy5Y+EkNmooRN7Tzf+LpPc8u8dbY o61QoVYN9/lSMZGcj0vu9eSZm2/njUJ8GPI1rt35a6VXPKNMeCboidiXlji92MTtntN0 LefQiHaaSwWO0njL8uqvcDYfNWwGPQrw7PuHrv9RogLiy4Zic8skXZBb0f0viBMftBqt Iow/ikIqiMDEJgrnIw9DGO+M+z/C5+rhTwsewItFLtq5Ioavi/UGwSfk0LTqp7lA9uNY 3eTQ==
MIME-Version: 1.0
X-Received: by 10.194.59.208 with SMTP id b16mr23516940wjr.15.1367822536436; Sun, 05 May 2013 23:42:16 -0700 (PDT)
Received: by 10.180.14.34 with HTTP; Sun, 5 May 2013 23:42:16 -0700 (PDT)
In-Reply-To: <51657E80.8070208@bbiw.net>
References: <51657E80.8070208@bbiw.net>
Date: Sun, 05 May 2013 23:42:16 -0700
Message-ID: <CAL0qLwb-Aj+Te2uYJZo8g5LR4B6JREPFATTPSLGf_L4LvgMrZQ@mail.gmail.com>
From: "Murray S. Kucherawy" <superuser@gmail.com>
To: Dave Crocker <dcrocker@bbiw.net>
Content-Type: multipart/alternative; boundary="047d7b86de326670fd04dc07005f"
Cc: Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] Review of: draft-ietf-appsawg-malformed-mail-03
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 May 2013 06:42:30 -0000
Thanks, Dave. I'm waiting for my co-author to get back to me on two points in your review that I can't really answer, and then we'll at long last post a new version for the WG to review. I'll try to solicit a couple more reviews and then suggest Salvatore start WGLC on it. -MSK, hatless this time On Wed, Apr 10, 2013 at 8:00 AM, Dave Crocker <dcrocker@bbiw.net> wrote: > > Review of: Advice for Safe Handling of Malformed Messages > > I-D: draft-ietf-appsawg-malformed-**mail-03 > > Reviewer: D. Crocker > > Review Date: 10 April 2013 > > > Summary: > > Internet Mail has always been marked by an unfortunate degree of > regular and permitted non-conformance to its formal specifications. The > current draft seeks to categorize and discuss common types of > non-conformance and to provide some guidance for how it should be handled. > The document is explicit in stating that it does not have the goal of > standardizing this guidance. > > The document is reasonably clear and complete. I believe a document > like this can provide very helpful guidance for email developers and > operators. It would be useful in its current form, but could greatly > benefit from some modification. > > One major concern, which is easily remedied, is the draft's use of > normative language. The document is often unusually careful to use > qualifying language that precisely limits the scope of the normative text > to "a module compliant with this memo". However I think this is too subtle > for most readers and that the use of normative language defeats the stated > limitation of not wanting to define a standard. Hence I changing all such > language and, instead, using language that is clearly only modest "advice", > such as with: > > * a common handling is... > * it is best to... > * it will typically be safe and helpful to... > > and so on. > > > > Detailed Comments: > > > Abstract >> >> The email ecosystem has long had a very permissive set of common >> processing rules in place, despite increasingly rigid standards >> governing its components, ostensibly to improve the user experience. >> > > Although Internet mail formats have been precisely defined since the > 1970s, authoring and handling software often show only mild conformance to > the specifications. The distributed and non-interactive nature of email > has often prompted adjustments to receiving software, to handle these > variations, rather than trying to gain better conformance by senders, since > the receiving operator is primarily driven by complaining recipient users > and has no authority over the sending side of the system. > > > The handling of these come at some cost, and various components are >> > > Processing with such flexibility comes at some cost, since mail > software is faced with... > > > faced with decisions about whether or not to permit non-conforming >> messages to continue toward their destinations unaltered, adjust them >> to conform (possibly at the cost of losing some of the original >> message), or outright rejecting them. >> > > A core requirement for interoperability is that both sides to an > exchange work from the same details and semantics. By having receivers be > flexible, beyond the specifications, there can -- and often has been -- a > good chance that a message will not be fully interoperable. Worse, a > well-established pattern of tolerance for variations can sometimes be used > as an attack vector. > > > This document includes a collection of the best advice available >> regarding a variety of common malformed mail situations, to be used >> as implementation guidance. It must be emphasized, however, that the >> intent of this document is not to standardize malformations or >> otherwise encourage their proliferation. The messages are manifestly >> malformed, and the code and culture that generates them needs to be >> fixed. Therefore, these messages should be rejected outright if at >> all possible. Nevertheless, many malformed messages from otherwise >> legitimate senders are in circulation and will be for some time, and, >> unfortunately, commercial reality shows that we cannot always simply >> reject or discard them. Accordingly, this document presents >> alternatives for dealing with them in ways that seem to do the least >> additional harm until the infrastructure is tightened up to match the >> standards. >> > > > >> 1. Introduction >> >> 1.1. The Purpose Of This Work >> >> The history of email standards, going back to [RFC822] and beyond, >> > > { here I actually suggest citing RFC 733, since it managed to establish > the solid foundation, with 822 being a relatively small modification. 733 > was not the first formal standard, but the first had poor adoption. /d} > > > contains a fairly rigid evolution of specifications. But >> implementations within that culture have also long had an >> undercurrent known formally as the robustness principle, but also >> known informally as Postel's Law: "Be conservative in what you do, be >> liberal in what you accept from others." >> > > Jon Postel's directive is often misinterpreted to mean that any > deviance from a specification is acceptable. Rather, it was intended only > to account for legitimate variations in interpretation /within > specifications, as well as basic transit errors, like bit errors. Taken to > its unintended extreme, excessive tolerance would imply that there are no > limits to the liberties that a sender might take, while presuming a burden > on a receiver to "correctly" guess at the meaning of any such variation. > > {BTW, I believe Postel's Law was not the motivating reason for email > format deviations. Rather, I think that receiver's were accountable to > their users -- the recipients -- while having no control over the > misbehaving senders. So they/we hacked receiving code when necessary, to > appease the users. /d } > > > In general, this served the email ecosystem well by allowing a few >> errors in implementations without obstructing participation in the >> game. The proverbial bar was set low. However, as we have evolved >> into the current era, some of these lenient stances have begun to >> expose opportunities that can be exploited by malefactors. Various >> email-based applications rely on strong application of these >> standards for simple security checks, while the very basic building >> blocks of that infrastructure, intending to be robust, fail utterly >> to assert those standards. >> >> This document presents some areas in which the more lenient stances >> can provide vectors for attack, and then presents the collected >> wisdom of numerous applications in and around the email ecosystem for >> dealing with them to mitigate their impact. >> >> 1.2. Not The Purpose Of This Work >> >> It is important to understand that this work is not an effort to >> endorse or standardize certain common malformations. The code and >> culture that introduces such messages into the mail stream needs to >> be repaired, as the security penalty now being paid for this lax >> processing arguably outweighs the reduction in support costs to end >> users who are not expected to understand the standards. However, the >> reality is that this will not be fixed quickly. >> >> Given this, it is beneficial to provide implementers with guidance >> about the safest or most effective way to handle malformed messages >> when they arrive, taking into consideration the tradeoffs of the >> choices available especially with respect to how various actors in >> the email ecosystem respond to such messages in terms of handling, >> parsing, or rendering to end users. >> >> 1.3. General Considerations >> >> Many deviations from message format standards are considered by some >> receivers to be strong indications that the message is undesirable, >> i.e., is spam or contains malware. Such receivers quickly decide >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 4] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> that the best handling choice is simply to reject or discard the >> message. This means malformations caused by innocent >> misunderstandings or ignorance of proper syntax can cause messages >> with no ill intent also to fail to be delivered. >> >> Senders that want to ensure message delivery are best advised to >> adhere strictly to the relevant standards (including, but not limited >> to, [MAIL], [MIME], and [DKIM]), as well as observe other industry >> best practices such as may be published from time to time either by >> the IETF or independently. >> >> 2. Document Conventions >> >> 2.1. Key Words >> >> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", >> "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this >> document are to be interpreted as described in [KEYWORDS]. However, >> they only have that meaning in this document when they are presented >> entirely in upper case. >> > > > { While the document is clear that its use of normative language is meant > to apply only to those implementations choosing to conform to this > document, the document itself says -- appropriately, IMO -- that it is not > trying to standardize these behaviors. It's therefore confusing and > probably counter-productive to use normative language. I strongly urge > dropping all such language and, instead, only offering modest "advice" with > language like: > > * a common handling is... > * it is best to... > * it will typically be safe and helpful to... > > and so on. /d} > > > 2.2. Examples >> >> Examples of message content include a number within braces at the end >> of each line. These are line numbers for use in subsequent >> discussion, and are not actually part of the message content >> presented in the example. >> >> Blank lines are not numbered in the examples. >> >> 3. Background >> >> The reader would benefit from reading [EMAIL-ARCH] for some general >> background about the overall email architecture. Of particular >> interest is the Internet Message Format, detailed in [MAIL]. >> Throughout this document, the use of the term "messsage" should be >> > > { Freud possibly at work for this missspellling? /d} > > > assumed to mean a block of text conforming to the Internet Message >> Format. >> >> 4. Internal Representations >> >> Any agent handling a message could have one or two (or more) distinct >> > > As an agent parses and processes a message, it might create a number > of distinct representations for the message. > > > representations of a message it is handling. One is an internal >> representation, such as a block of storage used for the header and a >> block for the body. These may be sorted, encoded, decoded, etc., as >> per the needs of that particular module. The other is the >> representation that is output to the next agent in the handling >> chain. This might be identical to the version that is input to the >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 5] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> module, or it might have some changes such as added or reordered >> header fields, body modifications to remove malicious content, etc. >> >> In some cases, advice is provided only for internal representations. >> However, there is often occasion to mandate changes to the output as >> well. >> > > { What does this last sentence mean? "Mandate"? Perhaps what is meant > is: /d} > > However it is sometimes necessary to make changes between the input > and output versions, as well. > > > >> 5. Invariate Content >> > > Invariant {?} > > > >> Experience has shown that it is beneficial to ensure that, from the >> first analysis agent at ingress into the destination Administrative >> Management Domain (ADMD; see [EMAIL-ARCH]) to the agent that actually >> affects delivery to the end user, the message each agent sees is >> > > { This is an artfully-crafted sentence, but it would be easier to read if > broken into parts. Perhaps: /d} > > An especially interesting handling sequence occurs within the > destination Administrative Management Domain (ADMD; see [EMAIL-ARCH]). From > ingress to the ADMD, through the boundary agent, until delivery to the end > user, it is beneficial to ensure that each agent sees an identical form for > the message. > > > identical. Absent this, it can be impossible for different agents in >> the chain to make assertions about the content that correlate. >> > > > the chain to make consistent assertions about the content. > > > For example, suppose a handling agent records that a message had some >> specific set of properties at ingress to the ADMD, then permitted it >> to continue inbound. Some other agent alters the content for some >> reason. The user, on viewing the delivered content, reports the >> message as abusive. If the report is based on the set of properties >> > > message as abuse. However, report processing often takes place at, > or close to, the original point of ingress and is likely to be based on the > set of properties recorded there, rather than at the user's system. > > > recorded at ingress, then the complaint effectively references a >> message different from what the user saw, which could render the >> complaint inactionable. Similarly, a message with properties that a >> filtering agent might use to reject an abusive message could be >> allowed to reach the user if an intermediate agent altered the >> message in a manner that alters one of those properties, thwarting >> detection of the abuse. >> > > { awkward sentence structure. d/} > > > Therefore, agents comprising an inbound message processing >> > > comprising an inbound -> within an integrated message > > {or should this simply say 'within an ADMD'? /d} > > > environment SHOULD ensure that each agent sees the same content, and >> the message reaches the end user unmodified. An exception to this is >> content that is identitfied as certainly harmful, such as some kind >> of malicious executable software included in the message. >> > > {the 'exception' sentence is far too specific. There are, no doubt, many > reasons for deviating from this recommendation. Simpler, safer and > non-normative wording would be: /d} > > environment will simplify operational concerns by ensuring that each > agent receives the same content -- except for the usual handling agent > trace information additions -- and that this is what reaches the end user, > unmodified. Various policies, such as special handling for detected > message abuse, will make exceptions appropriate. > > > 6. Mail Submission Agents >> >> Within the email context, the single most influential component that >> can reduce the presence of malformed items in the email system is the >> Mail Submission Agent (MSA). This is the component that is >> essentially the interface between end users that create content and >> the mail stream. >> > > the Mail Handling Service (MHS) [EMAIL-ARCH] > > > The lax processing described earlier in the document creates a high >> > > {this first sentence is out of place. the earlier discussion in the > document established the need for better conformance; it doesn't need to be > sold here, again. /d} > > > support and security cost overall. Thus, MSAs MUST evolve to become >> more strict about enforcement of all relevant email standards, >> especially [MAIL] and the [MIME] family of documents. >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 6] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> Relay Mail Transport Agents (MTAs) SHOULD also be more strict; >> > > Relay -> Relaying > > { This pseudo-normative phrasing does nothing helpful, since it isn't > actually specifying anything. Modify to something like: /d} > > More strict conformance by relaying MTAs also will be helpful. > Although > > > although preventing the dissemination of malformed messages is >> desirable, the rejection of such mail already in transit also has a >> support cost, namely the creation of a [DSN] that many end users >> might not understand. >> >> 7. Line Terminaton >> >> The only valid line separation sequence in messaging is ASCII 0x0D >> > > For interoperable Internet Mail messages, the only valid... > > > ("carriage return", or CR) followed by ASCII 0x0A ("line feed", or >> LF), commonly referred to as CRLF. Common UNIX user tools, however, >> typically only use LF for line termination. This means the protocol >> has to convert LF to CRLF before transporting a message. >> > > for internal line termination. This means that a protocol engine, > which converts between Unix and Internet Mail formats, has to convert > between these two end-of-line representations, before transmitting a > message or after receiving it. > > > Naive implementations can cause messages to be transmitted with a mix >> > > { These aren't "naive". They are quite simply broken! d/} > > Implementations that do not conform to Internet Mail standards > sometimes cause messages to be transmitted... > > > of line terminations, such as LF everywhere except CRLF only at the >> end of the message. According to [SMTP], this means the entire >> message actually exists on a single line. >> > > { also RFC 5322! /d} > > > A "naked" CR or LF in a message has no reasonable justification, and >> > > { this is wrong. they have legitimate presentation uses, albeit pretty > archaic at this point. Better: /d } > > Within modern Internet Mail it is highly unlikely that an isolated CR > or LF is valid, in common ASCII text. Furthermore [MIME]... > > > furthermore [MIME] presents mechanisms for encoding content that >> actually does need to contain such an unusual character sequence. >> >> Thus, handling agents MUST treat naked CRs and LFs as CRLFs when >> interpreting the message. >> > > Thus, it will typically be safe and helpful to treat a naked CR or LF > as equivalent to a CRLF, when parsing a message. > > > 8. Header Anomalies >> >> This section covers common syntactical and semantic anomalies found >> in headers of messages, and presents preferred mitigations. >> > > in a message header, and > > > 8.1. Converting Obsolete and Invalid Syntaxes >> >> There are numerous cases of obsolete header syntaxes that can be >> applied to confound agents with variable processing. This section >> > > { The phrasing of the first sentence sounds as if confounding is a goal. > If it's meant that way, say it more clearly. If it isn't, perhaps: /d } > > A message using an obsolete header syntax might confound an agent > that is attempting to be robust in its handling of syntax variations. > > > presents some examples of these. Messages including them SHOULD be >> > > { 'of these'? of which? /d} > > { Why reject this particular set? What about others, outside these > examples? Again, phrase this non-normatively. /d} > > > rejected; where this is not possible, RECOMMENDED internal >> interpretations are provided. >> >> 8.1.1. Host-Address Syntax >> >> The following obsolete syntax: >> > > The following obsolete syntax that attempts to specify source routing: > > { explain, or perhaps even cite the old ABNF rule for it /d} > > > >> To: <@example.net:fran@example.com**> >> >> should be interpreted as: >> > > can safely be interpreted as: > > > To: <fran@example.com> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 7] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 8.1.2. Excessive Angle Brackets >> >> The following over-use of angle brackets, e.g.: >> >> To: <<<user2@example.org>>> >> >> should be interpreted as: >> > > can safely be interpreted as: > > > To: <user2@example.org> >> >> 8.1.3. Unbalanced Angle Brackets >> >> The following use of unbalanced angle brackets: >> >> To: <another@example.net >> To: second@example.org> >> >> should be interpreted as: >> > > can usually be treated as: > > > To: <another@example.net> >> To: second@example.org >> >> 8.1.4. Unbalanced Parentheses >> >> The following use of unbalanced parentheses: >> >> To: (Testing <fran@example.com> >> To: Testing) <sam@example.com> >> >> should be interpreted as: >> >> To: (Testing) <fran@example.com> >> To: "Testing)" <sam@example.com> >> >> 8.1.5. Unbalanced Quotes >> >> The following use of unbalanced quotation marks: >> >> To: "Joe <joe@example.com> >> >> should be interpreted as: >> >> To: "Joe <joe@example.com>"@example.net >> > > { WTF??? And why is this a good fixup, especially given concerns about > display-name attack vectors? /d} > > > where "example.net" is the domain name or host name of the handling >> agent making the interpretation. >> >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 8] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 8.2. Non-Header Lines >> >> It has been observed that some messages contain a line of text in the >> > > Some messages contain a line of... > > > header that is not a valid message header field of any kind. For >> example: >> >> From: user@example.com {1} >> To: userpal@example.net {2} >> Subject: This is your reminder {3} >> about the football game tonight {4} >> Date: Wed, 20 Oct 2010 20:53:35 -0400 {5} >> >> Don't forget to meet us for the tailgate party! {7} >> >> The cause of this is typically a bug in a message generator of some >> kind. Line {4} was intended to be a continuation of line {3}; it >> should have been indented by whitespace as set out in Section 2.2.3 >> of [MAIL]. >> >> This anomaly has varying impacts on processing software, depending on >> the implementation: >> >> 1. some agents choose to separate the header of the message from the >> body only at the first empty line (i.e. a CRLF immediately >> followed by another CRLF); >> >> 2. some agents assume this anomaly should be interpreted to mean the >> body starts at line {4}, as the end of the header is assumed by >> encountering something that is not a valid header field or folded >> portion thereof; >> >> 3. some agents assume this should be interpreted as an intended >> header folding as described above and thus simply append a single >> space character (ASCII 0x20) and the content of line {4} to that >> of line {3}; >> >> 4. some agents reject this outright as line {4} is neither a valid >> header field nor a folded continuation of a header field prior to >> an empty line. >> >> This can be exploited if it is known that one message handling agent >> will take one action while the next agent in the handling chain will >> take another. Consider, for example, a message filter that searches >> message headers for properties indicative of abusive of malicious >> content that is attached to a Mail Transfer Agent (MTA) implementing >> option 2 above. An attacker could craft a message that includes this >> malformation at a position above the property of interest, knowing >> the MTA will not consider that content part of the header, and thus >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 9] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> the MTA will not feed it to the filter, thus avoiding detection. >> Meanwhile, the Mail User Agent (MUA) which presents the content to an >> end user, implements option 1 or 3, which has some undesirable >> effect. >> >> It should be noted that a few implementations choose option 4 above >> since any reputable message generation program will get header >> folding right, and thus anything so blatant as this malformation is >> likely an error caused by a malefactor. >> >> The preferred implementation if option 4 above is not employed is to >> apply the following heuristic when this malformation is detected: >> >> 1. Search forward for an empty line. If one is found, then apply >> option 3 above to the anomalous line, and continue. >> >> 2. Search forward for another line that appears to be a new header >> field, i.e., a name followed by a colon. If one is found, then >> apply option 3 above to the anomalous line, and continue. >> >> 8.3. Unusual Spacing >> >> The following message is valid per [MAIL]: >> >> From: user@example.com {1} >> To: userpal@example.net {2} >> Subject: This is your reminder {3} >> {4} >> about the football game tonight {5} >> Date: Wed, 20 Oct 2010 20:53:35 -0400 {6} >> >> Don't forget to meet us for the tailgate party! {8} >> >> Line {4} contains a single whitespace. The intended result is that >> lines {3}, {4}, and {5} comprise a single continued header field. >> However, some agents are aggressive at stripping trailing whitespace, >> which will cause line {4} to be treated as an empty line, and thus >> the separator line between header and body. This can affect header- >> specific processing algorithms as described in the previous section. >> >> Ideally, this case simply ought not to be generated. >> > > {This sentence is entirely gratuitous. Replace it with: d/ } > > This example was legal in earlier versions of the Internet Mail > format standard. > > >> Message handling agents receiving a message bearing this anomaly MUST >> behave as if line {4} was not present on the message, and SHOULD emit >> a version in which line {4} has been removed. >> > > The best handling of this example is for a message parsing engine to > behave as if line {4} was not present in the message and for a message > creation engine to emit the message with line {4} removed,. > > > >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 10] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 8.4. Header Malformations >> >> There are various malformations that exist. A common one is >> > > { The first sentence is pretty obvious: there are always lots of ways to > screw up. I suggest dropping it and beginning with something like: /d} > > Among the many possible malformations, a common one is... > > > insertion of whitespace at unusual locations, such as: >> >> From: user@example.com {1} >> To: userpal@example.net {2} >> Subject: This is your reminder {3} >> MIME-Version : 1.0 {4} >> Content-Type: text/plain {5} >> Date: Wed, 20 Oct 2010 20:53:35 -0400 {6} >> >> Don't forget to meet us for the tailgate party! {8} >> >> Note the addition of whitespace in line {4} after the header field >> name but before the colon that separates the name from the value. >> >> The acceptance grammar of [MAIL] permits that extra whitespace, so it >> cannot be considered invalid. However, a consensus of >> implementations prefers to remove that whitespace. There is no >> perceived change to the semantics of the header field being altered >> as the whitespace is itself semantically meaningless. Thus, a module >> compliant with this memo MUST remove all whitespace after the field >> name but before the colon, and MUST emit that version of that field >> on output. >> > > Therefore, it is best to remove all whitespace after the field name > but before the colon and to emit the field in this modified form. > > > 8.5. Header Field Counts >> >> Section 3.6 of [MAIL] prescribes specific header field counts for a >> valid message. Few agents actually enforce these in the sense that a >> message whose header contents exceed one or more limits set there are >> generally allowed to pass; they may add any required fields that are >> > > ; they typically add any... > > > missing, however. >> >> Also, few agents that use messages as input, including Mail User >> Agents (MUAs) that actually display messages to users, verify that >> the input is valid before proceeding. Two popular open source >> filtering programs and two popular Mailing List Management (MLM) >> > > { I suggest changing 'two' to 'some', since the number might change; > there's no reason to make this document get out of date for such a minor > issue. /d } > > > packages examined at the time this document was written select either >> > > { hence, remove "examined at the time this document was written" /d } > > > the first or last instance of a particular field name, such as From, >> to decide who sent a message. Absent enforcement of [MAIL], an >> > > Absent strict enforcement > > > attacker can craft a message with multiple fields if that attacker >> knows the filter will make a decision based on one but the user will >> be shown the other. >> >> This situation is exacerbated when a claim of message validity is >> inferred by something like a valid [DKIM] signature. Such a >> signature might cover one instance of a constrained field but not >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 11] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> another, and a naive consumer of DKIM's output, not realizing which >> one was covered by a valid signature, could presume the wrong one was >> the "good" one. An MUA, for example could show the first of two From >> fields as "good" or "safe" while the DKIM signature actually only >> verified the second. >> > > { DKIM signatures do not verify addresses outside d=. While the problem > you are describing is, of course, real, it's far more complicated than > you've described here. Perhaps: /d } > > when message validity is assessed, such as through enhanced > authentication methods. Such methods might cover one instance of a > constrained field but not another, taking the wrong one as "good" or "safe". > > > Thus, an agent compliant with this specification MUST enact one of >> the following: >> > > In attempting to counter this exposure, one of the following can be > enacted: > > > 1. reject outright or refuse to process further any input message >> that does not conform to Section 3.6 of [MAIL]; >> >> 2. remove or, in the case of an MUA, refuse to render any instances >> of a header field whose presence exceeds a limit prescribed in >> Section 3.6 of [MAIL] when generating its output; >> >> 3. alter the name of any header field whose presence exceeds a limit >> prescribed in Section 3.6 of [MAIL] when generating its output so >> that later agents can produce a consistent result. Any >> alteration likely to cause the field to be ignored by downstream >> agents is acceptable. A common approach is to prefix the field >> names with a string such as "BAD-". >> > > { it would help if there were some rationales or analyses of the tradeoffs > amongst these kinds of choices, to help the implementer/operator decide > when to use which. /d } > > > > 8.6. Missing Header Fields >> >> Similar to the previous section, there are messages seen in the wild >> that lack certain required header fields. For example, [MAIL] >> requires that a From and Date field be present in all messages. >> > > { I think these aren't 'examples' but constitute the entire list. If > there are other required fields that can be classed as 'missing', this > section should list them. Also, since Message-ID isn't 'required', the > phrasing here doesn't quite match what's discussed in the section. Might be > worth distinguishing "required but missing" vs. "optional but really useful > and worth synthesizing". Synthesizing the latter probably isn't dangerous. > Synthesizing the former always is... d/} > > > >> When presented with a message lacking these fields, the MTA might >> perform one of the following: >> >> 1. Make no changes >> >> 2. Add an instance of the missing field(s) using synthesized content >> > > 3. Reject the message > > > Option 2 is RECOMMENDED for handling this case. Handling agents >> > > { Wow! Synthesizing a From: field strikes me as especially dangerous, in > all cases. The rationale provided, below, needs to state this and, I > believe, explain how and why it is worth incurring. The explanation that is > provided essentially define this hack as an attack vector... /d} > > > SHOULD add these for internal hanlding if they are missing, but MUST >> NOT add them to the external representation. The reason for this >> requirement is that there are some filter modules that would consider >> the absence of such fields to be a condition warranting special >> treatment (e.g., rejection), and thus the effectiveness of such >> modules would be stymied by an upstream filter adding them. >> >> The synthesized fields SHOULD contain a best guess as to what should >> have been there; for From, the SMTP MAIL command's address can be >> used (if not null) or a placeholder address followed by an address >> literal (e.g., unknown@[192.0.2.1]); for Date, a date extracted from >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 12] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> a Received field is a reasonable choice. >> >> One other important case to consider is a missing Message-Id field. >> An MTA that encounters a message missing this field SHOULD synthesize >> a valid one using techniques described above and add it to the >> external rpresentation, since many deployed tools use the content of >> that field as a common unique message reference, so its absence >> inhibits correlation of message processing. One possible synthesis >> would be based on based on an encoding of the current date/time and >> an internal MTA ID (e.g., queue ID) followed by @ and the fully >> qualified hostname of the machine synthesizing the header value. For >> example: >> >> tm = gmtime(&now); >> (void) snprintf(buf, sizeof(buf), "%04d%02d%02d%02d%02d.%s@%s", >> tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday, >> tm->tm_hour, tm->tm_min, queueID, fqhn); >> >> 8.7. Eight-Bit Data >> >> Standards-compliant mail messages do not contain any non-ASCII data >> without indicating that such content is present by means of published >> [SMTP] extensions. Absent that, [MIME] encodings are typically used >> > > Overall, the document sometimes mixes transfer issues with data > representation (object) issues, in ways that can be confusing. This > paragraph is one of those. It's worth the extra verbosity to label each > clearly and separately. So, for example: > > Standards-compliant mail messages that contain non-ASCII data are > required to self-label this through the use of [MIME]. If the > representation of the non-ASCII data is in an 8-bit mode (rather than > special encoding so that it retains a 7-bit base), then this must be > signaled through the use of [SMTP] extensions. > > > > without indicating that such content is present by means of published > > [SMTP] extensions. > > to convert non-ASCII data to ASCII in a way that can be reversed by >> other handling agents or end users. >> >> Non-ASCII data otherwise found in messages can confound code that is >> used to analyze content. For example, a null (ASCII 0x00) byte >> inside a message can cause typical string processing functions to >> mis-identify the end of a string, which can be exploited to hide >> malicious content from analysis processes. >> >> Handling agents MUST reject messages containing null bytes that are >> not encoded in some standard way, and SHOULD reject other non-ASCII >> bytes that are similarly not encoded. If rejection is not done, an >> ASCII-compatible encoding such as those defined in [MIME] SHOULD be >> used. >> > > { Hmmm. It occurs to me that the document might be helped by an early > discussion about a/the 'philosophy' that guides choosing whether to reject > a message versus repair it. But I don't have any clever text to suggest > for doing this... /d} > > > >> 9. MIME Anomalies >> >> [MIME], et seq, define a mechanism of message extensions for >> > > { perhaps quibbling, but since MIME does a variety of things, including > this one, I suggest: > > define -> includes > > > providing text in character sets other than ASCII, non-text >> attachments to messages, multi-part message bodies, and similar >> facilities. >> >> Some anomalies with MIME-compliant generation are also common. This >> section discusses some of those and presents preferred mitigations. >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 13] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 9.1. Header Field Names >> >> [MAIL] permits header field names to begin with "--". This means >> that a header field name can look like a [MIME] multipart boundary. >> For example: >> >> --foo:bar >> >> This is a legal header field, whose name is "--foo" and whose value >> is "bar". Thus, consider this header: >> >> From: user@example.com {1} >> To: userpal@example.net {2} >> Subject: This is your reminder {3} >> Date: Wed, 20 Oct 2010 20:53:35 -0400 {4} >> MIME-Version: 1.0 {5} >> Content-Type: multipart/mixed; boundary="foo:bar" {6} >> --foo:bar {7} >> Malicious-Content: muahaha {8} >> >> One implementation could observe that line {7} announces the >> beginning of the first MIME part while another considers it a part of >> the message's header. >> >> If rejection of such messages cannot be done, agents MUST treat line >> {7} as part of the message's header block and not a MIME boundary. >> > > { Under what circumstances can rejection /not/ be done??? And what is > involved in even detecting that it isn't a mime boundary? d/ } > > > >> 9.2. Missing MIME-Version Field >> >> Any message that uses [MIME] constructs is required to have a MIME- >> Version header field. Without them, the Content-Type and associated >> fields have no semantic meaning. >> > > them -> it > > > It is often observed that a message has complete MIME structure, yet >> lacks this header field. >> >> As described at the end of Section 8.2, this is not expected from a >> > > this -> this omission > > > reputable content generator and is often an indication of mass- >> produced spam or other undesirable messages. >> >> Therefore, an agent compliant with this specification MUST internally >> enact one or more of the following in the absence of a MIME-Version >> header field: >> >> 1. Ignore all other MIME-specific fields, even if they are >> syntactically valid, thus treating the entire message as a >> single-part message of type text/plain; >> > > { Offhand, this sounds like a potentially-interesting attack vector. /d} > > > >> Kucherawy & Shapiro Expires April 12, 2013 [Page 14] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 2. Remove all other MIME-specific fields, even if they are >> syntactically valid, both internally and when emitting the output >> version of the message; >> >> 10. Body Anomalies >> >> 10.1. Oversized Lines >> >> A message containing a line of content that exceeds 998 characters >> plus the line terminator (1000 total) violates Section 2.1.1 of >> [MAIL]. Some handling agents may not look at content in a single >> line past the first 998 bytes, providing bad actors an opportunity to >> hide malicious content. >> >> There is no specified way to handle such messages, other than to >> observe that they are non-compliant and reject them, or rewrite the >> oversized line such that the message is compliant. >> >> Handling agents MUST take one of the following actions: >> >> 1. Break such lines into multiple lines at a position that does not >> change the semantics of the text being thus altered. For >> example, breaking an oversized line such that a [URI] then spans >> two lines could inhibit the proper identification of that URI. >> >> 2. Rewrite the MIME part (or the entire message if not MIME) that >> contains the excessively long line using a content encoding that >> breaks the line in the transmission but would still result in the >> line being intact on decoding for presentation to the user. Both >> of the encodings declared in [MIME] can accomplish this. >> >> 11. Security Considerations >> >> The discussions of the anomalies above and their prescribed solutions >> are themselves security considerations. The practises enumerated in >> this memo are generally perceived as attempts to resolve security >> considerations that already exist rather than introducing new ones. >> > > { Hmmm. Whereas I think the document introduces quite a few attack > vectors that probably aren't discussed in other email specifications. /d} > > > >> 12. IANA Considerations >> >> This memo contains no actions for IANA. >> >> [RFC Editor: Please remove this section prior to publication.] >> >> 13. References >> >> >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 15] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> 13.1. Normative References >> >> [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate >> Requirement Levels", BCP 14, RFC 2119, March 1997. >> >> [MAIL] Resnick, P., "Internet Message Format", RFC 5322, >> October 2008. >> >> 13.2. Informative References >> >> [DKIM] Allman, E., Callas, J., Delany, M., Libbey, M., Fenton, >> J., and M. Thomas, "DomainKeys Identified Mail (DKIM) >> Signatures", RFC 4871, May 2007. >> >> [DSN] Moore, K. and G. Vaudreuil, "An Extensible Message >> Format for Delivery Status Notifications", RFC 3464, >> January 2003. >> >> [EMAIL-ARCH] Crocker, D., "Internet Mail Architecture", RFC 5598, >> July 2009. >> >> [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet >> Mail Extensions (MIME) Part One: Format of Internet >> Message Bodies", RFC 2045, November 1996. >> >> [RFC822] Crocker, D., "Standard for the Format of Internet Text >> Messages", RFC 822, August 1982. >> >> [SMTP] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, >> October 2008. >> >> [URI] Berners-Lee, T., Fielding, R., and L. Masinter, >> "Uniform Resource Identifier (URI): Generic Syntax", >> RFC 3986, January 2005. >> >> Appendix A. Acknowledgements >> >> The author wishes to acknowledge the following for their review and >> constructive criticism of this proposal: Tony Hansen, and Franck >> Martin >> >> Authors' Addresses >> >> Murray S. Kucherawy >> >> EMail: superuser@gmail.com >> >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 16] >> >> Internet-Draft Safe Mail Handling October 2012 >> >> >> Gregory N. Shapiro >> >> EMail: gshapiro@sendmail.com >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Kucherawy & Shapiro Expires April 12, 2013 [Page 17] >> >> > -- > Dave Crocker > Brandenburg InternetWorking > bbiw.net >
- [apps-discuss] Review of: draft-ietf-appsawg-malf… Dave Crocker
- [apps-discuss] Review of: draft-ietf-appsawg-malf… Dave Crocker
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… John Levine
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… John Levine
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… John R Levine
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Franck Martin
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… John R Levine
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… John R Levine
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Ned Freed
- Re: [apps-discuss] Review of: draft-ietf-appsawg-… Murray S. Kucherawy
- [apps-discuss] Malformed mail (was Re: Review of:… Ned Freed
- Re: [apps-discuss] Malformed mail (was Re: Review… Barry Leiba
- Re: [apps-discuss] Malformed mail (was Re: Review… John Levine
- Re: [apps-discuss] Malformed mail (was Re: Review… Dave Crocker
- Re: [apps-discuss] Malformed mail (was Re: Review… Murray S. Kucherawy
- Re: [apps-discuss] Malformed mail (was Re: Review… Ned Freed