Re: [ietf-822] utf8 messages

Daniel Vargha <dvargha@mimecast.com> Wed, 13 August 2014 19:08 UTC

Return-Path: <dvargha@mimecast.com>
X-Original-To: ietf-822@ietfa.amsl.com
Delivered-To: ietf-822@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CCC631A0389 for <ietf-822@ietfa.amsl.com>; Wed, 13 Aug 2014 12:08:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.968
X-Spam-Level:
X-Spam-Status: No, score=-4.968 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.668, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SSRT4WgWmhHR for <ietf-822@ietfa.amsl.com>; Wed, 13 Aug 2014 12:08:19 -0700 (PDT)
Received: from service-alpha-uk.mimecast.com (service-alpha-outbound1.mimecast.com [91.220.42.229]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A608B1A04AF for <ietf-822@ietf.org>; Wed, 13 Aug 2014 12:07:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mimecast.com; s=20130419; t=1407956838; bh=dCE+i77w237jg7dRYSDyYIERAlTsIHgg1s9mCixdsn8=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To:MIME-Version:Content-Type; b=MRAhmZ952h7K4bgqPy7xko/JN6rAxnqB+LjExriS9uDNXLo10Zw2XqTych2Pbf/IShpCzueAHSUhyAkVgS2t4imPAsu9vSAaVBzbnGRFdaOZcH4whe4AFQOKwVUw0CxgSf3nE414te6d8GtUOHa6S002hxuNu2xEqOmKRM+RveE=
Received: from remote.mimecast.com (146.101.202.133 [146.101.202.133]) (Using TLS) by uk-sl-b.uk.mimecast.lan; Wed, 13 Aug 2014 20:07:11 +0100
Received: from MC-LON-EXCH06.mcsltd.internal (192.168.40.206) by MC-LON-EXCH03.mcsltd.internal (192.168.40.12) with Microsoft SMTP Server (TLS) id 14.3.195.1; Wed, 13 Aug 2014 20:07:11 +0100
Received: from MC-LON-EXCH03.mcsltd.internal ([fe80::3879:e7a7:5e3d:3699]) by MC-LON-EXCH06.mcsltd.internal ([fe80::fc47:f11e:e9aa:b670%13]) with mapi id 14.03.0195.001; Wed, 13 Aug 2014 20:07:10 +0100
From: Daniel Vargha <dvargha@mimecast.com>
To: Mark Martinec <Mark.Martinec+ietf@ijs.si>, "ietf-822@ietf.org" <ietf-822@ietf.org>
Thread-Topic: [ietf-822] utf8 messages
Thread-Index: AQHPtaaAWFoietPkYkCGwOkQxU65z5vMFRA1gAB7aoCAAIf8hoABBMgAgABpbQCAADWoAIAAK+4A
Date: Wed, 13 Aug 2014 19:07:09 +0000
Message-ID: <D011732C.19684%dvargha@mimecast.com>
References: <CABa8R6tWEhjjZSvq6NbM7EimokOms3suZufn0-6N1SB_fzGM8Q@mail.gmail.com> <01PB9FABWA4E0000SM@mauve.mrochek.com> <CABa8R6tns-idiZTj=+vb9fVNyH-nNYT+w9oNMb80XbCs5osvFw@mail.gmail.com> <01PBABOOL4QO0000SM@mauve.mrochek.com> <CABa8R6vBqS1ewmTtHh8tTOdzobsWpvSEokRxOqpj1Oq3hA+vsw@mail.gmail.com> <D0111ECB.195FD%dvargha@mimecast.com> <38adf1fa5904098dd896002fec51583d@mailbox.ijs.si>
In-Reply-To: <38adf1fa5904098dd896002fec51583d@mailbox.ijs.si>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.4.3.140616
x-originating-ip: [205.217.25.189]
MIME-Version: 1.0
X-MC-Unique: ThjgTf18S4yeIKFV6hCnvw-1
Content-Type: multipart/alternative; boundary="_000_D011732C19684dvarghamimecastcom_"
Archived-At: http://mailarchive.ietf.org/arch/msg/ietf-822/3qL940PnKR4mItwlgg7SKOd4YZ4
Subject: Re: [ietf-822] utf8 messages
X-BeenThere: ietf-822@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion of issues related to Internet Message Format \[RFC 822, RFC 2822, RFC 5322\]" <ietf-822.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-822/>
List-Post: <mailto:ietf-822@ietf.org>
List-Help: <mailto:ietf-822-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Aug 2014 19:08:24 -0000

Daniel Vargha wrote:
> I fully agree with Brandon, the standard SHOULD consider the use case
> when a
> message is transferred from one system to another as a blob (e.g. flat
> file) and
> the only available "metadata" is that the message is in MIME format.
> Having
> some sort of well defined UTF8 indicator in the header section of the
> message
> would make it much simpler to adopt the new standard as it would
> require
> substantially less development effort in most cases.

It is not possible (even in the absence of SMTPUTF8 support) to be
able to transfer e-mail messages with no out-of-band ("metadata" /
envelope) information. The most obvious reason is the list of
recipient addresses, which is not present in a message itself.
Envelope sender may or may not be present in a mail header
(as a Return-Path header field). Other examples are RFC 3461 data
(RET, ENVID, NOTIFY, ORCPT). The SMTPUTF8 flag is just one more of
such out-of-band pieces of information necessary for mail transmission.

It is possible and it is happening to millions of messages every day.
A message is sitting in an archive store, the user exports it to an .EML
file and then imports the file into another archive system. The
destination system needs to parse the message for indexing in order
to make it searchable. Quite often a message goes through this
export/import process several times during it's life time.

Another example where it is assumed that a MIME message is self
contained and allows parsing without additional metadata is the
MBOX format (http://en.wikipedia.org/wiki/Mbox)

Daniel