Re: [ietf-822] utf8 messages

Mark Martinec <Mark.Martinec+ietf@ijs.si> Fri, 15 August 2014 22:23 UTC

Return-Path: <Mark.Martinec+ietf@ijs.si>
X-Original-To: ietf-822@ietfa.amsl.com
Delivered-To: ietf-822@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 898F71A0773 for <ietf-822@ietfa.amsl.com>; Fri, 15 Aug 2014 15:23:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.67
X-Spam-Level:
X-Spam-Status: No, score=-2.67 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.668, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0sMXis4w31bR for <ietf-822@ietfa.amsl.com>; Fri, 15 Aug 2014 15:23:24 -0700 (PDT)
Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) by ietfa.amsl.com (Postfix) with ESMTP id B48791A076F for <ietf-822@ietf.org>; Fri, 15 Aug 2014 15:23:23 -0700 (PDT)
Received: from amavis-proxy-ori.ijs.si (localhost [IPv6:::1]) by mail.ijs.si (Postfix) with ESMTP id 3hZfLL5Xy5zwb for <ietf-822@ietf.org>; Sat, 16 Aug 2014 00:23:22 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:organization :from:from:date:date:message-id:received:received:received; s= jakla2; t=1408141398; x=1410733399; bh=E7y++YWzCHO+mGT4oS6cfc6R8 ymHf9MpzYzPwb41nro=; b=BcftdKtLv77IPveqrBvmPIrFxJanEYqOsSifwTTdy iZNon9TGNUueT7xW70/pRfDtJc5gOefvVb68Keuje9jdCnnrvV8LkVYAK+k+wLsn P1wJhVftXlHEGWZtf8f6EgVHb/X7nr5cYiuxZtgVgfz26ndU3Y5voAdKyZXMgGmq yk=
X-Virus-Scanned: amavisd-new at ijs.si
Received: from mail.ijs.si ([IPv6:::1]) by amavis-proxy-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10012) with ESMTP id ess7ZILQ9lyk for <ietf-822@ietf.org>; Sat, 16 Aug 2014 00:23:18 +0200 (CEST)
Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP for <ietf-822@ietf.org>; Sat, 16 Aug 2014 00:23:17 +0200 (CEST)
Received: from [92.244.73.162] (vpn034.ijs.si [92.244.73.162]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mildred.ijs.si (Postfix) with ESMTPSA id 3hZfLF37gKzFg for <ietf-822@ietf.org>; Sat, 16 Aug 2014 00:23:17 +0200 (CEST)
Message-ID: <53EE8851.7010802@ijs.si>
Date: Sat, 16 Aug 2014 00:23:13 +0200
From: Mark Martinec <Mark.Martinec+ietf@ijs.si>
Organization: Jozef Stefan Institute
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.0
MIME-Version: 1.0
To: ietf-822@ietf.org
References: <CABa8R6tWEhjjZSvq6NbM7EimokOms3suZufn0-6N1SB_fzGM8Q@mail.gmail.com> <01PB9FABWA4E0000SM@mauve.mrochek.com> <CABa8R6tns-idiZTj=+vb9fVNyH-nNYT+w9oNMb80XbCs5osvFw@mail.gmail.com> <01PBABOOL4QO0000SM@mauve.mrochek.com> <CABa8R6vBqS1ewmTtHh8tTOdzobsWpvSEokRxOqpj1Oq3hA+vsw@mail.gmail.com> <D0111ECB.195FD%dvargha@mimecast.com> <01PBCA98IPI00000SM@mauve.mrochek.com> <D013B9C1.1972E%dvargha@mimecast.com> <01PBEGWAGVDG0000SM@mauve.mrochek.com> <D013DB6D.1977B%dvargha@mimecast.com>
In-Reply-To: <D013DB6D.1977B%dvargha@mimecast.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/ietf-822/Kac8iXuc_rm55BEIugjUqAoZicw
Subject: Re: [ietf-822] utf8 messages
X-BeenThere: ietf-822@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion of issues related to Internet Message Format \[RFC 822, RFC 2822, RFC 5322\]" <ietf-822.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-822/>
List-Post: <mailto:ietf-822@ietf.org>
List-Help: <mailto:ietf-822-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Aug 2014 22:23:27 -0000

On 2014-08-15 17:15, Daniel Vargha wrote:
>
> On 15/08/2014 15:13, "Ned Freed" <ned.freed@mrochek.com> wrote:
>
>>> On 14/08/2014 01:56, "Ned Freed" <ned.freed@mrochek.com> wrote:
>>
>>>>> I fully agree with Brandon, the standard SHOULD consider the use case
>>>>> when a
>>>>> message is transferred from one system to another as a blob (e.g.
>>> flat
>>>>> file) and
>>>>> the only available "metadata" is that the message is in MIME format.
>>>>> Having
>>>>> some sort of well defined UTF8 indicator in the header section of the
>>>>> message
>>>>> would make it much simpler to adopt the new standard as it would
>>> require
>>>>> substantially less development effort in most cases.
>>>>
>>>> I'm skeptical of the claim, but if you absolutely have to have
>>> something,
>>>> why
>>>> not add a Received: field containing a "with smtputf8" clause, assuming
>>>> one
>>>> isn't there already?
>>
>>> Received: headers are not very reliable, and the syntax is is not well
>>> defined.
>>
>> On the contrary, it's quite well defined. See RFC 5321. The issue isn't
>> that
>> it's poorly defined, but rather that there are a lot of agents that don't
>> create it properly.
>
> Maybe because it was defined too late, and not in the right place? (RFC
> 5321
> is about SMTP not about MIME) From the parser's point of view the reason
> is
> indifferent, the reality is that it is better not to rely on it. Even RFC
> 5321
> says:
>
> "...receiving systems MUST NOT reject mail based on the format of a trace
> header field and SHOULD be extremely robust in the light of unexpected
> information or formats in those header fields."
>
>>
>>> Successfully parsing a Received: header itself requires a lot of
>>> heuristics.
>>
>> A full parse does, and so does looking for IP address information (which
>> doesn't appear directly as a clause value and whose position was only
>> standardized late in the game). Looking for a with clause with a
>> particular
>> value does not.
>
> Looks like we have quite different ideas about reliability and parsing.
> I certainly would not consider the partial parsing approach you suggested
> as reliable.

It's only the topmost (last added) Received header field that is
to be considered here. It is added by your own mailer, and you
know exactly what software that is using. The topmost added
Received header field is always in the same format, its contents
is always trustworthy. It is likely RFC 5321 compliant, but even
if it is not, you already know and can rely on its idiosyncrasies.
So there is no good excuse for not fetching the WITH clause
from this field, if other ways of accessing the EAI flag are
not available.

   Mark