Re: [ietf-822] utf8 messages

Brandon Long <blong@google.com> Fri, 15 August 2014 19:45 UTC

Return-Path: <blong@google.com>
X-Original-To: ietf-822@ietfa.amsl.com
Delivered-To: ietf-822@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DEE301A03C3 for <ietf-822@ietfa.amsl.com>; Fri, 15 Aug 2014 12:45:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.046
X-Spam-Level:
X-Spam-Status: No, score=-2.046 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.668, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Gu9SDlxjNJzC for <ietf-822@ietfa.amsl.com>; Fri, 15 Aug 2014 12:44:59 -0700 (PDT)
Received: from mail-ig0-x22d.google.com (mail-ig0-x22d.google.com [IPv6:2607:f8b0:4001:c05::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 572B81A03B1 for <ietf-822@ietf.org>; Fri, 15 Aug 2014 12:44:59 -0700 (PDT)
Received: by mail-ig0-f173.google.com with SMTP id h18so2905027igc.6 for <ietf-822@ietf.org>; Fri, 15 Aug 2014 12:44:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=4Vju9fHOkrXCJ4wGnRwRf0xevsQclPoRGLhj+Jx3Q8M=; b=VN2US6P+hFijMXALzCEny3yoC8aipeF2Z49zp6c/g9VclIAbDBXnR4S/GdkfEW3KE2 O7g/Brogoj2JpgQZfhIZm2DEo34aCaZKM+Ncv+MwhayPS0MsNMGri1c7orXaNM0G4GSC Cdx1u9oZCLwrHTqJ+YfnOa5QbdO/YdK+G1it2CW1zd00w3I88sFYlbqOUMhiESg1XcHf 7CRlRlmw5WCFcq5gjnZ57PMISZ1SL1J6FJ1PFXhHAl2kc1JQSA0KWSgAAtA212qaY0RN cdim4zGiXKHoMzdEv5XkzM66rZv/T5krVs9xV3GK1Wt1Ll1WsTiUpwpAOWpZ4MmhZGnm dZ/A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=4Vju9fHOkrXCJ4wGnRwRf0xevsQclPoRGLhj+Jx3Q8M=; b=R+pdFCv/aC78R1p2IzX0Bcjwi2TUp1aqQDNovqugYosrSFiTq+WibNr7PfdJKD5abF v4paWlhiEjbZc31I9UohonqfTVXDvOEZaTnHfYIKdRp/KjzYZvKwgWAKSwqDaurjazDo seJkBh0AR4wxR8qKkDmO55WJPjrefHgV6nOYt6qXGLMj/fdIVoX8Z/c+eu/iq71xFVdZ IPuJ2VInELqF4uuv+Ds/6PlxAKe77uzlK5do3Bor29L9ED06bZwaPqvfaFVospHiA96c j3Vx+w3Z1IHjnuDZNO2nL0zVQmXWA60BFWkX6gUiHW9CWWKggZxP9bk9MoDiGbX5HGAo yQRg==
X-Gm-Message-State: ALoCoQnGWMxl/+NHg1uZrk45Dy4IOSrSihoB6485oos9Xkmn6QIO61adhgB5JXJfUBs3yatORXRB
MIME-Version: 1.0
X-Received: by 10.43.129.74 with SMTP id hh10mr22153223icc.48.1408131898570; Fri, 15 Aug 2014 12:44:58 -0700 (PDT)
Received: by 10.64.62.78 with HTTP; Fri, 15 Aug 2014 12:44:58 -0700 (PDT)
In-Reply-To: <01PBDQ3P6A8Q0000SM@mauve.mrochek.com>
References: <CABa8R6tWEhjjZSvq6NbM7EimokOms3suZufn0-6N1SB_fzGM8Q@mail.gmail.com> <01PB9FABWA4E0000SM@mauve.mrochek.com> <CABa8R6tns-idiZTj=+vb9fVNyH-nNYT+w9oNMb80XbCs5osvFw@mail.gmail.com> <01PBABOOL4QO0000SM@mauve.mrochek.com> <CABa8R6vBqS1ewmTtHh8tTOdzobsWpvSEokRxOqpj1Oq3hA+vsw@mail.gmail.com> <01PBBWUH11D60000SM@mauve.mrochek.com> <CABa8R6uJ--4Fcntdgef+h6ZXjP_q0q7hZaBW-SOozMTtiE918g@mail.gmail.com> <01PBCGZERCU20000SM@mauve.mrochek.com> <CABa8R6sbFQHaP=YgrejJjUKJS+20BFP+kATZ+PrDnTgUhPMpHw@mail.gmail.com> <01PBDQ3P6A8Q0000SM@mauve.mrochek.com>
Date: Fri, 15 Aug 2014 12:44:58 -0700
Message-ID: <CABa8R6vB=HqU42w1nkyY0zouVupradeMaf+7tu4F-1eSmFmfQA@mail.gmail.com>
From: Brandon Long <blong@google.com>
To: Ned Freed <ned.freed@mrochek.com>
Content-Type: multipart/alternative; boundary="001a11c1ec9a9ec0850500b041f4"
Archived-At: http://mailarchive.ietf.org/arch/msg/ietf-822/rL0mHa9479Ifd3tefpFE4q3SRns
Cc: ietf-822@ietf.org
Subject: Re: [ietf-822] utf8 messages
X-BeenThere: ietf-822@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion of issues related to Internet Message Format \[RFC 822, RFC 2822, RFC 5322\]" <ietf-822.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf-822/>
List-Post: <mailto:ietf-822@ietf.org>
List-Help: <mailto:ietf-822-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-822>, <mailto:ietf-822-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Aug 2014 19:45:01 -0000

On Thu, Aug 14, 2014 at 6:42 PM, Ned Freed <ned.freed@mrochek.com> wrote:

> > On Wed, Aug 13, 2014 at 6:34 PM, Ned Freed <ned.freed@mrochek.com>
> wrote:
>
> > > > Let me try one more time, since something isn't making it through.
> > >
> > > > I have three messages.  One message has an entirely 7bit header with
> 2047
> > > > encoded subject.  Another message is a 6532 message, with the
> subject in
> > > > utf8.  A third message is has a cp-1250 8bit subject.  There are two
> 8bit
> > > > bytes in the subject in both of the last two messages, and in the
> cp1250
> > > > case, those two bytes happen to also be a valid utf8 character.
> > >
> > > > We want to be able to parse all three of those and do so correctly.
> We
> > > > know the third type is technically invalid, but we see millions of
> such
> > > > messages every day, dropping all of those would be a dis-service to
> our
> > > > users.  We currently see way more of such messages than we do of 6532
> > > > messages... though in practice, the most common charset now is
> utf-8, so
> > > I
> > > > guess those are now the same as 6532 messages that have leaked.
> > >
> > > I thought I understood the problem you were attempting to solve, but
> now
> > > I'm
> > > totally confused, because this seems to hqve nothing to do with
> additional
> > > labeling of legitimate EAI messages at all.
> > >
>
> > My point is that without a label, I can't tell the difference between the
> > 6532 messages and the illegitimate messages, given just the message.
>
> Yes, that's exactly what I thought. But given that, surely it makes more
> sense
> to label the illegimate messages as such, especially since you're going to
> want
> to set the EAI message bit for some of them, making it essentally an
> orthogonal
> setting?
>
> Did you even bother to read the rest of my response?
>

Yes, but I didn't understand it until this response.  It is an interesting
alternative.  You still seem to be assuming an external eai bit, but even
without that, yes, theoretically we could mark any non legitimate message
at smtp-in time.  It would be harder to do at imap APPEND time, though,
since I believe clients assume the messages as uploaded are stored as is..
I'd have to test to see if that was a failure.  Potentially, I would have
to be careful with any source of messages to make sure .. and that means we
may need to upgrade our other api's and migration tools to have a bit for
eai.

Brandon