Re: [EAI] UTF-8 in Message-IDs

ned+ima@mrochek.com Sat, 13 August 2011 22:15 UTC

Return-Path: <ned+ima@mrochek.com>
X-Original-To: ima@ietfa.amsl.com
Delivered-To: ima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8018621F86BD for <ima@ietfa.amsl.com>; Sat, 13 Aug 2011 15:15:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.337
X-Spam-Level:
X-Spam-Status: No, score=-2.337 tagged_above=-999 required=5 tests=[AWL=0.110, BAYES_00=-2.599, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LxA5V3poLLGU for <ima@ietfa.amsl.com>; Sat, 13 Aug 2011 15:15:27 -0700 (PDT)
Received: from mauve.mrochek.com (mauve.mrochek.com [66.59.230.40]) by ietfa.amsl.com (Postfix) with ESMTP id CC41D21F86B3 for <ima@ietf.org>; Sat, 13 Aug 2011 15:15:27 -0700 (PDT)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01O4T11P5D00012KF0@mauve.mrochek.com> for ima@ietf.org; Sat, 13 Aug 2011 15:15:07 -0700 (PDT)
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01O4CJSMR6GG00VHKR@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ima@ietf.org; Sat, 13 Aug 2011 15:15:04 -0700 (PDT)
From: ned+ima@mrochek.com
Message-id: <01O4T11O8X4M00VHKR@mauve.mrochek.com>
Date: Sat, 13 Aug 2011 15:08:48 -0700
In-reply-to: "Your message dated Sat, 13 Aug 2011 22:23:22 +0200" <CAHhFybo47--0YjCRcvSO4asoV_R89+ULDB3tyij+ba=O_6gKsQ@mail.gmail.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN
References: <CAHhFybo47--0YjCRcvSO4asoV_R89+ULDB3tyij+ba=O_6gKsQ@mail.gmail.com>
To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mrochek.com; s=mauve; t=1313273609; bh=J9Z0/+6Pot3bOrQ+q9FbW0lno0sBmWTNOBxJalCJ7Sw=; h=From:Cc:Message-id:Date:Subject:In-reply-to:MIME-version: Content-type:References:To; b=XiW0hs+6wol3tTuLkUzAzOjQwkNpr2xBUl76jtkxSa0i0JKwVtOFUTm8kIpl+VhZz iw24NMMIMO0MX8q7PP90axsnBsy4ctzbBB2etm6OnCznb6fGBIgqojldEQEhXWHsHO 6JtNFAEo8rxhmG9r8DP9WyDPsBU/OfKMMc2ccxJQ=
Cc: ima@ietf.org
Subject: Re: [EAI] UTF-8 in Message-IDs
X-BeenThere: ima@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "EAI \(Email Address Internationalization\)" <ima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ima>, <mailto:ima-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ima>
List-Post: <mailto:ima@ietf.org>
List-Help: <mailto:ima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ima>, <mailto:ima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 13 Aug 2011 22:15:28 -0000

> looking for something else (= state of the art wrt EAI
> and SPF) yesterday I stumbled over the "81 IETF EAI
> meeting minutes" in the mailing list archive, with a
> note about "UTF-8 in Message-IDs".

> That was certainly (bad) news from my POV, and whining
> about it off list I got the tip that waiting for an
> IETF Last Call is a bad plan when this could be also
> addressed before WG Last Call.

> From my POV the "globally unique forever Message-ID"
> in mail and news is in essence an early form of what
> is now a GUID/UUID in other protocols.  Maybe a bit
> more so, in theory there could be UUID collisions,
> and that's not possible for Message-IDs (in theory).

> This theory is relevant for NetNews, gateways from or
> to NetNews, mail and news threading, APOP, and likely
> other use cases I'm not aware of.

> Nobody would say replace hex. by UTF-8 in GUID/UUIDs,
> therefore I fail to see the point of trying something
> in this direction in Message-IDs.  Gateway operators
> stating that UTF-8 in Message-IDs would be no problem
> in practise, or that this would be only a minor point
> when they adopt EAI, might convince me that I'm only
> paranoid, but I doubt it.

> And the "EAI experiment" phase did not test this plan,
> there's no evidence I'm aware of that UTF-8 in Message-
> IDs is harmless.

I'm sure it is not, just as utf-8 in addresses is far from harmless and is
going to require all sorts of infrastructure changes.

But I fail to see how this is in any way relevant. We're defining a new message
format here that *cannot* be downgraded to the old format and retain all
semantics. This is true irrespective of how message-ids are handled. As such,
utf-8 in message-ids is a small additional cost.

It's also a cost that I doubt can be avoided no matter what the specification
says. Once utf-8 is in domains in messages it's going to leak into message-ids
no matter what the specifications say.

> Did you really check all RFCs using
> or mentioning Message-IDs for potential side-effects
> of introducing UTF-8 in the "local", "unique", "left"
> etc. part of a Message-ID?

Nope. Given the approach the WG is taking to this problem, such an analysis
would be a pointless waste of time.

Maybe, if we were taking the MIME approach, such a check would make sense.
But we're not doing that.

> E.g., does it fly with the
> Message-ID concept in APOP and CRAM-MD5, or do you
> intend something in the direction of "updates 2195"
> to make it work?

Why would that be coupled to this new message format?

				Ned