Re: [dane] email canonicalization for SMIMEA owner names

Nico Williams <> Thu, 11 December 2014 22:03 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 4DC7E1A0149 for <>; Thu, 11 Dec 2014 14:03:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.666
X-Spam-Status: No, score=-1.666 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, IP_NOT_FRIENDLY=0.334, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id MAUuh_qkPWid for <>; Thu, 11 Dec 2014 14:03:17 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 4E0641A1ABE for <>; Thu, 11 Dec 2014 14:03:15 -0800 (PST)
Received: from (localhost []) by (Postfix) with ESMTP id 5071C1B405F for <>; Thu, 11 Dec 2014 14:03:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed;; h=date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to;; bh=YK/WEKj6f2/fCGZFfrjrCiRYHeg =; b=Zj3G0MsYG2xEjmWqtpgs3JEW6Eqj8Sj21ZYWrHSj9rUKWX2mtX86MorCXZ7 RzBxyXAfAz5ROk9R5GxskerAGT0kOfJH1R/V5Uos23qZzOZUpQDYpBPtqsITtMWs VgvKNrEtxkozTBBx5gnP5R5Yj8ZCifD0gN5ZG01jw0MUOBI0=
Received: from localhost ( []) (Authenticated sender: by (Postfix) with ESMTPA id 206201B4059 for <>; Thu, 11 Dec 2014 14:03:14 -0800 (PST)
Date: Thu, 11 Dec 2014 16:03:13 -0600
From: Nico Williams <>
Message-ID: <20141211220308.GH3448@localhost>
References: <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [dane] email canonicalization for SMIMEA owner names
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DNS-based Authentication of Named Entities <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 11 Dec 2014 22:03:18 -0000

On Thu, Dec 11, 2014 at 08:50:53PM +0000, Viktor Dukhovni wrote:
> I have a proposal that solves the ASCII use-case.  Sadly, little
> can be done for non-ASCII Unicode, those names will just have to
> be used consistently by all parties.

Well, domains could publish the local-part canonicalization function
they use, or, rather, a small index of well-known canonicalization

This is just a tweak to your proposal.  You propose just two functions:
identity and ASCII-tolower, with the client trying all [two] of them.

If we add more functions we'll want to know which function the domain
uses, so we'll need that one more lookup.  We need just a handful of
functions that will work for most cases.

E.g., gmail treats periods as if they weren't there.  That might need to
be part of one ore more standard canon function(s).

I realize that your proposal is simpler, and we might want to stop there.

> For all-ASCII addresses, (ignoring for the moment Turkish case-
> folding of "I" to a non-ASCII "dotless" "i"), the proposal is
> as follows:

What site would want to permit local-part names that are equivalent but
for an i/dotless-i?  I realize that the situation can have come up, but
going forward a site might want to treat them as equivalents, and,
really, to implement Unicode case-folding + some standard mappings, as
the canonicalization, at least for SMIMEA purposes (the actual e-mail
addresses understood by users as canonical might bear a dotless i even
if for SMIMEA purposes it becomes a dotted i).

>     * Clients that encounter an ascii localpart that is not all lower-case
>       try both keys, first the localpart as-is, then case-folded with
>       the "@lower:" prefix.  

Almost there :)