Hashing local-parts of addresses (was: dane-openpgp 2nd LC resolution)
ned+ietf@mauve.mrochek.com Sun, 20 March 2016 16:55 UTC
Return-Path: <ned+ietf@mauve.mrochek.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7A64712D7D9 for <ietf@ietfa.amsl.com>; Sun, 20 Mar 2016 09:55:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.797
X-Spam-Level:
X-Spam-Status: No, score=0.797 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f9k3D9vDiA2N for <ietf@ietfa.amsl.com>; Sun, 20 Mar 2016 09:54:59 -0700 (PDT)
Received: from mauve.mrochek.com (mauve.mrochek.com [68.183.62.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DFE3712D6EA for <ietf@ietf.org>; Sun, 20 Mar 2016 09:54:58 -0700 (PDT)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01PY11K21BTC000AJ4@mauve.mrochek.com> for ietf@ietf.org; Sun, 20 Mar 2016 09:49:59 -0700 (PDT)
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-disposition: inline
Content-type: text/plain; CHARSET="us-ascii"
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01PY0VBAGHHS00004Z@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for ietf@ietf.org; Sun, 20 Mar 2016 09:49:49 -0700 (PDT)
From: ned+ietf@mauve.mrochek.com
Message-id: <01PY11JYGSLU00004Z@mauve.mrochek.com>
Date: Sun, 20 Mar 2016 09:48:42 -0700
Subject: Hashing local-parts of addresses (was: dane-openpgp 2nd LC resolution)
In-reply-to: "Your message dated Sat, 19 Mar 2016 10:26:24 -0400" <7B35165CF1E545B14FE01F7F@JcK-HP8200.jck.com>
References: <56DC484F.7010607@cs.tcd.ie> <3470AB158222ED0ECAF2CAEA@JcK-HP8200.jck.com> <56ED45A8.7060304@cs.tcd.ie> <7B35165CF1E545B14FE01F7F@JcK-HP8200.jck.com>
To: John C Klensin <john-ietf@jck.com>
Archived-At: <http://mailarchive.ietf.org/arch/msg/ietf/XXWuPVa82lmNDrduVlPgHbjMPQs>
Cc: iesg@ietf.org, IETF-Discussion <ietf@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Mar 2016 16:55:00 -0000
(This is the first in what will hopefully be a series of review comments on the latest version of the dane-openpgp specification. I'm breaking this up into several different topics in hopes of keeping any resulting discussion focused on the particular set of issues I've brought up.) This is the first specification I'm aware that hashes the local-part of an address to produce a corresponding identifier. Not only have we never gone this far before, we've actually tried to stay away from operations like address comparisons that have similar, albeit more limited, semantics. In regards to this operation, there has been extensive discussion of the longstanding requirement that only agents with administative authority over the associated domain can "interpret" the local-part of an address. Unfortunately, AFAICT this discussion has completely missed two fundamental and vitally important points. First, there's no way to define a mapping of local-parts to a new set of identifiers *without* effectively interpreting the local-part! If you define the mapping as the draft currently does, implicit in that definition is that local-parts are case-sensitive. And similarly, if you convert the local-part to lower (or upper) case, you're now assuming the local-part is case-insensitive. And in the case of EAI, without some sort of normalization you're assuming that different UTF-8 representations of the same string of characters correspond to different recipients. (Which, as Harald Alvestrand and I both pointed out on the IETF list, is technically untenable and needs to be addressed. My suggestion was and is to specify that the same case-folding and normalization algorithm used for IDNs also be employed here.) But - and this is the second fundamental point that AFAICT has been missed - who is doing the interpreting? In one sense it's the consumer of the OPENPGPKEY records in the DNS, and the discussion so far has focused on how such consumers don't have the right to do that. But who published those records? That would be the owner of the domain - you know, the folks who *are* entitled to interpret the local-part of addresses in whatever fashion they choose. So when a domain owner publishes such records in the DNS, a reasonable way to look at it is that they are effectively saying, "Everyone is allowed to interpret the local-parts of our addresses as specified in this document in this one narrow context." I'm pretty confident there's nothing in any standard that forbids such a delegation of authority. And once you realize this is what is going on, not only does it become clear that this draft is *not* violating the longstanding rules about local-part interpretation, it casts the decision not to normalize the local-parts to lower (or upper) case in an entirely different light. By choosing not to normalize this specification is effectively restricting its own applicability to domains with case-sensitive local parts. That is, IMO, a highly suboptimal choice - the overwhelming majority of domains treat the local part in a case-insensitive fashion, and so should the mechanism specified in this draft. Or, to put this another way, the inherent limitations of using the DNS to provide the mapping from address to PGP key restricts the domain of applicability of this specification to domains with particular local-part policies, and the way in which the local-part to DNS mapping is specified determines which policies the specification supports. And while it seems logical to support a policy that's known to be in wide use, the specification also needs to be very clear that domains that employ case-sensitive local-parts MUST NOT avail themselves of this mechanism. What needs to happen here is that the specification be revised to make it clear that this is what is going on: That by publishing such records a domain is granting a limited right to interpret the local parts of its addresses. (One can of course argue that a specification that fails to offer a solution to case-sensitive domains, or to domains that employ various forms of subaddressing semantics, is unacceptable. But I am emphatically not making that argument. I have a number of grave reservations about this draft that I am going to try to explain in subsequent messages, but this isn't one of them.) There's also - as noted by Sean Leonard - a technical glitch in the current specification: The local-part is not the correct input to the hash function. A canonicalization step is needed because all of these addresses are equivalent: (1) first.last@example.com (2) first . last @example.com (3) "first.last"@example.com (4) "\f\i\r\s\t.last"@example.com (2) is equivalent to (1) because CWS has no semantics, (3) is equivalent to (1) because the enclosing quotes are not properly part of the address, and (4) is equivalent to (1) because quoted-pairs are semantically equivalent to just the quoted character. I believe this is the entire list, so the obvious canonicalization to use on the local-part portion of an address prior to lowercasing and hashing is: (a) If the local-part is unquoted remove any whitespace around periods. (b) Remove any enclosing double quotes. (c) Remove any literal quoting. I might be inclined to say that this rather technical matter can wait to be resolved in a future update, but (1) Implementations once deployed are difficult to change, and according to the draft there are already incompatible implementations out there and (2) Normalization need to be revisited anyhow, so why not fix this as well? Finally, a couple of observations about terminology are in order. The current text covering the hashing of local-parts begins with: The user name (the "left-hand side" of the email address, called the "local-part" in the mail message format definition [RFC5322] and the local-part in the specification for internationalized email [RFC6530]) is encoded in UTF-8 (or its subset ASCII). If the local-part is written in another encoding it MUST be converted to UTF-8. First, the left hand side of an email address is not a "user name" and should not be referred to as such. (The entire address is in some cases a "user name" of sorts, and in some cases the local-part is identical to some kind of login credential. But neither of these are universally true, and more to the point, none of this is relevant to the matter at hand.) Second, it probably makes sense to note that local-part is an ABNF production contained in a broader syntax, not just a name. Third, the term "encoding" here is inaccurate; it should be charset. That's all for now. Ned
- dane-openpgp 2nd LC resolution Stephen Farrell
- Re: dane-openpgp 2nd LC resolution E Taylor
- Re: dane-openpgp 2nd LC resolution Stephen Farrell
- Re: dane-openpgp 2nd LC resolution John C Klensin
- Re: dane-openpgp 2nd LC resolution John C Klensin
- Re: dane-openpgp 2nd LC resolution Doug Barton
- Re: dane-openpgp 2nd LC resolution Paul Wouters
- Treat model (was: Re: dane-openpgp 2nd LC resolut… John C Klensin
- Case distinctions as theoretical exercise (was: R… John C Klensin
- Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
- Re: dane-openpgp 2nd LC resolution John Levine
- Re: dane-openpgp 2nd LC resolution Paul Wouters
- Re: dane-openpgp 2nd LC resolution Paul Wouters
- Re: dane-openpgp 2nd LC resolution Doug Barton
- Re: Case distinctions as theoretical exercise Doug Barton
- Re: Threat model Doug Barton
- Re: dane-openpgp 2nd LC resolution Doug Barton
- Re: Case distinctions as theoretical exercise John C Klensin
- Re: dane-openpgp 2nd LC resolution John R Levine
- Re: dane-openpgp 2nd LC resolution John C Klensin
- Re: dane-openpgp 2nd LC resolution Doug Barton
- Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
- Re: dane-openpgp 2nd LC resolution Paul Wouters
- Re: dane-openpgp 2nd LC resolution Paul Wouters
- Re: dane-openpgp 2nd LC resolution Doug Barton
- Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
- Re: dane-openpgp 2nd LC resolution Mark Andrews
- Re: dane-openpgp 2nd LC resolution Warren Kumari
- Re: Case distinctions as theoretical exercise Phillip Hallam-Baker
- Re: Case distinctions as theoretical exercise John Levine
- Re: Case distinctions as theoretical exercise Phillip Hallam-Baker
- Re: dane-openpgp 2nd LC resolution Stephen Farrell
- Re: dane-openpgp 2nd LC resolution John C Klensin
- Hashing local-parts of addresses (was: dane-openp… ned+ietf