Re: dane-openpgp 2nd LC resolution

Doug Barton <dougb@dougbarton.us> Mon, 14 March 2016 03:58 UTC

Subject: Re: dane-openpgp 2nd LC resolution
To: ietf@ietf.org
References: <56DC484F.7010607@cs.tcd.ie>
From: Doug Barton <dougb@dougbarton.us>
Openpgp: id=E3520E149D053533C33A67DB5CC686F11A1ABC84
Message-ID: <56E636FD.9050902@dougbarton.us>
Date: Sun, 13 Mar 2016 20:58:53 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <56DC484F.7010607@cs.tcd.ie>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/ietf/XG0uBNsKpdCmM5tB5ME3m6_jfRE>
Precedence: list

In general I support the idea of an *experimental* draft on this topic. 
The idea has been kicked around in both casual and serious conversation 
in both the PGP and DNSSEC communities since the earliest days of the 
latter, and having a canonical (pun intended) way to experiment with 
this concept would be a great way to move forward.

However, I have some serious concerns about the suggested approach. I 
realize that I'm late to the party on this, so I apologize in advance to 
Paul. I've subscribed to the DANE WG ML so hopefully I can make a more 
useful contribution going forward. (FWIW, the last time I checked the 
suggested approach was one I agreed with, so I diverted my attention to 
more personally critical pursuits.)

Fundamentally I think that using this RR to return the cert itself is 
the wrong approach. IMO it should return the full fingerprint of the key 
and let the sending user retrieve the key itself, with all corresponding 
material. This overcomes several problems I see with the proposal:

1. We know from vast experience with PGP that users are very lazy about 
updating keys, and everything associated with them.

2. The draft recommends that the version of the key that is published be 
stripped of any "extraneous" material such as unrelated UIDs, photos, 
various certifications, etc. Given the desire to keep the packet size as 
small as possible these recommendations are sensible, but I question 
whether the resulting certificate has sufficient utility. One could also 
argue that most users think of "my PGP key" as the entire package of the 
main key, along with the the various associated UIDs, subkeys, etc.

3. Encryption subkeys are not tied to specific UIDs. Also, it's not at 
all clear to me how the mechanism described in this draft would interact 
with a "key" that has multiple encryption subkeys.

4. The draft suggests (primarily in Section 7.1) that this mechanism be 
utilized in a side channel (my term) separate from the sending user's 
existing keyring(s). I have deep concerns about this approach, as it 
increases the risk that a message could be sent encrypted to the wrong 
key. The biggest risk in this case is that of a false positive, where 
the sending user believes that the message was correctly encrypted. 
That's arguably still better than the alternative of sending the same 
message unencrypted. However it's not better than the alternative of 
informing the user that the MTA was not able to send the message because 
it couldn't find a key that it could rely on.

5. Super large OPENPGPKEY RRs will undoubtedly be used as DDOS 
amplification fodder. For example, using Paul's software to generate the 
RR for my primary key/UID the resulting packet would be over 12k.

All 5 of these issues are resolved by having the OPENPGPKEY RR specify 
the signature of the key, and having the sending user's software deal 
with the full key instead of only a truncated version.

1. To the extent that users are likely to update anything with their 
key, they are likely to send a current version to the key servers. 
Further, in the case of a key revocation (particularly for a compromised 
key), it may not be possible to remove or modify the OPENPGPKEY RR in a 
timely manner, or at all. However the receiving user should be able to 
upload a revocation cert for the key to the key servers, which would be 
picked up by sending user's software.

2. Having the sending user's software retrieve the full key gives them 
the most up to date version, and matches better with what I believe the 
expectations of most users would be for this RR.

3. Retrieving the whole key puts the problem of multiple encryption 
subkeys into the hands of the sending user's software, where there are 
already well known solutions.

4. The way that this mechanism interacts with the user's existing key 
ring still needs more thought.

Regarding 4, I see several likely scenarios:

1. The user already has exactly 1 key that contains exactly 1 UID which 
matches the e-mail address:
	- The OPENPGPKEY result matches: The user is notified, the user's 
software MAY make note of this match and suitably increase the "validity 
score" for the key (if the software has such a concept)
	- The OPENPGPKEY result does not match: The user is notified, MTAs/MUAs 
SHOULD halt processing at this point, and let the user manually rectify 
the error. User's software MAY make note of the mismatch and suitably 
decrease the "validity score" for the key (if the software has such a 
concept)

2. The user has more than one key, and/or more than one UID which 
matches the e-mail address:
	- There is an RR which corresponds to one of the known keys: See the 
first mechanism under 1 above.
	- Otherwise see the second mechanism under 1 above. Additionally, the 
software MAY ask the user if they would like to download the key that 
matches the RR for evaluation.

3. The user has no extant keys which match the e-mail address:
	- There is an RR which corresponds: The software MAY ask the user if 
they would like to download the key that matches the RR for evaluation.
	- There is not, the user is notified, and the software MAY suggest that 
the user search the key servers.

The draft currently suggests a mechanism for using the RR data in an 
unattended manner. I believe that this is a mistake for an experimental 
draft, and should be replaced with text which makes it clear that at 
this stage of the experiment the software SHOULD NOT proceed without 
user confirmation.

Some other thoughts:

The draft needs to be even more clear about its intentions. Specifically 
it needs to spell out how the mechanism should be used. I see two 
scenarios, but that doesn't mean there are only two:

1. This mechanism can be used as a data point in the WOT
2. This mechanism can be used for opportunistic encryption

Personally I don't see anything wrong with #2, but the sending user has 
to be involved in the decision-making process. Retrieving the full key 
will make that user's decision making process easier.

I would like to see more explanation of the mechanism for representing 
the local part. I think that the explanation of "Why do we need to hash 
it?" is good, however it's not clear to me why SHA-256 was chosen, or 
why the hash value is truncated the way that it is. Obviously a full 64 
character hex encoding of a SHA-256 hash would overflow the 63 byte 
limit for a single DNS label. But why truncate the value specifically to 
28 octets? If the 63 byte limit is the concern, why not use SHA-224?

I'd also like to suggest an alternative mechanism for this problem. In 
cases where the entire local part of the address consists exclusively of 
1-63 Letter-Digit-Hyphen (LDH) characters, allow that representation to 
be used directly. That would greatly simplify provisioning in the common 
case, and also go part of the way towards addressing John Levine's 
concern about the poor DNS admin who needs to go in and clean up those 
records later. (That said, the records themselves refer to a key which 
will answer said admin's question, so while it's a bit of extra work the 
hash doesn't make answering the question impossible.)

This would also require a bit of text warning mail/DNS admins to mandate 
hashes in the event that local delivery is case sensitive, but I really 
think that this is such a corner case (again, pun intended) as to be 
nearly a non-issue.

In cases where the local part contains a non-LDH character, the hashing 
mechanism is mandatory. Obviously the system should fall back to trying 
the hash version of the RR for a "simple" local part as described above.

Given that the draft specifies a new RR with its own ID, there is no 
need for the _openpgpkey intermediate label. A query such as 'dig 
doug.barton.us openpgpkey' would do the right thing (once all the 
software catches up to the eventual RFC of course).

I would avoid using 'LHS' to refer to the local part of the e-mail 
address in order to avoid confusion with the way that acronym is 
commonly used in reference to DNS RRs.

The discussion about local part wildcards in 5.3 confuses me. How would 
the user know how to hash the wildcard to look up the RR, given a real 
e-mail address as the starting point?

The details on how the sending user's software should use the returned 
data should be moved out of Section 7 (Security Considerations) and into 
Section 5. Also, see above. :)

If we stick with sending the whole cert back in the RR the Security 
Considerations section should list use of such answers in DDOS 
amplification attacks as a risk.

Regarding 7.5, given that the hash is truncated, it's hard for me to see 
how rainbow tables would be effective for converting the results of zone 
walking into a list of valid e-mail addresses for the domain.

Minor nit, it's "GnuPG" :)

I hope that these suggestions are useful, and again, I apologize for not 
offering them sooner. If you are in agreement with them, and would like 
help with text, I'd be happy to do what I can.

best,

Doug

Some more comments below.

On 03/06/2016 07:10 AM, Stephen Farrell wrote:
>
> Hi all,
>
> The 2nd IETF last call for this started on Feb 8th.
> Thanks again for the lively discussion.
>
> The tl;dr version is: once a revision addresses the
> substantive issues raised as per below, taking into
> account reactions to this summary, and we have a chance to
> take a quick look at -08 (I'll extend the LC to the date
> of the -08 version plus one week), then if no new
> substantive issues arise, I think we have rough consensus
> to go forward with this experiment (and others to come)
> and let the IESG review the document.
>
> Cheers,
> S.
>
> The substantive issues that arose in the 2nd last call
> were:
>
> 1. The context of the experiment
> 2. Changes to the trust model
> 3. The local-part (lhs) issue and i18n
>
> For each, I'll say where I conclude we've ended up and
> recommend what to do for -08.
>
> 1. The context of the experiment
> --------------------------------
>
> I think part of the reason this one has been hard has been
> a perception that we're developing the one true way to
> retrieve key retrieval for e2e email security.  The
> resolution here is to include text like that below in all
> similar experiments.
>
> "This specification is one experiment in improving access
> to public keys for end-to-end email security. There are a
> range of ways in which this can reasonably be done, for
> OpenPGP or S/MIME, for example using the DNS, or SMTP, or
> HTTP.  Proposals for each of these have been made with
> various levels of support in terms of implementation and
> deployment.  For each such experiment, specifications such
> as this will enable experiments to be carried out that may
> succeed or that may uncover technical or other impediments
> to large- or small-scale deployments. The IETF encourages
> those implementing and deploying such experiments to
> publicly document their experiences so that future
> specifications in this space can benefit."

This is a good change, and it was adopted in the -08 version.

> 2. Changes to trust model
> -------------------------
>
> John Levine noted a place where -07 seems to be saying a
> bit too much:
>
> " In sections 1 and 7, it claims that finding a key
> through DNS lookup is not a substitute for web-of-trust
> verification, which is fine.  But section 5.2 says that if
> a domain publishes a key for an address that's
> inconsistent with an existing key, verification of the key
> is "treated as a failure."  It's unclear what the effect
> is supposed to be, but considering the discussion of the
> lost key problem, it appears that the intent is that the
> sender would stop using the old key. "
>
> I think the text is 5.2 is a little ambiguous so suggest
> the change below, which clarifies that the failure is
> in the confirmation step and that the resulting action
> is dependent on local policy and is not being determined
> by taking part in the experiment.
>
> OLD:
>
> Locally stored OpenPGP public keys are not automatically
> refreshed.  If the owner of that key creates a new OpenPGP
> public key, that owner is unable to securely notify all
> users and applications that have its old OpenPGP public
> key.  Applications and users can perform an OPENPGPKEY
> lookup to confirm the locally stored OpenPGP public key is
> still the correct key to use.  If the locally stored
> OpenPGP public key is different from the DNSSEC validated
> OpenPGP public key currently published in DNS, the
> verification MUST be treated as a failure unless the
> locally stored OpenPGP key signed the newly published
> OpenPGP public key found in DNS.  An application that can
> interact with the user MAY ask the user for guidance.  For
> privacy reasons, an application MUST NOT attempt to lookup
> an OpenPGP key from DNSSEC at every use of that key.
>
> NEW:
>
> Locally stored OpenPGP public keys are not automatically
> refreshed.  If the owner of that key creates a new OpenPGP
> public key, that owner is unable to securely notify all
> users and applications that have its old OpenPGP public
> key.  Applications and users can perform an OPENPGPKEY
> lookup to confirm the locally stored OpenPGP public key is
> still the correct key to use.  If the locally stored
> OpenPGP public key is different from the DNSSEC validated
> OpenPGP public key currently published in DNS, the
> confirmation MUST be treated as a failure unless the
> locally stored OpenPGP key signed the newly published
> OpenPGP public key found in DNS.  An application that can
> interact with the user MAY ask the user for guidance,
> otherwise the application will have to apply local policy.
> For privacy reasons, an application MUST NOT attempt to
> lookup an OpenPGP key from DNSSEC at every use of that
> key.

This text is too confusing. Also, the issue of how to associate any keys 
that already exist on the sending user's keyring for the same e-mail 
address with any information retrieved from an OPENPGPKEY query needs 
more thought. (See above)

> 3. The local-part (lhs) issue and i18n
> ---------------------------------------
>
> This has always been and will always be an issue for any
> solution in this space. Absent changes to the mail
> architecture, or major changes to email protocols and
> deployment, it will always be an issue. And it's quite
> related to the  "joe+ietf" kind of lhs too, which'd need
> to be considered for a general solution to the problem.

I disagree that a solution for representing the local part of an e-mail 
address in DNS *for the purpose of looking up PGP keys* needs to take 
every possibly valid local part into account. The reason being that in 
order to be useful for encrypting e-mail the PGP key has to specify a 
UID that is an *exact* match for the receiving e-mail address. Thus it's 
trivial for the receiving user to create the precise number of canonical 
records and CNAMEs that are relevant for their key.

I think part of the communication problem here is that the mail folks 
have gotten caught up in trying to boil the ocean, and rightly conclude 
that Paul's hashing mechanism does not account for every possible 
combination. But that argument is specious *in this context*.

> I
> conclude that the right thing here is to do experiments,

Hooray :)

> but for those to not try to solve the general issue,

... and again! :)

> and
> instead to define the limits within which the experiment
> is to be done.

Agreed.

[big snip of more stuff I agree with]

Doug

dane-openpgp 2nd LC resolution Stephen Farrell
Re: dane-openpgp 2nd LC resolution E Taylor
Re: dane-openpgp 2nd LC resolution Stephen Farrell
Re: dane-openpgp 2nd LC resolution John C Klensin
Re: dane-openpgp 2nd LC resolution John C Klensin
Re: dane-openpgp 2nd LC resolution Doug Barton
Re: dane-openpgp 2nd LC resolution Paul Wouters
Treat model (was: Re: dane-openpgp 2nd LC resolut… John C Klensin
Case distinctions as theoretical exercise (was: R… John C Klensin
Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
Re: dane-openpgp 2nd LC resolution John Levine
Re: dane-openpgp 2nd LC resolution Paul Wouters
Re: dane-openpgp 2nd LC resolution Paul Wouters
Re: dane-openpgp 2nd LC resolution Doug Barton
Re: Case distinctions as theoretical exercise Doug Barton
Re: Threat model Doug Barton
Re: dane-openpgp 2nd LC resolution Doug Barton
Re: Case distinctions as theoretical exercise John C Klensin
Re: dane-openpgp 2nd LC resolution John R Levine
Re: dane-openpgp 2nd LC resolution John C Klensin
Re: dane-openpgp 2nd LC resolution Doug Barton
Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
Re: dane-openpgp 2nd LC resolution Paul Wouters
Re: dane-openpgp 2nd LC resolution Paul Wouters
Re: dane-openpgp 2nd LC resolution Doug Barton
Re: dane-openpgp 2nd LC resolution Viktor Dukhovni
Re: dane-openpgp 2nd LC resolution Mark Andrews
Re: dane-openpgp 2nd LC resolution Warren Kumari
Re: Case distinctions as theoretical exercise Phillip Hallam-Baker
Re: Case distinctions as theoretical exercise John Levine
Re: Case distinctions as theoretical exercise Phillip Hallam-Baker
Re: dane-openpgp 2nd LC resolution Stephen Farrell
Re: dane-openpgp 2nd LC resolution John C Klensin
Hashing local-parts of addresses (was: dane-openp… ned+ietf