Re: [openpgp] Deprecating SHA1

Jon Callas <joncallas@icloud.com> Sun, 25 October 2020 01:01 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Jon Callas <joncallas@icloud.com>
In-Reply-To: <20201024165354.GD860779@camp.crustytoothpaste.net>
Date: Sat, 24 Oct 2020 18:01:09 -0700
Cc: Jon Callas <joncallas@icloud.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B62148B4-41B6-49CC-ABA5-78852D76E51C@icloud.com>
References: <87sga5xg03.wl-neal@walfield.org> <20201024165354.GD860779@camp.crustytoothpaste.net>
To: "openpgp@ietf.org" <openpgp@ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/openpgp/f3SRZgfFKXSY65-8SyUwmfj6zDI>
Subject: Re: [openpgp] Deprecating SHA1
Precedence: list

I'm reading this with a good deal of exasperation. I apologize in advance for my tone, and I am not at all sorry about the content.

In 2004, the mathematician Wang Xiaoyun published a thing that possibly doesn't need mentioning. She broke, where "broke" means found collisions, in a mess of hash functions including MD5, SHA0, and a lot more. Her original publication is one of the most badass things I've ever seen in cryptography: she just published the collisions. H(X) == H(Y) for the suitable X and Y in enough number to show it's not a fluke (two is a good number for this) and across appropriate hash functions. This sent everyone into a tizzy. Did she know some underlying mathematical secret that we didn't?

Weeks and weeks went by, on to CRYPTO in Santa Barbara, and there was a lot of stress, speculating, and hand-wringing going on. At CRYPTO, Wang gave a talk on the work[1], and the rest of us found out that she didn't have any new mathematical insight, she was merely the best cryptanalyst on the planet, someone who had seen order trails of flipping bit that no one else on the planet had seen. At dinner that evening, I was talking to Ron Rivest and he said, "I used to think that hash functions were the cryptographic primitive that we understand best, now I know that they're the primitive we understand least."

Between those times, coming out of a session where we all got to talk to Wang in person, I walked out into the UCSB quad, got out my phone and called my friend and colleague, Will Price, who was VPE at PGP; I was CTO. We talked as I explained what had gone on in the session. I remember hearing the clack of Will typing as we talked. Twenty minutes later, he had a download link for me of a version of PGP that didn't use MD5 at all, used SHA256 by default (much of our conversation was whether it should be that or SHA512) and didn't outright prevent using SHA1 (which wasn't broken yet, but the betting on when was pretty furious), but it was pretty much stuck in the "advanced" preference UI in a disused lavatory with a sign on the door reading "beware of the leopard." That software went through our normal release QA, and shipped right after that, total time less than a week.

I know that hindsight is 2020, but why is this being discussed?

I'm sure that some readers of that paragraph saw through the little magic trick I did. I implied it took twenty minutes to make changes. Those same readers will realize that in fact we'd been discussing what to do and had a number of contingency plans based on what we found out about what Wang had discovered. That phone call was us deciding what to write in icing on the cake that had been baked weeks before. We're not geniuses, we just thought ahead and made some bold contingency plans. You can think ahead. You can be bold. Come on, get yourself an outfit and be a cowboy, too[2].

I get the impression that most people here think that the map is the territory. That RFC4880 and anything else is the definition of what one does. Here are a number of options possible.

* You could stop creating signatures that use SHA1. Just stop. Ditto for any other compromised aspect of SHA1.

* You could resolve a signature with SHA-1 in a creative way. For example, signatures have a number of states. There's the obvious case of a signature not computing correctly, say if the message has been damaged. That one's easy. There's the case of a correct signature from a trusted key. That's also easy. There are other cases that you have to take care of that are in the middle. You have to deal with a correct signature from an untrusted key. You have to deal with a signature made by a key that you don't have and so you can tell if it's correct or not. There are other edge cases, too.

Thus, you could consider a SHA1 signature to be incorrect and let the user know. You could consider it to be like a signature for a key you don't have. You could let the user know that it's mathematically correct but untrusted. (We did things like this in PGP; yes, this can be complex.)

You could consider a self-signature done with SHA1 to be non-existent, and handle it appropriately. You could even take special knowledge you have and do some reasonable thing. For example, let's suppose you know that this self-signature was created before the Shatter attack, and so you'll let is slide by. 

* You could also do helpful things for the user -- for example, when you have the key unlocked (by which I mean that you have the user's passphrase and thus the private key in your hot little RAM), you could go rewrite their self-signature with some other hash function and give them a new one just like the old one. (We did that in PGP, too.)

* You can do other helpful things for users like writing preferences in their self-signature that you, as implementor, think are helpful. You could do something like change key expiry automatically, as well. Just as text editors help users with things like autocorrect, you can help users in proper crypto management.

* You could compute primitives in any way that a naïve partner will handle correctly, as well. All the way back to when PGP started doing DSA (because in those days, there were patent issues with RSA), we were concerned with the issue with DSA that losing a single random number exposes the key. So when we computed the DSA random value, we took the raw nonce and then ran it through a keyed hash with the DSA private key. If you use N' = H(K+N) in DSA, you protect the nonce; learning the nonce requires either breaking the hash function or knowing the private key. (These days, you'd likely HMAC the nonce with the private key as that's today's idiom. In those days, there was no HMAC.) It's been that way for like ever, and no one ever noticed. We even discussed deterministic constructions using the private key with NIST and they said, no, we couldn't get that approved, but they liked the keyed hash.

* You can do plenty of other things. The standard is not a straightjacket. Implementers do not have to be fundamentalists nor textualists about it, either.

Particularly in the case where the OpenPGP support is built into some context, like an email client, there's nothing wrong with building something that only does a few things.

I'm flabbergasted because it seems to me that people are waiting for the working group to give permission to use (or not use) features of the OpenPGP standard. On the other hand, it seems that there's this expectation that if the standard says (e.g.) "MUST NOT use SHA1" that that will somehow magically make all the software get updated. 

Yeah, sure, there's nothing wrong with the standard noting that SHA1 is broken for collisions, act accordingly. That's neither necessary nor sufficient. Implementors can all decide that they're not going to use SHA1, and that makes it *really* happen. I'm grouchy about guidance in standards because implementation guidance for cryptography changes over time and often has a lot of nuance in it. There's guidance in 4880 that is just flat wrong. It's there despite being wrong because it was the consensus of the working group and the area directors that these wrong things are there. The one I'm thinking of is also not bad advice, but it's still wrong. Sorry for getting carried away there. My point is that the standard is not gospel and neither is it a bottleneck. It is a description of how the bits are laid out for the purpose of interoperability. 

Summing up, yes, stop using SHA1! How many bits must a cryptography write down before this is an issue. PGP, a reference implementation for OpenPGP stopped in 2004. NIST said to stop using it by the end of 2010. Highlight is 2020. Putting some appropriate text in OpenPGP about it is a fine thing. You don't have to wait for that. Be bold! 

Thanks for reading.

	Jon


[1]: The Wikipedia article on Wang is dreadful. In it, it says, "At the rump session of CRYPTO 2004, she and co-authors demonstrated..." This statement is 100% true and yet paints a picture that is totally false. It implies that the penny was dropped in the rump session when that was the closest thing to an official coming-out. It had been discussed in mailing lists, chat rooms, and the like before.

[2]: https://www.youtube.com/watch?v=dCeelWFO56Y

[openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 Paul Wouters
Re: [openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 Phil Pennock
Re: [openpgp] Deprecating SHA1 Guillem Jover
Re: [openpgp] Deprecating SHA1 Guillem Jover
Re: [openpgp] Deprecating SHA1 Jonathan McDowell
Re: [openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 brian m. carlson
Re: [openpgp] Deprecating SHA1 Jon Callas
Re: [openpgp] Deprecating SHA1 Phil Pennock
Re: [openpgp] Deprecating SHA1 Phil Pennock
Re: [openpgp] Deprecating SHA1 Peter Gutmann
Re: [openpgp] Deprecating SHA1 Benjamin Kaduk
Re: [openpgp] Deprecating SHA1 Ángel
Re: [openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 Neal H. Walfield
Re: [openpgp] Deprecating SHA1 Tobias Mueller
Re: [openpgp] Deprecating SHA1 heikostamer
Re: [openpgp] SHA1 Linter & Fixer Neal H. Walfield