[Hash] BOF Goals

In my involvement with IETF groups, one thing that has always struck me  
as a good thing is its decisions to stick to well-defined, practical  
matters. Furthermore, these have been more in layout than anything  
else. We don't like APIs, we like message transactions. Even when we  
venture into advice, it's always actionable, meaning that there are  
specific things that a practitioner can do. These preferences are  
positively cliché, none of us needs to read the words "rough consensus  
and working code" -- we've already been humming that tune in our heads.

It's therefore a bit unusual for me to ask one of my own cliché  
questions: What problem are we trying to solve? and have that be both  
genuine and backed with my own puzzlement. I don't know what we are  
doing, expect, plan, or even hope.

Yes, cryptographic engineering is in the uncomfortable situation  
presently that we're staring at embarrassing surprises with our present  
suite of hash functions. But the present-day workarounds are clear; we  
have hash functions that are good enough for the short and medium-term  
future. We also know that some uses of the present hash functions work  
just fine, thank you. (HMAC-MD5 springs to mind.)

As it has been stated, there are two problems we're looking at:

(1) truncating existing wide hashes for use in systems like DSA.

(2) to explore "randomized" hashes.

The first one is pretty easy to deal with, in the general case. We  
already addressed this in OpenPGP.

FIPS-180, which describes all of the SHA-family hashes [1] answers this  
question on page 73:

     "Some applications may require a hash function with an output size
     (i.e., message digest size) different than those provided by the
     hash functions in this Standard. In such cases, a truncated hash
     output may be used, whereby a hash function with a larger output
     size is applied to the data to be hashed, and the resulting output
     (i.e., message digest) is truncated by selecting an appropriate
     number of the leftmost bits. For example, if an output of 96 bits is
     desired, the SHA256 hash function could be used (e.g., because it is
     available to the application), and the leftmost 96 bits of the
     output are selected as the message digest, discarding the rightmost
     160 bits of the SHA-256 output."

Now, I should add that we decided not to do this in PGP's products, nor  
in OpenPGP. At PGP Corporation, we coded such a truncation into our DSA  
implementation for early releases of PGP 9. However, when I mentioned  
this a senior NIST person, I was asked not to. We accidentally let the  
truncated-SHA-256 version of DSA escape in one beta test, but we don't  
have it in there.

The reason we decided to stick with DSA as-is is that DSA does not  
include hash information in the signature itself. Consequently, there's  
a chance of cross-hash collisions in addition to the normal chance of  
hash collisions. I think that the risk of this is low when the hash  
functions are strong, and higher if the hash functions are weak. Thus,  
truncating SHA-256 conceivably makes the overall system *less* secure  
by opening up the possibility of cross-hash collisions.

NIST is supposed to release an addendum to DSA for wide hashes and keys  
bigger than 1024 bits. The *correct* thing for us to do is wait for  
that. That they haven't done so yet is disappointing. I was asking for  
it pretty-please with sugar on top in August '04 at Crypto, and we're  
still waiting. But -- they have their reasons, whatever they are. If  
someone finds the wait unacceptable, then there's an obvious workaround  
-- use RSA with SHA-256.

I don't think we need to do anything with the truncation issue. The  
answer of how to truncate the wide SHAs is answered for us in FIPS 180.  
The answer of what to do with DSA is "Don't." We can wait for NIST. If  
we don't we're going to have to do what they say, anyway. Why make more  
work?

The second issue, using salted hashes (I prefer this to "randomized"),  
is more interesting technologically, but there are still a number of  
extremely important issues open about them.

The first open issue is their expected security when compared both to  
the unsalted version and comparable other functions. For example, we  
know that SHA-1 should have 80 bits of security and doesn't. Will as  
salted SHA-1 have 80 bits of security? What makes us think so? SHA-256  
should have 128 bits of security, and many of us (I know I am one)  
expect it to have flaws. However, do we expect it to have less than 80  
bits of security? I don't.

It is a fair criticism of our history of designing hash functions that  
we've tweaked them rather than redesign them. But salting the hash is  
also a tweak. If SHA-256 is so broken that alone it has less security  
than 80 bits, what could possibly make us think that a tweak like  
salting will actually increase security? I wrote a long note to the  
CFRG list on this theme, which I will be happy to repost here. I  
believe that whatever flaws SHA-256 might have, it has enough security  
to last us for five to fifteen years at 80 to 100 bits of security. My  
rationale for my opinion is in that CFRG email.

The second issue is the performance of the salted hash functions. In  
discussions in the CFRG list, Dan Bernstein has suggested that salted  
MD5 has similar performance to unsalted SHA-256. If this is so, then  
whyever would you use salted MD5? Surely no one suggests that SHA-256  
has fewer than 64 bits of security, do they? If salted MD5 and SHA-256  
are similar in performance, then the *only* reason to consider using  
salted MD5 is that whatever protocol you're using can support a 128-bit  
hash, but not a 256-bit hash. And even in that case, the *real*  
question we should be asking is whether truncated SHA-256 has 64 bits  
of security!

It seems to me that if there are questions that a hashing working group  
should be asking, they are not the ones we're asking, but other ones  
that include:

* What is the security of a salted hash function? Does salted MD5 have  
64 bits of security? Does salted SHA-1 have 80 bits of security? Does  
salted SHA-256 have 128 bits of security? Why do we think this? I've  
seen no reasoning nor metrics.

* What is the security of a truncated hash function? Examples: Suppose  
SHA-256 has 110 bits of security (if it is as broken as SHA-1, this is  
my predicted security value). If you truncate SHA-256 to 128 bits, does  
that mean that it has 55 bits of security? Or does it mean that it has  
64 bits of security? Or something in the middle? Why do we think this?  
What do we expect its security to be if you truncate it to 160 bits and  
why?

* What is the performance of a salted hash function? Bernstein's claim  
that salted MD5 has similar performance to SHA-256 has gone  
unchallenged, and he himself has noted this silence.

* Are there other constructions we can perform that give us security  
over the naked hash function, but require less work than salting? For  
example, suppose we took SHA-256 and folded its halves, XORing them  
together. Does this give us more security than straight truncation? If  
not, why not (since it seems intuitive that folding would have at least  
an eensy bit more security)? What other constructions could we make  
that improve over raw truncation without the cost of salting?

Without answers to these questions, I don't see how this working group  
can give any meaningful results. Worse, I don't see how any  
recommendation of this group can improve over the simple advice of "use  
SHA-256." If this BOF results in a working group, I believe it *must*  
address the questions I outlined above, as well as the issues I and  
Bernstein have brought up in CFRG.

If salted hashes are slower than wider hashes and the wider hash has  
enough security (enough being greater than 64 bits for an MD5  
replacement and greater than 80 bits for a SHA-1 replacement), then we  
should just use the wider hash and be done with it -- except when  
that's not reasonable.

It is only when the actual size of the hash function is an issue and  
you can't just switch to RSA that the simple advice isn't worth taking.  
I can think of one such instance, and it's a draft that I work on --  
syslog-sign. In syslog-sign, space is at a premium, and it uses DSA for  
this reason. (A DSA signature is twice the size of the hash function  
rather than proportional to the size of the key, and syslog packets are  
teenie.) Furthermore, the way that syslog-sign is used, it is far  
better to have a 160-bit hash with 80 bits of security than it is to  
have anything else; space is at that much of a premium. In *this* case,  
I'd be happy to seemingly contradict my previous grouchiness and say,  
"Aw, just truncate SHA-256 to 160 bits and use that in DSA." There may  
be other cases where the use model of the protocol might warrant that.

But in the vast majority of cases, the simple advice:

(1) If you're using DSA, grin and bear it until NIST comes out with  
wide DSA
(2) If you're using RSA, switch to using SHA-256
(3) If you're using DSA and don't like (1), switch to (2)

is perfectly fine advice.

	Jon

[1]  
<http://csrc.nist.gov/publications/fips/fips180-2/fips180 
-2withchangenotice.pdf>

-- 
Jon Callas
CTO, CSO
PGP Corporation         Tel: +1 (650) 319-9016
3460 West Bayshore      Fax: +1 (650) 319-9001
Palo Alto, CA 94303     PGP: ed15 5bdf cd41 adfc 00f3
USA                          28b6 52bf 5a46 bc98 e63d

-- 
Jon Callas
CTO, CSO
PGP Corporation         Tel: +1 (650) 319-9016
3460 West Bayshore      Fax: +1 (650) 319-9001
Palo Alto, CA 94303     PGP: ed15 5bdf cd41 adfc 00f3
USA                          28b6 52bf 5a46 bc98 e63d

________________________________________________________________
This message could have been secured by PGP Universal. To secure
future messages from this sender, please click this link:

https://keys.pgp.com/b/b.e?r=hash%40ietf.org&n=I6LW%2FTFegliptjmtozrp%2Bg%3D%3D

_______________________________________________
Hash mailing list
Hash@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/hash