Re: [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions

"Blumenthal, Uri - 0662 - MITLL" <uri@ll.mit.edu> Wed, 09 December 2009 01:03 UTC

Return-Path: <uri@ll.mit.edu>
X-Original-To: cfrg@core3.amsl.com
Delivered-To: cfrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 225B73A69C3 for <cfrg@core3.amsl.com>; Tue, 8 Dec 2009 17:03:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.301
X-Spam-Level:
X-Spam-Status: No, score=-7.301 tagged_above=-999 required=5 tests=[AWL=0.796, BAYES_00=-2.599, GB_I_LETTER=-2, HTML_MESSAGE=0.001, MIME_BAD_LINEBREAK=0.5, RCVD_IN_DNSWL_MED=-4, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id L8CGct2NPkYC for <cfrg@core3.amsl.com>; Tue, 8 Dec 2009 17:03:16 -0800 (PST)
Received: from ll.mit.edu (LLMAIL1.LL.MIT.EDU [129.55.12.41]) by core3.amsl.com (Postfix) with ESMTP id C5D6F3A69B6 for <cfrg@irtf.org>; Tue, 8 Dec 2009 17:03:15 -0800 (PST)
Received: (from smtp@localhost) by ll.mit.edu (8.12.10/8.8.8) id nB9132PP025026 for <cfrg@irtf.org>; Tue, 8 Dec 2009 20:03:02 -0500 (EST)
Received: from lle2k7-hub02.llan.ll.mit.edu( ), claiming to be "LLE2K7-HUB02.mitll.ad.local" via SMTP by llpost, id smtpdAAA0Aa4CS; Tue Dec 8 19:56:56 2009
Received: from LLE2K7-BE01.mitll.ad.local ([ ]) by LLE2K7-HUB02.mitll.ad.local ([ ]) with mapi; Tue, 8 Dec 2009 19:56:56 -0500
From: "Blumenthal, Uri - 0662 - MITLL" <uri@ll.mit.edu>
To: "'cfrg@irtf.org'" <cfrg@irtf.org>
Date: Tue, 08 Dec 2009 19:56:55 -0500
Thread-Topic: [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions
Thread-Index: Acp4P5EiGp2C5XCPT6SOYHUrDnjOJwAKu9Ph
Message-ID: <90E934FC4BBC1946B3C27E673B4DB0E4A7EE853F9C@LLE2K7-BE01.mitll.ad.local>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_90E934FC4BBC1946B3C27E673B4DB0E4A7EE853F9CLLE2K7BE01mit_"
MIME-Version: 1.0
Subject: Re: [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2009 01:03:19 -0000

Please correct me if I'm wrong, but it seems that *in general* KDF based on a PRP primitive (such as AES) is clearly better than the one based on a weaker one (such as unkeyed hash, even extended to keyed MAC - where I have my own misgivings that are besides the point).

I fully agree with Hugo's point about dangers of bringing a crypto function from one context and applying in another. In fact that's what's been happening with SHA (and HMAC) being taken from digest and MAC generation to PRF and KDF.

I support Zooko's position.

________________________________
From: cfrg-bounces@irtf.org <cfrg-bounces@irtf.org>
To: cfrg@irtf.org <cfrg@irtf.org>
Sent: Tue Dec 08 14:32:06 2009
Subject: [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions

Zooko,

Let me add to what David wrote.

One thing that you seem to be doing in the analysis in your message is what Bernstein said NOT to do (from your own quote of him):
"It's often a security disaster to allow functions in one context merely because they qualify in another."

In particular, the fact that CBC-MAC with AES can make a better MAC function than HMAC with SHA2 does not mean that AES is a better basis for a KDF than SHA2.

KDF is a very different beast than a MAC or a PRF especially as it does not take a secret key (if it takes any key it is a public salt value).
The use of HMAC in HKDF may give the impression that any MAC function or any PRF function will be do as well (or better), but this impression is wrong.

In spite of the simplicity and familiarity of the KDF functionality, it is a non-trivial primitive to formalize and instantiate, and the intuition that works for a MAC or a PRF does not necessarily work in the KDF case. Addressing the whole complexity of considerations in the design of HKDF is well beyond what can be written in an email message so please refer to my paper on the subject. You will find there definitions, rationale, engineering considerations and technical results regarding the instantiation of HKDF with HMAC. Most arguments are heuristic in nature but they are the best guidelines for design that we know.

We do have some results (see Dodis, Gennaro, Hastad, K, Rabin cited in my paper) about the applicability of block ciphers to building KDFs but these are more limited and restricted than those that hold for HMAC.

Regarding the applicability of Poly1305 or any other CW MAC to the HKDF setting: A CW MAC consists of two modules, a universal hash  functions (UHF) and a PRF. The UHF is applied to the message and the result is processed via the PRF. Both UHF and PRF take secret keys. Once you replace these keys with non-secret salt values (as in the KDF case) nothing from the proof of security of these MAC functions holds anymore. So you cannot say that these schemes are good KDFs with non-secret keys just because they were good MACs with secret keys (remember Bernstein's quote?)

The one thing that you can say (and David has repeatedly referred to this point) is that UHFs with non-secret keys can make a good basis as randomness extractors (and hence a basis for KDFs) but this is NOT true for ALL UHFs. In particular, if you want to extract k bits of randomness using a UHF, this function has to be epsilon-universal where epsilon must be of the form 2^{-k}+e with e<<2^{-k}, thus in particular epsilon<<2^{-k+1}. However the UHFs behind functions like Poly1305 or UMAC all have epsilon >> 2^{-k} (something in the order of log(n)/2^k where n is the length of the input). So to use these functions as KDFs you would need to use much larger parameters (e.g. instead of a working modulo 2^130-5 you would be working modulo a prime close to 2^800) and then you would still need to apply another extractor (e.g. a pairwise independent hash function). This means that the ultra-efficiency of Poly1305 is gone and the complexity of implementation is up (in particular, requiring a primitive you did not have already in your typical S/W libraries). And then there is also the issue, also pointed by David, that these more complex UHFs make for provable KDFs only when the input is significantly larger than the output, something particularly problematic in the DH case.

I apologize for getting a bit too technical but these issues are technical (and while the  above is a bit fuzzy and quite imprecise it should give you an idea of the type of considerations involved in using a UHF as a KDF).

Hugo


On Fri, Dec 4, 2009 at 6:20 PM, Zooko Wilcox-O'Hearn <zooko@zooko.com<mailto:zooko@zooko.com>> wrote:
On Monday, 2009-10-26, at 5:07 , David McGrew wrote:

In the context of the extraction stage, he seemed to say that a Carter-Wegman MAC such as Poly1305 should be analyzed merely as a statistical extractor, not as a computational extractor.  Is that what you meant to say, David?

Yes, that's right.

I don't see why that would be so.

Check out S2.1 of http://cs.haifa.ac.il/~ronen/online_papers/survey.ps<http://cs.haifa.ac.il/%7Eronen/online_papers/survey.ps>

Okay, I've looked at that section.  It seems to be talking about Carter-Wegman universal hash functions, not about secure MACs built on such functions.  According to my, admittedly limited, understanding, a Carter-Wegman hash function could never be a computational extractor because there is nothing secret from an attacker, so it is trivial for an attacker to distinguish the output of a Carter-Wegman hash function from a truly uniform random string.  However, a Carter-Wegman-based MAC such as Poly1305 does have a secret that can be withheld from the attacker in exactly the same way that HMAC has -- the MAC key.

Do I understand correctly, David -- were you thinking of a Carter-Wegman hash function rather than of a Carter-Wegman MAC?

So as far as I understand, an HKDF which uses Poly1305 internally should enjoy the same analysis based on the PRF serving as a computational extractor as an HKDF which uses HMAC internally.


What security property do we want of a KDF anyway?  Let's denote a KDF as a function f which is chosen from a family of functions by a key "s", and which takes an input "x" and outputs a result "r":

r = f_s(x)

I think we want the function f to have the property Unpredictable Function [1].  This means if I give you access to an oracle that will compute f(x) for you, then you can't guess what it will give you for any x that you didn't query the oracle for.

That is also exactly the definition of a secure MAC!  So, is the KDF problem solved, and the answer is "use a MAC"?


I suppose this is perhaps the inspiration behind HKDF.  It appears to me that Poly1305 has a better security guarantee than HMAC has for this purpose, and that the ciphers that can currently be used by Poly1305 (such as AES, Serpent, or Salsa20) are more likely to be secure than the hash functions that can currently be used by HMAC (such as SHA-256 and SHA-512).

The security guarantee of Poly1305 is simply that if the underlying cipher is indistinguishable from a PRP, then Poly1305-CIPHER is an Unpredictable Function [2].

The best security guarantee of HMAC is that if the underlying hash function is both a "Privacy-Preserving MAC" (which is a stronger property than a MAC) and a "computationally almost universal" hash function then HMAC has the Unpredictable Function property [3, section 4].

So already Poly1305 requires less of its underlying components than HMAC does.  But in addition to that, the underlying components that we have available to use with Poly1305 look stronger than the ones that we have available to use with HMAC.

Already there are demonstrations that HMAC instantiated with popular, real-world hash functions like MD5 and SHA-1 is not secure [4].  (By the way, I don't know how to reconcile the results of [4] with the results of [3].  I suppose this implies that SHA-1 is either not a Privacy-Preserving MAC or not a computationally almost universal hash function.)

There is a widespread concert that SHA-256 might turn out to be weak, as its older siblings MD5 and SHA-1 have.  (That's why I'm holding my breath waiting for SHA-3.)

If I want to implement and deploy HKDF today (and I do!), and I use HMAC, and I don't want to rely on the long-term safety of SHA-256, then I have few remaining options.  SHA-512 is prohibitively expensive on one of my primary target platforms -- 32-bit embedded ARM boxes [5].  On the other hand if I deploy HKDF today using Poly1305 instead of HMAC, then I can use AES, and even if I don't want to use AES then I have several good alternatives such as Serpent and Salsa20.

It seems much less likely that AES, Serpent, or Salsa20 will turn out to be distinguishable from a PRP than that SHA-256 will turn out to be not a Privacy-Preserving MAC or not a computationally almost universal hash function.

In addition to these security advantages, Poly1305-AES or Poly1305-Salsa20 enjoy a substantial performance advantage over HMAC-SHA-256.

So if I were forced to stop researching and start implementing today, I think I might implement HKDF-Poly1305 instead of HKDF-HMAC.

Regards,

Zooko Wilcox-O'Hearn

P.S.  Aha, and now I find this letter that Prof. Bernstein wrote to this list in 2006 which makes the same points plus some other good points: http://www.ietf.org/mail-archive/web/cfrg/current/msg01167.html .  I think I read that letter at the time, but it is worth re-reading.

[1] Moni Naor, Omer Reingold: "From unpredictability to indistinguishability: A simple construction of pseudo-random functions from MACs" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.6517
[2] Daniel J. Bernstein: "The Poly1305-AES message-authentication code" http://cr.yp.to/mac/poly1305-20050329.pdf
[3] Mihir Bellare: "New Proofs for NMAC and HMAC: Security without Collision-Resistance (2006)" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.60.7396
[4] Christian Rechberger, Vincent Rijmen: "New Results on NMAC/HMAC when Instantiated with Popular Hash Functions" http://www.jucs.org/jucs_14_3/new_results_on_nmac/jucs_14_3_0347_0376_rechberger.pdf
[5] http://bench.cr.yp.to/results-hash.html#arm-apollo
_______________________________________________
Cfrg mailing list
Cfrg@irtf.org<mailto:Cfrg@irtf.org>
http://www.irtf.org/mailman/listinfo/cfrg