[Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions

Hugo Krawczyk <hugo@ee.technion.ac.il> Tue, 08 December 2009 19:32 UTC

Return-Path: <hugokraw@gmail.com>
X-Original-To: cfrg@core3.amsl.com
Delivered-To: cfrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7379C3A68D0 for <cfrg@core3.amsl.com>; Tue, 8 Dec 2009 11:32:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.976
X-Spam-Level:
X-Spam-Status: No, score=-3.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, GB_I_LETTER=-2, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IOdfb53NHaYD for <cfrg@core3.amsl.com>; Tue, 8 Dec 2009 11:32:41 -0800 (PST)
Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.24]) by core3.amsl.com (Postfix) with ESMTP id B1C363A66B4 for <cfrg@irtf.org>; Tue, 8 Dec 2009 11:32:40 -0800 (PST)
Received: by qw-out-2122.google.com with SMTP id 9so992779qwb.27 for <cfrg@irtf.org>; Tue, 08 Dec 2009 11:32:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to :content-type; bh=AtpkFk9tiV2nyQYYqM76rV5SxIHtzepV3lyHZBj4cc4=; b=ZL8R+1K0LrPz76sifF9ubiUt32+EwMZtvWylk39sMXBnLVYy2CQxP4MbhXnFlvgGpa FQthB9IF351n0RBKHUxNpUao+OWxkiNDMlQ/SIQzpBKIDOyTcHcvY/OMAPXtIICuEJwQ Ix6rlRhGYpWaHJF6aeD3Fx0S5nqWiMnvLtX4I=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=kmeHsPbePkuCgOH/xvJwi8733magdJH28m2r6fs0uFS26iggfr4EZbcO97dB0dHOqj neRX85NZOThqml6hbtEQGF0WWSxY36sEmMft1C0mx+mhlb0SJ3vdInrFM7aDy25zKmOa jgKN7il3U/La4SHiszpfCdniEvMlzNDeQOCGA=
MIME-Version: 1.0
Sender: hugokraw@gmail.com
Received: by 10.224.114.194 with SMTP id f2mr4822435qaq.68.1260300746484; Tue, 08 Dec 2009 11:32:26 -0800 (PST)
In-Reply-To: <e89b43830912071619r2a5ad533uf835656bc73bb0eb@mail.gmail.com>
References: <e89b43830910211838x2e1ca67cgaf48d02cd4008710@mail.gmail.com> <46DFA920-54BF-4567-90AF-6742C8FAA5F2@zooko.com> <e89b43830910221609y75514633m8064d5b19d8d54e4@mail.gmail.com> <823D9A4C-11AB-4CBE-A554-D8FF62AC6166@zooko.com> <1AA41B0A-F458-444A-BFC0-78A49FAB6F43@cisco.com> <D4BA0244-BDB0-4EE5-8018-E2E1A49578B9@zooko.com> <e89b43830912071619r2a5ad533uf835656bc73bb0eb@mail.gmail.com>
From: Hugo Krawczyk <hugo@ee.technion.ac.il>
Date: Tue, 08 Dec 2009 14:32:06 -0500
X-Google-Sender-Auth: 1b092b70ed3d6ef0
Message-ID: <e89b43830912081132x3d33c71n449c48b9a3b75348@mail.gmail.com>
To: cfrg@irtf.org
Content-Type: multipart/alternative; boundary="00c09f88d0ba4deedb047a3ca2c8"
Subject: [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Dec 2009 19:32:47 -0000

Zooko,

Let me add to what David wrote.

One thing that you seem to be doing in the analysis in your message is what
Bernstein said NOT to do (from your own quote of him):
"It's often a security disaster to allow functions in one context merely
because they qualify in another."

In particular, the fact that CBC-MAC with AES can make a better MAC function
than HMAC with SHA2 does not mean that AES is a better basis for a KDF than
SHA2.

KDF is a very different beast than a MAC or a PRF especially as it does not
take a secret key (if it takes any key it is a public salt value).
The use of HMAC in HKDF may give the impression that any MAC function or any
PRF function will be do as well (or better), but this impression is wrong.

In spite of the simplicity and familiarity of the KDF functionality, it is a
non-trivial primitive to formalize and instantiate, and the intuition that
works for a MAC or a PRF does not necessarily work in the KDF case.
Addressing the whole complexity of considerations in the design of HKDF is
well beyond what can be written in an email message so please refer to my
paper on the subject. You will find there definitions, rationale,
engineering considerations and technical results regarding the instantiation
of HKDF with HMAC. Most arguments are heuristic in nature but they are the
best guidelines for design that we know.

We do have some results (see Dodis, Gennaro, Hastad, K, Rabin cited in my
paper) about the applicability of block ciphers to building KDFs but these
are more limited and restricted than those that hold for HMAC.

Regarding the applicability of Poly1305 or any other CW MAC to the HKDF
setting: A CW MAC consists of two modules, a universal hash  functions (UHF)
and a PRF. The UHF is applied to the message and the result is processed via
the PRF. Both UHF and PRF take secret keys. Once you replace these keys with
non-secret salt values (as in the KDF case) nothing from the proof of
security of these MAC functions holds anymore. So you cannot say that these
schemes are good KDFs with non-secret keys just because they were good MACs
with secret keys (remember Bernstein's quote?)

The one thing that you can say (and David has repeatedly referred to this
point) is that UHFs with non-secret keys can make a good basis as randomness
extractors (and hence a basis for KDFs) but this is NOT true for ALL UHFs.
In particular, if you want to extract k bits of randomness using a UHF, this
function has to be epsilon-universal where epsilon must be of the form
2^{-k}+e with e<<2^{-k}, thus in particular epsilon<<2^{-k+1}. However the
UHFs behind functions like Poly1305 or UMAC all have epsilon >> 2^{-k}
(something in the order of log(n)/2^k where n is the length of the input).
So to use these functions as KDFs you would need to use much larger
parameters (e.g. instead of a working modulo 2^130-5 you would be working
modulo a prime close to 2^800) and then you would still need to apply
another extractor (e.g. a pairwise independent hash function). This means
that the ultra-efficiency of Poly1305 is gone and the complexity of
implementation is up (in particular, requiring a primitive you did not have
already in your typical S/W libraries). And then there is also the issue,
also pointed by David, that these more complex UHFs make for provable KDFs
only when the input is significantly larger than the output, something
particularly problematic in the DH case.

I apologize for getting a bit too technical but these issues are technical
(and while the  above is a bit fuzzy and quite imprecise it should give you
an idea of the type of considerations involved in using a UHF as a KDF).

Hugo


On Fri, Dec 4, 2009 at 6:20 PM, Zooko Wilcox-O'Hearn <zooko@zooko.com>wrote:

> On Monday, 2009-10-26, at 5:07 , David McGrew wrote:
>
>  In the context of the extraction stage, he seemed to say that a
>>> Carter-Wegman MAC such as Poly1305 should be analyzed merely as a
>>> statistical extractor, not as a computational extractor.  Is that what you
>>> meant to say, David?
>>>
>>
>> Yes, that's right.
>>
>> I don't see why that would be so.
>>>
>>
>> Check out S2.1 of http://cs.haifa.ac.il/~ronen/online_papers/survey.ps<http://cs.haifa.ac.il/%7Eronen/online_papers/survey.ps>
>>
>
> Okay, I've looked at that section.  It seems to be talking about
> Carter-Wegman universal hash functions, not about secure MACs built on such
> functions.  According to my, admittedly limited, understanding, a
> Carter-Wegman hash function could never be a computational extractor because
> there is nothing secret from an attacker, so it is trivial for an attacker
> to distinguish the output of a Carter-Wegman hash function from a truly
> uniform random string.  However, a Carter-Wegman-based MAC such as Poly1305
> does have a secret that can be withheld from the attacker in exactly the
> same way that HMAC has -- the MAC key.
>
> Do I understand correctly, David -- were you thinking of a Carter-Wegman
> hash function rather than of a Carter-Wegman MAC?
>
> So as far as I understand, an HKDF which uses Poly1305 internally should
> enjoy the same analysis based on the PRF serving as a computational
> extractor as an HKDF which uses HMAC internally.
>
>
> What security property do we want of a KDF anyway?  Let's denote a KDF as a
> function f which is chosen from a family of functions by a key "s", and
> which takes an input "x" and outputs a result "r":
>
> r = f_s(x)
>
> I think we want the function f to have the property Unpredictable Function
> [1].  This means if I give you access to an oracle that will compute f(x)
> for you, then you can't guess what it will give you for any x that you
> didn't query the oracle for.
>
> That is also exactly the definition of a secure MAC!  So, is the KDF
> problem solved, and the answer is "use a MAC"?
>
>
> I suppose this is perhaps the inspiration behind HKDF.  It appears to me
> that Poly1305 has a better security guarantee than HMAC has for this
> purpose, and that the ciphers that can currently be used by Poly1305 (such
> as AES, Serpent, or Salsa20) are more likely to be secure than the hash
> functions that can currently be used by HMAC (such as SHA-256 and SHA-512).
>
> The security guarantee of Poly1305 is simply that if the underlying cipher
> is indistinguishable from a PRP, then Poly1305-CIPHER is an Unpredictable
> Function [2].
>
> The best security guarantee of HMAC is that if the underlying hash function
> is both a "Privacy-Preserving MAC" (which is a stronger property than a MAC)
> and a "computationally almost universal" hash function then HMAC has the
> Unpredictable Function property [3, section 4].
>
> So already Poly1305 requires less of its underlying components than HMAC
> does.  But in addition to that, the underlying components that we have
> available to use with Poly1305 look stronger than the ones that we have
> available to use with HMAC.
>
> Already there are demonstrations that HMAC instantiated with popular,
> real-world hash functions like MD5 and SHA-1 is not secure [4].  (By the
> way, I don't know how to reconcile the results of [4] with the results of
> [3].  I suppose this implies that SHA-1 is either not a Privacy-Preserving
> MAC or not a computationally almost universal hash function.)
>
> There is a widespread concert that SHA-256 might turn out to be weak, as
> its older siblings MD5 and SHA-1 have.  (That's why I'm holding my breath
> waiting for SHA-3.)
>
> If I want to implement and deploy HKDF today (and I do!), and I use HMAC,
> and I don't want to rely on the long-term safety of SHA-256, then I have few
> remaining options.  SHA-512 is prohibitively expensive on one of my primary
> target platforms -- 32-bit embedded ARM boxes [5].  On the other hand if I
> deploy HKDF today using Poly1305 instead of HMAC, then I can use AES, and
> even if I don't want to use AES then I have several good alternatives such
> as Serpent and Salsa20.
>
> It seems much less likely that AES, Serpent, or Salsa20 will turn out to be
> distinguishable from a PRP than that SHA-256 will turn out to be not a
> Privacy-Preserving MAC or not a computationally almost universal hash
> function.
>
> In addition to these security advantages, Poly1305-AES or Poly1305-Salsa20
> enjoy a substantial performance advantage over HMAC-SHA-256.
>
> So if I were forced to stop researching and start implementing today, I
> think I might implement HKDF-Poly1305 instead of HKDF-HMAC.
>
> Regards,
>
> Zooko Wilcox-O'Hearn
>
> P.S.  Aha, and now I find this letter that Prof. Bernstein wrote to this
> list in 2006 which makes the same points plus some other good points:
> http://www.ietf.org/mail-archive/web/cfrg/current/msg01167.html .  I think
> I read that letter at the time, but it is worth re-reading.
>
> [1] Moni Naor, Omer Reingold: "From unpredictability to
> indistinguishability: A simple construction of pseudo-random functions from
> MACs" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.6517
> [2] Daniel J. Bernstein: "The Poly1305-AES message-authentication code"
> http://cr.yp.to/mac/poly1305-20050329.pdf
> [3] Mihir Bellare: "New Proofs for NMAC and HMAC: Security without
> Collision-Resistance (2006)" http://citeseerx.ist.psu.edu/viewdoc/
> summary?doi=10.1.1.60.7396
> [4] Christian Rechberger, Vincent Rijmen: "New Results on NMAC/HMAC when
> Instantiated with Popular Hash Functions" http://www.jucs.org/
> jucs_14_3/new_results_on_nmac/jucs_14_3_0347_0376_rechberger.pdf
> [5] http://bench.cr.yp.to/results-hash.html#arm-apollo
> _______________________________________________
> Cfrg mailing list
> Cfrg@irtf.org
> http://www.irtf.org/mailman/listinfo/cfrg
>