Re: [Cfrg] KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions

David McGrew <mcgrew@cisco.com> Mon, 07 December 2009 22:50 UTC

Return-Path: <mcgrew@cisco.com>
X-Original-To: cfrg@core3.amsl.com
Delivered-To: cfrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3A9963A6896 for <cfrg@core3.amsl.com>; Mon, 7 Dec 2009 14:50:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.599
X-Spam-Level:
X-Spam-Status: No, score=-8.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sl1FfJKtvVRi for <cfrg@core3.amsl.com>; Mon, 7 Dec 2009 14:49:59 -0800 (PST)
Received: from rtp-iport-1.cisco.com (rtp-iport-1.cisco.com [64.102.122.148]) by core3.amsl.com (Postfix) with ESMTP id 7DBD83A683D for <cfrg@irtf.org>; Mon, 7 Dec 2009 14:49:56 -0800 (PST)
Authentication-Results: rtp-iport-1.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEALMUHUurR7Hu/2dsb2JhbADDWpZBhDMEgWc
X-IronPort-AV: E=Sophos;i="4.47,357,1257120000"; d="scan'208";a="72591614"
Received: from sj-core-5.cisco.com ([171.71.177.238]) by rtp-iport-1.cisco.com with ESMTP; 07 Dec 2009 22:49:45 +0000
Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by sj-core-5.cisco.com (8.13.8/8.14.3) with ESMTP id nB7MnDI1024883; Mon, 7 Dec 2009 22:49:45 GMT
Received: from xfe-sjc-212.amer.cisco.com ([171.70.151.187]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 7 Dec 2009 14:49:37 -0800
Received: from [10.32.254.210] ([10.32.254.210]) by xfe-sjc-212.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 7 Dec 2009 14:49:36 -0800
Message-Id: <4375EE8D-94E3-4621-BDA5-EFA58FD0F937@cisco.com>
From: David McGrew <mcgrew@cisco.com>
To: Zooko Wilcox-O'Hearn <zooko@zooko.com>
In-Reply-To: <D4BA0244-BDB0-4EE5-8018-E2E1A49578B9@zooko.com>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Mon, 07 Dec 2009 14:49:35 -0800
References: <e89b43830910211838x2e1ca67cgaf48d02cd4008710@mail.gmail.com> <46DFA920-54BF-4567-90AF-6742C8FAA5F2@zooko.com> <e89b43830910221609y75514633m8064d5b19d8d54e4@mail.gmail.com> <823D9A4C-11AB-4CBE-A554-D8FF62AC6166@zooko.com> <1AA41B0A-F458-444A-BFC0-78A49FAB6F43@cisco.com> <D4BA0244-BDB0-4EE5-8018-E2E1A49578B9@zooko.com>
X-Mailer: Apple Mail (2.936)
X-OriginalArrivalTime: 07 Dec 2009 22:49:36.0926 (UTC) FILETIME=[8D5E0BE0:01CA778F]
Cc: cfrg@irtf.org
Subject: Re: [Cfrg] KDF==MAC? and: how about HKDF-Poly1305? Re: Answers to HKDF questions
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Dec 2009 22:50:01 -0000

Hi Zooko,

On Dec 4, 2009, at 3:20 PM, Zooko Wilcox-O'Hearn wrote:

> On Monday, 2009-10-26, at 5:07 , David McGrew wrote:
>
>>> In the context of the extraction stage, he seemed to say that a  
>>> Carter-Wegman MAC such as Poly1305 should be analyzed merely as a  
>>> statistical extractor, not as a computational extractor.  Is that  
>>> what you meant to say, David?
>>
>> Yes, that's right.
>>
>>> I don't see why that would be so.
>>
>> Check out S2.1 of http://cs.haifa.ac.il/~ronen/online_papers/ 
>> survey.ps
>
> Okay, I've looked at that section.  It seems to be talking about  
> Carter-Wegman universal hash functions, not about secure MACs built  
> on such functions.  According to my, admittedly limited,  
> understanding, a Carter-Wegman hash function could never be a  
> computational extractor because there is nothing secret from an  
> attacker, so it is trivial for an attacker to distinguish the output  
> of a Carter-Wegman hash function from a truly uniform random string.

I think what's confusing here is the terminology of CW hashing, in  
which there is a family of hash functions H, and there is a function h  
that is chosen uniformly at random from that set.  The choice of which  
h in H is used in a particular instance can be kept secret.   A more  
"cryptographic" way of thinking about this situation is to consider a  
single function f that accepts an "index" input K and another input X,  
which selects a function h_K from H, then returns f(K,X) = h_K(X).    
(Aside: the GCM/GMAC description uses the latter terminology to  
describe its use of a polynomial hash function.)

Poly1305, GMAC, UMAC, and VMAC use universal hash functions, and could  
be used in a KDF in such a way that the proof of security relied on  
the universal hash as a statistical extractor.  It might be  
interesting to define such a thing.  But it would not be efficient to  
use it to extract keys from a DH shared secret, because the DH shared  
secret would need to be extra large (due to the technical details of  
the statistical extractor - Section 6.9 of http://www.shoup.net/ntb/ntb-v1.pdf 
  provides most of the technical details).   It might be fine for  
other uses.

I think it would be more productive, in the long run, for us to focus  
on the design of a computational extractor.  It would be great if this  
goal was considered in the SHA-3 competition.

> However, a Carter-Wegman-based MAC such as Poly1305 does have a  
> secret that can be withheld from the attacker in exactly the same  
> way that HMAC has -- the MAC key.
>
> Do I understand correctly, David -- were you thinking of a Carter- 
> Wegman hash function rather than of a Carter-Wegman MAC?

What I meant is that a CW-MAC can be used as an extractor with a  
theoretical justification provided by the details of the underlying CW- 
hash.

>
> So as far as I understand, an HKDF which uses Poly1305 internally  
> should enjoy the same analysis based on the PRF serving as a  
> computational extractor as an HKDF which uses HMAC internally.

No, the analysis of the two different cases would be different.  What  
property of Poly1305 would you be tying into the HKDF proof of security?

>
>
> What security property do we want of a KDF anyway?  Let's denote a  
> KDF as a function f which is chosen from a family of functions by a  
> key "s", and which takes an input "x" and outputs a result "r":
>
> r = f_s(x)
>
> I think we want the function f to have the property Unpredictable  
> Function [1].  This means if I give you access to an oracle that  
> will compute f(x) for you, then you can't guess what it will give  
> you for any x that you didn't query the oracle for.
>
> That is also exactly the definition of a secure MAC!  So, is the KDF  
> problem solved, and the answer is "use a MAC"?
>

What you're saying makes intuitive sense, but I do not see how the  
details can possibly support the intuition.

Suppose that we're using a KDF to generate a key that will be used in  
a PRF, and the PRF is proven to be secure assuming that the key used  
as input is generated uniformly at random.   In this case, we want the  
output of the KDF to be indistinguishable from random, and not merely  
unpredictable.  The standard, somewhat contrived, example about why an  
unpredictable function isn't necessarily indistinguishable from random  
would be a KDF with a u+v bit output, where the first u bits are  
unpredictable, and the final v bits are a checksum on the first u  
bits.  Clearly we don't want to use keys like this for any real  
cipher, but we also have to admit that they're unpredictable, so any  
analysis that only required the KDF to be unpredictable would have to  
tolerate this function.   It is non-standard to assume that AES-128 is  
a good PRF if its keys are unpredictable, rather than random, for  
instance.

>
> I suppose this is perhaps the inspiration behind HKDF.  It appears  
> to me that Poly1305 has a better security guarantee than HMAC has  
> for this purpose, and that the ciphers that can currently be used by  
> Poly1305 (such as AES, Serpent, or Salsa20) are more likely to be  
> secure than the hash functions that can currently be used by HMAC  
> (such as SHA-256 and SHA-512).
>
> The security guarantee of Poly1305 is simply that if the underlying  
> cipher is indistinguishable from a PRP, then Poly1305-CIPHER is an  
> Unpredictable Function [2].
>
> The best security guarantee of HMAC is that if the underlying hash  
> function is both a "Privacy-Preserving MAC" (which is a stronger  
> property than a MAC) and a "computationally almost universal" hash  
> function then HMAC has the Unpredictable Function property [3,  
> section 4].
>
> So already Poly1305 requires less of its underlying components than  
> HMAC does.  But in addition to that, the underlying components that  
> we have available to use with Poly1305 look stronger than the ones  
> that we have available to use with HMAC.
>
> Already there are demonstrations that HMAC instantiated with  
> popular, real-world hash functions like MD5 and SHA-1 is not secure  
> [4].  (By the way, I don't know how to reconcile the results of [4]  
> with the results of [3].  I suppose this implies that SHA-1 is  
> either not a Privacy-Preserving MAC or not a computationally almost  
> universal hash function.)
>
> There is a widespread concert that SHA-256 might turn out to be  
> weak, as its older siblings MD5 and SHA-1 have.  (That's why I'm  
> holding my breath waiting for SHA-3.)
>
> If I want to implement and deploy HKDF today (and I do!), and I use  
> HMAC, and I don't want to rely on the long-term safety of SHA-256,  
> then I have few remaining options.  SHA-512 is prohibitively  
> expensive on one of my primary target platforms -- 32-bit embedded  
> ARM boxes [5].  On the other hand if I deploy HKDF today using  
> Poly1305 instead of HMAC, then I can use AES, and even if I don't  
> want to use AES then I have several good alternatives such as  
> Serpent and Salsa20.
>
> It seems much less likely that AES, Serpent, or Salsa20 will turn  
> out to be distinguishable from a PRP than that SHA-256 will turn out  
> to be not a Privacy-Preserving MAC or not a computationally almost  
> universal hash function.
>
> In addition to these security advantages, Poly1305-AES or Poly1305- 
> Salsa20 enjoy a substantial performance advantage over HMAC-SHA-256.
>
> So if I were forced to stop researching and start implementing  
> today, I think I might implement HKDF-Poly1305 instead of HKDF-HMAC.
>
> Regards,
>
> Zooko Wilcox-O'Hearn
>
> P.S.  Aha, and now I find this letter that Prof. Bernstein wrote to  
> this list in 2006 which makes the same points plus some other good  
> points: http://www.ietf.org/mail-archive/web/cfrg/current/msg01167.html 
>  .  I think I read that letter at the time, but it is worth re- 
> reading.

There are a couple of old threads on this topic too, including one  
with the subject "KDF definition and goal".

I sympathize with your desire to find a KDF that doesn't rely on an  
MD5-like function, but I don't think that Poly1305 or any similar  
function will get us a general-purpose KDF.  Maybe such a thing would  
be OK for uses other than DH; if you're still interested, let's work  
through the details of the proof as a way to start.

regards,

David

>
> [1] Moni Naor, Omer Reingold: "From unpredictability to  
> indistinguishability: A simple construction of pseudo-random  
> functions from MACs" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.121.6517
> [2] Daniel J. Bernstein: "The Poly1305-AES message-authentication  
> code" http://cr.yp.to/mac/poly1305-20050329.pdf
> [3] Mihir Bellare: "New Proofs for NMAC and HMAC: Security without  
> Collision-Resistance (2006)" http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.60.7396
> [4] Christian Rechberger, Vincent Rijmen: "New Results on NMAC/HMAC  
> when Instantiated with Popular Hash Functions" http://www.jucs.org/jucs_14_3/new_results_on_nmac/jucs_14_3_0347_0376_rechberger.pdf
> [5] http://bench.cr.yp.to/results-hash.html#arm-apollo