[Cfrg] Answers to HKDF questions
Hugo Krawczyk <hugo@ee.technion.ac.il> Thu, 22 October 2009 01:38 UTC
Return-Path: <hugokraw@gmail.com>
X-Original-To: cfrg@core3.amsl.com
Delivered-To: cfrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 547EC3A68D6 for <cfrg@core3.amsl.com>; Wed, 21 Oct 2009 18:38:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.676
X-Spam-Level:
X-Spam-Status: No, score=-1.676 tagged_above=-999 required=5 tests=[AWL=0.300, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fKXHkQSVl2i5 for <cfrg@core3.amsl.com>; Wed, 21 Oct 2009 18:38:28 -0700 (PDT)
Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.24]) by core3.amsl.com (Postfix) with ESMTP id 2790F3A6876 for <cfrg@irtf.org>; Wed, 21 Oct 2009 18:38:28 -0700 (PDT)
Received: by qw-out-2122.google.com with SMTP id 5so992416qwd.7 for <cfrg@irtf.org>; Wed, 21 Oct 2009 18:38:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:from:date :x-google-sender-auth:message-id:subject:to:content-type; bh=GpaKxlnbDPs5Dgp4xw7BNKi8FhEFFox8F+6+VyYz9AE=; b=KuKSYXETQTNbfMxNFa0jfCH0GxBveYcSPIjUhZRff6TOjZPMaa8IapBHQ6ETm9HjKt lltl/7BVDlLtlB8QUZ19n7cxs2W6FlGhgGLB4zfoJhogGRivUTJN/RIH9h83sn0QQHoD JKnz5m6Ts0MQKlAFwWqX3G0aCISLm0yGsW5XY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; b=kbvnaD0KtPdSm920Vtt7gtGSgYTMKnKl3ysai8122PMifPtnwRdnPqiUrXb6VIKCg8 XVhDga1sXKTHZG6pg4nypMkLucsz/CrjS7RpAmy+UQBzW0oib8ObYKwt+S75Qm0beH77 yYZlq7HSag8AY/bivXvBaj8zlXjLGj5QKC4Fs=
MIME-Version: 1.0
Sender: hugokraw@gmail.com
Received: by 10.224.30.209 with SMTP id v17mr4336093qac.188.1256175516133; Wed, 21 Oct 2009 18:38:36 -0700 (PDT)
From: Hugo Krawczyk <hugo@ee.technion.ac.il>
Date: Wed, 21 Oct 2009 21:38:16 -0400
X-Google-Sender-Auth: 273d73088cf8ee45
Message-ID: <e89b43830910211838x2e1ca67cgaf48d02cd4008710@mail.gmail.com>
To: cfrg@irtf.org
Content-Type: multipart/alternative; boundary="00c09f89928f6a2be204767c2757"
Subject: [Cfrg] Answers to HKDF questions
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Oct 2009 01:38:30 -0000
Here is a list of questions people asked related to HKDF in the last two days with some answers. I wrote it in this way as it seems more manageable than answering 20 or so messages, many raising similar questions. SEE ALSO MY LONG MESSAGE FROM EARLIER TODAY WITH MORE GENERAL RATIONALE. >Many comments ask "why we need yet another KDF, if we have so many". Answer: Because we have too many! (*) Designers of new protocols/applications do not have any guidance to choose between them. No one took the effort to sort out what properties we are looking for. What does each of these schemes provide and what they lack, etc etc. Well, I did. My work is far from a full taxonomy of properties, scenarios and existing schemes. But it does provide the main guidelines, and it does show (I want to believe) that analysis does favor some schemes over others. In particular, HKDF is the closest I could get to something that is general purpose enough and well-founded in analytics. (*) The best answer to the above question is given by Peter's own question: > In fact every protocol that I know of has invented its own KDF, incompatible > with every other KDF out there, thus my question about why we need yet another > one. We do not need another one, we need one... > Dan Harkins and others: How is HMAC(0,X) better than SHA256(X||X). Merkle-Damgard (MD), the iteration mode underlying SHA256, does not preserve "randomness" in the following sense: Even if the compression function that you use as the underlying hash is purely random, the result you get from M-D iterating this compression function is NOT random (actually not even a good extractor). HMAC, on the other hand, if you start with a random compression function you end with a random oracle (i.e., a function that looks random to an adversary that can only query the function in a bounded number of points). This is known in the literature as "random oracle indifferentiability". It is not a result about actual hash functions but a strong result about the STRUCTURAL STRENGTH OF HMAC (if it starts with something good it preserves the goodness). There are several papers on this; the main ones are [23,17] in my paper. >What is the value of HKDF when salt is set to 0. In this case we can only say useful things if we assume the compression function to be random. In this case, as said above in the context of indifferentiability, HMAC implies a random oracle and then good extraction properties while in the case of a plain M-D hash an underlying random compression function does not imply randomness and not even extraction. Moreover, the fact that you apply HMAC only once (with salt=0) to concentrate the input entropy in one key, and then use this key with the PRF is much better than the naive (unsalted) repetition SHA(X||1)||SHA(X||2)||... common to other KDFs -- for more on this see the paper. >Dan: HKDF is a GENERALIZATION of IKE's HKDF. True, but only for the unsalted case. Note that IKEv1 illustrates nicely the versatility of HKDF. It can use public nonces for salt in the signature mode, and it can use a secret salt in PK encryption mode. The latter gives you a much stronger KDF. One that is only based on the PRF properties of HMAC (since the salt acts as a secret key to the PRF). IKE is also a good example of how applications may use/find salt. Most other protocols do not use salt even if they have it available (e.g., SP 800-56A). Salting the KDF has important benefits for randomization and independence. >Uri repeatedly asks what makes HMAC appropriate for extraction/KDF. You should read the paper and the results cited there (especially section 8). Keep in mind that HMAC can only inherit its properties from the underlying hash function. It cannot transform a bad hash function into a good one, but for some functionalities (KDF is one of them) it can use weaker hash functions for stronger purposes than alternative schemes using the plain hash. If you are not willing to assume anything from the underlying hash function other than collision resistance, then there is only one result I can offer to you: Look at the result from [23] quoted in page 24 of my paper (second bullet). That result assumes properties of the hash function implied by collision resistance only. The downside: It applies only when you truncate the output of the extract part (for example, you use SHA-512 for extraction and SHA-256 for PRF). But it may not be totally foolish (smiley inserted) for you to make a heuristic leap of faith and accept this as an indication of strength also when you do not truncate. >Peter Gutman and Simon J.: PBKDF. PBKDF is quite specific to passwords since it has the iterations count that is not needed in general and you certainly do now want to artificially slow down your KDF when your source has decent entropy. I believe that PBKDF2 is a good design for passwords: it uses an iteration counter, salt, and HMAC as the underlying PRF (better in that sense than PBKDF1). You could make it general purpose by having the option of using it with iteration-count=1. However, even in this case the design is not great as a general purpose KDF since it uses the source keying material directly as a PRF key. There is nothing we can say in general if you key a PRF with a non-uniform key (some PRFs do not even accept arbitrary-long keys). That's why you first extract (or concentrate) the entropy of the input into a short key and then use that key for keying the PRF in the extraction stage. > Can we use HKDF for passwords? Not as defined. For passwords you need to replace the entropy of the source with slowing iterations of the function to slow down dictionary attacks (see page 25 in my paper). This can be added, but it is not included now. For example, for the specific case of passwords one could replace the extract part of HKDF with the F(P,S,c,1) function from PBKDF2. I am not sure if this is something we want to include in our draft. I will discuss with Pasi. >The input source is random (a strong cryptographic key), can we skip extract? When your input is already a strong cryptographic key you may choose to skip the extract part and go directly to expansion. For this any (variable output) PRF will do. The expansion part of HKDF is an implementation of such a PRF. >David McGrew raises the question that we may need different KDFs for different >settings. I believe that having one general-purpose KDF is IMPORTANT. It does not preclude an application to choose another one or a variant. But if we start standardizing a KDF for each usage scenario (including different types of key material sources) we will have to leave in the user's or protocol designer's hands to choose what is best for his setting. Not only this is not an easy task (understanding entropy and extraction in full generality is hard) but what you may assume for an application today may be different tomorrow. The one case where a special variant may be needed is for passwords (see above). >Are block ciphers better than hash for KDF? Block ciphers are good as a basis for a PRF (e.g via a variable input PRF mode). However, they are more problematic for extraction. For block ciphers as extractors we can only say things under the assumption that the block cipher is a truly random permutation. In the case of HMAC we have much richer results including those that do not idealize the hash function. (BTW, those that said to believe more on ideal ciphers than in random oracles, should pay attention to the related-key attacks against AES-256!) >David McGrew: Should we standardize on the extract part and allow for multiple >expansion modules? Good question. From a modularity point of view the answer is yes. >From the IETF point of view that wants to have as specific transforms as possible this may be problematic. In other words, my academic side would prefer modularity in the spec and independence between extraction and expansion (as long as the latter uses a good appropriate PRF mode). But I think that the IETF (well represented in this work by Pasi :) will not like that. >Zooko: How about outputting a random (algebraic) group element instead of a >(close to) uniform string? Even a "general purpose" KDF has limits to its generality. To me, the main need for applications is to have a KDF that produces uniform (or "close to uniform") strings of a certain length(s). In particular, the problem with choosing a random group element is that techniques to do so are group-dependent (you do it differently in a finite field multiplicative group than in an RSA group than in an elliptic curve; you may even have different techniques for different types of curves). So I do not see that you can find a "fits all" solution here. (Neither is HKDF a fit-all solution - for example it doesn't do what you want - but it as close as it get to serve the most common applications). > David McGrew: what is the precise security statement for HKDF? > What assumptions does one need to make about the hash function used in HKDF in > order that the security analysis applies? The paper says that "it is shown in > [23] (see Section 8) that using HMAC with a truncated output as an extractor > allows to prove security under considerably weaker assumptions on the > underlying hash function." However, both of the Lemmas in that paper (and the > implication in Section 8) make random oracle assumptions. Section 8 contains a summary of results from [23,17] regarding the properties of HMAC extractor. These results are quantitative, relating the strength of the underlying hash function with the quality of extraction. They cover a variety of situations depending on the entropy of the source and the existence of salt. The truncated output result that does not require any form of idealized assumption (uses almost universality and/or collision resistance) is quoted in my paper page 24 second bullet and is proven in [23]. I apologize for questions I missed. Hugo
- Re: [Cfrg] Answers to HKDF questions Zooko Wilcox-O'Hearn
- [Cfrg] Answers to HKDF questions Hugo Krawczyk
- Re: [Cfrg] Answers to HKDF questions David McGrew
- Re: [Cfrg] Answers to HKDF questions Hugo Krawczyk
- Re: [Cfrg] Answers to HKDF questions Blumenthal, Uri
- Re: [Cfrg] Answers to HKDF questions Hugo Krawczyk
- Re: [Cfrg] Answers to HKDF questions Zooko Wilcox-O'Hearn
- Re: [Cfrg] Answers to HKDF questions David McGrew
- [Cfrg] KDF==MAC? and: how about HKDF-Poly1305? Re… Zooko Wilcox-O'Hearn
- Re: [Cfrg] KDF==MAC? and: how about HKDF-Poly1305… David McGrew
- [Cfrg] Fwd: KDF==MAC? and: how about HKDF-Poly130… Hugo Krawczyk