Re: [Cfrg] Memory-efficient evaluation of data-independent memory-hard functions

Thank you for this post; the recent work in this area has been really
fascinating.

On Fri, Feb 12, 2016 at 11:12 AM, Joel Alwen <jalwen@ist.ac.at> wrote:

> Hi,
>
> I thought some of you might be interested in Jeremiah Blocki and my
> recent paper that just appeared on eprint
> (http://eprint.iacr.org/2016/115). It complements the recent work on
> data-dependent MHF (i.e. scrypt) by instead focusing on data
> *independent* MHFs (e.g. Both Catena functions, Argon2i, all three
> Balloon Hashing functions, etc.). Some of you may find it relevant to
> recent discussion on the list about finding a password-hashing standard.
>
> In a nutshell we develop a new class of parallel evaluation algorithms
> for several rather general classes of MHFs (including all those I just
> listed) and show that the resulting memory complexity is a fair bit
> lower then one might hope for when allowing parallelism (like in an
> ASIC). Put differently, we give some better than hoped for
> time-memory-trade-offs by using parallelism to evaluate several
> instances of the MHF at once resulting in a low "Time X Memory" per
> instance. In other words just the kind of algorithm one might find
> useful for launching cheap(er) brute-force attacks on these iMHFs.
>
> >From a more theoretical perspective we also show a general lower-bound.
> It states that no data-independent MHF (say doing 1-pass over n memory
> blocks) can have even only Omega(n^2/log(n)) cumulative
> memory-complexity (let alone the Omega(n^2) which people have been
> shooting for up till now).
>
> So where do these results leave us? AFAIK the closest we've got to an
> iMHF with a proof for large memory-complexity is the iMHF in my paper
> with Vladimir Serbinenko. Unfortunately the construction is based on a
> recursively built graph. What that boils down to is that if we want to
> compute the incoming edges of a node (i.e. which blocks from memory go
> into the next call to the compression function H) the straightforward
> computation (unravelling the recursion) takes log(n) operations. But
> really we want constant time (say a single call to a compression
> function + a mod operation as in scrypt) as otherwise the throughput of
> any implementation is going to suffer. Also the proof only shows
> something like Omega(n^2/log^10(n)) (with probably nasty constants)
> which is asymptotically way better then any of above mentioned iMHFs but
> actually pretty useless for practical parameters. I think its easy  to
> tighten the proof down to something closer to Omega(n^2/log^7(n)) but
> thats still pretty unsatisfactory and any way it doesnt help making the
> incoming-edges function easier to compute.
>
> Anyway I hope you enjoy the paper and that it provides a bit of fuel for
> thought. :-) If anything, IMHO it points out room for a lot more work in
> this area (both from a practical and theoretical point of view) before I
> would feel confident in proposing a standard for wider adoption.
>
> - Joel
>
>
>
>
> PS. What does the theoretical lower-bound mean for our search for a new
> password-hashing standard? What can we still hope for?
>
> Here are a couple thoughts about those questions.
>
> 1) Well, we can always just ignore it if we find an iMHF which no one
> manages to attack after a convincing amount of effort has been exerted.
> After all we know efficient Random Oracles can't exist yet we still
> effectively ask that our now hash function candidates to behave like
> one. This route does seem to give up on proof based evidence for (at
> least asymptotic) security though.
>
> 2) We could decide to settle for Omega(n^2/log(n)) assuming we can find
> a clean, simple and strongly explicit iMHF with accompanying proof
> matching this bound with small constants. (Notice that in the recent
> scrypt paper we only show Omega(n^2/log^2(n)) so its not even clear that
> a data-*dependent* MHF could break that bound.)
>
> 3) Alternatively, if we want to still shoot for a proof of Omega(n^2)
> complexity, then one potential way to still get there could be to make
> an analysis for concrete practical parameter ranges (rather then
> asymptotic behaviour). The large constants in the lower-bound do not
> rule this out. (Though the parameters of our evaluation algorithms for
> almost all the iMHFs I listed above do rule this out for those
> functions.) See figures in the paper...
>
> 4) Another less preferable way IMHO would be to increase the number of
> inputs to the underlying hash function. (Something like the Single and
> Double Buffer balloon hashing functions do as they use 21 inputs.) E.g.
> I think the lower-bound lets us have Omega(n^2) memory-hardness if we
> use order log(n) (or maybe log^2(n)) inputs to the hash function
> (instead of a constant like 2 or 3 as in most current iMHFs). There are
> at least two reasons why I'm personally not super in favour of this
> approach though. Practically it means worse throughput for an
> implementation as evaluating the hash function takes a lot longer now.
> Theoretically I'd also call it cheating because it really boils down
> making log(n) calls to a compression function (as was pointed out
> already in a couple papers on the subject). In particular, really we
> should be looking at MHFs defined over compression functions, not
> arbitrary input length hashes. But if look at what happens to any proof
> of memory-hardness for the high-indegree MHF when view the function as
> being over a compression function then AFAIK the proof looses a
> multiplicative cubic(!) of the indegree. So if we prove O(n^2) for an
> MHF with d inputs to the hash function then I only see how to prove that
> the same MHF viewed over the underlying compression function really has
> O(n^2/d^3) complexity. (See Alwen Serbinenko for how that works.)
>
>
> Also I'd love to hear other ideas for how to deal with the lower-bound
> if you have any!
>
> _______________________________________________
> Cfrg mailing list
> Cfrg@irtf.org
> https://www.irtf.org/mailman/listinfo/cfrg
>