Re: [Cfrg] draft-mcgrew-hash-sigs implementation and findings

"Scott Fluhrer (sfluhrer)" <sfluhrer@cisco.com> Thu, 05 April 2018 00:16 UTC

From: "Scott Fluhrer (sfluhrer)" <sfluhrer@cisco.com>
To: Paul Selkirk <paul@psgd.org>
CC: "cfrg@ietf.org" <cfrg@ietf.org>
Thread-Topic: [Cfrg] draft-mcgrew-hash-sigs implementation and findings
Thread-Index: AQHTvtcpKCxlbs0AwkiHrbS402lzzqPxVZYQ
Date: Thu, 05 Apr 2018 00:16:02 +0000
Message-ID: <f9d4c3bbf70c43948f2b49cef41de8a0@XCH-RTP-006.cisco.com>
References: <5d590027-50d9-637e-8ef0-9b5a8ac22565@psgd.org>
In-Reply-To: <5d590027-50d9-637e-8ef0-9b5a8ac22565@psgd.org>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/cfrg/Gk2zcSh3LEL0M24huIX9ecJApiY>
Subject: Re: [Cfrg] draft-mcgrew-hash-sigs implementation and findings
Precedence: list

Ok, I went through all your suggestions (from both reviews), and I've modified the draft accordingly. Now, I do have the following comments (so justify why I didn't follow your suggestions, or to answer questions that you raised). I'll wait a few days before posting the update, to give you a chance to respond to my comments (so that I ideally won't have to have a -12 version)

- One thing that was constantly raised was the relative lack of pseudocode. Part of this is our intentions; we wanted people to understand what the data structures and algorithms are, and if we included any pseudocode, it was with the hope that it would be informative, not normative. There are actually a number of reasonable algorithms for several of the operations, and we didn't want to preclude them. Now, when you suggested pseudocode, and I thought it would help with the understanding, I included it.

- " I would break up paragraph 4, and add a short explanation of the HSS"
Excellent suggestion; the only thing I modified from your suggested text was omitting the "signature generation time is somewhat faster"; actually, depending on the algorithm, it could be faster, the same speed, or even slower (which is a rather odd case, but it can happen)

- "Security String"; you point out that its definition has a lot of forward references; I moved that into the Rationale section

- " q is the leaf number, or alternately the current index into the LMS secret key. This feels like a layer violation."
Actually, that's important for security; without it, someone could do a multitarget attack, that is, generate a lot of different OTS keys, hash them, and the resulting hash is any one of a number of targets, he wins, By including 'q' in the value being hashed, the attacker must select which leaf node we is attacking (similarly, by including 'I', the identifier for the public key, the attacker must select which LMS key he is attacking)

- " Using LM-OTS outside of LMS is outside the scope of this document."
As writing, yes, it is; however I left the reference in there just in case...

- " This is phrased awkwardly, and implies that it's someone else's job to generate I."
Actually, it is someone else's job to generate I (as it must be the same value used in the LMS tree); I rephrased it to make it clearer

- " As a side note, our project already uses RFC4122 version 4 UUIDs, which are effectively 122-bit random numbers with a few fixed bits. I'm assuming that the security of this system doesn't require a full 128 bits for I."
Actually, the goal is to have enough entropy that there's no single value that is used for too many distinct LMS trees; with the current "all 128 bits random" design, the likelihood of 2**64 public keys having a 4 way collision (that is, there's a single I value that is used by four of the keys) is < 2^-128. For something with 122 bits entropy, well, as long as don't generate more than 2**55 or so public keys, you're stll as safe as the original design. And, against conventional computers, this is massively conservative; we're starting with requiring circa 2**248 hash operations expected, and so even if you had a few more collisions than expected, you are still plenty safe (actually, you could use the same I value everywhere, and you'd still be safe. If you assume a Quantum Computer equipped adversary, you are still likely safe (although in that case, I wouldn't use the same I value everywhere).

- " I might also move this whole description of node numbers to section 5.3, which is the only place it's actually used."
Actually, there are also referenced in 5.4.2

- "LMS public key generation pseudocode"
Actually, appendix C already contains pseudocode

- "LMS signature" explanation needs a lot of help
Yup, it sure does. I added a lot of text here, along with some ASCII art (a diagram is a great help in explaining what an authentication path is; ASCII art isn't great, but it's better than nothing)

- " L has a maximum value of 8, which seems oddly prescriptive. Is there an engineering reason for that limit?"
It's mostly there because we had to put in some sort of limit (so that implementations didn't have to pretend to hand, say, 1000 levels), and we picked 8 as a reasonable number (as we couldn't think of any reason to have more than that). If someone has a use case that requires more, we'd have no problem raising that limit

- " HSS does not require all trees to use the same LMS type code (i.e. to be the same height). While most reasonable people would use same-height trees, is there a use case for trees of different heights"
Actually, yes. At the very least, if you go through my HSS implementation, there's good reasons for the top and bottom trees to be of different heights:
- The top level tree doesn't change, and so as an optimization to rebuilding the tree (in the case where we're resuming the signature operation after restarting the program) we can store (and the reuse) nodes from the top level tree; as they never change, we don't have to worry about updating them. This makes having an especially large top level tree more attractive.
- During the signature operation, all the churn happens on the bottom level tree; when you sign a message (and then update the state and the nodes you record), almost all the operations happen to the bottom level trees; hence to speed up the signature operation, it would make sense to make the bottom level trees smaller.
Of course, these reasons are quire specific to my implementation; your implementation (which needs to operate under very different constraints) is likely quite different.
- " or even different hash algorithms"
That is less clear; we just didn't see a good reason to disallow it.

- " As a notational issue, I think of the signature as belonging with the thing being signed, rather than with the key that is used to sign it"
Normally, I would agree with you; however the problem comes up; what is Sig[h-1]? Is it the signature of the bottom level LMS public key? Or, is it the signature of the actual message? Now, I suppose we could come up with different terminology for the second use; I decided not to.

- Signature generation pseudocode
Actually, we had text there which said mostly what your pseudocode did; however I felt that the operation was sufficiently complex that both the text explanation and the pseudocode make sense.

- " Why is there a gap between hbs_reserved and lms_sha256_n32_h5? Is it because the completely different enum ots_algorithm_type has values 1-4?"
A previous version has 1-4 stand for 16 byte versions of the hash function (that is, n=16). We decided to omit them (as the goal of this draft is to be Postquantum, and 16 byte hashes aren't that); but we didn't reorganize things. A similar gap doesn't appear in the OTS parameter sets because those sets had the 16 byte hashes after the 32 byte hashes (don't ask me why; that predates when I started getting involved).

- Section 4.4, Algorithm 1, step 2: n is not used here.
Actually, it's the output size of H. Depending on how your language deals with byte strings, you'll need it (Python wouldn't, a C implementation most likely would)

- Section 5.2, Algorithm 5, step 1: m is not used here.
Similarly, it's the output size of H.

- In the LMS signature, it's odd that q is first and lms_type is after the ots_sig. However, that would be an incompatible change to the protocol, which I don't expect to happen at this late date.
Yes, unless something is flagged as a security issue (or someone asks for additional parameter sets), nothing substantive is likely to change

[Cfrg] draft-mcgrew-hash-sigs implementation and … Paul Selkirk
Re: [Cfrg] draft-mcgrew-hash-sigs implementation … Paterson, Kenny
Re: [Cfrg] draft-mcgrew-hash-sigs implementation … Scott Fluhrer (sfluhrer)
Re: [Cfrg] draft-mcgrew-hash-sigs implementation … Paul Selkirk