[Cfrg] Re: Comments on SIV and draft-dharkins-siv-aes-00

Hi Dan,

On 10/19/07 1:23 PM, "Dan Harkins" <dharkins@lounge.org> wrote:

> 
>   Hi David,
> 
> On Thu, October 18, 2007 1:41 pm, mcgrew wrote:
>> Hi Dan,
>> 
>> I'm sorry to take so long in getting back to you.  The new draft looks
>> great
>> - thanks for carrying it forward.  I have a bunch of comments, some on the
>> document, and some on SIV itself.
> 
>   I'm working on an -01 version of the draft right now so your comments
> are timely.

Super - I guess that I came in just under the wire.

I hope that other people on the mail list can provide comments on the draft
too!  

> 
>> First, a bunch of detailed comments.
>> 
>> The abstract says that " SIV takes a key, a plaintext, and a vector of
>> data".  I think the term "vector" will not be intuitive to many readers,
>> so
>> perhaps it would make sense to say that the vector is "an array of
>> variable-length octet strings", or something like that.  I mean to suggest
>> that you add text describing what is meant, rather than changing the
>> terminology from what Phil and Tom wrote.
> 
>   OK
> 
>> For the key derivation application (Section 1.3.3), what would the SIV
>> plaintext input be equal to?  Would it be omitted?
> 
>   The key derivation application uses S2V, the vectorized PRF component
> of SIV. There is no encryption step.
> 
>> Also, I would guess that SIV-based key derivation would only be
>> appropriate
>> for deriving keys from a given key, and that it may not be suitable for
>> use
>> in deriving keys from data that is unpredictable but not uniformly random,
>> as is used e.g. in Diffie-Hellman.  At least, I believe that this is
>> outside
>> of the scope of what is claimed in the security analysis, and it would
>> make
>> sense to document that (after verifying with Phil and Tom).
> 
>   Hopefully my explanation above addresses this comment. I will try to
> be more explicit in 1.3.3 that it is S2V being used for key derivation.
> 
>> I think that it might be useful to help explain the vector of inputs by
>> using an analogy to the POSIX "iovec" or scatter/gather functions readv
>> and
>> writev; these functions also allow the user to avoid data-marshalling, and
>> they should be familiar to many implementers.   Of course, the way that
>> readv and writev work doesn't depend on the way that data is broken into
>> smaller elements, but SIV does.
> 
>   That's a good idea. readv and writev seem to be analogous. Each iovec
> structure represents an AD input and the array of such structures represents
> the vector of inputs. I will try to come up with some appropriate verbage.
> 
>> Does S2V mean "vector to string"?  Would "V2S" be sensible?
> 
>   "string to vector"
> 
>> Section 1.3.4 typo - "troughput".  Also, it might be useful to provide the
>> detail that SIV requires two passes over the data during an encryption
>> operation, and thus is less suitable for pipelined hardware
>> implementations.
> 
>   Pipelined hardware implementations are definitely not the place for
> something like SIV. I see the difference more in "control plane" versus
> "data plane" application. SIV is more appropriate for a control plane
> application which is typically something in user-space calling into a
> cryptographic library to obtain encryption services. In such a situation
> the application developer may not be aware of the requirements surrounding
> nonce use for a cipher or may miss a subtle nuance in those requirements
> and not be able to ensure the security of the application. A data plane
> application of an AEAD cipher would typically be able to control the
> nonce space (along the lines of something like what SP 800-38D requires).
> 
>   The fact that SIV requires 2 passes of the data while something like GCM
> only requires 1 really just underscores for me the appropriateness of
> the distinction above.
> 
>   I'll mention 2 passes in that section.
> 
>> Notation "X10*" - might be notationally clearer to define p(X) as a
>> padding
>> function, since "X10" looks like a variable name.
> 
>   OK
> 
>> I like the compatibility between SIV-CTR and typical CTR implementations.
>> 
>> Sections 3 and 6 define how to use SIV as a nonce-based AEAD, and how to
>> use
>> it as such in the context of [AEAD].  But I think that a bit more
>> specificity is needed here.  Section 3 seems to allow multiple "associated
>> data" inputs, while Section 6 will need to require that there is just a
>> single AD input.  So I think that Section 3 needs to add a definition
>> that's
>> specific to the use of SIV together with [AEAD].
> 
>   It's a shame that [AEAD] requires a single AD input. I know Phil has
> commented that it should allow multiple inputs. Your response was that
> it is too late. Is it? [AEAD] is still an I-D.
> 
>   [AEAD] is supposed to provide a generic interface into AEAD cipher modes.
> It doesn't as long as it constrains valid modes.
> 
>   A single AD input is the degenerate case. SIV can handle a single AD
> input just fine but the generic interface to AEAD cipher modes should
> not force it.
> 
>   I don't think the changes to [AEAD] are significant to remove the
> limitation on AD inputs and I'd be happy to suggest text on how to do
> that if you're willing to produce another rev of the draft.
> 
>> Next, some higher-level comments.
>> 
>> First, what's the motivation for key wrapping?   This is an important
>> question that a lot of people have wrestled with.  I understand from
>> Section
>> 1.3.1 that nonceless AEAD is valuable because there are existing protocols
>> that do not make use of nonces, so SIV's capability for nonceless AEAD
>> enables it to be easily adopted by these protocols.   This is a very good
>> point.  Nonetheless, it does not address the question of "when should a
>> user
>> use nonceless AEAD?" outside of those "legacy" cases.   I would expect
>> that
>> we would want to provide guidance that users SHOULD use nonces wherever
>> possible, but MAY otherwise do without nonces.  (Perhaps there should be
>> an
>> exception for cases in which determinism is essential, e.g. database
>> applications in which plaintext-to-ciphertext mapping must be
>> deterministic.
>> But this is clearly a special case.)
> 
>   I think the motivation is to provide deterministic authenticated
> encryption for specialized data, such as cryptographic keys. The American
> Standards Committee Working Group X9F1 has come up with a draft standard
> for such a problem. S/MIME has RFCs on that problem. I do mention that in
> the draft.
> 
>   I'm a little reluctant to jump into that brier patch though. [DAE] does
> a very nice treatment of the reasons behind, and requirements for, key
> wrapping both informally (in the introduction) and formally (in appendix
> C). I will try to address your comment by pointing readers to [DAE] and
> X9F1. Would that be acceptable?

Seems to me that normative guidance on when nonces are/aren't needed should
be in the document, since it significantly affects security.  I propose at
least something like this:

<quote>
Applications SHOULD include a nonce in the associated data.  This nonce must
either be generated uniformly at random and be at least as long as the key,
or each nonce value must be distinct for each distinct invocation of the SIV
encrypt function.  Applications MAY do without a nonce in the associated
data if the plaintext contains data that is unpredictable to an adversary,
i.e. a secret key. 
</quote>

Details are up to debate, but I *think* that most people who care would
agree with that guidance.

My concern here isn't specific to SIV at all - my concern is that, if we
define specifications of keywrap algorithms and recommend their use, that we
be very clear where and how we expect for them to be used.

Off topic: OK, now that I have noticed and read Appendix F of [DAE], I think
that it might be a good idea to use that appendix as a source of input data
for the test vectors ;-)

> 
>> Second, I'm skeptical about the value of the vector input, so I suggest
>> that
>> more motivation, explanation, and an example usage or two, be added to the
>> draft.  I'll summarize my skepticism below in the hope that it will be
>> helpful.
> 
>   OK, I'll come up with an example on using a vector of inputs.
> 
>> As I understand it, the two benefits of the vector-input are that it
>> eliminates the need for the user to marshal multiple inputs into a single
>> input, and that it offers performance advantages in those cases that there
>> are repeated invocations of the crypto function in which some of the
>> inputs
>> remain constant.
>> 
>> Regarding performance, any AEAD algorithm can be made to support a
>> scatter/gather or init/update/final interface as per RFC1321.  It is a
>> conventional technique to copy the intermediate state after an update
>> operation, and then use it to process different suffixes.   Beyond that,
>> there are functions that support an "incremental" interface, in the sense
>> of
>> "Incremental Cryptography and Application to Virus Protection" (27th ACM
>> Symposium on the Theory of Computing, May 1995).  GMAC, and many other
>> functions that make use of universal hashing, can be used in this way.  So
>> it is possible to reap the performance benefit claimed for SIV with some
>> existing functions, and it's possible to realize the performance advantage
>> without using a vector of inputs.
> 
>   You seem to be arguing against the novelty of this idea but not against
> the idea itself.

I was trying to make the point that the performance advantages can be
realized without changing the user interface, so to speak.

> 
>   It is a natural way to deal with AD. In [AEAD] you say,
> 
>     "When using an AEAD to secure a network protocol, for example,
>      this input could include addresses, ports, sequence numbers,
>      protocol version numbers, and other fields that indicate how the
>      plaintext or ciphertext should be handled, forwarded, or processed."
> 
> That's, potentially, several distinct pieces of information. Some may
> be contiguous (addresses, ports and sequence numbers might all be in a
> single header) but other might not be. Some AD might not even transit
> with the authenticated and encrypted data.
> 
>   I guess I can try to highlight this concept but it seems that what
> should really be explained is why these multiple distinct pieces of
> information have to be viewed as a single component input to an AEAD
> cipher mode.

To play the devil's advocate: why, then, is there a single plaintext input
in SIV instead of a vector of plaintext inputs?

> 
>>                                     As a concrete example, one could
>> replace
>> the use of AES-CMAC on a vector of inputs in SIV with a polynomial hash
>> function (such as GHASH, the component of GCM/GMAC) applied to a single
>> input.  This would allow even *more* performance optimizations (in
>> particular, it allows optimizations whenever there are repeated
>> invocations
>> of the crypto function in which *any part* of the input remains constant).
> 
>   This is the second time I have heard this fantastic idea.
> 
>> In practice, it seems that these optimizations aren't used so much.  I
>> believe that the reason is because the additional complexity doesn't seem
>> warranted when the amount of data that stays the same across invocations
>> is
>> small compared to the entire data.  FWIW, I do think that there are
>> applications for incremental message authentication within the area of
>> security for data-at-rest.
>> 
>> The key derivation example that's used to motivate the vector-of-inputs
>> points out that in key derivation applications, it is common to have
>> multiple inputs to the KDF, some of which stay fixed across multiple
>> invocations of the KDF algorithm.  This is true, though I question the
>> performance gains, because I suspect that in the KDF case, there are many
>> small inputs.
> 
>   I attempted to get an S2V-based KDF adopted by the IEEE 802.11r (Fast
> Handoff) Task Group. Performance of the S2V KDF (using AES-CMAC) was
> four times faster than the HMAC-SHA256 KDF that 11r uses. This was a
> real-world example in which the context and label being bound into the
> derived key was constant and all that changed was the MAC address of the
> AP to whom the new key was to be delivered.

Great - might be good to add that to the draft.

David

_______________________________________________
Cfrg mailing list
Cfrg@ietf.org
https://www1.ietf.org/mailman/listinfo/cfrg