Re: [Cfrg] RFC Draft for KangarooTwelve XOF draft-viguier-kangarootwelve-00

David Wong <> Thu, 15 June 2017 17:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BC7671293D8 for <>; Thu, 15 Jun 2017 10:27:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 95a0c7T_KwzL for <>; Thu, 15 Jun 2017 10:27:19 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:400d:c0d::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id DE72D129329 for <>; Thu, 15 Jun 2017 10:27:18 -0700 (PDT)
Received: by with SMTP id w1so29155282qtg.2 for <>; Thu, 15 Jun 2017 10:27:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=qcUfsi1HqcKaJbRoZSqkAiv72twLkga6Ld1d1/lLGXc=; b=mdkeU56CAeiuJRwQtGIrV9PT24bv95kb5i3Ohs51bcsKxXbSTri7dxeuHJG+w8KHh7 IgkNNZ3ebL+gwoVYEAi8KeQaNIkaz0AhCUq4itM39d232dH94Lnoxr6rvBtI+KqKGJ1J 5r+bXn3rI8Qata6LmdvVTjPDdSpWCkMfOEOinbeRQM1qxTSOecwhAEzATQ7CNX883Uvq UYLnUbvVLDDflbpV/R1vyFbzF9VxVIaaTRN8W6wE0x/f7mGcvOZTuzZgaqjS5JygCoo/ Vg5EwbqcZ4lwV75dMA4/7+d+Ed3KXoMFossrBTzNCraU7mkCg2fX8MheAfl47ZcmEI2A ux5Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qcUfsi1HqcKaJbRoZSqkAiv72twLkga6Ld1d1/lLGXc=; b=huw6piEne1RmnFjnnSTmHMtPotwHfb7VZOz4UyWYu3Zn1SVKJ0tdvO5FhMY1lacXzc zjqwiR6NNZh7VvXkEzjZRMtjlZ7dqEtz2dCtRIL0IkCdIx249jIMJddZbRuFf8K443/J zcQwJTz2WufI7eUYEtcpJyUw7UQJqampvy+zP2AElc/4zw++qWdvGe1niWIxTGPnOONG zoiSiSuP31/qGJuxz4B0ACpv8jHND21VcHOwObjr+EFKua/DrEfqNTv3auMHJ9WOzRsl wy7E6t7o7MliZV4OXKuyacA6+a4jZno/nae8yxSPqzVaKi4CkFFRJAiR1gm6pT/6JOEP Ip7A==
X-Gm-Message-State: AKS2vOxi/ukHMr+JkAWvzfehRu7yfLPk+3EH7NI4obE8i8JRpYXhmWXp pKk+v1MAYjY8M+fm3nqTPKoVlPCqIw==
X-Received: by with SMTP id z134mr7783288qka.35.1497547637685; Thu, 15 Jun 2017 10:27:17 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Thu, 15 Jun 2017 10:27:17 -0700 (PDT)
In-Reply-To: <>
References: <>
From: David Wong <>
Date: Thu, 15 Jun 2017 18:27:17 +0100
Message-ID: <>
To: =?UTF-8?Q?Beno=C3=AEt_Viguier?= <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [Cfrg] RFC Draft for KangarooTwelve XOF draft-viguier-kangarootwelve-00
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Crypto Forum Research Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 15 Jun 2017 17:27:22 -0000

Hey Benoit,

I'm happy to see that the first draft is done!

The whole thing looks pretty good to me. I have some minor feedback as
well as a major one right below:

I think the specification mixes implementation tricks and the real
specification keccak. If you're planning on giving (reversed) values
in hexadecimal for the dsByte and the padding, I think it is worth
specifying the sponge function differently with inversed operations in
the permutation and a transformation from input to lanes, relevant to
the tricks that implementations use. This would also mean getting rid
of any LSB 0 bit-numbering mention in the convention section. I think
this RFC would be a really good opportunity to clear things up and
simplify the implementers' job.

I would start by defining F as F(input, domain_separation,
output_length). Then the following algorithm for the function F:

// Padding
1. append the domain_separation byte to the input
2. append zeros to the input until it becomes a multiple of the rate (168 bytes)
3. XOR the last byte of the padded input with 0x80

// XOR'ing
4. split the input in n blocks of 168 bytes (rate size): block_1, ..., block_n
5. For each block, apply the Absorb algorithm
      Absorb(block_1, state); ...; Absorb(block_n, state)
      (note that each Absorb call is taking an updated state from the
previous Absorb call)

// Output'ing
6. concatenate the outputs of Squeeze algorithm until the amount of
bytes collected is greater or equal
   to output_length
      output = Squeeze(state) || ... || Squeeze(state)
      (note that each Squeeze call is taking an updated state from the
previous Squeeze call)
7. truncate the output to the size of output_length and return the result

// Absorb algorithm
Absorb(input, state) takes an input of 168 bytes, the state of the
sponge and updates the state.

1. Split the input in 21 8-byte blocks: input_1, ..., input_21; such that
   input_i = {b_i_0, b_i_1, b_i_2, b_i_3, b_i_4, b_i_5, b_i_6, b_i_7}
2. For each input_i, reverse the order of the bytes, such that
   reversed_input_i = {b_i_7, b_i_6, b_i_5, b_i_4, b_i_3, b_i_2, b_i_1, b_i_0}
3. For the first 21 "public" lanes of the sponge seen as
   Lane_i = {L_i_0, L_i_1, L_i_2, L_i_3, L_i_4, L_i_5, L_i_6, L_i_7},
   XOR each reversed_input_i against each Lane_i, such that
   Lane_i = {L_i_0 + b_i_7, L_i_1 + b_i_6, L_i_2 + b_i_5, L_i_3 +
b_i_4, L_i_4 + b_i_3, L_i_5 + b_i_2, L_i_6 + b_i_1, L_i_7 + b_i_0}
4. Update the sponge state by calling the permutation with only the
last 12 rounds (see Appendix A)

// Squeeze algorithm
output = Squeeze(state) takes the state of the sponge and updates the
state, while returning a 168 bytes output

1. Copy out the 21 "public" lanes of the sponge state Lane_1, ...,
Lane_21 seen as
   Lane_i = {L_i_0, L_i_1, L_i_2, L_i_3, L_i_4, L_i_5, L_i_6, L_i_7}
2. Reverse the bytes of each copied lane:
   Reversed_Lane_i = {L_i_7, L_i_6, L_i_5, L_i_4, L_i_3, L_i_2, L_i_1, L_i_0}
3. Update the sponge state by calling the permutation with only the
last 12 rounds (see Appendix A)
4. Return Reversed_Lane_1 || ... || Reversed_Lane_21

Now, in the appendix A.1. of the permutation, I would be careful to
define every operation in the reverse direction. For example ROL64()
is done by a left shift << instead of a right shift >>. It should also
be mentionned that operations have been reversed since the lanes have
been reversed.

I think this is important because the point of such an RFC is to dumb
down the technical details and facilitate the implementation. A lot of
shortcuts can be taken here because we do not care about other
rate+capacity variants, or non-multiple of bytes.

I have some more feedback about details here and there. Feel free to
ignore or acknowledge as you deem relevant.

→ The abstract

  It could contain more information on the use-case rather than on
what type of primitive it is.
  I'm saying that because I often run into obscure RFC and have a hard
time understanding from the abstract what they are for. Here is how I
would think is important:

  * K12          → KangarooTwelve is an eXtendable Output Function (XOF)
  * trust        → The design is based on SHA-3 and done by the same inventors
  * XOF          → It provides an arbitrary output length that can be
truncated to any desired size
  * fast         → Its performs better than SHA-3, SHA-2 and SHA-1
  * big inputs   → It takes advantage of tree hashing to efficiently
parallelize hashing of big files
  * small inputs → Due to Sakura Coding and the padding, no penalty
exists for small inputs

  This could be written as:

  This document defines the KangarooTwelve eXtendable Output Function (XOF),
  a hash function based on SHA-3 and designed by the same inventors that
  provides outputs of arbitrary lengths. It performs better than SHA-3, SHA-2
  and SHA-1; taking advantage of parallelism to speed up hashing of big files
  while still being efficient by avoiding penalties for small inputs.

→ The introduction

  - It could use the same spiel about the no-penalty for small inputs.
  - You mention Keccak without defining what it is (you mention
Keccak-p before though).

→ Section 1.1. Conventions

  - Do you really need the MUST, SHOULD, etc... keywords?
    I'm not sure any of these makes too much sense in K12's context.
  - "LSB 0 bit convention" is also called "LSB 0 bit-numbering", which
might more formal and fit to the context?

> |s|  denotes the length of a byte string "s".  For example, |"FF FF"| = 2.

  I think this should emphasize the unit: "denotes the length of a
value in 'bytes'"

→ Section 2.1. Inner function: F

> Keccak-p[1600,n_r=12]

  I think it should be noted that these are the "last" 12 rounds of
the permutation with 24 rounds.

  Also I'd prefer a function F defined as F(input, domain_separation,
output_length) to simplify the
  other parts of the RFC.

> The input string S SHOULD be represented as a pair (Sbytes, dS),

  Not sure why the "SHOULD" or why the mention of a "pair". What's a
pair in C :) ?

>  multi-rate padding rule pad10*1

  - You mention the multi-rate padding without defining it.
  - Do we really need these explanations on the computation of `dS`?
Just giving the values should be enough imo

→ Section 2.2. Tree hashing over F

  - Could this section simply be renamed "Hashing"?
  - I would also enumerate the steps and simplify a few things. For example:

  KangarooTwelve( input, custom_string, output_length ) takes any
input to be hashed, a
  (optionally empty) customization string and a desired output length.
It then execute
  the following steps:

  1. Concatenate input, custom_string and right_encode(|custom_string|).
       S = input || custom_string || right_encode(|custom_string|)
  2. If the byte length of S is less than 8192 bytes, then run the
sponge function
     on S with dsByte = 0x07 and ignore the rest of these steps.
Otherwise skip this
     step and go directly to step 3.
       F(S, dsByte=0x07, output_length)
  3. Split the concatenated value into n chunks of 8192 bytes, where
the last one can
     be shorter or equal to 8192 bytes). For example with 5 chunks:
       S = S_0 || .. || S_4
       with |S_0| = |S_1| = |S_2| = |S_3| = 8192 bytes
       and |S_4| <= 8192 bytes
  4. Compute CV_1 to CV_4, optionally using the parallelism supported
by your architecture.
     (For example this is achievable on Intel architecture by using
SIMD instructions.)
       Node_i  = S_i || `110`
       CV_i    = F(Node_i, dsByte=0x0B, output_length=32)
  5. To compute the final input, start by contatenating the first
chunk with the hexadecimal string "03 00 00 00 00 00 00 00"
       final_input = S_0 || "03 00 00 00 00 00 00 00"
  6. Append the concatenation of all the the CV_i to the final input.
       final_input = final_input || CV_1 || CV_2 || CV_3 || CV_4
  7. Append the encoding of the number of chunks minus one (4 in our
example) to the final input.
       final_input = final_input || right_encoding(n-1)
  8. Append the hexadecimal string "FF FF" to the final input.
       final_input = final_input || "FF FF"
  9. Run the sponge function on the final input with dsByte = 0x06
       F(final_input, dsByte=0x06, output_length)

> For |S| > 8192 bytes

  I think diagrams should be in the appendix (or at least in a
different section). Also a diagram for `|S| <= 8192 bytes` would be
nice as well :)

                                 |    message   |
                                 | custom_string|
                     +-----------------------------------+  F(dsByte=0x07)
                     | right_encode(|custom_string|)
|-----------------> output

→ 2.3.  right_encode( x )

> x = x / 256

  I don't think you define the division here. I would define it in
this section as well as in the convention section (1.1):

     a / b outputs the quotient of the operation: a divided by b. For
example 10 = 3 * 9 + 1, so 3 would be outputed
     by the operation 10 / 9. Furthermore, since 240 = 0 * 256 + 240,
0 would be the output of 240 / 256.

→ Section 3. Test Vectors

  These should probably be in an Appendix.

→ Appendix A.3.

  You could also think of including (or replacing yours with) the
python function there:
  But I think it should be noted that this is not taking advantage of

I probably have more, but this is starting to get huge so... I'll wait
until you tell me if I'm being unreasonable.

Thanks again for your work Benoit!