Re: [Crypto-panel] Review of AES-GCM-SIV

Hi all,

Please find my updated review below:

================================================================================

Summary: almost ready

========================================
Major concerns:
========================================

none

========================================
Minor comments and recommendations:
========================================

The draft is somewhat unclear whether plaintexts and additional data
(AD) are always assumed to be given as *byte*-strings, or whether it is
also possible to encrypt arbitrary-length *bit*-strings whose length is
not a multiple of 8.

On the one hand, at the beginning of Section 4 it is clearly stated that
the encryption algorithm takes "arbitrary-length plaintext & additional
data *byte*-strings".

On the other hand, then it would also be somewhat more
intuitive/consistent to include
the *byte*-length of plaintext and AD in the length block. The current
draft includes the bit-length. (This is of course technically fine and
essentially just a different notation, but *could* be misleading for
developers.) Also the example in Section 8 mentions the bit-length of
plaintext/AD.

Hence, I want to suggest to mention this assumption about the plaintext
length more clearly - even if it seems quite standard and will most
likely hold for most application anyway.

In the worst case, if a developer misunderstands this and allows the
encryption with arbitrary bit-length plaintext and AD, then this may
enable to break the integrity of ciphertexts (at least in theory), if
the bytelen() function is implemented in the natural way.

Let C = Enc(k,m,d,n) be a ciphertext, encrypting plaintext m with key k,
nonce n, and additional data d. Suppose that |d| = 7 bits, and that
bytelen() is implemented such that bytelen(d) = 1 (which seems natural,
even tough bytelen(d) = 7/8 would actually be correct). Note that the
encryption algorithm pads d with zeroes to a multiple of 16 bytes before
it is processed by POLYVAL, such that in particular it holds that

  C = Enc(k,m,d,n) = Enc(k,m,d||0,n)

and the decryption algorithm accepts both d and d||0 as "valid"
additional data for C.

Of course this attack is rather theoretical, but it can easily be
avoided by either including the precise *bit*-length of plaintext and AD
into the length block, or by letting the encryption algorithm abort, if
the lengths of plaintexts or AD are not a multiple of 8 bits (and one
could ignore this check in applications where this is guaranteed by the
environment - but this is of course something that only the application
developer can decide).

========================================
Nitpicking:
========================================

Section 1 "Introduction", 1st paragraph: I suggest to replace
  "...that is easier for practitioners to use correctly."
with
  "that is easier to use correctly."

In Section 4, first paragraph, the text suggests that plaintexts and
additional data of arbitrary length can be encrypted. However, the
description of the decryption procedure in Section 5 rejects ciphertexts
of size larger than 2^36+16 bytes, and Section 6 gives upper bounds on
the plaintext and AD sizes P_MAX and A_MAX.

In Section 4, last paragraph, the result of encryption is the "resulting
ciphertext ... followed by the tag". Thus, in this notation, the tag is
not part of the "ciphertext", but it is separate and sent along with the
ciphertext.
However, at the beginning of Section 5, decryption algorithm receives as
input key, nonce, AD, and a ciphertext, and the ciphertext is split into
the encrypted plaintext and the tag, thus the "ciphertext" contains the
tag here. One could unify this, by always considering the tag as part of
the ciphertext.

Section 8, very very nitpicking: One could mention here that the
plaintext are the bit strings corresponding to the *ASCII encoding* of
"Hello world" and "example".

Section 8, 5th paragraph, again very nitpicking: Some developers may
have difficulties in understanding immediately which numbers are given
in hexadecimal notation, and which in decimal notation. For clarity, one
could write here something like:
"example": 7 characters = 56 bits = 0x38 bits
"Hello world": 11 characters = 88 bits = 0x58 bits

Section 9, 7th paragraph: "Suzuki et al. [multibirthday]", the reference
lists Kazuhiro as first author, so it seems this should be Kazuhiro et al.

I did not check the test vectors.

Regarding Scott's comment on the verbal description of the encryption
and decryption algorithms: I had the same impression, some pseudocode
may be helpful to clarify what is happening here.

Apart from the above minor comments, I think that this is an excellent
RFC, which is very clear, precise, easy to understand, and
well-readable. The large number of test vectors will certainly be
considered very helpful to many implementers. I think it is very useful
to have a nonce misuse-resistant encryption scheme defined in an RFC, in
particular if it is as competitive with weaker solutions regarding
implementational difficulty and computational efficiency as this one.

================================================================================

Cheers,
Tibor

On 04/07/2017 15:53, Paterson, Kenny wrote:
> Thanks everyone for this helpful discussion.
> 
> If you want to update your reviews in the light of it, please go ahead and
> resend your reviews here. I'll then collate the three reviews we have to
> the CFRG list.
> 
> Cheers
> 
> Kenny 
> 
> On 04/07/2017 13:27, "Crypto-panel on behalf of Bjoern Tackmann"
> <crypto-panel-bounces@irtf.org on behalf of bjoern.tackmann@ieee.org>
> wrote:
> 
>> Hi all,
>>
>> On Sun, Jul 2, 2017 at 2:37 PM, Tibor Jager
>> <tibor.jager@upb.de> wrote:
>>
>>
>> On 01/07/2017 20:34, Bjoern Tackmann wrote:
>>> Please find my review below. It's a nice piece of work and overall in
>>> quite good shape.
>>>
>>> After looking at the other reviews: I do not quite understand Tibor's
>>> comment on the bit-length vs. byte-length, given that the draft states
>>> that the scheme takes "arbitrary-length plaintext & additional data
>>> byte-strings" -- and for me the term "byte-strings" means that the
>>> byte-length of the strings is an integer.
>>
>> Indeed, this is one of the sections that suggests that it is implicitly
>> assumed that "valid" plaintexts and AD have always a byte-length which
>> is an integer.
>>
>> What I found *potentially* confusing is:
>>
>> - Then it would also be somewhat more intuitive/consistent to include
>> the byte-length of plaintext and AD in the length block. The current
>> draft includes the bit-length. (This is of course technically fine and
>> essentially just a different notation, but *could* be confusing.)
>>
>> - Also the example in Section 8 mentions the bit-length.
>>
>>
>>
>>
>> I fully agree that it would be less ambiguous to do these computations in
>> terms of byte-length. I do not see any advantage of having the scheme
>> operate internally in terms of bit-length, when only byte-length strings
>> are allowed.
>>
>>
>>
>>
>> - It would also make sense to let the encryption algorithm abort, if the
>> lengths of plaintexts and AD are not a multiple of 8 bits (and one could
>> ignore this check in applications where this is guaranteed by the
>> environment - but this is of course something that only the application
>> developer can decide).
>>
>>
>>
>>
>> Agreed.
>>
>>
>>
>>
>> Best,
>> Björn 
>>
>>
>>
>>
>>
>>
> 
> _______________________________________________
> Crypto-panel mailing list
> Crypto-panel@irtf.org
> https://www.irtf.org/mailman/listinfo/crypto-panel
>