[COSE] Benjamin Kaduk's Discuss on draft-ietf-cose-hash-algs-04: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 11 June 2020 05:55 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: cose@ietf.org
Delivered-To: cose@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id C28C33A16FF; Wed, 10 Jun 2020 22:55:40 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-cose-hash-algs@ietf.org, cose-chairs@ietf.org, cose@ietf.org, Ivaylo Petrov <ivaylo@ackl.io>, ivaylo@ackl.io
X-Test-IDTracker: no
X-IETF-IDTracker: 7.3.1
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <159185494077.19612.5376022702223451135@ietfa.amsl.com>
Date: Wed, 10 Jun 2020 22:55:40 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/cose/7fLVaUHNLEW0GpWk1TagdCeIkFk>
Subject: [COSE] Benjamin Kaduk's Discuss on draft-ietf-cose-hash-algs-04: (with DISCUSS and COMMENT)
X-BeenThere: cose@ietf.org
X-Mailman-Version: 2.1.29
List-Id: CBOR Object Signing and Encryption <cose.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cose>, <mailto:cose-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cose/>
List-Post: <mailto:cose@ietf.org>
List-Help: <mailto:cose-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cose>, <mailto:cose-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Jun 2020 05:55:41 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-cose-hash-algs-04: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-cose-hash-algs/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

In Section 3.3:

   The SHA-3 hash algorithms have a significantly different structure
   than the SHA-2 hash algorithms.  One of the benefits of this
   differences is that when computing a shorter SHAKE hash value, the
   value is not a prefix of the result of computing the longer hash.

I did not think this was the case -- the sponge construction seems to
only use the 'd' parameter to truncate the output stream, but 'd' does
not seem to otherwise cause the output stream to vary.  Indeed, Section
4 of FIPS-202 concludes:

% Note that the input d determines the number of bits that Algorithm 8
% returns, but it does not affect their values.

Am I misunderstanding what the quoted statement is trying to convey?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 1

   [CMS].  This omission was intentional as a structure consisting of
   just a digest identifier, the content, and a digest value does not by
   itself provide any strong security service.  Additionally, an
   application is going to be better off defining this type of structure
   so that it can include any additional data that needs to be hashed,
   as well as methods of obtaining the data.

(The "additionally" bits were also part of the original intentional
omission's justification.  I don't know that we need to make this clear
specifically, though.)

   signature to be validated without first downloading all of the
   content associated with the signature.  This capability can be of
   even greater importance in a constrained environment as not all of
   the content signed may be needed by the device.

nit (I think this came up in Warren's comments, too?): we should clarify
that this functionality is used when the content being signed is broken
up into multiple chunks, each of which is represented as a hash
value+hash algorithm -- for any individual signature output or hash
value, the full plaintext must be processed, but the chunks that are not
of interested do not need to be fetched+hashed -- only their
contribution to the signature is processed.

   common.  One of the primary things that has been identified by a hash
   function for secure message is a certificate.  Two examples of this

nit: the grammar doesn't look right; maybe "messages" plural?

Section 2

   signature or using the hash as part of the body to be signed.  Other
   uses of hash functions do not require the same level of strength.

This sounds like it's definitively saying that all other uses do not
require this strength, which doesn't seem right.  Maybe "may not
require"?

   them.  Applications should also make sure that the ability to change
   hash functions is part of the base design as cryptographic advances
   are sure to reduce the strength of a hash function.

BCP 201 would be a great reference here :)

   A hash function is a map from one, normally large, bit string to a
   second, usually smaller, bit string.  There are going to be
   collisions by a hash function.  The trick is to make sure that it is
   difficult to find two values that are going to map to the same output

side note: if I was writing this (but I'm not!), I'd say something like
"because the output range has so many fewer possible configurations than
the input domain, there will inherently be many collisions where
different input strings produce the same output string".

Section 2.1

   *  Additional data, this can be something as simple as a random value
      to make finding hash collisions slightly harder (as the value

(Do we want to use the word "salt"?)

      handed to the application cannot have been selected to have a
      collision), or as complicated as a set of processing instructions

I can't tell if this parenthetical is supposed to say "as long as the
[salt] value handed to the application[...]" or "as the data handed to
the application cannot have been selected to have a collision when
combined with the unknown-at-the-time [salt] value".

      hashed be included.  (Encoding as a CBOR array accomplished this
      requirement.)

nit: s/accomplished/accomplishes/

   COSE_Hash_V = (
       1 : int / tstr, # Algorithm identifier
       2 : bstr, # Hash value
       3 : tstr ?, # Location of object hashed
       4 : any ?   # object containing other details and things
       )

nit: the '?' goes before the optional element, not after.

   An alternative structure that could be used for situations where one
   is searching a group of objects for a match.  In this case, the

nit: sentence fragment.

Section 3.1

   Despite the above, there are still times where SHA-1 needs to be used
   and therefore it makes sense to assign a point for the use of this
   hash algorithm.  Some of these situations are with historic HSMs

nit: do you have a preference among "point", "code point", and
"codepoint"?

   Because of the known issues for SHA-1 and the fact that is should no

nit: s/is/it/

Section 3.2

   *  *SHA-256/64* provides a truncated hash.  The length of the
      truncation is designed to allow for smaller transmission size.
      The trade-off is that the odds that a collision will occur
      increase proportionally.  Locations that use this hash function

Pedantically, "proportionally" would require some explanation (and
possibly the word "exponentially".  I don't insist on any changes here,
though.

      need either to analysis the potential problems with having a

nit: s/analysis/analyze/

      collision occur, or where the only function of the hash is to
      narrow the possible choices.

nit: the grammar of the "or where [...]" clause doesn't match up with
the start of the sentence.

      The latter is the case for [I-D.ietf-cose-x509].  The hash value
      is used to select possible certificates and, if there are multiple
      choices then, each choice can be tested by using the public key.

nit: maybe "multiple choices remaining"?

Section 3.3

   The family of SHA-3 hash algorithms [FIPS-202] was the result of a
   competition run by NIST.  The pair of algorithms known as SHAKE-128
   and SHAKE-256 are the instances of SHA-3 that are currently being
   standardized in the IETF.

"But what about RFC 6931?"
(Yes, I see RFC 8692 and RFC 8702 that only do the SHAKEs.)

   Unlike the SHA-2 hash functions, no algorithm identifier is created
   for shorter lengths.  Applications can specify a minimum length for
   any hash function.  A validator can infer the actual length from the
   hash value in these cases.

In light of my disuss point, I think this claim needs some
reconsiderations as well -- absent some other mechanism to detect
modification (truncation), an attacker could truncate a longer output to
a shorter one, undetected by the validator, since the one stream is a
prefix of the other.

   |SHAKE128|TBD10|128-bit SHAKE| []           | [This   | Yes         |

I'm not sure that we really want the description to just be "128-bit
SHAKE" -- the 128 and 256 relate to the "capacity" of the sponge
construction and are not tied to the output length.

Also, I think we should consider having some "minimum output length"
that has to be used in order to qualify for the "Recommended:Yes".

Section 4.1

   In addition, IANA is to add the value of 'Filter Only' to the set of
   legal values for the 'Recommended' column.  This value is only to be
   used for hash functions and indicates that it is not to be used for
   purposes which require collision resistance.  IANA is requested to
   add this document to the reference section for this table due to this
   addition.

So the ordering is now "Yes, No, Filter Only, Deprecated"?

Section 5

There are probably some considerations relating to the application's
"additional data" used in a CBOR hash structure such as the (first) one
in Section 2.1.  Making sure that all the relevant attributes/parameters
are bound into the computation/verification properly, etc.

   security all need to be included as part of this analysis.  In many
   cases the value being hashed is a public value, as such pre-image
   resistance is not part of this analysis.

nit: comma splice.

   Algorithm agility needs to be considered a requirement for any use of
   hash functions.  As with any cryptographic function, hash functions

(BCP 201 rears its head again)

Section 6

It's not really going to work so great to have normative references to
both RFC 8152 and the thing that obsoletes RFC 8152.  Given the one
place that we cite [COSE], moving it to informative seems reasonable.

Section 7

RFC 3174, on the other hand, is needed to implement one of the hashes
that we're allocating a codepoint for, so should probably be normative.
(It's already on the downrefs registry, so that's not a concern.)

Also, if we depend on the registry actions in 8152bis-algs, does that
make it a normative dependency?  (8152bis-struct will be taking care of
any downref concerns already, I think, for what it's worth.)