Paul Wouters' Discuss on draft-ietf-httpbis-digest-headers-12: (with DISCUSS and COMMENT)

Paul Wouters via Datatracker <noreply@ietf.org> Wed, 24 May 2023 17:18 UTC

Resent-Date: Wed, 24 May 2023 17:17:44 +0000
Resent-Message-Id: <E1q1s7M-003vtD-G1@lyra.w3.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Paul Wouters via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-httpbis-digest-headers@ietf.org, httpbis-chairs@ietf.org, ietf-http-wg@w3.org, mnot@mnot.net, mnot@mnot.net
Auto-Submitted: auto-generated
Reply-To: Paul Wouters <paul.wouters@aiven.io>
Message-ID: <168494865615.46280.6960866178840067379@ietfa.amsl.com>
Date: Wed, 24 May 2023 10:17:36 -0700
Received-SPF: pass client-ip=50.223.129.194; envelope-from=noreply@ietf.org; helo=mail.ietf.org
Subject: Paul Wouters' Discuss on draft-ietf-httpbis-digest-headers-12: (with DISCUSS and COMMENT)
Archived-At: <https://www.w3.org/mid/168494865615.46280.6960866178840067379@ietfa.amsl.com>
Resent-From: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list

Paul Wouters has entered the following ballot position for
draft-ietf-httpbis-digest-headers-12: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-httpbis-digest-headers/

----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I have a few DISCUSS questions about some of the security aspects of the draft,
that I would like the authors to at least consider.

It seems this new registry is getting pre-populated with a _lot_ of "insecure"
variants. Is there a strong reason why to not only add 1 insecure entry (eg crc32c) ?
New RFCs really should not be adding anything using MD5 or SHA1, even if we
allow/accept it is used insecure (see below for details).

The other question on the IANA Registry I have is the format and registration
policy. Recently, most security related IANA Registries try to use a RECOMMENDED
column that can only be set to Y using a registration policy of RFC Required.
Any other registration (eg via specification required or FCFS) cannot get
RECOMMENDED Y. Also, to change the RECOMMENDED column requires a standards track
RFC (via RFC Required policy). Is there a reason we cannot do the same here? I
also think a Designated Expert should not be able to change what is "standard"
or not, as the word "standard" strongly implies "standard track" or at the very
least "IETF consensus" which is not the same as a DE making a decision on their own.

This would also resolve my issue of specifying things as "standard" (when it
didn't come in via standards track RFC) and "insecure" (which really seems to
mean "secure for this type of usage, but not that type of usage), so I would
strongly prefer this indirection to be removed, and state a usage of "checksum
only" and "signed hash".

Finally, I think some more careful writing is needed around the case of
integrity vs integrity+signature and what it protects against. I don't
think "unintended or malicious data corruption" should be used as a type.
Either talk about "unintended" or talk about "malicious" - the two cannot
be used within a single concept unless you mandate integrity+signature.

If multiple hashes are included, what should happen when the most preferred hash
failed to verify? Should it be treated as failed or should it try other less
preferred but accepted hash algorithms? There should be text clarifying the
behaviour. Personally, I would prefer that only the most preferred hash is
checked and the other hashes are ignored, but perhaps I'm not fully aware of
common operational issues.

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

In between connections there is a possibility for unintended
or malicious data corruption. An HTTP integrity mechanism can
provide the means for endpoints, or applications using HTTP,
to detect data corruption and make a choice about how to act on
it. An example use case is to aid fault detection and diagnosis
across system boundaries.

I think there is too much subtlety here between starting with "unintended
or malicious data corruption" and then follow that with "detect
data corruption". It is clear that a hash can detect "unintended"
data corruption, but it is less clear it can detect "malicious" data
corruption, as a malicious actor will just update the digest value. And
the text is inconsistent with the Security Consideration in Section 6.1:

Integrity fields are not intended to be a general protection
against malicious tampering with HTTP messages.

Integrity fields do not provide integrity for [...] fields.

Perhaps "integrity field" then deservers a better name? I would propose
"digest field" but I guess RFC 3230 took that one and this document is
trying to obsolete that. Why not "checksum field" or "hash field" ?

The short fix for the sentence above could be:

Integrity fields do not provide authenticated integrity for [...]

Requests to update or change the fields in an existing
registration are permitted. For example, this could allow for the
transition of an algorithm status from "standard" to "insecure"
as the security environment evolves.

As stated in my DISCUSS, it feels odd that a DE can change what "standard" means.
And as stated in my DISCUSS, the term "insecure" feels weird to me.

Integrity fields do not provide any mitigations for downgrade
or substitution attacks (see Section 1 of [RFC6211]) of the
hashing algorithm.

See also my DISCUSS. While one hash field does not, sending multiple
ones could defend it a bit. If a sha: and sha-256: digest is sent,
and the receiver support sha-256, and the sha-256 hash is wrong but the
sha hash is fine, it should consider the integrity broken (assuming it
deems sha-256 more secure than sha-1). However, such behaviour should
be defined in this document (eg what to do in the case of multiple
hashes. eg. Only pick up the most preferred one and ignore the rest,
so in the above example one wouldn't even look at the sha: hash at all
if the sha-256 hash failed).

Section 7.2:

Can we omit all the "insecure" entries from the registry ? This draft is
something new, and whomever implements this should have at this point
sha-256 support available. Is there a performance issue? If there is a need
for something insecure and quick, can we limit this to 1 entry (eg crc32c)
and especially avoid MD5/SHA1 due to complications of those functions
getting blocked in crypto libraries and system wide crypto policies ?

Also, the Content-MD5 is limited to one specific digest algorithm;
other algorithms, such as SHA-1 (Secure Hash Standard), may be more appropriate

Is there a reason not to use SHA-256 instead of SHA-1 in this example?
The use of SHA-1 (or MD5) would be good enough for the purpose of this
draft (ignoring signed hashes) but the MD5 or SHA-1 hash function might
be disallowed or removed from crypto library implementations. If the
difference in performance between SHA-1 and SHA-256 is not an issue, I
would like to see SHA-256 mentioned instead of SHA-1. (and as separate
issue, I would prefer not recommend or even define MD5 or SHA-1 in this
new Registry at all)

whereas hashing algorithm keys are quoted

"hashing algorithm keys" is a strange term to me. It becomes clear a
few sentences down when we are talking about key/value pairs. Maybe
use "key value" and "hash value" (or qvalue as is used now?) when talking
about "keys" and "values"?

Section 4:

Why does Want-Repr-Digest: and Want-Content-Digest: need to have weights?
Are weight values used for anything else but a preference list? Eg why not
define Want-Repr-Digest: to have its most preferred algo listed first (left).

So instead of:

Want-Repr-Digest: sha-512=3, sha-256=10, unixsum=0

Why not:

Want-Repr-Digest: sha-512, sha-256, unixsum
(or even leave out unixsum is not supported or wanted)

The only argument I can see for weights is if you want to define multiple
hash algorithms with the exact same weight, which a left to right notation
wouldn't allow you do. But is that a feature that is really needed?

Paul Wouters' Discuss on draft-ietf-httpbis-diges… Paul Wouters via Datatracker
Re: Paul Wouters' Discuss on draft-ietf-httpbis-d… Lucas Pardue