Re: [Gen-art] Gen-ART and OPS-Dir review of draft-ietf-httpbis-header-compression-10

Hervé Ruellan <herve.ruellan@crf.canon.fr> Thu, 22 January 2015 17:47 UTC

Return-Path: <Herve.Ruellan@crf.canon.fr>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B262A1AC529; Thu, 22 Jan 2015 09:47:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.26
X-Spam-Level:
X-Spam-Status: No, score=-1.26 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_FR=0.35, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_NONE=-0.0001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rtnnQs00K6ZB; Thu, 22 Jan 2015 09:47:44 -0800 (PST)
Received: from inari-msr.crf.canon.fr (inari-msr.crf.canon.fr [194.2.158.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 856BD1A1B23; Thu, 22 Jan 2015 09:47:43 -0800 (PST)
Received: from mir-msr.corp.crf.canon.fr (mir-msr.corp.crf.canon.fr [172.19.77.98]) by inari-msr.crf.canon.fr (8.13.8/8.13.8) with ESMTP id t0MHlalF024968; Thu, 22 Jan 2015 18:47:36 +0100
Received: from Antiope.crf.canon.fr (antiope.fesl2.crf.canon.fr [172.19.70.56]) by mir-msr.corp.crf.canon.fr (8.13.8/8.13.8) with ESMTP id t0MHla46005609; Thu, 22 Jan 2015 18:47:36 +0100
Received: from timor.intra-usr.crf.canon.fr (172.20.8.117) by Antiope.crf.canon.fr (172.19.70.56) with Microsoft SMTP Server (TLS) id 15.0.995.29; Thu, 22 Jan 2015 18:47:36 +0100
Message-ID: <54C137B7.7020403@crf.canon.fr>
Date: Thu, 22 Jan 2015 18:47:35 +0100
From: Hervé Ruellan <herve.ruellan@crf.canon.fr>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: "Black, David" <david.black@emc.com>, Martin Thomson <martin.thomson@gmail.com>
Subject: Re: [Gen-art] Gen-ART and OPS-Dir review of draft-ietf-httpbis-header-compression-10
References: <CE03DB3D7B45C245BCA0D243277949362DE459@MX104CL02.corp.emc.com>, <CABkgnnUwNQUcFg5w5HFpSQrAUxtbqG_UN-_WDGop1eqqoCS+Aw@mail.gmail.com> <1421779730757.42642@crf.canon.fr> <CE03DB3D7B45C245BCA0D243277949362E9050@MX104CL02.corp.emc.com>
In-Reply-To: <CE03DB3D7B45C245BCA0D243277949362E9050@MX104CL02.corp.emc.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Originating-IP: [172.20.8.117]
X-ClientProxiedBy: Antiope.crf.canon.fr (172.19.70.56) To Antiope.crf.canon.fr (172.19.70.56)
Archived-At: <http://mailarchive.ietf.org/arch/msg/ietf/iV6AVbnpusHSB23BNQq0clD1fR4>
X-Mailman-Approved-At: Fri, 23 Jan 2015 10:00:54 -0800
Cc: "fenix@google.com" <fenix@google.com>, "General Area Review Team (gen-art@ietf.org)" <gen-art@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Jan 2015 17:47:46 -0000

David,

Some responses below.

On 01/20/2015 08:22 PM, Black, David wrote:
> Herve and Martin,
>
> -2-
>
> For major issue -2- regarding when to use the never-indexed format in order
> to avoid attacks like the CRIME attack on DEFLATE, I disagree with the
> discussion:
>
>>>> The second major issue is that I can't find the list of fields
>>>> that are required to use the never-indexed syntax for security
>>>> reasons.
>>>
>>> That list doesn't exist, because the need to avoid indexing is highly
>>> contextual.  Like padding, I don't expect this feature to be widely
>>> used, because it requires specific knowledge, but it is necessary to
>>> avoid low-entropy secrets from being exposed to CRIME-like attacks.
>
> Designing a mechanism to resist an attack but failing to provide guidance
> (either here or in the main HTTP v2 draft) on how to use it to resist that
> attack does not solve the actual problem, no matter how clever the
> mechanism.
>
> I hope the Security ADs notice this discussion, because the current
> situation looks like a security flaw from here (likely to result
> in insecure implementations, even though the implementations contain
> the mechanism needed to fix the security problem).
>
> If a list of specific fields is not reasonable, some specific guidance on
> where and why this mechanism needs to be applied is in order, as (IMHO)
> it is unreasonable to expect the entire global implementer community
> to have a detailed working command of the CRIME attack and its HPACK
> implications wrt specific pieces of information exchanged in the HTTP v2
> protocol.
>
>>> I can't confirm, but I think we're using it in Firefox for short
>>> cookies (over which we have little control, but still wish to
>>> protect).
>
> I expect HPACK to be implemented as part of HTTP v2 in systems whose
> implementation teams have far less security talent & insight by complarison
> to the major browser vendors, so while I appreciate the Firefox example,
> I don't consider it to be typical of many of the groups that are likely
> to  implement HTTP v2.

We started building HPACK when CRIME showed that using DEFLATE for 
compressing headers was not a secure solution.

The security review done when HPACK started to stabilize raise the issue 
that HPACK indexing table could still be used as an oracle for finding 
some header field values.

With the original HPACK, there were already several possible mitigation 
to prevent an attacker to take advantage of this oracle:
- The value space to search should be large enough to make the attack 
impractical.
- If this is not sufficient, a deployment of HPACK can select not to 
index sensitive header field values, or even not to index any header field.

To go further, it was decided to add a specific op-code for signalling 
to an intermediary that a header field was not indexed because it was 
deemed sensitive and should not be indexed further down the path.

The WG added this mechanism to make HPACK robust against foreseen 
possible attacks. However, security is a moving target, and I think that 
a list of header fields that should use it is difficult to build: this 
depends on the application/deployment, and this will evolve with time.

So I think we can offer guidance on how to use the "never indexed 
literals" mechanism but not much more. Section 7.1 already goes into 
details on how to protect header fields from unwanted probing. The last 
paragraph of section 7.1.2 and the whole section 7.1.3 targets in 
particular the "never indexed literals" mechanism. If you find these 
section insufficient or not clear enough, I can try to improve them.

>
> -- Everything beyond here is an editorial nit --
>
>>>> -- Section 1:
>>>>
>>>>     As
>>>>     Web pages have grown to include dozens to hundreds of requests,
>>>>
>>>> "include dozens to hundreds of requests" ->
>>>>          "require dozens to hundreds of requests to retrieve"
>>>
>>> "to retrieve" is too specific.
>
> How about "to access"?  If not, I guess the text is ok.

I think I like Martin's version:
    require dozens to hundreds of requests,
>
>>>>     Header Field:  A name-value pair.  Both the name and value are
>>>>        treated as opaque sequences of octets.
>>>>
>>>> Indicate what header or headers these fields come from.
>>>
>>> ?
>
> For HPACK, these fields are from the HTTP v2 header, right?

Yes.

>
>>>>     Static Table:  The static table (see Section 2.3.1) is a header table
>>>>        used to associate static header fields to index values.
>>>>
>>>> "associate static header fields" ->
>>>>          "statically associate commonly used header fields"
>>>
>>> I'm going to avoid the "commonly used" value judgment.  Whether they
>>> are commonly used or not isn't pertinent.  I'll let the editors decide
>>> if they would prefer your text.
>>
>> I also prefer to avoid "commonly used".
>
> How about "statically associate header fields that occur frequently in practice" ?
>
> The intent of the suggested edit it to provide some insight into why the static
> fields exist.

I understand your intent and I'm going to incorporate your suggestion.

> Thanks,
> --David
>
>
>> -----Original Message-----
>> From: RUELLAN Herve [mailto:Herve.Ruellan@crf.canon.fr]
>> Sent: Tuesday, January 20, 2015 1:49 PM
>> To: Martin Thomson; Black, David
>> Cc: fenix@google.com; General Area Review Team (gen-art@ietf.org);
>> ietf@ietf.org; ietf-http-wg@w3.org
>> Subject: RE: [Gen-art] Gen-ART and OPS-Dir review of draft-ietf-httpbis-
>> header-compression-10
>>
>> Thanks for all these edits, Martin.
>>
>> I did a few additional ones in:
>> https://github.com/http2/http2-spec/pull/694
>>
>> See below for my comments.
>>
>> Hervé.
>> ________________________________________
>>> From: Martin Thomson <martin.thomson@gmail.com>
>>> Sent: Tuesday, January 20, 2015 09:22
>>> To: Black, David
>>> Cc: fenix@google.com; RUELLAN Herve; General Area Review Team (gen-
>> art@ietf.org); ietf@ietf.org; ietf-http-wg@w3.org
>>> Subject: Re: [Gen-art] Gen-ART and OPS-Dir review of draft-ietf-httpbis-
>> header-compression-10
>>>
>>> In the absence of answers from the editors, I can provide an
>>> explanation for the choices you have identified as being issues.
>>>
>>> I've also turned your comments into a pull request.
>>>
>>> https://github.com/http2/http2-spec/pull/693
>>>
>>> You can review that; but the editors will likely have some more to say
>>> about this.
>>>
>>> (Note: I dropped opsdir from this reply, there was no substance there.
>>> I look forward to a review of HTTP/2 proper.)
>>>
>>> On 12 January 2015 at 18:12, Black, David <david.black@emc.com> wrote:
>>>> The first major issue involves the dense packing of static and
>>>> dynamic table indices, and what appears to be an inability to
>>>> ever change this  and HPACK in general (if that's a "feature,"
>>>> an explanation of why is in order).
>>>
>>> That is entirely intentional.  The same question has been raised
>>> several times, and the answer from the working group has been
>>> consistent:
>>>
>>> This compression scheme will not be the fastest, or give the best
>>> compression ratios.  It will have those limitations, but it will be
>>> easy to understand, implement and get correct and it will provide good
>>> enough compression performance.
>>>
>>> It will also be completely inflexible, but if that turns out to be a
>>> problem, we will fix it in HTTP/3, even if that means that HTTP/3 is
>>> almost entirely identical to HTTP/2.
>>>
>>> A few people objected to this on the grounds that this flies in the
>>> face of a body of accepted wisdom on extensibility in protocols.  But
>>> that flexibility turns out to be contrary to the aforementioned goals.
>>> Ultimately, this choice was very clearly and consistently the
>>> consensus of the working group.
>>>
>>> I'm going to propose the following text be added:
>>>
>>>     The HPACK format is intentionally simple and inflexible.  Both
>>>     characteristics reduce the risk of interoperability or security
>>>     issues based by implementation error.  No extensibility mechanisms
>>>     are defined; changes to the format are only possible by defining a
>>>     complete replacement.
>>>
>>>> The second major issue is that I can't find the list of fields
>>>> that are required to use the never-indexed syntax for security
>>>> reasons.
>>>
>>> That list doesn't exist, because the need to avoid indexing is highly
>>> contextual.  Like padding, I don't expect this feature to be widely
>>> used, because it requires specific knowledge, but it is necessary to
>>> avoid low-entropy secrets from being exposed to CRIME-like attacks.
>>>
>>> I'll note that the combination of "low-entropy" and "secret"
>>> immediately makes this a scenario into which only the very careful and
>>> knowledgeable should venture.  But apparently some do and those few
>>> (along with the paranoid) need the mechanism.  The rest of us can just
>>> use more entropy on our secrets.
>>>
>>> I can't confirm, but I think we're using it in Firefox for short
>>> cookies (over which we have little control, but still wish to
>>> protect).
>>>
>>> [snip]
>>>> Minor issues:
>>>>
>>>> -A- Section 1.3:
>>>>
>>>>     Dynamic Table:  The dynamic table (see Section 2.3.2) is a header
>>>>        table used to associate stored header fields to index values.
>>>>        This table is dynamic and specific to an encoding or decoding
>>>>        context.
>>>>
>>>> Need to define "header table" before using it in this definition, or
>>>> point to the discussion of the term in Section 1.
>>>
>>> Or you could not use "header table":
>>>
>>>     Dynamic Table:  The dynamic table (see Section 2.3.2) is a table that
>>>        associates stored header fields with index values.  This table is
>>>        dynamic and specific to an encoding or decoding context.
>>>
>>>> -B- Section 4.2
>>>>
>>>> This paragraph is unclear on what has to be communicated:
>>> [snip]
>>>> I suggest:
>>>>
>>>>     Multiple updates to the maximum table size can occur between the
>>>>     sending of two header blocks.  In the case that this size
>>>>     is changed more than once in this interval, the smallest
>>>>     maximum table size that occurs in that interval MUST
>>>>     be sent in an encoding context update.  The final maximum size is
>>>>     always sent, resulting in at most two encoding context updates.  This
>>>>     ensures that the decoder is able to perform eviction based on
>>>>     reductions in decoder table size (see Section 4.3).
>>>
>>> LGTM (I apologize, that was my text).
>>>
>>>> -C- Section 4.4:
>>>>
>>>> This paragraph is unclear on whether eviction occurs before or after
>>>> adding an entry:
>>>>
>>>>     Whenever a new entry is to be added to the dynamic table, entries are
>>>>     evicted from the end of the dynamic table until the size of the
>>>>     dynamic table is less than or equal to (maximum size - new entry
>>>>     size), or until the table is empty.
>>>>
>>>> I suggest inserting "(before the new entry is added)" after
>>>> "until the size of the dynamic table"
>>>
>>> How about this instead:
>>>
>>> s/Whenever a new entry is to be added.../Before a new entry is added.../
>>>
>>>> -D- Section 4.4:
>>>>
>>>>     If the representation of the added entry references the name of an
>>>>     entry in the dynamic table, the referenced name is cached prior to
>>>>     performing eviction to avoid having the name inadvertently evicted.
>>>>
>>>> Cached where and how?  Please explain.
>>>
>>> I don't find this unclear, would "saved" cause less confusion than "cached"?
>>
>> I think that this paragraph is too implementation oriented. In particular, the
>> "cached" is implementation-dependent. I propose changing it to:
>>
>>                      A new entry can reference the name of an entry in the
>>                      dynamic table that will be evicted when adding this new
>>                      entry into the dynamic table. Implementations have to take
>>                      care not deleting the referenced name from memory while
>>                      evicting the referenced entry from the dynamic table.
>>
>>
>>>
>>>> -E- Section 5.1
>>>>
>>>> N is supposed to be the number of bits in the prefix, which makes the
>>>> use of N in "Value (N)" incorrect in Figure 2.  I think just deleting
>>>> "(N)" in the figure will fix this.
>>>
>>> I think that's a fair point; I think removing the "(7)" from Figure 3
>>> is necessary though.
>>
>> The "(N)" in the figure was supposed to indicate the size of the value in
>> bits. But I'm OK with removing it.
>>
>>>>
>>>> -F- Section 7.1.3
>>>>
>>>> This section applies only to intermediaries that are aware of HPACK
>>>> (and presumably use it).  That should be stated, along with a reminder
>>>> that an HPACK-unaware intermediary that does HPACK-unaware compression
>>>> may create vulnerabilities to attacks like CRIME.
>>>>
>>>> Nits/editorial comments:
>>>>
>>>> -- Section 1:
>>>>
>>>>     As
>>>>     Web pages have grown to include dozens to hundreds of requests,
>>>>
>>>> "include dozens to hundreds of requests" ->
>>>>          "require dozens to hundreds of requests to retrieve"
>>>
>>> "to retrieve" is too specific.
>>>
>>>> -- Section 1.3:
>>>>
>>>>     Header Field:  A name-value pair.  Both the name and value are
>>>>        treated as opaque sequences of octets.
>>>>
>>>> Indicate what header or headers these fields come from.
>>>
>>> ?
>>>
>>>>     Static Table:  The static table (see Section 2.3.1) is a header table
>>>>        used to associate static header fields to index values.
>>>>
>>>> "associate static header fields" ->
>>>>          "statically associate commonly used header fields"
>>>
>>> I'm going to avoid the "commonly used" value judgment.  Whether they
>>> are commonly used or not isn't pertinent.  I'll let the editors decide
>>> if they would prefer your text.
>>
>> I also prefer to avoid "commonly used".
>>
>>>
>>>> -- Section 2.4:
>>>>
>>>> The rationale for the additional format that forbids ever encoding a
>>>> field should be repeated here (it's stated in Section 2.3).
>>>
>>> With the forward reference to 6.2.3, which contains an explanation, I
>>> think that this is adequately covered already.
>>
>> I added it here, because it is somewhat hidden in 6.2.3 (or at least less
>> visible).
>>
>>>
>>>> -- Section 5.1:
>>>
>>> I think moving the figures and related text is better.
>>>
>>>
>>>> idnits complained that it couldn't find an IANA Considerations
>>>> section.  Please add an empty one (stating that there are no IANA
>>>> Considerations) if/when the draft is revised.
>>>
>>> I tend to think that absence of "IANA Considerations" and a section
>>> with "This document has no IANA actions." should be treated as
>>> precisely equivalent.  But I guess that ship sailed already.
>>>
>>
>> I added this "IANA Considerations" section.