Re: Gen-art last call review of draft-ietf-httpbis-http2-16

Martin Thomson <martin.thomson@gmail.com> Tue, 20 January 2015 07:17 UTC

Return-Path: <martin.thomson@gmail.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D27001B2D68; Mon, 19 Jan 2015 23:17:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u3BM1Z6AXsyk; Mon, 19 Jan 2015 23:17:16 -0800 (PST)
Received: from mail-oi0-x232.google.com (mail-oi0-x232.google.com [IPv6:2607:f8b0:4003:c06::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0B5041AD0A8; Mon, 19 Jan 2015 23:17:16 -0800 (PST)
Received: by mail-oi0-f50.google.com with SMTP id v63so11789759oia.9; Mon, 19 Jan 2015 23:17:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1Eyak7wn/vN+DJ4D0K2WQKvSXipp086w0unm1XemXxU=; b=LSFJsVVLEhW9QHAfYmQd2POmP30frfZ1wiAob4SaaLjgXIloFPUpyMqnvRNyotsDdy tpBOLzvx7c+rQfQWDqu2uzOiADUhLjL0Mjytj7NsKXumFA2rlFE9zHkXaJOScKGLIM+l clbENXNcGDpGGGwj/2fG0FIf9P0v1gj7SFf2LLHqvBfss3YO57uwNF+RIBjrrl+Vf3rv OnmNlYsk+FU1DaVApggoMtSpdp/Cpj77hsM2md+9LMu+7ZAIm4LKvfgJn1CJ9gfAYNdH T1tGwj+jIVxT0727IJ6W4ta3M3gu/BFCR06/+LlzwcHQos2taSJhO7LHdArQKhGgnRUC JDzw==
MIME-Version: 1.0
X-Received: by 10.202.75.202 with SMTP id y193mr19540129oia.12.1421738235293; Mon, 19 Jan 2015 23:17:15 -0800 (PST)
Received: by 10.202.226.136 with HTTP; Mon, 19 Jan 2015 23:17:15 -0800 (PST)
In-Reply-To: <54BCDE62.9090902@dial.pipex.com>
References: <54BCDE62.9090902@dial.pipex.com>
Date: Mon, 19 Jan 2015 23:17:15 -0800
Message-ID: <CABkgnnUidujHKnNYKdzYV24pWP6G24_2brMB1fypYQSqmXeDqA@mail.gmail.com>
Subject: Re: Gen-art last call review of draft-ietf-httpbis-http2-16
From: Martin Thomson <martin.thomson@gmail.com>
To: Elwyn Davies <elwynd@dial.pipex.com>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/ietf/WnyidnWPdHynF6qt1TirZUfK0oM>
Cc: General area reviewing team <gen-art@ietf.org>, draft-ietf-httpbis-http2.all@tools.ietf.org, IETF Discussion List <ietf@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Jan 2015 07:17:21 -0000

Thanks for the thorough review Elwyn.

I've made changes in the form of a proposed change set against the
editor's copy.

https://github.com/http2/http2-spec/pull/692

You can review the changes at that link.  Note that some of the typos
you caught were already fixed, or will be fixed in other proposed
change sets.

As is customary, I will await a sanity check and AD/shepherd approval
to merge the changes into the master copy.  I expect that to occur
after the IESG discuss approval.

Responses inline.

On 19 January 2015 at 02:37, Elwyn Davies <elwynd@dial.pipex.com> wrote:
> Summary:
> Almost ready.  A well written document with just a few nits really.  I am
> slightly surprised (having not been following httpbis in detail) that HTTP/2
> is so tightly wedded to TCP - this is doubtless pragmatic but it adds to the
> ossification of the Internet and makes me slightly suspicious that it is an
> attempt to really confirm HTTP/2 as the waist of the neo-Internet.  Can't we
> ever use any other transport?

I think that - overall - the desire for the timely replacement of
HTTP/1.1 was stronger than the desire to attain perfection.

And rather than ossifying, the general view is that ALPN clears a path
to more changes in the future.

If you want to talk about the waist of the neo-Internet, I think
identifying HTTP more generally is appropriate; we're only really
upgrading a small part of it.  And there are ongoing plans to continue
to upgrade the entire protocol.  But our relevance is only defined by
what we ship, and this is a significant improvement that isn't worth
holding back for years while we ponder more fundamental changes.

> A couple of minor issues: (1) I am not sure
> that I see the (security) value of the padding mechanisms specified, but I
> am not an expert so I will defer to the security experts, and (2) the
> extension method for SETTINGS seems to be flawed.  Apologies for the
> slightly delayed review.

I'll address those below.  And don't concern yourself with the delay,
it's not a small document.

> Major Issues:
> s3, para 1: Just checking: HTTP/2 is defined to run over TCP connections
> only.  I take it that this is something that the WG has formally decided
> upon.  Is there something that stops HTTP/2 running over any other reliable
> byte stream protocol with in-order delivery?  Could there be a more general
> statement?

Such a statement was debated at some length.  The current view (unless
I'm mistaken) was to recognize that HTTP/2 uses TCP.  More definitive
statements were avoided to ensure that if a future document chose to
define how HTTP/2 used some other transport that wouldn't be too
disruptive.

And who said it needed to have in-order delivery?

Referring to the cited section, the accepted view on ALPN tokens is
that they identify a particular application protocol profile; a new
protocol that layered something that was partially or substantially
like HTTP/2 over another transport would have a new identifier.

> Minor Issues:
> s4.3, next to last para; s5, bullet #1:
> (Just a bit more than a nit)
> s4.3 says:
>
> Header blocks MUST be
>    transmitted as a contiguous sequence of frames, with no interleaved
>    frames of any other type or from any other stream.
>
> s5 says:
>
>    o  A single HTTP/2 connection can contain multiple concurrently open
>       streams, with either endpoint interleaving frames from multiple
>       streams.
>
> If s4.3 is correct (which multiple repetitions elsewhere indicate that it
> is) then the bullet in s5 needs to reflect the constraint in s4.3 as they
> are currently inconsistent.

Section 5 is summary information. I don't think that further
qualification is needed at this point; as you note, the constraint is
repeated frequently.

> I am not quite sure whether the last para of s4.3 implies that the client
> must wait until
> it has the whole header block before trying to decompress or whether
> decompression
> might happen as fragments are received - please clarify this in the text.

That's a clarification that belongs in the header compression
document.  It appears in Section 3.1 of that document.

> s6.1 and s6.2: I am dubious about the value of the padding story in DATA and
> HEADERS frames, even given the caveats in s10.7.  An attacker can use the
> header to deduce the padding section.  It seems a bit odd to say that you
> can add arbitrary padding and then insist it is all zeroes.  However, I am
> not an expert in this area and will defer to security experts.

Hard as it is to get right, padding here is important in some use
cases.  We've had a lot of separate requests for the feature.

The all zeros thing is safe if your cipher provides IND-CCA, and that
is table-stakes.

You are right though about the arbitrary thing, I've removed the word
"arbitrary".

> s6.5.3: If an endpoint sends a SETTINGS value that the receiving endpoint
> doesn't understand (for an extension, say), the receiver will ignore it but
> still send an ACK.  This leaves the sending endpoint unaware that the other
> endpoint didn't understand it.  I suspect that this makes the extension
> mechanism for settings effectively useless.  I note that s6.5 says that
> implementations MUST support the values defined in the draft so that the ACK
> mechanism works fine for the base system, but any extension of SETTINGS
> cannot be used to limit the behaviour of the receiving endpoint because the
> sender can't rely on the receiver actioning the limitation; the only useful
> things might be to expand a limit that the sender would otherwise have to
> conform to.  It would be possible to alter the spec so that the receiver can
> send back any values it didn't understand with a frame with ACK set - not
> exactly negotiation but just rejection.

I don't think this is a problem.  Most proposals that have been
floated use a reciprocal SETTINGS frame to signal that the value was
understood.  The only failure mode there occurs when there is a
collision between two extensions that results in two conflicting views
of the setting semantics.  Of course, that is equally possible with a
rejection-based scheme.

> Editorial/Nits:
> Abstract:
>
> allowing multiple
>    concurrent messages on the same connection.
>
> Technically the *messages* are not on the connection concurrently.  How
> about:
>      allowing a server to simultaneously process multiple requests submitted
> via a single
>     connection with arbitrary ordering of responses.

Technically, it is possible to have multiple messages concurrently in
progress, is only frames that don't interleave.

Would s/messages/exchanges/ work for you?  I value brevity over
absolute correctness for this sort of language.

> s2, last para:
>
> allowing many requests to be compressed into one TCP packet.
>
> At this point in the document, nothing has been said about TCP.
> Better:
>    allowing many requests to be compressed into one transport connection
> packet.

Or, I could simply s/TCP//

> s2.2: I think definitions or references or deprecations for message/message
> body/message payload/message headers/payload data and  entity/entity
> headers/entity body are needed.  With HTTP/2 the distinction between message
> body and entity body is no longer needed I think.  It appears that message
> body is not currently used but message payload, payload data and entity body
> are.  I think that it might be useful to stick to message payload in the
> body of the draft and add a note in terminology to indicate that message
> payload covers both message body and entity body in HTTP/1.x and stress that
> there is only one encoding.

I've referenced the RFC7230 definitions of message body and payload
body, which have very specific meaning there.  There were a couple of
places that might have been ambiguous that I've corrected.

> s3.2.1, para 4:
> OLD:
> production for "token68"
> NEW:
> production for "token68", allowing all the characters in a base64url string,

Hmm.

> s3.4, para 3: s/sever/server/

Thanks.

> s3.5, paras 2-5: It would be worth explaining that what is being done here
> is to send what would be a method request for the PRI method which will not
> be recognized by well-implemented HTTP/1.x servers.  The PRI method is then
> reserved for HTTP/1.1 so that it can't be accidentally implemented later
> (see Section 11.6).

We've discussed providing explanatory text of this nature in the
working group on a few occasions, but avoided it.  In this case, the
main beneficiaries of that information are people who aren't reading
and implementing this specification, so we decided against more
verbiage to that effect.

> s4.1, Figure 1 (and other figures): The frame header layout doesn't match
> the usual conventions for protocol component specifications where each 32
> bit 'word' is filled out.  I'm not sure how strictly we would want to adhere
> to this convention... consult an AD.
>
> s4.1: Would performance be helped by aligning the length and stream
> identifiers on
> 32 bit boundaries?

Not really; you would also have to pad the tail end of the various
frame types to get any advantage for anything other than the first
request.

> s4.1, Flags:  For the avoidance of doubt it would be good to label the flag
> bits with the numbers that are used later in the document.

Not sure what you mean here.  Adding "|7  Flags(7)  0|" to the diagram
seems more likely to be confusing to me.

> s4.1, Stream Identifier:  s5.1 and s5.1.1 say the identifiers are integers -
> this should be reflected here.  Also consistency between s4.1 and s5.1.1
> should have the reserved id as either 0 or 0x0 (OK, this is really picky!)

Done.

> s5, bullets #2 and #5:
>
>    o  Streams can be established and used unilaterally or shared by
>       either the client or server.
>
>    o  Streams are identified by an integer.  Stream identifiers are
>       assigned to streams by the endpoint initiating the stream.
>
> [Is there a risk of a race condition in which one end allocates a stream id
> and sends using it and the other end allocates the same id for a different
> purpose while the first frame is in transit?  Maybe clients should always
> use odd numbered streams and servers even numbered .. or is this a use for
> the reserved bit to be used by servers?]
>
> OK.. I correctly identified the possibility of race conditions.. Please add
> a pointer to s5.1.1 which tells you to do what I just proposed.   Another
> pointer in s4.1 definition of stream id would also help.

5.1.1 is as early as I can put that definition.  The introductory text
is just an overview; I don't think that it needs to be complete,
meticulously correct, or fully referenced.  It merely needs to
establish the purpose of Section 5: which is to more precisely
describe how the enumerated properties are realized.

Section 4.1 already references 5.1.1.  I consider that sufficient.

> s5.1:
> OLD:
>    half closed (local):
>       A stream that is in the "half closed (local)" state cannot be used
>       for sending frames.  Only WINDOW_UPDATE, PRIORITY and RST_STREAM
>       frames can be sent in this state.
> NEW:
>    half closed (local):
>       A stream that is in the "half closed (local)" state cannot be used
>       for sending frames with the exceptions of  WINDOW_UPDATE,
>       PRIORITY and RST_STREAM frames.

Thanks.

> Also: Would it be worth saying that any type of frame can be received in
> this state?

Good catch.  Added:

"An endpoint can receive any type of frame in this state.  Providing
flow control credit using WINDOW_UPDATE frames is necessary to
continue receiving flow controlled frames."

> s5.1:
>
>    half closed (remote):
>       A stream that is "half closed (remote)" is no longer being used by
>       the peer to send frames.  In this state, an endpoint is no longer
>       obligated to maintain a receiver flow control window if it
>       performs flow control.
>
> Needs a pointer to the section where the flow control window is discussed.

I disagree.  Though your quoting made me notice a mistake here.  The
"if" clause needs to go (it's a vestige of a previous iteration of the
protocol that didn't make flow control mandatory).

> s5.1, next to last para:
>
>    In the absence of more specific guidance elsewhere in this document,
>    implementations SHOULD treat the receipt of a frame that is not
>    expressly permitted in the description of a state as a connection
>    error (Section 5.4.1) of type PROTOCOL_ERROR.
>
> There are not many other sorts of frames, so I think it would help (possibly
> even essential, since there doesn't seem to be anywhere else this is stated)
> to explicitly specify what sorts of frames can be expected to be received in
> the 'active' states (Open, Half closed(local) at least)

Those states permit all frame types in the active directions.

This is a catch-all only.  We had a couple of corner cases come up
when someone did the complete enumeration of states and frame types.
This was an easier fix than a tedious enumeration, which was found to
adversely affect comprehension.

> s5.1, next to last para: s/Frame of unknown/Frames of unknown/

Ack

> s5.1, next to last para; s5.5, para 3, s6.2, END_HEADERS, para 2:
> s5.1 says:
> OLD:
> Frame of unknown types are ignored.
>
> s5.5 and s6.2 qualify this by saying that frames of unknown types inside
> HEADER frame sequences have to be treated as connection errors.  The two
> sections need to be synced up.  Maybe:
> NEW:
> Frames of unknown types are ignored except when they occur in HEADER frame
> sequences when they have to be treated as a connection error of type
> PROTOCOL_ERROR (see Sections 5.5 and 6.2).

But then you missed PUSH_PROMISE.  The CONTINUATION business is
important, but not important enough to turn a clear and concise
statement into a paragraph.  I have little concern that the feature
will be accidentally missed.

> s5.1.1, last para/s3.4, last para, s9.1, para 3:  Having to open a new
> connection when stream ids run out interacts with the point made in s3.4
> that there is no guarantee that the new connection will support HTTP/2.  It
> might be worth pointing this out in s5.1.1 and s9.1 - it means the client
> has to be able to switch back to HTTP/1.1 in the very rare case that this
> happens. Thought: Would it be useful for a server to have a SETTINGS flag
> that guarantees that it would do HTTP/2 on all subsequent connections?

Yes, and the fallback isn't necessarily pretty.  At least not for
performance.  I think that the view is that this is a low risk
scenario and not worthy of special attention.

As for the setting, a promise like that would be hard to keep, since
to some extent this is out of the control of the server.  There is
always a risk that a different intermediary could be present next
time.


>  s5.3.1, para 1: To be absolutely explicit: s/on another stream/on exactly
> one other stream/

Being too precise here risks misinterpretation in the opposite
direction: since there is a dependency graph, a stream actually
transitively depends on 0..n other streams.  I don't think that it's
worth risking the interpretation that - because there is a dependency
on exactly one stream - the tree is at most one layer deep.

> s6.2, END_STREAM flag, 2nd para: s/A HEADERS frame carries/A HEADERS frame
> can carry/

It always carries the flag, it is whether it is set that matters.

> s6.2;s6.6: The wording on padding should be synchronized with that in s6.1
> or a note added to s6.2 to refer readers to the padding comments in s6.1
> similar to that in s6.6.  Synching 6.2 and s6.6 would be sensible.

I obviously missed trimming 6.2 to match 6.6, thanks.

> s6.3, Weight: Should be explicitly said to be an (unsigned) integer.

Done.

> s6.5.2, SETTINGS_HEADER_TABLE_SIZE: Should have a reference to
> [COMPRESSION].

OK

> s8.1.3: For clarity, the convention "+ END_HEADERS"/"- END_HEADERS" should
> be explained.

The text includes an explanation of the presence of flags.  Since a
great deal of inference is required to make sense of other aspects of
the examples, I think that it's OK as is, and avoiding more paragraphs
is nice too.

> s8.1.3, last example: It might be worth noting that there could be
> significant time between the two responses being sent.

I don't think that there is much value to that here.

> s9.1.1, para 1: s/to an origin servers/to an origin server/
>
> s10.3, para 2: s/translater verbatim/translated verbatim/
>
> s10.5.1, para 1:
> OLD:
>     For this
>    an other reasons, such as ensuring cache correctness, means that an
>    endpoint might need to buffer the entire header block.
> NEW:
>    This ordering and other reasons, such as ensuring cache correctness, mean
> that an
>    endpoint might need to buffer the entire header block.
>
> s10.5.1, para 2: s/This setting specific/This setting is specific/

All done.