Re: Benjamin Kaduk's Discuss on draft-ietf-httpbis-header-structure-18: (with DISCUSS and COMMENT)
Barry Leiba <barryleiba@computer.org> Sat, 06 June 2020 02:57 UTC
Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 857063A08D1 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 5 Jun 2020 19:57:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.648
X-Spam-Level:
X-Spam-Status: No, score=-2.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bHTh1mjS37MD for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 5 Jun 2020 19:57:08 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 46C013A08CD for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 5 Jun 2020 19:57:07 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1jhOyZ-0001by-In for ietf-http-wg-dist@listhub.w3.org; Sat, 06 Jun 2020 02:54:27 +0000
Resent-Date: Sat, 06 Jun 2020 02:54:27 +0000
Resent-Message-Id: <E1jhOyZ-0001by-In@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <barryleiba@gmail.com>) id 1jhOyW-0001b4-QL for ietf-http-wg@listhub.w3.org; Sat, 06 Jun 2020 02:54:25 +0000
Received: from mail-il1-f173.google.com ([209.85.166.173]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <barryleiba@gmail.com>) id 1jhOyS-0003YC-KR for ietf-http-wg@w3.org; Sat, 06 Jun 2020 02:54:24 +0000
Received: by mail-il1-f173.google.com with SMTP id h3so11517273ilh.13 for <ietf-http-wg@w3.org>; Fri, 05 Jun 2020 19:54:20 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fwZUmE7PfDZG4bideV/oc6Zerh7iGx+8MigrLfJpgGw=; b=KLqqauqgylvVEyznXcOlvr0G7t95LkN4ID5/SPbguAHnz+2v3SH8bp0CXE/T4CAv7W RH2piV1RH9Zc4Eewf+HS2Pz0QDX9z9S3kAaN3kfeFMO/6YEkBhAXdwaGklklabq1F+6R +X82yOut9SwbhEFG0Bfe2eBTdsU2663qUBSW+hKKt2rjI7nJDMLe572zFqL3lOl9Ztdc f5x6fM2L7Jtue6aylO68RjO+bo4WMYDAvAYqvFaOaiMVzKnSwSd2tcYBBE/fZct/RNr+ zCJAe68Gtry8Lf2bw0N3j17JhZs6Rbv49eb1ymDTfYW0lPx6huSUooJ/0B4h8YQRn48E eSrw==
X-Gm-Message-State: AOAM532f0NG+dlt5jNfHlUMUAvn8nmt51EiqCE4ksc4I35v+ei3uaNkL gZVhIGZzGe3IdBVVABJQjNDl1Ijsj6CnJvwrohw=
X-Google-Smtp-Source: ABdhPJwGzHV84EBQiC9bCjvyXX71z3MYpftFBR0jDhy7/GOuLnXi4VWFU5TNkhgSR8GNEWB7apOyayZaFwNLCE6fgSk=
X-Received: by 2002:a92:9e16:: with SMTP id q22mr10975041ili.17.1591412049167; Fri, 05 Jun 2020 19:54:09 -0700 (PDT)
MIME-Version: 1.0
References: <158985680600.32294.14997877272521602557@ietfa.amsl.com>
In-Reply-To: <158985680600.32294.14997877272521602557@ietfa.amsl.com>
From: Barry Leiba <barryleiba@computer.org>
Date: Fri, 05 Jun 2020 22:53:58 -0400
Message-ID: <CALaySJ+2xHThd6PxDH_jr2P0xu+JRggPtrh2gRbgsZwwB2dFQw@mail.gmail.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, draft-ietf-httpbis-header-structure@ietf.org, httpbis-chairs@ietf.org, ietf-http-wg@w3.org, tpauly@apple.com
Content-Type: multipart/alternative; boundary="000000000000e1c91905a7617d7c"
Received-SPF: pass client-ip=209.85.166.173; envelope-from=barryleiba@gmail.com; helo=mail-il1-f173.google.com
X-W3C-Hub-Spam-Status: No, score=-5.4
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1jhOyS-0003YC-KR 8f7d87fc7cafea2fb6b914cd81d082e4
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Benjamin Kaduk's Discuss on draft-ietf-httpbis-header-structure-18: (with DISCUSS and COMMENT)
Archived-At: <https://www.w3.org/mid/CALaySJ+2xHThd6PxDH_jr2P0xu+JRggPtrh2gRbgsZwwB2dFQw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/37726
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
Hi, Ben. Can you check version -19 and see if it resolves your DISCUSS? Thanks, Barry On Mon, May 18, 2020 at 10:53 PM Benjamin Kaduk via Datatracker < noreply@ietf.org> wrote: > Benjamin Kaduk has entered the following ballot position for > draft-ietf-httpbis-header-structure-18: Discuss > > When responding, please keep the subject line intact and reply to all > email addresses included in the To and CC lines. (Feel free to cut this > introductory paragraph, however.) > > > Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html > for more information about IESG DISCUSS and COMMENT positions. > > > The document, along with other ballot positions, can be found here: > https://datatracker.ietf.org/doc/draft-ietf-httpbis-header-structure/ > > > > ---------------------------------------------------------------------- > DISCUSS: > ---------------------------------------------------------------------- > > Thanks for this document; it will be very nice to have this > more-structured mechanism available for future HTTP header (and trailer) > fields. > > (0) There seem to still be a few lingering internal inconsistencies that > merit further discussion. > > Most notably, there is the inherent risk of skew when both prose > algorithms and ABNF constructions are provided for the same structures. > While Section 1.2 is careful to disclaim that the prose algorithm takes > precedence over the ABNF for parsing, to my reading the coverage in the > following paragraph of serialization procedures imply that it is the > ABNF that is authoritative. In particular, "[i]mplementations MAY vary > from the specified behavior so long as the output still matches the > ABNF" seems to admit deviations from the prose algorithms but require > compliance with the ABNF, in effect making the ABNF take precedence over > the prose algorithm. Having a different description of the procedure > normative for generation vs. consumption invites > interoperability-affecting feature skew, such as the handling of empty > lists as Julian noted on the list. > > Similarly, Section 3.1.1's prose says that inner lists are delimited "by > a single space", but the ABNF uses (1*SP), allowing for more than > one space. > > Additionally, when Section 4.2.3.2 discusses parsing parameter map keys, > the handling for duplicate map key names is specified as overwriting the > previous value, in contrast to the prose description (Section 3.1.2) > that describes these keys as "unique within the scope [of] the > Parameters they occur within". (While dictionary key names might be > expected to have a similar behavior, I did not find conflicting text for > that behavior.) > > Finally, at a prose level we employ needlessly different descriptions in > several places for what is effectively the same procedure; while I do > not think any of these affect interoperability (and thus the full > details are in the COMMENT section), it does seem to be continuing the > theme. (These are things like how we describe the algorithm to skip > implicit-true for booleans, whether we initialize the output string to > the empty string only to immediately add a constant literal character to > it vs. initializing the output string to that literal character, etc.) > > A couple other points that may need further discussion: > > (1) What aspect(s) of structured field processing are case (in)sensitive? > The only mentions I see of case sensitivity are in Section 4.2 discussing > header field names and (implicitly) Section 4.2.2 discussing a > "character-for-character" comparison of dictionary key names, but of > course we cite RFC 5234 for ABNF, which uses case-insensitive matching. > On the other hand, base64 encoding requires case sensitivity for > successful round-tripping of arbitrary binary data. > > (2) One of the stated goals of this document is to define intentionally > strict processing rules, but there are several places where we could > have removed additional optionality but elected to not do so. What is > the criterion for "too far" towards strictness? For example, Section > 4.2.7 gives optionality with respect to base64 padding (see COMMENT). > > > ---------------------------------------------------------------------- > COMMENT: > ---------------------------------------------------------------------- > > Is the document toolchain used for this document responsible for the > rendering of em dashes as a single hyphen, as opposed to normal RFC > style which uses two hyphens for an em dash? > > What's the motivation for "MUST omit values of boolean true" (in > parameters and dictionaries)? It seems to make the output rules more > complicated without a significant gain in encoding size. > > Section 1.2 > > When parsing from HTTP fields, implementations MUST follow the > algorithms, but MAY vary in implementation so as the behaviors are > indistinguishable from specified behavior. If there is disagreement > > nit: was this intended to be "so long as"? > > Section 2 > > o Define the semantics of those structures. > > nit: it's not clear to me what the referent for "those" is intended to > be (as while the previous bullet point does give a list of types of > structures, it seems to be saying that any given field has to pick > exactly one, which is inconsistent with the plural "those"). > > o Specify any additional constraints upon the structures used, as > well as the consequences when those constraints are violated. > > nit: similarly, the plural "structures" may not be needed here. > > Typically, this means that a field definition will specify the top- > level type - List, Dictionary or Item - and then define its allowable > types, and constraints upon them. For example, a header defined as a > > It's a little unfortunate that we task the word "type" with two > different meanings (the top-level List/Dictionary/Item and also the more > fine-grained Integer/String/etc.). We also haven't mentioned the latter > type of "type" yet, though perhaps that's not an issue. > > List might have all Integer members, or a mix of types; a header > defined as an Item might allow only Strings, and additionally only > strings beginning with the letter "Q". Likewise, Inner Lists are > only valid when a field definition explicitly allows them. > > (We haven't mentioned Inner Lists yet, so the reader might not know what > to expect from them.) > > When parsing fails, the entire field is ignored (see Section 4.2); in > most situations, violating field-specific constraints should have the > same effect. Thus, if a header is defined as an Item and required to > be an Integer, but a String is received, the field will by default be > ignored. If the field requires different error handling, this should > be explicitly specified. > > Explicitly specified in the specification that defines the structured > field, I trust? > > Also, I worry that just leaving this as "in most cases" is a bit > antithetical to the goals of structured headers -- don't we want to nail > things down as tightly as we can? If someone wants an exception, they > don't have to use structured headers... > > To further assure that this extensibility is available in the future, > and to encourage consumers to use a complete parser implementation, a > field definition can specify that "grease" Parameters be added by > > Do you want to reference RFC 8701 here? > > An extension to a structured field can then require that an entire > field value be ignored by a recipient that understands the extension > if constraints on the value it defines are not met. > > This seems to set us up for a situation in which an implementation that > does not understand the extension will happily process the header field > and ignore the extension, but implementations that do understand the > extension will ignore the header field entirely. Is that really what's > intended? (When might this be desirable?) > > Section 3 > > This section defines the abstract value types that can be composed > into Structured Fields. The ABNF provided represents the on-wire > > nit: I don't understand the sense in which "composed into" is used -- > don't individual Structured Fields need to be exactly one of these three > types, irregardless of whether there is additional substructure? > > Section 3.1 > > An empty List is denoted by not serializing the field at all. > > So any structured field definition that allows for an empty list has to > define its semantics to be equivalent to the semantics used for when the > peer does not support that structured field at all? > > Section 3.1.1 > > Inner Lists are denoted by surrounding parenthesis, and have their > values delimited by a single space. A field whose value is defined > > The ABNF does not seem to support the "single" in "single space". > > A header field whose value is defined as a List of Inner Lists with > Parameters at both levels could look like: > > Is this also an inner list of strings? > > Section 3.1.2 > > Parameters are an ordered map of key-values pairs that are associated > > nit: is the plural "values" intentional? > > parameters = *( ";" *SP parameter ) > parameter = param-name [ "=" param-value ] > param-name = key > key = ( lcalpha / "*" ) > *( lcalpha / DIGIT / "_" / "-" / "." / "*" ) > > Huh, so I could have a key that was "*" or "*****"? (I assume that no > special "wildcard" semantics are afforded to the asterisk in either > usage...) > > Example-ParamListHeader: abc;a=1;b=2; cde_456, (ghi;jk=4 l);q="9";r=w > > Is it worth writing out which item/list each parameter applies to? > > Parsers MUST support at least 256 parameters on an Item or Inner > List, and support parameter keys with at least 64 characters. Field > specifications can constrain the types and cardinality of individual > parameter names and values as they require. > > In what way might I further constrain the *type* of a parameter *name*? > > Section 3.2 > > Implementations MUST provide access to Dictionaries both by index and > by name. Specifications MAY use either means of accessing the > members. > > Example-DictHeader: en="Applepie", da=:w4ZibGV0w6ZydGU=: > > I, for one, would appreciate a bit of commentary on the interpretation > of the final '=', even if just a forward reference to §3.3.5. > > Typically, a field specification will define the semantics of > Dictionaries by specifying the allowed type(s) for individual member > names, as well as whether their presence is required or optional. > > Similarly to my previous comment, I'm not sure I understand what is > meant by the "type" of a member name. > > Section 3.3.3 > > Are the parentheses around "chr" in the sh-string construction doing > anything? > > Section 3.3.5 > > A Byte Sequence is delimited with colons and encoded using base64 > ([RFC4648], Section 4). For example: > > Thank you for the section reference to disambiguate base64 and > base64url! > > Section 4 > > This section defines how to serialize and parse Structured Fields in > field values, and protocols compatible with them (e.g., in HTTP/2 > [RFC7540] before HPACK [RFC7541] is applied). > > nit: I'm not sure I'm parsing this sentence correctly: what does > "protocols compatible with them" bind to -- is it part of what "this > section defines"? > (I don't actually see any text anywhere in this document that I would > consider to be described by "protocols compatible with them".) > > Section 4.1 > > It's interesting to note that some of the subsections treat "input item > of wrong type" as an explicit failure case (e.g., 4.1.3.1) whereas > others implicitly assume that the input is one of the appropriate types > (e.g., 4.1.1). > > Given a structure defined in this specification, return an ASCII > string suitable for use in a HTTP field value. > > nit: I would have expected wording like "the following procedure > returns" or "return [...] value as follows" or similar; the current > sentence is an imperative that does not indicate how to fulfil it. > (The same comment applies to basically all of the subsections, and a > similar one to § 4.2 and its subsections.) > > 6. Return output_string converted into an array of bytes, using > ASCII encoding [RFC0020]. > > This implicitly assumes that output_string contains only characters > representable in ASCII. I believe this is currently true based on the > relevant ABNF constructions and serialization procedures, but it still > feels like a needless dangling edge case. > > Section 4.1.2 > > Is the nesting level correct for these (sub)procedures? It seems like > top-level steps 3, 4, and 5 are logically children of step 2. > > Section 4.1.6 > > 3. Let output be an empty string. > > 4. Append DQUOTE to output. > > Can't we consolidate steps 3 and 4 (as is done in § 4.1.1.1 where output > is initialized to "("? > > Section 4.1.8 > > 1. If input_bytes is not a sequence of bytes, fail serialization. > > side note: perhaps this is just my roots as a C programmer showing > through, but how could you end up with something that's not a sequence > of bytes? :) > > Section 4.2 > > Strings split across multiple field lines will have unpredictable > results, because comma(s) and whitespace inserted upon combination > will become part of the string output by the parser. Since > concatenation might be done by an upstream intermediary, the results > are not under the control of the serializer or the parser. > > This seems to have the implication that the results are unpredictable > even if a serializer and parser collaborate to use a procedure that, in > disregard of this specification, splits individual list and/or > dictionary entries across multiple field lines, due to the potential for > an intermediary that is not complicit in the deviation from the spec. > In some sense, that might be the key implication of this statement, in > which case it's surprising to not see it spelled out more clearly. > > Section 4.2.1.2 > > 3. While input_string is not empty: > > 1. Discard any leading SP characters from input_string. > > 2. If the first character of input_string is ")": > > Don't we need to check if the input_string is empty again after > consuming leading SP but before attempting to consume anything else? > > Section 4.2.2 > > Given an ASCII string as input_string, return an ordered map whose > values are (item_or_inner_list, parameters) tuples. input_string is > > Should we say anything about the map keys? > > Section 4.2.3.2 > > Why do we use a different procedure to implement the "implicit true" > property of booleans than we did in Section 4.2.2? > > Section 4.2.4 > > 4. Otherwise, prepend char to input_string, and exit the loop. > > This is the only place in the document where we use "prepend" in an > algorithm. That might make it surprising to the reader; is it easy to > reformulate this without greedily consuming input that does not match? > > 2. If output_number is outside the range -999,999,999,999,999 > to 999,999,999,999,999 inclusive, fail parsing. > > [I'm not sure how this check could trigger given the length check in 7.5] > > Section 4.2.7 > > Because some implementations of base64 do not allow reject of encoded > data that is not properly "=" padded (see [RFC4648], Section 3.2), > parsers SHOULD NOT fail when it is not present, unless they cannot be > configured to do so. > > This is a new protocol mechanism; why do we need to be so accommodating > to inflexible implementations? > (Also, nit: "rejection") > > Because some implementations of base64 do not allow rejection of > encoded data that has non-zero pad bits (see [RFC4648], Section 3.5), > parsers SHOULD NOT fail when it is present, unless they cannot be > configured to do so. > > (ibid) > Also, nit: singular/plural mismatch "bits"/"it is" > > Section 6 > > It seems worth mentioning the handling for duplicated key names (e.g., > in parameters and dictionaries) w.r.t. overwrite or must-be-unique, and > how there have been previous vulnerabilities relating to different > implementations choosing "first one wins" vs. "last one wins". > > Appendix B > > Implementers should note that Dictionaries and Parameters are order- > preserving maps. Some fields may not convey meaning in the ordering > of these data types, but it should still be exposed so that > applications which need to use it will have it available. > > This far through the document I still have no idea when the order might > actually be useful; even this text is only noting that (at least > sometimes) the order does not convey meaning. > > > >
- Benjamin Kaduk's Discuss on draft-ietf-httpbis-he… Benjamin Kaduk via Datatracker
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Mark Nottingham
- Boolean param notation, Re: Benjamin Kaduk's Disc… Julian Reschke
- empty lists, was: Benjamin Kaduk's Discuss on dra… Julian Reschke
- Re: Boolean param notation, Re: Benjamin Kaduk's … Poul-Henning Kamp
- Re: empty lists, was: Benjamin Kaduk's Discuss on… Mark Nottingham
- Re: empty lists, was: Benjamin Kaduk's Discuss on… Julian Reschke
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Benjamin Kaduk
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Mark Nottingham
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Benjamin Kaduk
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Benjamin Kaduk
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Mark Nottingham
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Mark Nottingham
- Re: Benjamin Kaduk's Discuss on draft-ietf-httpbi… Barry Leiba