Re: Permitted characters in HTTP/2 fields

Greg Wilkins <gregw@webtide.com> Fri, 21 May 2021 02:37 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 05C2F3A0FDD for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 20 May 2021 19:37:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.647
X-Spam-Level:
X-Spam-Status: No, score=-2.647 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=webtide-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eLP2WjccDh-a for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 20 May 2021 19:36:55 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 181633A0FDC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 20 May 2021 19:36:54 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1ljv1m-00025U-63 for ietf-http-wg-dist@listhub.w3.org; Fri, 21 May 2021 02:36:42 +0000
Resent-Date: Fri, 21 May 2021 02:36:42 +0000
Resent-Message-Id: <E1ljv1m-00025U-63@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <gregw@webtide.com>) id 1ljv1k-00024b-G1 for ietf-http-wg@listhub.w3.org; Fri, 21 May 2021 02:36:40 +0000
Received: from mail-ot1-x32d.google.com ([2607:f8b0:4864:20::32d]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <gregw@webtide.com>) id 1ljv1h-0002V2-US for ietf-http-wg@w3.org; Fri, 21 May 2021 02:36:40 +0000
Received: by mail-ot1-x32d.google.com with SMTP id r26-20020a056830121ab02902a5ff1c9b81so16724883otp.11 for <ietf-http-wg@w3.org>; Thu, 20 May 2021 19:36:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=webtide-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=+Ig6xbQyCQLa3+XVF//hemcs2Z0ZqnVj/pkBKIvw0wc=; b=cQ0R+pdxeeNhQxxwJdxw6TlnglfHZN8zgRn27sgnB79hJDdfyBIbh8UGJjy1d8kNUU kCgv5LgW36hY+p1Dwme69a1hmtFySrJDQGkBKYseHIc5kAZf8UykvJCB6FPEDmhtK1c4 6wZHNUmhbU/7rypjAtMQ2GsOhFrdv58+OAR4xjkpuDJnmMnX5ySMekUvR6DskcOTJ84D 3pl6qMBG8Kh1pLKgwe7KHQerP3Md25YzI8et7vxCPTaRcsZoPIByCaBVZfyhBAKUFtDx Rt43yUwq1iwvb9jcqNirCRoltKg7xrFYK/B/GpEXP7XAgjmqJbCla93N3aRKtZib1KY0 de5Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=+Ig6xbQyCQLa3+XVF//hemcs2Z0ZqnVj/pkBKIvw0wc=; b=M0qo70Vj5OvMSvhTdGsMa2ZpSr3h3z3RHEGW3V2OkUdhPkqvB0YOAB6veQ4hlZaUBW 9L2jF6Tcb/CAQVSmDpvRTYsazLnicyugLze7uXjlqYEeRrm9BvGcOUFd1O/Kx7nYH78L W+VWLBcEVWWL9Iq6OmlQGHsVJegoQssPewtl33svnQk8qzJfLQIq0MoomcVAmcWSEuOL drM5MYesq28oY/1DKbZiEZy3hWK0XqOU6n2yOJYWd2WSRAslsp6WeUqUzYLZER2pq9H3 UrH0rvo+WPsjr/XPl4VzQxhauN7unSqz9pS845rB+CkK7Zb5IoKrRCJI1Tv62K6/czJK NLAg==
X-Gm-Message-State: AOAM531s1Zy9WkEipn3IHlDjOMNkKuQ0VUhJgJ5z5XJdzY3sachckYCp vKf/NK+G4VNIahUMTBFTyOAF1f0/Dc1+TrIt4frdrVlK2IvCdA==
X-Google-Smtp-Source: ABdhPJwHDPvTaj6yTAdDEdE+8GucJ2e+EAdqqq/p6kmGS458byXOMOuFaOJ+cmCdfiVukhhiE6GMIh4kbMm+cHYiiCc=
X-Received: by 2002:a9d:855:: with SMTP id 79mr6001929oty.36.1621564586318; Thu, 20 May 2021 19:36:26 -0700 (PDT)
MIME-Version: 1.0
References: <f8e10178-62e0-466b-a561-10bde3297e73@www.fastmail.com> <20210427123819.GA1303@1wt.eu> <CAAPGdfEKiSFCNzGsyNOSRptdMLHwhC7j=DmaOmb-h9ZTfuHvTg@mail.gmail.com> <cd1ac1e9-5587-4dc0-bd81-2e64bf553847@www.fastmail.com> <20210520165910.GB22995@1wt.eu> <abe41b99-6ed5-451f-88f6-ff9ab600862d@www.fastmail.com>
In-Reply-To: <abe41b99-6ed5-451f-88f6-ff9ab600862d@www.fastmail.com>
From: Greg Wilkins <gregw@webtide.com>
Date: Fri, 21 May 2021 12:36:15 +1000
Message-ID: <CAAPGdfEkztg6VfOcN5=qfgGmaayJ9GR3Ybfm0v=4=h3V1jpzeA@mail.gmail.com>
To: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="00000000000025f0f005c2cdedb0"
Received-SPF: softfail client-ip=2607:f8b0:4864:20::32d; envelope-from=gregw@webtide.com; helo=mail-ot1-x32d.google.com
X-W3C-Hub-DKIM-Status: validation passed: (address=gregw@webtide.com domain=webtide-com.20150623.gappssmtp.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-3.2
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1ljv1h-0002V2-US d6cbb5f50ee4c74430931562f7729c17
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Permitted characters in HTTP/2 fields
Archived-At: <https://www.w3.org/mid/CAAPGdfEkztg6VfOcN5=qfgGmaayJ9GR3Ybfm0v=4=h3V1jpzeA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/38800
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

This conversation started with:

At our last interim, we discussed potential ways in which HTTP/2 was
> probably too strict about characters (octets really) in field names and
> values.
> The conclusion then was to loosen the restriction and mandate only a small
> set of checks.  This should match what implementations already do.


Any chance of describing exactly what those reasons are, because it's lost
on me exactly what problem is being solved here.      If we don't have a
full brief for these changes, then how are we meant to evaluate them or
indeed record the reason for posterity.  Neither #815 nor #846 explain the
problem other than say the text is confusing.  There is no motivation for
why validation should be less than carrying HTTP fields plus pseudo fields.

I don't mind the current text so much, as it says I can validate against
HTTP semantic fields as defined by
https://www.ietf.org/archive/id/draft-ietf-httpbis-semantics-15.html#section-5,
so I will.   I'm just going to reject any other fields and I'm allowed to,
so I'm happy.    But I have no idea why we want to allow implementations to
send non compliant fields around.   Isn't that just asking for problems.
 If it is because some existing implementations are already sending invalid
fields, then they are doing so regardless and unless you say an impl must
accept them, then any impl may reject them as invalid. So changing the spec
to be less strict makes no difference so long as impls are allowed to
actually enforce correct validation.

Finally, when the "Brief" says we should match what implementations already
do, then the question is which implementations are to be matched?   If
there are some implementations that already enforce the precise spec for
HTTP headers, then should we match those imples or are some implementations
more match worthy than others?



On Fri, 21 May 2021 at 11:22, Martin Thomson <mt@lowentropy.net> wrote:

> Hey Willy,
>
> On Fri, May 21, 2021, at 02:59, Willy Tarreau wrote:
> > I really agree. I don't remember if 0x80 and above are forbidden in H2
> but
> > I'd personally prefer to block them so that we don't needlessly introduce
> > the risk of aliasing due to different codings being used. Protocol
> elements
> > that define how messages should be delimited/routed/etc must be strictly
> > defined and easy to enforce in implementations and applications.
>
> We never really said before.  I'm happy to extend the 0x7f to 0x7f-0xff if
> that is what others want.  It's not quite the same as limiting the grammar
> to what is permitted for field names, but it might be OK.
>
> field-name is "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
> "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
>
> That amounts to a whole bunch of characters less than %x21-7E (minus
> ':').  A simpler check for c >= 0x21 && c <= 0x7e && c != ':' seems
> reasonable to me.  Then we don't have to worry about Unicode field names.
> That's not a whole lot different than c >= 0x21 && c != 0x7e && c != ':' as
> the current PR has.
>
> I had the distinct impression that we DID see Unicode field names in some
> cases though.
>
> We wanted to avoid backward incompatibility issues that might result from
> tighter constraints on field *values*, which is why we never said anything
> before, but names might be easier.
>
>

-- 
Greg Wilkins <gregw@webtide.com> CTO http://webtide.com