Re: RFC 9113 and :authority header field

David Schinazi <dschinazi.ietf@gmail.com> Wed, 29 June 2022 19:46 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7E93AC14F74C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 29 Jun 2022 12:46:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.761
X-Spam-Level:
X-Spam-Status: No, score=-7.761 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W_-DJPWoARpx for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 29 Jun 2022 12:46:01 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D9726C14F743 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 29 Jun 2022 12:46:00 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.94.2) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1o6daS-00FQGs-M0 for ietf-http-wg-dist@listhub.w3.org; Wed, 29 Jun 2022 19:42:56 +0000
Resent-Date: Wed, 29 Jun 2022 19:42:56 +0000
Resent-Message-Id: <E1o6daS-00FQGs-M0@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <dschinazi.ietf@gmail.com>) id 1o6daR-00FQFu-QG for ietf-http-wg@listhub.w3.org; Wed, 29 Jun 2022 19:42:55 +0000
Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by mimas.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <dschinazi.ietf@gmail.com>) id 1o6daP-007RFa-Rg for ietf-http-wg@w3.org; Wed, 29 Jun 2022 19:42:54 +0000
Received: by mail-pl1-x62f.google.com with SMTP id o18so15060070plg.2 for <ietf-http-wg@w3.org>; Wed, 29 Jun 2022 12:42:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2YhAqSbNDH4fZZRhn5/WqyY9kzKYww8HIcO9qpLsFLs=; b=mv8A/0G3D9EIIuPekLvsPherZdh3poqqGw6DnsISsYAWTpg0c9Qo79pPtiDpMzctR3 O9uujE3b3nVR6OprPOpI+Q0GTom56SpLMAGi7Ri0itnYtD/QUKaSozEfI08ygW0rdUKU ULy1qDJTy07n2bGD1AKjL5yFndH2EAOVt3SYSjkfUCPWZ8JgV2EStsGHmBrRkoWkAiHo EKF28YzngglnWWG/zGHZNZjM06rIoLRejPskbO3iOycbXXFcFmE0kyzVtAqQgOAQGm4n pY2YoTSM6ZoAx8KU7ymOGHkPtObCNUgmsO8B2xUvyZF2EOzK77Py9IefGtMlJUxNoV28 XQ8Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2YhAqSbNDH4fZZRhn5/WqyY9kzKYww8HIcO9qpLsFLs=; b=rtWZbKv9ePdQHZKE03rSVNuoaU32FSKWIxoRikuiW3cxzFoq/jg0HVSMLHAIEwUrVD bGsycuER4ImDqBleJaHIB0wSWbHq74imlqYYwUJe5aInBHAdB+Xzte45PyJ/rzXJ5SdK vaZ4X2J26TQY+ZWzuKxXEohVJT8Ad/kzH+oLTj2KYQtq2aeBdpQaotu1KiovY39kizi1 ynIfPUUwHhYvMTWzdOaWvhCoi52Y9efnyyREBoZ2cmUMYTAymau9EJ2bgS/tZB8Wl+wZ 3xMFTnQpKJnErGv5S/9YyJFV5S51+iIOks5iO/GTrAaFMtDCtWmexGPbJ90KyheFr4Wz XyNw==
X-Gm-Message-State: AJIora/TThJMsy3nMTYmDhaopqTnbPEXcPlwx3IYs1lMPxSS9iW/Oo/Q qm/N1B7kK+6zfjMQGxgjPTZS68VFMbZ9eFD2Njc=
X-Google-Smtp-Source: AGRyM1vrpsd5dybHLJuGlZjbj4jgYxRwvT5ZhEHXKAzTAbcIwSwVdZyP3BrFyEVD312N6rFqDmRfs14O81VGRy2AEHo=
X-Received: by 2002:a17:90b:1e4f:b0:1ed:4837:5f94 with SMTP id pi15-20020a17090b1e4f00b001ed48375f94mr7440605pjb.68.1656531761572; Wed, 29 Jun 2022 12:42:41 -0700 (PDT)
MIME-Version: 1.0
References: <CAPyZ6=+q+MoOOwoCxbtFjt+gqsjHBqTzz9KXNVcs3EP-4VFp=Q@mail.gmail.com> <D7142A8A-5B80-46F5-A653-2307EE2DC5D8@gbiv.com> <CAPyZ6=LCSDAsPoFCQ2cRO-i+dpo5vnp2L5A7ZLw8dvRtDs6HUg@mail.gmail.com> <20220629055254.GA18881@1wt.eu> <34B74169-9A07-4003-8F76-1B518DE3A3A0@gbiv.com>
In-Reply-To: <34B74169-9A07-4003-8F76-1B518DE3A3A0@gbiv.com>
From: David Schinazi <dschinazi.ietf@gmail.com>
Date: Wed, 29 Jun 2022 12:42:30 -0700
Message-ID: <CAPDSy+41Cu8We=FTZr3PnMq+S1Vw75-YZo4LM3OLZEfnXCE5aQ@mail.gmail.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: Willy Tarreau <w@1wt.eu>, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, HTTP <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="00000000000035302605e29b5be7"
Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=dschinazi.ietf@gmail.com; helo=mail-pl1-x62f.google.com
X-W3C-Hub-DKIM-Status: validation passed: (address=dschinazi.ietf@gmail.com domain=gmail.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-6.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1o6daP-007RFa-Rg 566697284578d65701a76ff79c36057e
X-Original-To: ietf-http-wg@w3.org
Subject: Re: RFC 9113 and :authority header field
Archived-At: <https://www.w3.org/mid/CAPDSy+41Cu8We=FTZr3PnMq+S1Vw75-YZo4LM3OLZEfnXCE5aQ@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40221
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

I might be misunderstanding something, but from my reading of RFC 9113,
sending an empty :authority pseudo-header with a non-empty Host header is
invalid for HTTP requests. That's why google.com rejects these with 400.

<<Clients that generate HTTP/2 requests directly MUST use the ":authority"
pseudo-header field to convey authority information, unless there is no
authority information to convey>>
<<An intermediary that forwards a request over HTTP/2 MUST construct an
":authority" pseudo-header field using the authority information from the
control data of the original request, unless the original request's target
URI does not contain authority information>>

Sending a non-empty Host header means that the URI contains authority
information, so it triggers the h2 requirement to send :authority.

Am I misunderstanding something?
David

On Wed, Jun 29, 2022 at 10:54 AM Roy T. Fielding <fielding@gbiv.com> wrote:

> > On Jun 28, 2022, at 10:52 PM, Willy Tarreau <w@1wt.eu> wrote:
> >
> > Hi Tatsuhiro,
> >
> > On Wed, Jun 29, 2022 at 08:58:47AM +0900, Tatsuhiro Tsujikawa wrote:
> >> RFC 7540 even says that :intermediary MUST omit :authority "when
> translating
> >> from an HTTP/1.1 request that has a request target in
> >> origin or asterisk form (see [RFC7230], Section 5.3)."
> >>
> >> Now RFC 9113 has this text:
> >>
> >>      An intermediary that forwards a request over HTTP/2 MUST construct
> >>      an ":authority" pseudo-header field using the authority
> >>      information from the control data of the original request, unless
> >>      the original request's target URI does not contain authority
> >>      information (in which case it MUST NOT generate ":authority").
> >>      Note that the Host header field is not the sole source of this
> >>      information; see Section 7.2 of [HTTP].
> >>
> >> This means :authority must be included if the host header field exists
> in
> >> an HTTP/1.1 request.
> >
> > My understanding is that Host doesn't necessarily count as "control data"
> > here, and that the goal was to accurately represent an HTTP/1.x request
> > targetting an HTTP/1.0 server after being transported over HTTP/2. For
> > example, let's say that a client passes this to a proxy:
> >
> >     GET http://example.com/ HTTP/1.0
> >     Proxy-connection: keep-alive
> >
> > and nothing more. If instead it gets sent via a gateway that transports
> > it over H2, it could make sense to consider that the scheme is "http",
> > the authority is "example.com", that there's no host, hence the request
> > would be passed as:
> >
> >     :method: GET
> >     :scheme: http
> >     :authority: example.com
> >
> > and that's all. Conversely, let's see the same HTTP/1.0 request sent
> > directly to the origin server:
> >
> >     GET / HTTP/1.0
> >
> > There's no more authority nor host, so a gateway receiving that cannot
> > invent one, unless it uses its own configured name corresponding to its
> > own address, that it expects the client used to construct the request.
> >
> > With HTTP/1.1 there are less ambiguities since Host is mandatory, but
> > the distinction between "proxy requests" and origin requests is still
> > relevant, especially when you don't know whether or not the origin
> > server supports HTTP/1.1 or only 1.0 (and may be confused by the
> > presence of an authority in the request line). For example, if a
> > client sends:
> >
> >  GET / HTTP/1.1
> >  Host: example.com
> >
> > to an HTTP/1.0 server that parses Host, it will work. If it sends
> >
> >  GET http://example.com/ HTTP/1.1
> >  Host: example.com
> >
> > To an HTTP/1.1 server, it will work as well, but it may fail to an
> HTTP/1.0
> > server (or worse, loop over itself if it supports proxing requests and
> > resolves itself as example.com).
>
> Well, this ship has sailed, but I must have missed that original
> discussion.
>
> The premise is incorrect in all respects, since all of those HTTP/1.1
> requests are also valid HTTP/1.0 requests (even with an absolute URI)
> and so is the presence of Host in those requests.
>
> Host is an HTTP/1.x field that was used in HTTP/1.0 requests (in 1995)
> as soon as we reached consensus on the field name. That was long before
> 1.1 was finished and 1.0 obsoleted. Host is a required part of HTTP/1.0 now
> just by virtue of the Internet as deployed, regardless of the
> informational RFC.
>
> [The idea was originally proposed in 1994 by John Franks
>
>
> https://lists.w3.org/Archives/Public/ietf-http-wg-old/1994SepDec/0019.html
>
> but it took a long time to converge on a single syntax
>
>
> https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0067.html
>
> https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0084.html
>
> https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0130.html
>
> https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995SepDec/0291.html
>
> and while we still talk about it as an important addition of HTTP/1.1
> (because
> that's where we chose to document it), the feature is required for 1.0 to
> work with deployed servers.]
>
> So, an HTTP proxy recipient that receives any form of authority/host
> information must forward that information in either Host or :authority,
> no matter what version it is using. Failure to do so introduces a
> security bypass because L7 routers act on that information whether
> or not the client/server pair is aware of their presence.
>
> Hence, an HTTP/1.0 proxy that receives your first example should forward
> that as
>
>     GET / HTTP/1.0
>     Host: example.com
>     Proxy-connection: keep-alive
>
> because the routing doesn't work otherwise due to name-based hosts
> being deployed before HTTP/1.1.
>
> And, no, there is absolutely no reason to concern ourselves with proxies
> that loop over their own hostnames, since that is a self-correcting error
> whenever a full URI is received as the request target.
>
> > If the first request is transported over H2, thus converted from H1 to
> > H2 then back from H2 to H1, adding an authority that was not initially
> > present would introduce exactly this problem. By not adding it and using
> > Host only, the request representation is preserved, and the origin server
> > can receive the same request that the client took care to encode, and not
> > be confused. That's why I'm saying that in this case it's clearly visible
> > that Host isn't part of the "control data" and must not appear in an
> > authority that was not initially encoded.
> >
> > I know it's a bit complicated but we have to deal with history. What
> we're
> > doing in haproxy is that both Host and :authority are used
> interchangeably
> > after having been checked for proper matching, and are modified at the
> > same time if needed, and we have a flag indicating if an authority was
> > present in the incoming request to know if we have to produce one on
> > output or not. That's in the end what seems to preserve the most accurate
> > representation along a chain of multiple versions. This allows us to emit
> > a Host field only if one was present, and an authority only if one was
> > present, regardless of the HTTP version. I don't think that RFC9113
> brings
> > any changes regarding this, it might only be a matter of what constitutes
> > "control data".
>
> Sorry, that is a broken implementation. You need to send Host regardless
> of the original request version.
>
> ....Roy
>
>
>