Re: RFC 9113 and :authority header field

"Roy T. Fielding" <fielding@gbiv.com> Wed, 29 June 2022 17:53 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8784AC14F733 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 29 Jun 2022 10:53:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.758
X-Spam-Level:
X-Spam-Status: No, score=-2.758 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gbiv.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cUBMqaY705Yw for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 29 Jun 2022 10:53:47 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B7B27C14F737 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 29 Jun 2022 10:53:46 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.94.2) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1o6bpT-00F5zR-Nj for ietf-http-wg-dist@listhub.w3.org; Wed, 29 Jun 2022 17:50:19 +0000
Resent-Date: Wed, 29 Jun 2022 17:50:19 +0000
Resent-Message-Id: <E1o6bpT-00F5zR-Nj@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <fielding@gbiv.com>) id 1o6bpS-00F5yY-CM for ietf-http-wg@listhub.w3.org; Wed, 29 Jun 2022 17:50:17 +0000
Received: from insect.birch.relay.mailchannels.net ([23.83.209.93]) by mimas.w3.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <fielding@gbiv.com>) id 1o6bpQ-007OcU-Jy for ietf-http-wg@w3.org; Wed, 29 Jun 2022 17:50:17 +0000
X-Sender-Id: dreamhost|x-authsender|fielding@gbiv.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 7022576107C; Wed, 29 Jun 2022 17:50:00 +0000 (UTC)
Received: from pdx1-sub0-mail-a315.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id EBC5076130E; Wed, 29 Jun 2022 17:49:59 +0000 (UTC)
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1656525000; a=rsa-sha256; cv=none; b=oVa/ZmhIF+zKkXf/N3CrnONiY86thAzedtDwluhU28sdffd7s5LRx16p57su2a0z2NCAQV pDc89OR+RxK+vTlmW5z/Wh56uoJStUP/PouWt6VfC+YK7o2HvCbMpqiJ+UBfsx52rThhHG BX+CP6f8Tk5CBvA4rexxB2xDVFgT7CjNXqNcg9ux/WylzVkV88rBcsAugwAxkhlVBWeDAy OESVC85raKm2wRq+wh80LYpxcZhAzB73weg2AjoIRd2Rl4JTgLQNiYz1uvUBr4IsBLx5It xVSydgJ7KGe3sDr28zSyG2Xg9oXBi+3RbStNerVTg0yNKg8TNIHpCw0Zu9kD8Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1656525000; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y+ttZrQGtv5Rz/QYhqOMsofO/ClfgZD7um4dQKOQLrk=; b=FZifuQU0JJvgn/zy6lqLABPztSOCyks2K5a9VOBFhebV7CX3krauEMSF1olWw5tCAWxkmc Wq1//SKE1n/vDjxdo9wm7mNynOH7VQsZhIK3Tidg7b8JmAecyjKcMXemFiuOm+Sq93shnL wM/yIcUn8z7QOH8O4LnAH1HCI9EdGxYOajAXP6tPYuy+o8beTLkPY3nec4aRjEK3TSEp// 7akfUmoH0vSK8fmitGiaBX64+E0oXzNJ2qMKa7ldO63sVpmWblwGE36nZIWyhMGwO3OJZB ZlU/I9rwQaV2OGS/XIXTrlA7ublzZEP2QTJwUVLBiFRktdtpCIWUBpKrHaZWRw==
ARC-Authentication-Results: i=1; rspamd-689699966c-msv9z; auth=pass smtp.auth=dreamhost smtp.mailfrom=fielding@gbiv.com
X-Sender-Id: dreamhost|x-authsender|fielding@gbiv.com
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|fielding@gbiv.com
X-MailChannels-Auth-Id: dreamhost
X-Chief-Obese: 7fde45c007d9dc30_1656525000263_1695752045
X-MC-Loop-Signature: 1656525000263:3039120132
X-MC-Ingress-Time: 1656525000262
Received: from pdx1-sub0-mail-a315.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.125.123.46 (trex/6.7.1); Wed, 29 Jun 2022 17:50:00 +0000
Received: from smtpclient.apple (ip72-194-77-117.oc.oc.cox.net [72.194.77.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: fielding@gbiv.com) by pdx1-sub0-mail-a315.dreamhost.com (Postfix) with ESMTPSA id 4LY89z3Brjz2Q; Wed, 29 Jun 2022 10:49:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gbiv.com; s=dreamhost; t=1656524999; bh=axegQ89PlQ3S4I814aukrBcF7cnDmq4qdRq0mXPnyHg=; h=Content-Type:Subject:From:Date:Cc:Content-Transfer-Encoding:To; b=fdCy2sqUgbe8f+KiS38sj1tzboeWozmsWscdEszlDeloL11V2sluZd6HZQ+vx1Y+n eqqEm70wa2CRNM8GseQhhd7TzB1gPXRua0b+m1V+j2uapUyP70t6fq12VrSo1Z2XTu k/vDluFHbl1U9ENgs6NPVUjRmHkCPBoxKhxUsA/R/Ne/cy6s9VodKiGoUiZfljN2cJ IoZq6UqMx0TAgddRu8CVgMy+9F6SNB5sh3h6s2jhkB+sDGbtCxoB7J1hpYAl6TOm7f 6omXEF/ko5pjYt/PJF5Ib7oMP4Lo5xpwZFjz+/Q/1M5zJpbWP5Lh12u0/vgl0pDkBq MMR1n1J9XMJ+g==
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
From: "Roy T. Fielding" <fielding@gbiv.com>
In-Reply-To: <20220629055254.GA18881@1wt.eu>
Date: Wed, 29 Jun 2022 10:49:58 -0700
Cc: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, HTTP <ietf-http-wg@w3.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <34B74169-9A07-4003-8F76-1B518DE3A3A0@gbiv.com>
References: <CAPyZ6=+q+MoOOwoCxbtFjt+gqsjHBqTzz9KXNVcs3EP-4VFp=Q@mail.gmail.com> <D7142A8A-5B80-46F5-A653-2307EE2DC5D8@gbiv.com> <CAPyZ6=LCSDAsPoFCQ2cRO-i+dpo5vnp2L5A7ZLw8dvRtDs6HUg@mail.gmail.com> <20220629055254.GA18881@1wt.eu>
To: Willy Tarreau <w@1wt.eu>
X-Mailer: Apple Mail (2.3696.100.31)
Received-SPF: pass client-ip=23.83.209.93; envelope-from=fielding@gbiv.com; helo=insect.birch.relay.mailchannels.net
X-W3C-Hub-DKIM-Status: validation passed: (address=fielding@gbiv.com domain=gbiv.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-9.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1o6bpQ-007OcU-Jy 6e2f1499f1998b145b80900cdf903d56
X-Original-To: ietf-http-wg@w3.org
Subject: Re: RFC 9113 and :authority header field
Archived-At: <https://www.w3.org/mid/34B74169-9A07-4003-8F76-1B518DE3A3A0@gbiv.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40220
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

> On Jun 28, 2022, at 10:52 PM, Willy Tarreau <w@1wt.eu> wrote:
> 
> Hi Tatsuhiro,
> 
> On Wed, Jun 29, 2022 at 08:58:47AM +0900, Tatsuhiro Tsujikawa wrote:
>> RFC 7540 even says that :intermediary MUST omit :authority "when translating
>> from an HTTP/1.1 request that has a request target in
>> origin or asterisk form (see [RFC7230], Section 5.3)."
>> 
>> Now RFC 9113 has this text:
>> 
>>      An intermediary that forwards a request over HTTP/2 MUST construct
>>      an ":authority" pseudo-header field using the authority
>>      information from the control data of the original request, unless
>>      the original request's target URI does not contain authority
>>      information (in which case it MUST NOT generate ":authority").
>>      Note that the Host header field is not the sole source of this
>>      information; see Section 7.2 of [HTTP].
>> 
>> This means :authority must be included if the host header field exists in
>> an HTTP/1.1 request.
> 
> My understanding is that Host doesn't necessarily count as "control data"
> here, and that the goal was to accurately represent an HTTP/1.x request
> targetting an HTTP/1.0 server after being transported over HTTP/2. For
> example, let's say that a client passes this to a proxy:
> 
>     GET http://example.com/ HTTP/1.0
>     Proxy-connection: keep-alive
> 
> and nothing more. If instead it gets sent via a gateway that transports
> it over H2, it could make sense to consider that the scheme is "http",
> the authority is "example.com", that there's no host, hence the request
> would be passed as:
> 
>     :method: GET
>     :scheme: http
>     :authority: example.com
> 
> and that's all. Conversely, let's see the same HTTP/1.0 request sent
> directly to the origin server:
> 
>     GET / HTTP/1.0
> 
> There's no more authority nor host, so a gateway receiving that cannot
> invent one, unless it uses its own configured name corresponding to its
> own address, that it expects the client used to construct the request.
> 
> With HTTP/1.1 there are less ambiguities since Host is mandatory, but
> the distinction between "proxy requests" and origin requests is still
> relevant, especially when you don't know whether or not the origin
> server supports HTTP/1.1 or only 1.0 (and may be confused by the
> presence of an authority in the request line). For example, if a
> client sends:
> 
>  GET / HTTP/1.1
>  Host: example.com
> 
> to an HTTP/1.0 server that parses Host, it will work. If it sends
> 
>  GET http://example.com/ HTTP/1.1
>  Host: example.com
> 
> To an HTTP/1.1 server, it will work as well, but it may fail to an HTTP/1.0
> server (or worse, loop over itself if it supports proxing requests and
> resolves itself as example.com).

Well, this ship has sailed, but I must have missed that original discussion.

The premise is incorrect in all respects, since all of those HTTP/1.1
requests are also valid HTTP/1.0 requests (even with an absolute URI)
and so is the presence of Host in those requests.

Host is an HTTP/1.x field that was used in HTTP/1.0 requests (in 1995)
as soon as we reached consensus on the field name. That was long before
1.1 was finished and 1.0 obsoleted. Host is a required part of HTTP/1.0 now
just by virtue of the Internet as deployed, regardless of the informational RFC.

[The idea was originally proposed in 1994 by John Franks

   https://lists.w3.org/Archives/Public/ietf-http-wg-old/1994SepDec/0019.html

but it took a long time to converge on a single syntax

   https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0067.html
   https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0084.html
   https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995JanApr/0130.html
   https://lists.w3.org/Archives/Public/ietf-http-wg-old/1995SepDec/0291.html

and while we still talk about it as an important addition of HTTP/1.1 (because
that's where we chose to document it), the feature is required for 1.0 to
work with deployed servers.]

So, an HTTP proxy recipient that receives any form of authority/host
information must forward that information in either Host or :authority,
no matter what version it is using. Failure to do so introduces a
security bypass because L7 routers act on that information whether
or not the client/server pair is aware of their presence.

Hence, an HTTP/1.0 proxy that receives your first example should forward
that as

    GET / HTTP/1.0
    Host: example.com
    Proxy-connection: keep-alive

because the routing doesn't work otherwise due to name-based hosts
being deployed before HTTP/1.1.

And, no, there is absolutely no reason to concern ourselves with proxies
that loop over their own hostnames, since that is a self-correcting error
whenever a full URI is received as the request target.

> If the first request is transported over H2, thus converted from H1 to
> H2 then back from H2 to H1, adding an authority that was not initially
> present would introduce exactly this problem. By not adding it and using
> Host only, the request representation is preserved, and the origin server
> can receive the same request that the client took care to encode, and not
> be confused. That's why I'm saying that in this case it's clearly visible
> that Host isn't part of the "control data" and must not appear in an
> authority that was not initially encoded.
> 
> I know it's a bit complicated but we have to deal with history. What we're
> doing in haproxy is that both Host and :authority are used interchangeably
> after having been checked for proper matching, and are modified at the
> same time if needed, and we have a flag indicating if an authority was
> present in the incoming request to know if we have to produce one on
> output or not. That's in the end what seems to preserve the most accurate
> representation along a chain of multiple versions. This allows us to emit
> a Host field only if one was present, and an authority only if one was
> present, regardless of the HTTP version. I don't think that RFC9113 brings
> any changes regarding this, it might only be a matter of what constitutes
> "control data".

Sorry, that is a broken implementation. You need to send Host regardless
of the original request version.

....Roy