[core] Fwd: Subtle incompatibility between H2 and H1's :path

Carsten Bormann <cabo@tzi.org> Thu, 19 August 2021 12:25 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E28433A1222 for <core@ietfa.amsl.com>; Thu, 19 Aug 2021 05:25:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aeL5n_EVMXEz for <core@ietfa.amsl.com>; Thu, 19 Aug 2021 05:25:06 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::15]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81EE73A11FB for <core@ietf.org>; Thu, 19 Aug 2021 05:25:06 -0700 (PDT)
Received: from [192.168.217.118] (p548dcc89.dip0.t-ipconnect.de [84.141.204.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Gr3qy1BZwz2xG2; Thu, 19 Aug 2021 14:25:02 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
Date: Thu, 19 Aug 2021 14:25:01 +0200
X-Mao-Original-Outgoing-Id: 651068701.643208-c7125a0d7288eb5d0bbbebd4bd0a80af
Content-Transfer-Encoding: quoted-printable
Message-Id: <65E74A10-7B77-4F45-887B-ABECD76FA5CE@tzi.org>
References: <20210819055955.GB8102@1wt.eu>
To: core@ietf.org
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/RzowkmZl6Ji85pltTob9zr15ojM>
Subject: [core] Fwd: Subtle incompatibility between H2 and H1's :path
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Aug 2021 12:25:15 -0000

As a little nugget of content during our summer vacations, here is an interesting observation about URI paths from the HTTP working group mailing list.
AFAICT, RFC 7252 (Sections 5.10.1, 6.5) is already able to handle what is being proposed here; the comment is mostly relevant to core-href (which I believe also has all that’s needed).

Grüße, Carsten


> Begin forwarded message:
> 
> From: Willy Tarreau <w@1wt.eu>
> Subject: Subtle incompatibility between H2 and H1's :path
> Date: 2021-08-19 at 07:59:55 CEST
> To: HTTP Working Group <ietf-http-wg@w3.org>
> Archived-At: <https://www.w3.org/mid/20210819055955.GB8102@1wt.eu>
> 
> Hello,
> 
> after tightening up the :path parser in haproxy to strictly comply with
> both RFC7540 and the latest draft, one user of a large hosting platform
> reported breakage of at least one hosted site which contains a few HTML
> links with the path beginning with two slashes, resulting from the
> concatenation of a base URL ending with a slash and a prefix. E.g:
> 
>    <img src="https://site.example.org//static/image.jpg">
> 
> At first I responded "that's expected as it is explicitly forbidden by
> the H2 spec (RFC7540), which says":
> 
>     "The ":path" pseudo-header field includes the path and query parts
>      of the target URI (the "path-absolute" production and optionally a
>      '?' character followed by the "query" production (see Sections 3.3
>      and 3.4 of [RFC3986])."
> 
>   And RFC3986#3.3:
> 
>      path-absolute   ; begins with "/" but not "//"
>      path-absolute = "/" [ segment-nz *( "/" segment ) ]
>      segment-nz    = 1*pchar
>      segment       = *pchar
> 
> Then I wondered why before this change the request was processed by the
> HTTP/1.1 backend server, had it been too lenient or was there a difference
> in the protocol spec. The response is the latter. In RFC7230 #2.7, a
> purposely different absolute-path is defined:
> 
>  An "absolute-path" rule is defined for protocol elements that can
>  contain a non-empty path component.  (This rule differs slightly from
>  the path-abempty rule of RFC 3986, which allows for an empty path to
>  be used in references, and path-absolute rule, which does not allow
>  paths that begin with "//".)
> 
>     request-line   = method SP request-target SP HTTP-version CRLF
>     request-target = origin-form
>                    / absolute-form
>                    / authority-form
>                    / asterisk-form
> 
>     origin-form    = absolute-path [ "?" query ]
>     absolute-path = 1*( "/" segment )
> 
> And this version is the one that was adopted by the HTTP core spec, but
> the H2 spec keeps its difference with path-absolute that cannot start
> with "//", even in the latest draft.
> 
> This use of "path-absolute" was introduced into the H2 spec between draft
> 04 and draft 05 when trying to precise the definition of :path. And I think
> that by then the difference between HTTP/1 and RFC3986's interpretation of
> path-absolute and absolute-path has simply been overlooked.
> 
> Given that in the report above the browsers happily sent the request using
> the HTTP definition of absolute-path and not RFC3986's definition of
> path-absolute (thus violating RFC7540), that sites *are* written to rely
> on this, that this seems to be how other H2 implementations are currently
> handling it, and that the new HTTP spec defines the format of a request-target
> in origin form as an absolute-path as well, I think we should fix the latest
> H2 draft to adopt the common definition of absolute-path (which explicitly
> permits "//") and stop keeping a non-interoperable exception here.
> 
> Does anyone disagree ?
> 
> Thanks,
> Willy
>