Re: Ambiguity on HTTP/3 HEADERS and QUIC STREAM FIN requirement

Willy Tarreau <w@1wt.eu> Fri, 17 June 2022 05:37 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CCDBC15BEC8 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 16 Jun 2022 22:37:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.662
X-Spam-Level:
X-Spam-Status: No, score=-2.662 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8y7e3yZi-u9u for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 16 Jun 2022 22:37:26 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 53A05C14F74E for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 16 Jun 2022 22:37:25 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1o24fW-000198-31 for ietf-http-wg-dist@listhub.w3.org; Fri, 17 Jun 2022 05:37:18 +0000
Resent-Date: Fri, 17 Jun 2022 05:37:18 +0000
Resent-Message-Id: <E1o24fW-000198-31@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <w@1wt.eu>) id 1o24fV-00018F-AB for ietf-http-wg@listhub.w3.org; Fri, 17 Jun 2022 05:37:17 +0000
Received: from wtarreau.pck.nerim.net ([62.212.114.60] helo=1wt.eu) by mimas.w3.org with esmtp (Exim 4.92) (envelope-from <w@1wt.eu>) id 1o24fT-0003F7-3L for ietf-http-wg@w3.org; Fri, 17 Jun 2022 05:37:17 +0000
Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 25H5b0Lp030407; Fri, 17 Jun 2022 07:37:00 +0200
Date: Fri, 17 Jun 2022 07:37:00 +0200
From: Willy Tarreau <w@1wt.eu>
To: Martin Thomson <mt@lowentropy.net>
Cc: ietf-http-wg@w3.org, Amaury Denoyelle <adenoyelle@haproxy.com>
Message-ID: <20220617053700.GD30314@1wt.eu>
References: <YqsBZ0M4lDXGRQTo@miskatonic> <CAJ_4DfRD9ktc1cmDd8GmQDx8iX6s=NYj3C9MW-3-yU+SZYxkEQ@mail.gmail.com> <CALGR9obvYB-2vYXYgwSMpS3s8uSbLKn0ZF39piYQ_ROt8DAXJg@mail.gmail.com> <CAPDSy+5-+L_F8gOYht2OWmWHnp4JtWt4tfPbM_TN799AgK7H1A@mail.gmail.com> <71b309c9-0220-4e14-8be5-13f2cb1d1278@beta.fastmail.com> <20220617045330.GB30314@1wt.eu> <d2a71dca-1c0d-416e-8f94-700114266aaf@beta.fastmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <d2a71dca-1c0d-416e-8f94-700114266aaf@beta.fastmail.com>
User-Agent: Mutt/1.10.1 (2018-07-13)
Received-SPF: pass client-ip=62.212.114.60; envelope-from=w@1wt.eu; helo=1wt.eu
X-W3C-Hub-Spam-Status: No, score=-4.9
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1o24fT-0003F7-3L 96afc4c5c98121a1486eeed9a8f14256
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Ambiguity on HTTP/3 HEADERS and QUIC STREAM FIN requirement
Archived-At: <https://www.w3.org/mid/20220617053700.GD30314@1wt.eu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40144
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Fri, Jun 17, 2022 at 03:02:56PM +1000, Martin Thomson wrote:
> Hey Willy,
> 
> On Fri, Jun 17, 2022, at 14:53, Willy Tarreau wrote:
> > Probably, but I'm still seeing it as a workaround. I mean, since HTTP/1.0
> > we've been used to know when receiving a full headers block the entire
> > list of the header fields. And it looks like with H3 the headers block
> > is uncertain at the end of a HEADERS frame. 
> 
> I don't think that's right.  There are two layers at work here, but you do
> have a clear marker for the end of the header block.
> 
> The HEADERS frame has ended in this case, so you have a clear indication that
> you have all the headers.

Not exactly. In HTTP/1 we used to have Transfer-Encoding which was a
connection-level header field to bridge the gap between what was explicitly
advertised in headers and what could have been ambiguous at the connection
level (such as receiving a FIN late). In HTTP/2 we could get rid of this
header field because the HEADERS frame was able to arbor the END_STREAM
flag to indicate that there was no body following. And in H2, it's not the
same to send a HEADERS + ES and a HEADERS followed by DATA+ES. The first
one doesn't have a body, the second one has an empty body. Passing that
to HTTP/1 results in a transfer-encoding header being added to the second
one. So for H1 and H2 the presence or absence of a body has always been
explicit and known upon receipt of a complete headers block.

With H3 you have neither the transfer-encoding header nor the ES bit on
the frame to indicate that presence/absence. The only indication that
matches the H2 ES is the QUIC FIN that also signals the end of stream,
albeit at a lower level. That's why I think we've slowly deviated from
something very explict (H1) to something subtly explicit (H2) then
something ambiguous (H3).

> The QUIC stream, which in this case hasn't ended, so you aren't sure if
> content or trailers might follow.

Exactly.

> >    A sender which knows that no data follows the headers block SHOULD
> >    signal the end of the message by setting the FIN bit along with the
> >    HEADERS frame. A receiver that processes a HEADERS frame without
> >    seeing a FIN bit MAY expect more data to follow regardless of the
> >    HTTP method used.
> 
> This is entirely sensible advise to give, but it isn't always possible due to
> how different pieces of software work.  There are some rather fundamental
> design choices involved that can make it hard to guarantee that these two
> things leave at the same time.  As you say, most people can deliver on this,
> but not all.

I'm well aware. But it's easier to respect when you have the rule upfront
than when nobody told you there was a problem with this. Just like we've
all had to adjust certain painful things in our H2 implementations over
time, violating multiple layers, that resulted in new text being written
into RFC9113, there will be some adjustments needed to be done to various
implementations to fix some trouble detected in field.

> > Sure but while you can often do this when you're a server, you practically
> > can't when you're a gateway, because that would require to adjust the
> > behavior per-URI. 
> 
> Yeah, a gateway is in a bad position here because they don't really speak for
> the resource and so - without extra information about resources, which could
> be made for all resources on a server, but probably won't be - they have to
> sit on their hands, buffer requests, and pay the ridiculous cost of doing
> that.

Paying the cost of making two ends understand each other is the daily
job of a gateway :-)  Regardless it's also the one that takes all the
dirty stuff in the face and it needs to be robust by design. My concern
here precisely is that waiting will both make it less robust *and* will
possibly not work with some clients which forget to send their FIN.

> But if you take the fact that you have a clear signal that the headers are
> done, you can - even as a gateway - make some decisions.  It might not be
> 100% safe, but I can't see any origin servers complaining if you started
> processing from that point, for GET and HEAD requests at least.

Sadly that's not even true :-(  We've seen recently, I think it was
Elastic Search that takes JSON requests sent as the body of a GET
request. So now that we managed to better define the presence/absence
of a body in a request, we're back trying to guess it with a certain
probability based on a method, and I'd definitely not encourage
implementations to start to guess again.

Cheers,
Willy