Re: Structured request headers deployment issues

Mike West <mkwst@google.com> Fri, 19 June 2020 07:06 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6657E3A0775 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 19 Jun 2020 00:06:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.249
X-Spam-Level:
X-Spam-Status: No, score=-10.249 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9ALuMh9G2RsA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 19 Jun 2020 00:06:34 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 50D183A0044 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 19 Jun 2020 00:06:34 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1jmB3x-000179-Q1 for ietf-http-wg-dist@listhub.w3.org; Fri, 19 Jun 2020 07:03:45 +0000
Resent-Date: Fri, 19 Jun 2020 07:03:45 +0000
Resent-Message-Id: <E1jmB3x-000179-Q1@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <mkwst@google.com>) id 1jmB3v-00016O-RG for ietf-http-wg@listhub.w3.org; Fri, 19 Jun 2020 07:03:43 +0000
Received: from mail-lj1-x229.google.com ([2a00:1450:4864:20::229]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <mkwst@google.com>) id 1jmB3t-0002K9-PF for ietf-http-wg@w3.org; Fri, 19 Jun 2020 07:03:43 +0000
Received: by mail-lj1-x229.google.com with SMTP id q19so10288252lji.2 for <ietf-http-wg@w3.org>; Fri, 19 Jun 2020 00:03:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HhefG68BV9wdEzU2esk9crXyPZHSD7WJk6Htn9tMNN0=; b=IawUQjtHY+eLc/YTkzL2roXeQK8nMrvRkZBDoV2VMiHZPPitm2ACygLg33jCCZbfpN KPT8WKGLjwgQh9aDSjAv9ZS6aUmqx94lv69LKjaCAp5ARzHMPWrODzD/e1UQP3LnUUoY Qd1Ak3HwxWyWxqQ01jpp+kY/BkhL1EaCQpTeZgF/aVTOs/IHv2V4xcptdHmbvNWVkVIg xD3+0BeE52Y13tVNCqCkH5WOC7Wtt2mtXH8srC/VZL7kl45GgpgQWmFQ7j3v77gHwETZ E6umsYBLchEq2FI+BvoRdxmfFN763j/ACwSsazdF3zyNxeYJz4ddeKutF1vGfqnHfzZU kRzA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HhefG68BV9wdEzU2esk9crXyPZHSD7WJk6Htn9tMNN0=; b=n+tkG9HwhkOGPZ+0zWVFyj36eqBNftMaQF/EKVxwMSDcFKeId0C+zvV4i4kDAU4syZ raDLkJCkxtWlUjhcF7PINKOJq4ChqNTsnuUs/xZmQRRPRUQKJFvV+LWS0taqwXrLAaus 7hWgojMJtMg+IsizUo4ZYF+U5BGuNQUO4pRRYlWP1lLS/dD3Gdi/mzNklCwbozKQjxkP aUhvxe6Jugkt1v1PaJSStw/zNStP+7LjGXjUBcGVtM3a0S7IwqHwgiDq4EhXMbphugyD pUKBhDl0b+xloMfXWTYt9iqRVkJs0djtizMjSKYUmc31PWOpNfvOc76lF8D47LhD6ntK 5EzQ==
X-Gm-Message-State: AOAM533E3YEpci78UjmentP2oO3YJDrMxKL3T/4dXwUBRQivY+XnqzL7 SnAwWyCP0tIZvKk8YAlFIOaxGWNQETQEpGk9IRDEnA==
X-Google-Smtp-Source: ABdhPJxhdx4ZTGxCK5AWJ/egqs8nyq1gNR14JMKZPPzLv/fUWwjQ+dngutdF1BHi9qSMiqqCv6vqBDOv1wGAtyCP51U=
X-Received: by 2002:a2e:990a:: with SMTP id v10mr1065200lji.289.1592550209147; Fri, 19 Jun 2020 00:03:29 -0700 (PDT)
MIME-Version: 1.0
References: <CACj=BEiT7GnKeS_2wFK8jL0jUFtFYoX-wvXnSsPO4nYJ5P=2bQ@mail.gmail.com> <36626BBC-F97C-4A7A-8F1F-E3E9FBA920EA@mnot.net>
In-Reply-To: <36626BBC-F97C-4A7A-8F1F-E3E9FBA920EA@mnot.net>
From: Mike West <mkwst@google.com>
Date: Fri, 19 Jun 2020 09:03:18 +0200
Message-ID: <CAKXHy=fbcL4XEf8NeQ8kW-+d4c8Q6EwO3U0Ara626FX7hmY07A@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: Yoav Weiss <yoav@yoav.ws>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>, Tommy Pauly <tpauly@apple.com>, Ilya Grigorik <igrigorik@gmail.com>
Content-Type: multipart/alternative; boundary="00000000000081852705a86a7ded"
Received-SPF: pass client-ip=2a00:1450:4864:20::229; envelope-from=mkwst@google.com; helo=mail-lj1-x229.google.com
X-W3C-Hub-Spam-Status: No, score=-24.6
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1jmB3t-0002K9-PF fe2277b2fc6809f5530895ab1a3a8356
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Structured request headers deployment issues
Archived-At: <https://www.w3.org/mid/CAKXHy=fbcL4XEf8NeQ8kW-+d4c8Q6EwO3U0Ara626FX7hmY07A@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/37796
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Fri, Jun 19, 2020 at 7:13 AM Mark Nottingham <mnot@mnot.net> wrote:

> > On 16 Jun 2020, at 8:15 am, Yoav Weiss <yoav@yoav.ws> wrote:
> >
> > Hey all,
> >
> > Chromium M84 (which Chrome equivalent is now in Beta) has User-Agent
> Client Hints enabled by default, which is using Structured Headers.
> >
> > As a result of that, we found multiple sites which seem to have a
> somewhat allergic reaction to the presence of certain characters (that are
> part of the SH format) in request values.
> > While each site in question is different (in what appears to be coming
> from different stacks), we've seen sites that reject requests with quotes,
> question marks or equals signs in them.
> > It's still early, so it's hard to know how widespread the issue is, but
> we seem to be adding sites to the list at a faster pace than the pace of
> removing fixed ones from it.
>

Yoav, have we experimented at all with alternate spellings? Do
single-quotes have the same impact as double-quotes, for example? Are other
(uglier) non-alphanumeric delimiters less likely to cause trouble? There's
a limited appetite for breakage, and it might be worth exploring alternate
aesthetics rather than forcing through the particular spelling we like the
most.

> So, I wanted to give this group a heads-up on that front, and maybe get
> folks' opinions regarding possible things we could do on that front, other
> than outreach and waiting for said sites to fix themselves.
>
> AIUI these aren't new; e.g., IIRC quite a few months ago Chrome
> encountered several Austrian sites that had this problem, traced back to a
> local(?) WAF vendor there. I believe that's been corrected since, after
> reaching out to them.
>

I think you might be referring to some conversations around the
(since-renamed) `Sec-Metadata` header in 2018, like those captured in
https://bugs.chromium.org/p/chromium/issues/detail?id=861678#c11.

Personally, I think that outreach and waiting is the right approach; if
> browsers consistently send these headers, they'll adapt, and the numbers
> are still relatively small -- or at least small enough that it's not likely
> the numbers will be reduced if the syntax is changed (due to _other_ WAFs'
> opinions about what a "good" request is).
>

The flip-side of this is that we break sites users rely upon. That's tough
to do at scale. In this case in particular, I think I agree with the
underlying assertion that we'd like this syntax to work, and it seems
reasonable to me to roll it out to a small percentage of stable users. That
said, if we end up breaking the internet for even a small percentage of
users, it doesn't seem like a good idea to hold them hostage in the hopes
of increasing pressure on WAF vendors.

My expectation is that we'd roll this out to a small percentage of stable,
see a spike in bug reports (especially from enterprise folks who are likely
to have WAF deployments, unlikely to broadly deploy beta channels, and
unlikely to choose to enable usage statistics or histograms), see a
corresponding spike in error codes for top-level navigations, and roll it
back.

I think it might be reasonable to explore alternate spellings at the same
time, perhaps with some A/B testing to evaluate how well each weaves its
way through middleware.

Related, we're also seeing more examples WAFs limiting how we can evolve
> the protocol (e.g., <https://github.com/coreruleset/coreruleset/pull/1777>).
> There's been a bit of background chatter about writing something about this
> and creating better communication with that community; I'm not sure what
> that will look like yet, but if anyone has ideas or is interested, please
> say so.
>

I am interested!

-mike