Re: Migrating some high-entropy HTTP headers to Client Hints.

Mike West <mkwst@google.com> Thu, 29 November 2018 12:57 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6A33112F295 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 29 Nov 2018 04:57:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -11.958
X-Spam-Level:
X-Spam-Status: No, score=-11.958 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-1.459, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ses8ZTS3ejyE for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 29 Nov 2018 04:57:20 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [IPv6:2603:400a:ffff:804:801e:34:0:38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B12DC130DC9 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 29 Nov 2018 04:57:20 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.89) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1gSLqt-0005D2-3h for ietf-http-wg-dist@listhub.w3.org; Thu, 29 Nov 2018 12:55:31 +0000
Resent-Date: Thu, 29 Nov 2018 12:55:31 +0000
Resent-Message-Id: <E1gSLqt-0005D2-3h@frink.w3.org>
Received: from titan.w3.org ([2603:400a:ffff:804:801e:34:0:4c]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <mkwst@google.com>) id 1gSLqp-0005CK-R0 for ietf-http-wg@listhub.w3.org; Thu, 29 Nov 2018 12:55:27 +0000
Received: from mail-oi1-x22b.google.com ([2607:f8b0:4864:20::22b]) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from <mkwst@google.com>) id 1gSLqo-0004oy-Eg for ietf-http-wg@w3.org; Thu, 29 Nov 2018 12:55:27 +0000
Received: by mail-oi1-x22b.google.com with SMTP id b141so1469351oii.12 for <ietf-http-wg@w3.org>; Thu, 29 Nov 2018 04:55:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bTq9wqlutlIhuPfEOpoJhOsXjKnaaz0+BC05Uo63ojU=; b=gD0TpXYZlHfm+xJTOVQJdYhETFgtL1Oe3uFpt0RWKvskxqodhWZwp5V98Bt/3GMA/d svUcLdK4qMX+1A0KC8nVZZJlK0jR7ntE6nVS98M1S7x559dvVSkmtu2X9EHf/Sfcm97l Stmo/lfIbtapnSj8VSQTj62ASpDelbsXQ520FKyEn6bU4a3FBLiCrVOgHxlRIJ7Nrsbg F76jecF0gldncN6faudh3YC+zAqyahff0odKJ2XqELJt585sYvdCOVAJOIlh6h6Tbilz 98hmRVKyr8SKflYgSK33CNaiezNZZocQskQUmiYNP7nK02c42nQEnXP1bidiJ5NBZMXL c5FA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bTq9wqlutlIhuPfEOpoJhOsXjKnaaz0+BC05Uo63ojU=; b=WnP/Rh07SV1ZbO8BnY0/5Uj9JYXx8dKjpHx/q/ze8aAn74zWr0gaQirPvLxQUOTk/0 peqZW6GiBpyOjYawKetjwgPI+Y6v1iwSfO3ALe7wGZQ0Zk6J+RuUtlYTJXgXJBcqKr0+ t/1JfCKhyi0DvAPlbSUhvT5SD6KeNfO3eE5Hwxgcj9wqZ+VeN2nfbH6bI7VLRJS2WFdX z8BmmWRPhMZA+xk1xYqoAddnPtcVeadWWwLf09Nno3CYu4mrq0WxRSxLCjggpb+068wf +94T4jrKz25kiZZbUy0gfS1IoxfdKYhHrbYW31oT3wEHAH9IIXBhL1gZ3oWDjHHWBkZ7 C/cg==
X-Gm-Message-State: AA+aEWZm5THXAftiZXKT0aI7eSMxY4Yzmp5vZLZUDL+zxwCQFLfR7Mtl vc60gg6uz0h6tOQsh0hXm1+Lhnx+AepK8hSEDixFjA==
X-Google-Smtp-Source: AFSGD/UoiQ7hc8gxfypTOiHJ2liLv1NgA7QNIVo/8Na3DFORtlpGW2eC7qYu0U5Vlf/+e0MyemwpBrg9bTBnyXdusv8=
X-Received: by 2002:aca:2807:: with SMTP id 7mr795627oix.7.1543496105270; Thu, 29 Nov 2018 04:55:05 -0800 (PST)
MIME-Version: 1.0
References: <CAKXHy=eHiMtXi8vkDYtADHdU0tnUfd3p+Wfy7vSkLgT7cA1W0w@mail.gmail.com> <f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
In-Reply-To: <f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
From: Mike West <mkwst@google.com>
Date: Thu, 29 Nov 2018 13:54:53 +0100
Message-ID: <CAKXHy=f1n_MyY0=jhKwCsJ6=7pk80eAuUsL87vmmFwbWNx3uDg@mail.gmail.com>
To: hidinginthebbc@gmail.com
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="00000000000011999b057bcd3104"
X-W3C-Hub-Spam-Status: No, score=-22.0
X-W3C-Hub-Spam-Report: AWL=2.589, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1gSLqo-0004oy-Eg b49932f04374a8ce0375c00b9def0f91
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Migrating some high-entropy HTTP headers to Client Hints.
Archived-At: <https://www.w3.org/mid/CAKXHy=f1n_MyY0=jhKwCsJ6=7pk80eAuUsL87vmmFwbWNx3uDg@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/36108
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Thanks for the feedback!

On Thu, Nov 29, 2018 at 1:08 PM Thomas Peterson <hidinginthebbc@gmail.com>
wrote:

> I would propose that all Accept* headers are included in Client Hints as
> all can be used for some level of fingerprinting, e.g. Accept can used
> to distinguish between desktop browsers (which typically have html/xml
> MIME types) and cURL/wget which by default have '*/*'.


The philosophy in https://tools.ietf.org/html/draft-west-ua-client-hints is
that it's reasonable to expose basic information about the user agent (e.g.
it's Firefox, not cURL). That level of information seems quite difficult to
hide (given differences in behavior, network stacks, etc.) and quite
valuable to developers, which tips the balance for me towards exposing
brand and major version by default.

With that in mind, `Accept` and `Accept-Encoding` seem to be fairly static
in their relationship to the UA brand and version. Chrome more or less
hard-codes `Accept` and `Accept-Encoding` based on the kind of resource
being asked for, for instance (see
https://cs.chromium.org/chromium/src/net/url_request/url_request_http_job.cc?g=0&l=666
and
places like
https://cs.chromium.org/chromium/src/media/blink/resource_multibuffer_data_provider.cc?rcl=e53d19f7befd7927b6b9727dc88b9ee295c6fa05&l=110
 and
https://cs.chromium.org/chromium/src/content/renderer/loader/web_url_loader_impl.cc?rcl=e53d19f7befd7927b6b9727dc88b9ee295c6fa05&l=672
).

With the caveat that I'm sometimes prone to a myopic view of the world from
the standpoint of a web browser: `User-Agent` and `Accept-Language` seem to
contain significantly more entropy, and therefore feel like the right place
to start. I certainly wouldn't suggest that that's where we ought to stop.
:)


> Many user agents
> also do their own guess work on response bodies anyway (such as looking
> at the magic number) to determine content type or encoding, so the
> impact of a "failed negotiation" of content can be limited.
>
> Also, Is there a particular reason why Sec-CH-Lang omits Quality Values?
>

https://tools.ietf.org/html/draft-west-lang-client-hint-00#section-4.3
addresses this. In a nutshell, it seems like cruft, and some widely-used
user agents (I spot-checked Chrome and Firefox) implement the weighting
mechanism as a function of the list order. That semantic makes sense to me,
and more doesn't seem to be necessary.

I might well be missing use cases here, which I'd be thrilled to hear about!

-mike


>
> Regards
>
>
> On 29/11/2018 10:22, Mike West wrote:
> > Hey folks,
> >
> > Section 9.7 of RFC7231
> > <https://tools.ietf.org/html/rfc7231#section-9.7> rightly notes that
> > some of the content negotiation headers user agents deliver in HTTP
> > requests create substantial fingerprinting surface. I think it would
> > be beneficial if we took steps to reduce their prevalence on the wire,
> > and Client Hints looks like a reasonable infrastructure on top of
> > which to build.
> >
> > `User-Agent` and `Accept-Language` seem like particularly tasty and
> > low-hanging fruit, and I've sketched out two proposals as proofs of
> > concept:
> >
> > *   `User-Agent` could be represented as ~four distinct hints: `UA`,
> > `Model`, `Platform`, and `Arch`:
> > https://github.com/mikewest/ua-client-hints is a high-level explainer,
> > and https://tools.ietf.org/html/draft-west-ua-client-hints a sketchy
> > ID for the new headers.
> >
> > *   `Accept-Language` could be represented as a `Lang` hint:
> > https://github.com/mikewest/lang-client-hint is a high-level
> > explainer, https://tools.ietf.org/html/draft-west-lang-client-hint an
> > equally sketchy ID for the new header.
> >
> > I'd appreciate y'all's feedback. Thanks!
> >
> > -mike
>