Re: Migrating some high-entropy HTTP headers to Client Hints.

Thomas Peterson <hidinginthebbc@gmail.com> Thu, 29 November 2018 12:10 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 118461292AD for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 29 Nov 2018 04:10:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.459
X-Spam-Level:
X-Spam-Status: No, score=-4.459 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-1.459, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nFA1GrUVI8cD for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 29 Nov 2018 04:10:55 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [IPv6:2603:400a:ffff:804:801e:34:0:38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AD06D127333 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 29 Nov 2018 04:10:55 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.89) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1gSL7W-0007r7-HZ for ietf-http-wg-dist@listhub.w3.org; Thu, 29 Nov 2018 12:08:38 +0000
Resent-Date: Thu, 29 Nov 2018 12:08:38 +0000
Resent-Message-Id: <E1gSL7W-0007r7-HZ@frink.w3.org>
Received: from titan.w3.org ([2603:400a:ffff:804:801e:34:0:4c]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <hidinginthebbc@gmail.com>) id 1gSL7T-0007kj-58 for ietf-http-wg@listhub.w3.org; Thu, 29 Nov 2018 12:08:35 +0000
Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from <hidinginthebbc@gmail.com>) id 1gSL7M-00006S-SJ for ietf-http-wg@w3.org; Thu, 29 Nov 2018 12:08:34 +0000
Received: by mail-wr1-x42c.google.com with SMTP id v6so1624199wrr.12 for <ietf-http-wg@w3.org>; Thu, 29 Nov 2018 04:08:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=Y3RAWm7h3rUApJqFqtkjYNp61CBEu01Ryd8MiB6vBcM=; b=OWaVLqnuIoSERiwa7y60SQuHDY1n4K9sbfcN5NHDgeSqx6gSH2rr07ZHLUEyNeRvOR pNcAGAzIJ3lezmrq7acoMiaME68cwpVr/+WuYoEhVG4L7JcHwM1loBb9/AwnbsQpwYuO m0eO90WubncLSTzmWsT3HAiAVAJyrjU8rzhuspV3h/3ABJI2KEbOKeKwPhSLG08HSKvI uo3s8MSmPMUl5RJ3cxXcaIxjjbVBeikhv7ZtrosaWJw9Dr5f6ygrr2ZhNw4Pu3iXv4Lk zxehF8MBXjdeiaaYj6aiKmm31P9yqlbClL2H3rV8aiqfUujsMmUE/KG4Pc4eZv6Ty3iM IMFQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=Y3RAWm7h3rUApJqFqtkjYNp61CBEu01Ryd8MiB6vBcM=; b=ccB9MUu5LUbLZaR0YZbAXDZO/tz/yhBOquivgnXJ9dv6MtibXm9Xo3rp9FUtFPNvfK 7EFN8Bqrjud4c3PiyW1E9sRHvZJvMhqtBVHkzGHT3NGrqUjV8mv3A1xmuy+4ZK8/30L3 VGRMKP70Rob6zoHBJS16aTwtuOw8Isfzy2ung6iMTpjah88fJqCrYntBrzoAqA0+Yev1 QAqxCG1V3pRHYkB63+nYLgf5sPPSD1U6YlazcFwra2KdpRClBEdygI6LmtV47dAhcP76 W3ewvRVPUGXocyp2VYIfbxnMpoA9wd/kdbqDwur03Y4CqSIausrjX2UEI3ZQbEKpTNi2 qBiw==
X-Gm-Message-State: AA+aEWbSIbOEJo+zj7c64h8UeiOrdph1bRDAyvdAdLV89kwA6vC9+OcZ Z32UQg3zFVxugKef2CzF+Dapstg5
X-Google-Smtp-Source: AFSGD/UVO6yt7MPSO4AsbkQz+5TU5SVwzLwr7FrM5tZ76NPWNRqNMPJ5HZ51FKxSG5eSkxnWFJtG5w==
X-Received: by 2002:a5d:4b01:: with SMTP id v1mr1142273wrq.5.1543493287378; Thu, 29 Nov 2018 04:08:07 -0800 (PST)
Received: from ROADKILL.local ([132.185.158.36]) by smtp.gmail.com with ESMTPSA id w12sm1790172wrr.23.2018.11.29.04.08.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 29 Nov 2018 04:08:06 -0800 (PST)
To: Mike West <mkwst@google.com>, HTTP Working Group <ietf-http-wg@w3.org>
References: <CAKXHy=eHiMtXi8vkDYtADHdU0tnUfd3p+Wfy7vSkLgT7cA1W0w@mail.gmail.com>
From: Thomas Peterson <hidinginthebbc@gmail.com>
Message-ID: <f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
Date: Thu, 29 Nov 2018 12:08:05 +0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.3.1
MIME-Version: 1.0
In-Reply-To: <CAKXHy=eHiMtXi8vkDYtADHdU0tnUfd3p+Wfy7vSkLgT7cA1W0w@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-W3C-Hub-Spam-Status: No, score=-2.6
X-W3C-Hub-Spam-Report: AWL=1.503, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1gSL7M-00006S-SJ dc4beda9d12e537d016285ce44787b76
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Migrating some high-entropy HTTP headers to Client Hints.
Archived-At: <https://www.w3.org/mid/f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/36107
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

I would propose that all Accept* headers are included in Client Hints as 
all can be used for some level of fingerprinting, e.g. Accept can used 
to distinguish between desktop browsers (which typically have html/xml 
MIME types) and cURL/wget which by default have '*/*'. Many user agents 
also do their own guess work on response bodies anyway (such as looking 
at the magic number) to determine content type or encoding, so the 
impact of a "failed negotiation" of content can be limited.

Also, Is there a particular reason why Sec-CH-Lang omits Quality Values?


Regards


On 29/11/2018 10:22, Mike West wrote:
> Hey folks,
>
> Section 9.7 of RFC7231 
> <https://tools.ietf.org/html/rfc7231#section-9.7> rightly notes that 
> some of the content negotiation headers user agents deliver in HTTP 
> requests create substantial fingerprinting surface. I think it would 
> be beneficial if we took steps to reduce their prevalence on the wire, 
> and Client Hints looks like a reasonable infrastructure on top of 
> which to build.
>
> `User-Agent` and `Accept-Language` seem like particularly tasty and 
> low-hanging fruit, and I've sketched out two proposals as proofs of 
> concept:
>
> *   `User-Agent` could be represented as ~four distinct hints: `UA`, 
> `Model`, `Platform`, and `Arch`: 
> https://github.com/mikewest/ua-client-hints is a high-level explainer, 
> and https://tools.ietf.org/html/draft-west-ua-client-hints a sketchy 
> ID for the new headers.
>
> *   `Accept-Language` could be represented as a `Lang` hint: 
> https://github.com/mikewest/lang-client-hint is a high-level 
> explainer, https://tools.ietf.org/html/draft-west-lang-client-hint an 
> equally sketchy ID for the new header.
>
> I'd appreciate y'all's feedback. Thanks!
>
> -mike