Re: Migrating some high-entropy HTTP headers to Client Hints.

Ronan Cremin <rcremin@afilias.info> Thu, 11 April 2019 12:11 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BAB88120131 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 11 Apr 2019 05:11:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.899
X-Spam-Level:
X-Spam-Status: No, score=-2.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1HMPiwA-MY54 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 11 Apr 2019 05:11:41 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [IPv6:2603:400a:ffff:804:801e:34:0:38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 31CEF120048 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 11 Apr 2019 05:11:41 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.89) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1hEYWG-0000Ov-NY for ietf-http-wg-dist@listhub.w3.org; Thu, 11 Apr 2019 12:09:28 +0000
Resent-Date: Thu, 11 Apr 2019 12:09:28 +0000
Resent-Message-Id: <E1hEYWG-0000Ov-NY@frink.w3.org>
Received: from uranus.w3.org ([128.30.52.58]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <rcremin@afilias.info>) id 1hEYWF-0000O4-9K for ietf-http-wg@listhub.w3.org; Thu, 11 Apr 2019 12:09:27 +0000
Received: from www-data by uranus.w3.org with local (Exim 4.89) (envelope-from <rcremin@afilias.info>) id 1hEYWE-0003IW-Tw for ietf-http-wg@listhub.w3.org; Thu, 11 Apr 2019 12:09:27 +0000
Received: from mimas.w3.org ([2603:400a:ffff:804:801e:34:0:4f]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <rcremin@afilias.info>) id 1hEYLt-0008L2-Sl for ietf-http-wg@listhub.w3.org; Thu, 11 Apr 2019 11:58:45 +0000
Received: from outbound.afilias.info ([66.199.183.4]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <rcremin@afilias.info>) id 1hEYLp-0003RA-RA for ietf-http-wg@w3.org; Thu, 11 Apr 2019 11:58:45 +0000
Received: from ms5.on1.afilias-ops.info ([10.109.8.9] helo=smtp.afilias.info) by outbound.afilias.info with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.88) (envelope-from <rcremin@afilias.info>) id 1hEYLS-000CGd-5N for ietf-http-wg@w3.org; Thu, 11 Apr 2019 11:58:18 +0000
Received: from mail-ed1-f70.google.com ([209.85.208.70]:44598) by smtp.afilias.info with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <rcremin@afilias.info>) id 1hEYLS-000BGN-5A for ietf-http-wg@w3.org; Thu, 11 Apr 2019 11:58:18 +0000
Received: by mail-ed1-f70.google.com with SMTP id p90so2957630edp.11 for <ietf-http-wg@w3.org>; Thu, 11 Apr 2019 04:58:18 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=9yMlOaAZHkWnt2Ja8+ykyHk+PbJn7LIC1cD7GLrEQX4=; b=m31eHfEwHBu160VjWOC1ZXRDI3doXuNAwUXiCVu62+1AXMO7mOC/u9PqOVg2mExWn+ zXp5NcP6ZQzp745U3IGZyknq0wQo5JQDLneeKGfc1BGWZyEgeTcW/cbg0P/lS/5P9Nci 9BV0+s4TaLsFJnsFHzoy4COurkyBsGwOV+1QEOlLM9zoU6msCDr9wIMc5x6ITa9LoDl8 auDDUKeDWU/eW8hGdvPWGbuoA4hkOSRjKMGZV21RbzsoXTVaPeiarlYA2ekA0UQ8BfsW VVBaSeKKgG9yl7fjhmTg9EazO8vtroJ8WVOk6y0zjfQvZBN+TsY4QrJOBKuUoA4lioJb HQcw==
X-Gm-Message-State: APjAAAVsU1eH0ckVCBT2GmrerNs3/fSud//qzr+OjY8tKsvniDzV6I5U Q65baQEi981nqvZzPcIRN9VmpQ/yvnuR98sO0A3MuPa2ygQEt3cLt3pnHvjcAX7MQR5oESnMTk/ reqE1aX7U/1Qvtg==
X-Received: by 2002:aa7:d954:: with SMTP id l20mr31217824eds.156.1554983892250; Thu, 11 Apr 2019 04:58:12 -0700 (PDT)
X-Google-Smtp-Source: APXvYqxWNr2+B81VVoVV0i8UXtNeM6tpqbSxIwQ+mnWpPYv9AoWlJ4fIKnvqDx35aAfrdq6wvlsWEg==
X-Received: by 2002:aa7:d954:: with SMTP id l20mr31217806eds.156.1554983892010; Thu, 11 Apr 2019 04:58:12 -0700 (PDT)
Received: from rcremin.mtld.mobi ([89.101.149.145]) by smtp.gmail.com with ESMTPSA id l44sm11518359edb.37.2019.04.11.04.58.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Apr 2019 04:58:11 -0700 (PDT)
To: Thomas Peterson <hidinginthebbc@gmail.com>, Mike West <mkwst@google.com>, HTTP Working Group <ietf-http-wg@w3.org>
References: <CAKXHy=eHiMtXi8vkDYtADHdU0tnUfd3p+Wfy7vSkLgT7cA1W0w@mail.gmail.com> <f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
From: Ronan Cremin <rcremin@afilias.info>
Message-ID: <4d321ba1-f6f1-05c3-5b76-24f6a9b89525@afilias.info>
Date: Thu, 11 Apr 2019 12:58:10 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <f042d223-85ee-fd16-74fa-7d6d993f817f@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Received-SPF: pass client-ip=66.199.183.4; envelope-from=rcremin@afilias.info; helo=outbound.afilias.info
X-W3C-Hub-Spam-Status: No, score=-8.2
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1hEYLp-0003RA-RA 2a655a5c3832687a4aded7f1b4279825
X-caa-id: 523d93cf93
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Migrating some high-entropy HTTP headers to Client Hints.
Archived-At: <https://www.w3.org/mid/4d321ba1-f6f1-05c3-5b76-24f6a9b89525@afilias.info>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/36521
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi,

My name is Ronan Cremin, I help to build a device recognition product 
widely-used in the web analytics, publishing and advertising industries. 
Full disclosure: my employer profits from analysis of UA strings, though 
moving the same information to client hints is not expected to impact 
this materially.

One concern over moving UA string information to Client Hints is that 
the information required to publish device-specific responses arrives 
only in the second request from the client. This imposes a performance 
penalty on publishers that serve a device-tailored HTML document. As 
Mike mentioned, RWD notwithstanding, many publishers employ 
device-specific responses as envisaged in RFC1945, usually to tailor the 
experience to a class of device e.g. smartphone, tablet, desktop and so 
on. Publishers endeavour to fit everything required for the first screen 
of content into this first response, so a delay to this impacts 
performance. The last time I checked more than 80% of the top 100 
websites used this technique.

Web analytics might also be impacted. Most web analytics solutions 
support a JavaScript-free integration approach based on linking a single 
pixel image hosted by the analytics platform. The ability to do this is 
impacted for the same reason—the information required for analytics 
becomes available only on the second request from the client.

Has thought been given to the performance impact of the proposal? Yoav 
mentions this issue in his Client Hints infrastructure document 
(https://github.com/yoavweiss/client-hints-infrastructure) but I haven't 
seen any attempt to quantify the impact.

Regards,
Ronan

On 29/11/2018 12:08, Thomas Peterson wrote:
> I would propose that all Accept* headers are included in Client Hints 
> as all can be used for some level of fingerprinting, e.g. Accept can 
> used to distinguish between desktop browsers (which typically have 
> html/xml MIME types) and cURL/wget which by default have '*/*'. Many 
> user agents also do their own guess work on response bodies anyway 
> (such as looking at the magic number) to determine content type or 
> encoding, so the impact of a "failed negotiation" of content can be 
> limited.
>
> Also, Is there a particular reason why Sec-CH-Lang omits Quality Values?
>
>
> Regards
>
>
> On 29/11/2018 10:22, Mike West wrote:
>> Hey folks,
>>
>> Section 9.7 of RFC7231 
>> <https://tools.ietf.org/html/rfc7231#section-9.7> rightly notes that 
>> some of the content negotiation headers user agents deliver in HTTP 
>> requests create substantial fingerprinting surface. I think it would 
>> be beneficial if we took steps to reduce their prevalence on the 
>> wire, and Client Hints looks like a reasonable infrastructure on top 
>> of which to build.
>>
>> `User-Agent` and `Accept-Language` seem like particularly tasty and 
>> low-hanging fruit, and I've sketched out two proposals as proofs of 
>> concept:
>>
>> *   `User-Agent` could be represented as ~four distinct hints: `UA`, 
>> `Model`, `Platform`, and `Arch`: 
>> https://github.com/mikewest/ua-client-hints is a high-level 
>> explainer, and https://tools.ietf.org/html/draft-west-ua-client-hints 
>> a sketchy ID for the new headers.
>>
>> *   `Accept-Language` could be represented as a `Lang` hint: 
>> https://github.com/mikewest/lang-client-hint is a high-level 
>> explainer, https://tools.ietf.org/html/draft-west-lang-client-hint an 
>> equally sketchy ID for the new header.
>>
>> I'd appreciate y'all's feedback. Thanks!
>>
>> -mike
>