Re: Benjamin Kaduk's No Objection on draft-ietf-httpbis-client-hints-14: (with COMMENT)

Yoav Weiss <yoav@yoav.ws> Fri, 19 June 2020 11:39 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E61B23A09AE for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 19 Jun 2020 04:39:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.648
X-Spam-Level:
X-Spam-Status: No, score=-2.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=yoav-ws.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K3VsBQY7F764 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 19 Jun 2020 04:39:09 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B66C3A09AC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 19 Jun 2020 04:39:08 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1jmFJR-0007zr-NO for ietf-http-wg-dist@listhub.w3.org; Fri, 19 Jun 2020 11:36:01 +0000
Resent-Date: Fri, 19 Jun 2020 11:36:01 +0000
Resent-Message-Id: <E1jmFJR-0007zr-NO@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <yoav@yoav.ws>) id 1jmFJP-0007z4-Sx for ietf-http-wg@listhub.w3.org; Fri, 19 Jun 2020 11:36:00 +0000
Received: from mail-lf1-x141.google.com ([2a00:1450:4864:20::141]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <yoav@yoav.ws>) id 1jmFJL-0001Zz-0l for ietf-http-wg@w3.org; Fri, 19 Jun 2020 11:35:59 +0000
Received: by mail-lf1-x141.google.com with SMTP id d7so5336672lfi.12 for <ietf-http-wg@w3.org>; Fri, 19 Jun 2020 04:35:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yoav-ws.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mW26ffUA3x5Awq2ggTdUhXqeE/fRgHP9YjRgl+Pktf8=; b=kMYmbNUM5e26KOKtEsbXJxZjqa/QYuhwSP7qt9lXN0BngBpwBgGmZMoe93w5J6XcnN eZRCKzN0+F5oybOLFsoI4JEvis21IlL9TJapi5x3AOaAJvBNSSZghWjoaqbv3cHSJa+c tE8iuyfFTTE54/EmMLlU5BDZvH4kbnecYlWOQ4UfbW/a7rZt8+UY4uqqOCPFTChixS0X POwvCBoGHLLj1RKBbqrFaEwy61HBqH7+J+eYRoS+AWoCrXTm7wASbjt9/9AcForIDenG iDvPfITbuoAf6rcatAjV02ryqdMJu0ejnM64BjcmIXQADd9xFLFtbxQ6ZZjbF4rhNZr4 Rbxw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mW26ffUA3x5Awq2ggTdUhXqeE/fRgHP9YjRgl+Pktf8=; b=qfQ/yfGo6K1lyJstsGOx/OM47GpYzc/z84QrPyODmHx1vhdtr6GBr93jC0Z05UY4Dn +IlmE1BTP9kE5mCSt5Dhy3q+zgnCrQrMKc4utdR3kIy5x1oi/TgKhCzZ23al01r8zuAe b/npqGbcGzZX4c/1oDnhs6lTMIEnRYcNi33YA7qnvDL4/9puV5dNT2VCBVY25Y5oyJ+W PNSEZOCGTksERJtE8fCwqpczJykHL8QzIEF4Us94I+Kh2CtsHXYkU85jco08B0EInfhW nrabr2RK6HNFoBzv/+tcB6gu6NC1v9C8Xmrr0aJvSQteIcWNtiurZWhFV4Gfdx70YPT6 kNDw==
X-Gm-Message-State: AOAM533bbwOf9/qgboKA/DiGY8pCIRcJ1q1CsOtvQC92K3mFDXP/CWrN hD6ruebqOaD/kN7tX3g0Pj+7rzT5UqSRKQKbbK0gVYPSSpc=
X-Google-Smtp-Source: ABdhPJwB9UA7GVMm8WdHyesCHXahSzBarJ4o4nsxDPXrXeCIUEJSwpn2AWDbqXEUFeU5XOFWyn30HeBi54NM/9jUon0=
X-Received: by 2002:ac2:5f07:: with SMTP id 7mr1767451lfq.132.1592566542530; Fri, 19 Jun 2020 04:35:42 -0700 (PDT)
MIME-Version: 1.0
References: <158992178960.5956.2137971544232835817@ietfa.amsl.com> <CACj=BEiezqmP5AszaCC=jt5igYudGs-QQeejEr-2PFqvKDUbyw@mail.gmail.com> <BE79CBD3-AD98-4A54-9596-9318E46752A2@mnot.net>
In-Reply-To: <BE79CBD3-AD98-4A54-9596-9318E46752A2@mnot.net>
From: Yoav Weiss <yoav@yoav.ws>
Date: Fri, 19 Jun 2020 13:35:26 +0200
Message-ID: <CACj=BEhx7Q97bfC-adfz_83-wWEogms=SNH-Z2pTZzRbXZYSnw@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: Benjamin Kaduk <kaduk@mit.edu>, The IESG <iesg@ietf.org>, draft-ietf-httpbis-client-hints@ietf.org, httpbis-chairs@ietf.org, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="0000000000000c8de405a86e4b4a"
Received-SPF: pass client-ip=2a00:1450:4864:20::141; envelope-from=yoav@yoav.ws; helo=mail-lf1-x141.google.com
X-W3C-Hub-Spam-Status: No, score=-8.9
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1jmFJL-0001Zz-0l 12e0035b4d4853f0ff29f1299ed200d0
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Benjamin Kaduk's No Objection on draft-ietf-httpbis-client-hints-14: (with COMMENT)
Archived-At: <https://www.w3.org/mid/CACj=BEhx7Q97bfC-adfz_83-wWEogms=SNH-Z2pTZzRbXZYSnw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/37799
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Thanks! Reverted
<https://github.com/httpwg/http-extensions/pull/1220/commits/ca921fd687a5b4571a35f16ef2c92aa87fb95a54>
those two changes in the PR.

On Fri, Jun 19, 2020 at 4:06 AM Mark Nottingham <mnot@mnot.net> wrote:

> Just adding some detail --
>
> > On 17 Jun 2020, at 6:47 pm, Yoav Weiss <yoav@yoav.ws> wrote:
> >
> > Thanks for reviewing and apologies for the delayed reply :/
> >
> > Comments addressed below and incorporated into
> https://github.com/httpwg/http-extensions/pull/1220
> > Your review would be appreciated :)
> >
> > On Tue, May 19, 2020 at 10:56 PM Benjamin Kaduk via Datatracker <
> noreply@ietf.org> wrote:
> > Benjamin Kaduk has entered the following ballot position for
> > draft-ietf-httpbis-client-hints-14: No Objection
> >
> > When responding, please keep the subject line intact and reply to all
> > email addresses included in the To and CC lines. (Feel free to cut this
> > introductory paragraph, however.)
> >
> >
> > Please refer to
> https://www.ietf.org/iesg/statement/discuss-criteria.html
> > for more information about IESG DISCUSS and COMMENT positions.
> >
> >
> > The document, along with other ballot positions, can be found here:
> > https://datatracker.ietf.org/doc/draft-ietf-httpbis-client-hints/
> >
> >
> >
> > ----------------------------------------------------------------------
> > COMMENT:
> > ----------------------------------------------------------------------
> >
> > Section 1
> >
> >    There are thousands of different devices accessing the web, each with
> >    different device capabilities and preference information.  These
> >    device capabilities include hardware and software characteristics, as
> >    well as dynamic user and user agent preferences.  Historically,
> >
> > nit: should "user-agent" be hyphenated?
> >
> > In web specifications it typically isn't. RFC 7231 also doesn't seem to
> hyphen it.
>
> Yes. "User Agent" is a concept; "User-Agent" is a HTTP header field.
>
>
> >    applications that wanted to allow the server to optimize content
> >    delivery and user experience based on such capabilities had to rely
> >    on passive identification (e.g., by matching the User-Agent header
> >
> > nit: it feels like "allow the server" would be something that involves
> > granting permission or the client sending an active signal (as proposed
> > by this document), as opposed to just the apaplication that "wanted the
> > server to optimize" and had to make do with such limited signal as was
> > already available.
> >
> > OK. Removing "allow the".
> >
> >
> >    field (Section 5.5.3 of [RFC7231]) against an established database of
> >    user agent signatures), use HTTP cookies [RFC6265] and URL
> >
> > nit: hyphenate user-agent again, used as an adjective.
> >
> > TIL: compound adjective
> > Done!
>
> The problem is that this confuses the reader between the concept and the
> header field. In general, I'd prefer that the RFC Editor handle this level
> of detail regarding English usage, so that we don't needlessly go
> back-and-forth.
>
>
> >    o  User agent detection cannot reliably identify all static
> >       variables, cannot infer dynamic user agent preferences, requires
> >       external device database, is not cache friendly, and is reliant on
> >
> > nit: singular/plural mismatch ("an external device database" or
> > "external device databases")
> >
> > Done
> >
> >    o  Cookie-based approaches are not portable across applications and
> >       servers, impose additional client-side latency by requiring
> >       JavaScript execution, and are not cache friendly.
> >
> > (I think I missed a step in why a cookie-based approach inherently
> > requires javascript execution, though maybe it doesn't matter.)
> >
> > Essentially, if you want to dynamically set your cookies based on
> client-side information, you need javascript to do that.
> >
> >
> >    Proactive content negotiation (Section 3.4.1 of [RFC7231]) offers an
> >    alternative approach; user agents use specified, well-defined request
> >    headers to advertise their capabilities and characteristics, so that
> >
> > Chasing the reference, it's not clear that it supports quite this strong
> > of a statement: in addition to the explicit negotiation fields, it also
> > allows using implicit characteristics such as client IP address and
> > User-Agent.
> >
> > Would ending that section with the following work?
> > ", so that servers can select (or formulate) an appropriate response,
> based on those request headers (or on other, implicit characteristics)."
> >
> >
> > Section 2.1
> >
> >    access of third parties to those same header fields.  Without such an
> >    opt-in, user agents SHOULD NOT send high-entropy hints, but MAY send
> >    low-entropy ones [CLIENT-HINTS-INFRASTRUCTURE].
> >
> > It looks like the reference only defines a registry for low-entropy
> > hints, and we are inferring that any hints not listed in that table are
> > to be treated as "high-entropy".  Perhaps we could reword both
> > directions of this directive to refer only to the registry of
> > low-entropy hints (e.g., "SHOULD NOT send hints that are not listed in
> > [registry]")?
> >
> > Makes sense.
> >
> >
> >    Implementers need to be aware of the passive fingerprinting
> >    implications when implementing support for Client Hints, and follow
> >    the considerations outlined in the Security Considerations
> >    (Section 4) section of this document.
> >
> > side note: in some sense the Accept-CH mechanism transforms it from a
> > passive to an active fingerprinting mechanism.
> >
> > Good point! Removed "passive" here.
> >
> >
> > Section 2.2
> >
> >    information in them.  When doing so, and if the resource is
> >    cacheable, the server MUST also generate a Vary response header field
> >    (Section 7.1.4 of [RFC7231]) to indicate which hints can affect the
> >    selected response and whether the selected response is appropriate
> >    for a later request.
> >
> > side note: I suspect the answer I want is already present with a
> > detailed reading of RFC 7231, but I wonder if it's worth saying
> > something here about whether the Vary response header could/should
> > include registered client hint header field names that were not present
> > in the request in question.
> >
> > https://tools.ietf.org/html/rfc7231#section-7.1.4 implies that Vary can
> be set to header names that are missing from the request. ("or lack
> thereof")
> > I'm not sure we should mention that explicitly here.
>
> Our general practice in this sort of situation is to mention setting Vary
> *if* the response is cacheable, to remind the reader, but _not_ to make it
> a requirement, since that requirement is already made by HTTP.
>
> >
> > Section 3.1
> >
> >    Based on the Accept-CH example above, which is received in response
> >    to a user agent navigating to "https://example.com", and delivered
> >    over a secure transport, a user agent will have to persist an Accept-
> >    CH preference bound to "https://example.com".  It will then use it
> >
> > What level of requirement is implied by "will have to" here?  IIUC, it's
> > just that "if anything is persisted, it must be keyed on" but with no
> > obligation to do any persistence.  If so, perhaps a wording like "any
> > persisted Accept-CH preference will be bound to" would be better?
> >
> > The normative requirement in the paragraph above it is SHOULD.
> > I'll modify the wording to your suggested one.
> >
> >
> >    for navigations to e.g. "https://example.com/foobar.html", but not to
> >    e.g. "https://foobar.example.com/".  It will similarly use the
> >    preference for any same-origin resource requests (e.g. to
> >
> > nit: comma after "e.g." (throughout).
> >
> > OK
> >
> >
> >    "https://example.com/image.jpg") initiated by the page constructed
> >    from the navigation's response, but not to cross-origin resource
> >    requests (e.g. "https://thirdparty.com/resource.js").  This
> >    preference will not extend to resource requests initiated to
> >    "https://example.com" from other origins (e.g. from navigations to
> >    "https://other-example.com/").
> >
> > Perhaps thirdparty.example and other.example, to stay within the BCP32
> > space?
> >
> > Done
> >
> >
> > Section 3.2
> >
> >    When selecting a response based on one or more Client Hints, and if
> >    the resource is cacheable, the server needs to generate a Vary
> >    response header field ([RFC7234]) to indicate which hints can affect
> >    the selected response and whether the selected response is
> >    appropriate for a later request.
> >
> > Is BCP 14 language approprite here?
> >
> > Indeed. Changed to SHOULD.
>
> As per above, we try not to restate requirements that are already
> specified elsewhere.
>
>
> >
> >
> >    Above example indicates that the cache key needs to include the Sec-
> >    CH-Example header field.
> >
> > nit: please add the article "the" to make this a complete sentence.
> >
> > Yup
> >
> >
> > Section 4
> >
> > While I don't expect that I can tell the major browser vendors anything
> > new about the privacy considerations to client hints, I do think that we
> > should give some guidance to implementors of other HTTP clients, who may
> > not have such extensive depth of knowlege, on the general landscape in
> > which this mechanism is set.  The subsections hereof do a great job
> > covering a lot of relevant details and specific factors to consider;
> > thank you!  I think it may also be appropriate to have some more generic
> > lead-in text, noting that in the worst case, merely converting a passive
> > fingerprinting mechanism to an active fingerprinting mechanism with
> > server opt-in does not actually provide any privacy benefit (the worst
> > case being when all servers ask for all the data and clients accede)!
> > While we might hope that the need to jump through an extra hoop to
> > access fingerprinting information might dissuade some servers from
> > asking for it, it seems imprudent to assume that it will happen, so in
> > order to obtain real privacy benefit there needs to be some additional
> > policy controls in the client and in what hints are defined/implemented.
> > As I mentioned already, we already have a lot of the details for how to
> > apply such policy controls, and limitations to only define hints that
> > expose information already available in other means; what I'd like to
> > see is the high-level picture that ties them together.
> >
> >
> > OK. Added something. I'd appreciate your review to see if it matches
> what you had in mind.
> >
> > Section 4.1
> >
> >    upon it.  The header-based opt-in means that we can remove passive
> >    fingerprinting vectors, such as the User-Agent string (enabling
> >    active access to that information through User-Agent Client Hints
> >    [4]), or otherwise expose information already available through
> >
> > I think this [4] is the same as [UA-CH].
> >
> > It's pointing to a specific section of UA-CH. I'm not sure if this is
> critical.
> >
> >
> > Also, use of the first person ("we") is somewhat unusual in RFC style.
> >
> > Changed.
> >
> >
> >    Therefore, features relying on this document to define Client Hint
> >    headers MUST NOT provide new information that is otherwise not
> >    available to the application via other means, such as existing
> >    request headers, HTML, CSS, or JavaScript.
> >
> > As written, this is a fairly weird condition.  What constitutes
> > "available to the application via other means"?  Does "put up an
> > interstitial until the user provides the information in question" count?
> >
> > Changed to "not made available to the application by the user agent"
> >
> >
> >    o  Entropy - Exposing highly granular data can be used to help
> >       identify users across multiple requests to different origins.
> >       Reducing the set of header field values that can be expressed, or
> >       restricting them to an enumerated range where the advertised value
> >       is close but is not an exact representation of the current value,
> >
> > nit: "close to" seems like it would scan better.
> >
> > Yup
> >
> >
> >    Different features will be positioned in different points in the
> >    space between low-entropy, non-sensitive and static information (e.g.
> >    user agent information), and high-entropy, sensitive and dynamic
> >    information (e.g. geolocation).  User agents need to consider the
> >    value provided by a particular feature vs these considerations, and
> >    MAY have different policies regarding that tradeoff on a per-feature
> >    basis.
> >
> > How about on a per-origin basis (and, e.g., domain reputation)?  An
> > "entropy budget" where an origin that asks for too many distinct hints
> > won't get all of them?
> >
> > Those are definitely policies that user agents can apply (e.g. one
> concrete proposal that looks a lot like your "entropy budget" is
> https://github.com/bslassey/privacy-budget)
> >
> > (I also wonder if a descriptive "may wish to have" is better than the
> > normative "MAY", here.)
> >
> > Sure.
> >
> >    o  Implementers SHOULD restrict delivery of some or all Client Hints
> >       header fields to the opt-in origin only, unless the opt-in origin
> >       has explicitly delegated permission to another origin to request
> >       Client Hints header fields.
> >
> > Am I reading things right that this document does not define any such
> > delegation mechanisms but is just admitting the possibility of such
> > mechanisms being defined in the future?  I'd suggest clarifying up in
> > §2.1 with a parenthetical (akin to the "outlined below" note about the
> > opt-in mechanism).
> >
> > Added an "(as outlined in {{CLIENT-HINTS-INFRASTRUCTURE}})"
> clarification to 2.1
> >
> >
> >    Implementers SHOULD support Client Hints opt-in mechanisms and MUST
> >    clear persisted opt-in preferences when any one of site data,
> >    browsing history, browsing cache, cookies, or similar, are cleared.
> >
> > Who is the target audience for this SHOULD?  If it's just "people
> > implementing this document", it seems ineffectual, and if it's any
> > broader scope it seems unenforcable.
> >
> > Removed the SHOULD here as it's already defined elsewhere that high
> entropy hints require an opt-in.
> > Also changed "implementers" to "user agents".
> >
> >
> > Section 4.3
> >
> >    Research into abuse of Client Hints might look at how HTTP responses
> >    that contain Client Hints differ from those with different values,
> >
> > nit: what are "responses that contain Client Hints"?  We have discussed
> > Accept-CH header fields in responses, and client hints in requests, but
> > the only mention I recall of hints in responses was in the Vary header
> > field, and it's not clear that that is what was intended.
> >
> > Good catch! Changed to "responses to requests that contain Client
> Hints".
> >
> >
> > Section 5
> >
> >    While HTTP header compression schemes reduce the cost of adding HTTP
> >    header fields, sending Client Hints to the server incurs an increase
> >    in request byte size.  Servers SHOULD take that into account when
> >
> > nit: I wonder if this would be more clear as:
> >
> > % Sending Client Hints to the server incurs an increase in request byte
> > % size.  Some of this increase can be mitigated by HTTP header
> > % compression schemes, but each new hint will still lead to some
> > % increased bandwidth usage.  Servers SHOULD [...]
> >
> > Changed.
> >
> > Section 7.1
> >
> > I'm not sure I understand why [FETCH] is listed as a normative
> > reference.
> >
> > Moved it to be informative.
> >
> >
> > I find it amusing that we reference both 7231 and 7234 for Vary, though
> > to my untrained eye the current references both seem appropriate in
> > their respective locations.
> >
> > Section 7.2
> >
> > If [CLIENT-HINTS-INFRASTRUCTURE] is to be the source of truth for
> > low-entropy (and, by deduction) high-entropy hints, it seems like it
> > should be normative.
> >
> > Moved.
>
> --
> Mark Nottingham   https://www.mnot.net/
>
>