[Wpack] Problem statement and scope for BoF

Jeffrey Yasskin <jyasskin@google.com> Mon, 05 August 2019 17:50 UTC

Return-Path: <jyasskin@google.com>
X-Original-To: wpack@ietfa.amsl.com
Delivered-To: wpack@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CFCA01202A6 for <wpack@ietfa.amsl.com>; Mon, 5 Aug 2019 10:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.5
X-Spam-Level:
X-Spam-Status: No, score=-17.5 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eqN86pH8MST0 for <wpack@ietfa.amsl.com>; Mon, 5 Aug 2019 10:49:58 -0700 (PDT)
Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 66ACD1202ED for <wpack@ietf.org>; Mon, 5 Aug 2019 10:49:48 -0700 (PDT)
Received: by mail-lj1-x229.google.com with SMTP id v24so80154822ljg.13 for <wpack@ietf.org>; Mon, 05 Aug 2019 10:49:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=DiLYLbCLbeSkrEX6vUPto6+2x+O1cAMnyBthX2KuSM0=; b=KHAqaESlJ2EzlmJsL+dqCcPiOaR7kkBYkf076iunvBL30gDRjjTDHamFuJJNQ2h7by czT0B+SKoki7B39E+gTEVtHBE9pSdb5Z9Pa6XtkCOQri2NNjK86elEifkGDeDs7mFpI9 xG6sjkflStHAaDeqASmOI6KXChsz9wDnW2yC7A//t0D1gSxXgY1kz5PjZCcJ62HepZwX 9hBMIM368LvNEH4wHJgroEs5Avchv/qjk/HX8y+db0gMTO8jj0YxiuhDy0BifezcmjMW Tkw4IwFJ+oOjpKAiDRrt9VCi7wHt6YYzJco/2lu5yAdEZ6KpIDpea4B+y+WYykn0hUT3 CZjQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=DiLYLbCLbeSkrEX6vUPto6+2x+O1cAMnyBthX2KuSM0=; b=rLXAdBuBfRhX8RY2hKZA9wrakFi+aEqY5ZvQvK9eXD+2XiqV1Tq0WF/fUval/j3zt3 YF1Kl93O6Sajkycwn0Q92P76zyU7Q+Z5O67NKbQIc8cMfFoCeu//wVOdCHx3RgEoHhwE KEC8PCpYT3sp+y+udES+qAwmgZBOrrWP2nhK1asPWo0HMf1qk5hGI5FbY73uZsitQH0g n1hLDKGVk0nTQDBAF59TmkRqCnFMRnRM8iNSVjftQ55z+b2Rfv1nfgI0m/n6lrDZcIXu UQpfqkvyBRnKFd8kqPe+EtS4TwwJqtTcYsegHvRA6acsAxwfk8MzJE13MkdaTrgUaJtL wGWw==
X-Gm-Message-State: APjAAAWA9UbLHzpolkFih34epNu6zX25dxAZGswvlkWgXu63CSvdEupy OsKeCLpeXZrMpFvHh6wOFzXFk4kNLcpBI1WqOj6k42LRtnEBGQ==
X-Google-Smtp-Source: APXvYqxvLZuq9V+Mun4FynQG3KxHaEJ3KeS6qxoAjnwuKO3pycVy8sd9bnme2H2HW7L1XyZTogQpD0JsQtb75atotw0=
X-Received: by 2002:a2e:8583:: with SMTP id b3mr43339306lji.171.1565027385559; Mon, 05 Aug 2019 10:49:45 -0700 (PDT)
MIME-Version: 1.0
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Mon, 5 Aug 2019 10:49:34 -0700
Message-ID: <CANh-dXkRGGVAzRPC5k8p0u6b7NuidkOt81in70eF3Nwe_BAZQw@mail.gmail.com>
To: wpack@ietf.org
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/wpack/ctrnRXXda2X6z0Z6creNcTQX9AM>
Subject: [Wpack] Problem statement and scope for BoF
X-BeenThere: wpack@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Web Packaging <wpack.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wpack>, <mailto:wpack-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wpack/>
List-Post: <mailto:wpack@ietf.org>
List-Help: <mailto:wpack-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wpack>, <mailto:wpack-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Aug 2019 17:50:01 -0000

I've drafted an initial problem statement and scope for the WPACK
working group we're hoping to create at a BoF at IETF106 in Singapore.
I've pasted it below, and you can suggest changes at
https://bit.ly/wpack-draft-charter. I intend to draft a charter once I
have some indication that the group here likes the scope I'm
proposing.

Thanks,
Jeffrey

Problem
=======

There are large populations of users who have trouble making direct
connections to HTTPS origin servers over the public internet. For
example:

* 55% of internet users live in countries where political, social, or
religious content was blocked online.
(https://freedomhouse.org/sites/default/files/FOTN_2018_Final%20Booklet_11_1_2018.pdf).

* Other users run out of paid-for data in their mobile plan part-way
through a month, or aggressively disable mobile data to make sure it's
not wasted.

* Others use satellite connections with high latency and packet loss.
Anecdotally, see
https://meyerweb.com/eric/thoughts/2018/08/07/securing-sites-made-them-less-accessible/.

These users currently have to, at best, tell their browsers to
pre-fetch sites when they have a cheap real-time connection available
or wait until they find such a connection, and at worst, can't browse
the content at all.

Even users with highly-available internet connections want to be able
to read and interact with web pages as quickly as possible after
clicking a link. Needing to make extra connections in the critical
path, and having multiple connections competing for bandwidth without
the ability to prioritize, interfere with that goal.

All of these users deserve to know whether the content they're reading
or application they're running actually comes from the origin it
claims, and the publishers of those origins value their users being
able to verify that the content is authentic. Both groups also value
the users' ability to store data within websites and web applications
(e.g. preferences and content they've created) and have it available
the next time they load the same URL.

Scope
=====

1. Define a way to package the public portion of one or more websites
or web apps into one or more files that can be distributed and used
without a direct connection to the origin server.

2. Define a way to sign these packages such that the result gives the
recipient confidence in the integrity and authenticity of what they
received. This work will ensure that a client loading a Web Package
signed using the initial specification has security and privacy
properties that are at least as good as a transferring the contained
resources over HTTPS+TLS1.3 from the publishers' origin server(s)
except that:

    a. Packages do not guarantee confidentiality, but if the channel
that transfers the package provides confidentiality guarantees,
packages at most compromise that by revealing information to the
original origin server of the resources in the package and by making a
TLS1.3-or-later private connection to that origin server.

    b. Packages only guarantee their contents were vouched by the
origin servers' keys within the life of the signature, not during the
life of the connection(s) that transferred them.

    c. Packages do not provide any of the security "guarantees" of the
DNS system.

    d. A package generated for one client can be shared to other clients.

    e. TBD: Are the above sufficient? Can/should we give the WG
permission to change this list?

3. Optionally define a way for clients to check that an origin server
still vouches for its package in real time.

4. Define a way for an end-user to package their view of a site in
order to show a peer what they saw. This could just be an update of
RFC2557 (MHTML) but could instead be an un-signed variant of the same
format as above. This kind of package will not guarantee integrity or
authenticity from the origin server, but might from the end-user who
packaged the site.

5. Define the resulting security and privacy model and compare it to
the status quo security and privacy model of HTTPS. Search for places
the existing web platform depends on the fact that HTTPS provides
transport security and describe how those have to change or not given
Packaging's addition of object security. This may involve working with
W3C groups like WebAppSec and PING.

6. Formats that this group defines should be usable efficiently:

    a. When individual contained resources are either small (~10s of
bytes) or large (~GB),

    b. When the total number of contained resources is either small
(1) or large (thousands for the initial format with enough
extensibility for an update to expand it to millions or billions),

    c. When the total size of all contained resources is either small
(~KB) or large (~TB (matching El Paquete Semanal in Cuba) initially,
with enough extensibility for an update to expand it to EB), and

    d. When the format is either streamed to the client or loaded from
a slow but random-access medium like an SD card.

7. Ensure that any formats or protocols this group defines can be
migrated to better cryptography when their original cryptography is
broken.

8. As long as it doesn't compromise or delay the above goals, try to:

    a. Support signed statements about the content beyond just
assertions that it's the representation of a particular URL. For
example, that it appears on a transparency log, that it passed a
certain kind of static analysis, or that a particular real-world
entity vouches for it.

    b. Address the threat model of a website whose frontend might be
compromised after a user first uses the site.

    c. Support books being published in the format.

    d. Support long-lived archival storage.

    e. Optimize transport of large numbers of small same-origin resources.

    f. Allow the format to be used in self-extracting executables.

    g. Allow publishers to efficiently combine sub-packages from other
publishers.

Out of scope
------------

1. DRM

2. A way to distribute the private portions of a website. For example,
WPACK might define a way to distribute GMail's application but
wouldn't define a way to distribute individual emails without a direct
connection to GMail's origin server.

3. Defining the details of how web browsers load the formats and
interact with any protocols we define here. The W3C and/or WHATWG are
more appropriate fora for that effort.

4. A way to automatically discover the URL for an accessible package
that includes content for a blocked or expensive-to-access URL that
the user wants to browse.