[Wpack] Counter-proposal where bundles only contain a single origin
Jeffrey Yasskin <jyasskin@google.com> Wed, 26 August 2020 18:00 UTC
Return-Path: <jyasskin@google.com>
X-Original-To: wpack@ietfa.amsl.com
Delivered-To: wpack@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 56B683A1955 for <wpack@ietfa.amsl.com>; Wed, 26 Aug 2020 11:00:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qBE2qXZOs1oJ for <wpack@ietfa.amsl.com>; Wed, 26 Aug 2020 11:00:48 -0700 (PDT)
Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D3BDA3A1953 for <wpack@ietf.org>; Wed, 26 Aug 2020 11:00:47 -0700 (PDT)
Received: by mail-qk1-x730.google.com with SMTP id p4so2871254qkf.0 for <wpack@ietf.org>; Wed, 26 Aug 2020 11:00:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=rVHMLGeBv1WBtW6TW+iHFjqsdq+MUIClvsTmtV7Ptug=; b=vYBzNO3mIFE3RE0F/Z7VxYINd5pY4R0kHRyh4chM8LCSSvUxfCPmd/GJqjqEzsRgT0 h2YA2RgAojw1OIuceZYMXh1abrbIN80Lj5nwb8etU+4kw6/J8TjkEKEErnWYKq4t0LXZ Po5fpcMNpYhPRKONmTNh3sT3sgGVlBB2gq7fJGeOw55T9ypdnddUM5uH0hB2JBToM5Ld 15GPetxIWUzDnbP7XH3B5RZoXr3mEGJG1E4GpYk1eoKUnIDkiAv+XG8ZYkzAl3ok9WK8 YPKlhG1OT5XHF1fUaKUOJgwcBxdrUREUUNqDac1IdchGWEi8ov9b1MhtDuR516QVxg9A V5iA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=rVHMLGeBv1WBtW6TW+iHFjqsdq+MUIClvsTmtV7Ptug=; b=k6FKcXuuyf/UGpA+9enI4yJ5OTXxFARNUyUz6oEdRY04PDW3iwuhAIU+/ASUhxZJXb +HJv+RGbOO9lJuBHKyVi2S0zAG20Mk0BVglu08InTL7mvP9NTO0fEqoT3AH27EO5qPfE 7T/cK3tMdkvbNTLv12ePtD3QvKucCXaoVsYl/aT7PHItWwk1p1ICI49zZMap4v2bpig5 dK3EgVFhtZgmEVOTSDYGI30iGftfRZEY3vFVU7hqdMpeUiHS/D+QQe1zug2QNlOEIJ5W Wc2c+wmxP3//+PzaKRMX13WEhkk0t3CAQV/4f4HWRXf+bWosQ7ImTrk31N+Zwgl8N7GE ebkw==
X-Gm-Message-State: AOAM531DymFru4SJ08ypHB6Ge6pbFuhEnJnGiV1JLkAOs8ztGuafbXFR dfDV3AfOyuVUCiUZkjfEzQ5QCgfKeS66Zv4Ii/4fXSnLcTGaEil5
X-Google-Smtp-Source: ABdhPJxkMaoeUhSBh9g0N26UsKQueZnFgH/e7solU+K10OmfJ04W11xFwrREa4XtI75PDvm+lyM73jv5gvNWRIS1xCY=
X-Received: by 2002:a37:b502:: with SMTP id e2mr15906445qkf.144.1598464846007; Wed, 26 Aug 2020 11:00:46 -0700 (PDT)
MIME-Version: 1.0
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Wed, 26 Aug 2020 11:00:34 -0700
Message-ID: <CANh-dXkC8i6F1gxoD6nTJ4bp=7TVyy1fcN3v1vurj6h4+cZqiQ@mail.gmail.com>
To: WPACK List <wpack@ietf.org>
Cc: Martin Thomson <mt@mozilla.com>
Content-Type: multipart/alternative; boundary="00000000000055a05a05adcb9952"
Archived-At: <https://mailarchive.ietf.org/arch/msg/wpack/8fFVJv0AIksODEha8iyJrVDvOek>
Subject: [Wpack] Counter-proposal where bundles only contain a single origin
X-BeenThere: wpack@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Web Packaging <wpack.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wpack>, <mailto:wpack-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wpack/>
List-Post: <mailto:wpack@ietf.org>
List-Help: <mailto:wpack-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wpack>, <mailto:wpack-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Aug 2020 18:00:50 -0000
I worked with Martin to flesh out his ideas for simplifying the bundle URL and origin design, and we came up with the following. I still prefer the proposal I presented a couple weeks ago <https://github.com/WICG/webpackage/blob/master/explainers/bundle-urls-and-origins.md#proposal-a-package-scheme>, based on the tradeoffs down at the bottom here, but it's quite possible I've missed things. I'd appreciate this group's input on which way we should go. The basic idea here is that: * Each bundle defines a single origin. * Bundles identify their contents with paths, not full URLs. * Metadata can provide a base URL to resolve those paths against. * We allow a single file to contain multiple origins using nested bundles. Use cases: * Users should be able to share subsets of the web that can link within themselves, in a way that different sites within those subsets don't have their storage collide. * One request for ads should be able to return the contents of multiple iframes in a way that those contents can't modify each other. # URLs for nested bundle resources ``` package://distributor.example/bundle.wbn$app/foo.wbn$bar.html package:///c:/Users/name/Downloads/bundle.wbn$app/foo.wbn$bar.html ``` This fetches the outer bundle by removing everything after the first `$` and: * If the URL has an authority, replacing the `package:` with `https:` * If the URL doesn't have an authority, replacing the `package:` with `file:`. To allow the outer bundle to be fetched using a third scheme, we would need to add a matching new scheme for addressing inside it. Subsequently, each segment separated by `$`s is the path to look up in a nested bundle. Any `$`s inside a segment are percent-encoded. # Origins for nested bundle resources The origin of one of these holds the information from the URL up to the last `$`. So for ``` package://distributor.example/bundle.wbn$app/foo.wbn$bar.html ``` * scheme: package: * host: distributor.example/bundle.wbn$app/foo.wbn * port: null * domain: null We could also hold the part after the first `/` in a new component, similar to Gecko's OriginAttributes <https://wiki.mozilla.org/Security/Contextual_Identity_Project/Containers#An_extended_origin>, but OriginAttributes are mostly undocumented, and origins define "opaque hosts <https://url.spec.whatwg.org/#opaque-host>" to encode this sort of information into the existing field. # Navigating across nested bundles Within something like El Paquete Semanal or the Web Archive, it's straightforward to store each origin in separate nested bundles, but we want links from one origin to another to also work within the same top-level bundle. Addressing something in a sibling or parent bundle is reasonably straightforward, with something similar to the following syntax: ``` <a href="package:..$/https/other.site.example.wbn$/path.html"> ``` The downside is that if a user wants to take one site out of the big bundle and use it on its own, and they want links outside that site to land on the internet, they have to know the mapping from URLs to bundle names and undo it in all the links. If they don't do this, the links are broken instead. Similarly, if someone wants to compose a couple of pre-existing bundles, they have to rewrite links to point to sibling bundles appropriately. # Comparison We need to compare "single-origin bundles", described above, with "multi-origin bundles", described at https://github.com/WICG/webpackage/blob/master/explainers/bundle-urls-and-origins.md#proposal-a-package-scheme . Both use a package:bundle-location$within-bundle format. Because the single-origin proposal chooses not to encode the whole origin to fit in the authority URL component, and implies rather than states the bundle's scheme, I'll compare it to a variant of the multi-origin proposal that does the same, yielding: ``` package://archive.example/2020-04-01.wbn$https://camera.example/edit.html ``` The difference becomes whether the origin computation includes an authority component in the last $-delimited segment. ## Single-origin bundles are better: * Implementers can use a simpler algorithm to compute the origin, ending at the last $ instead of having to also parse a URL from the last component. * Including just a single origin may avoid the need for signatures to specify a subset of the bundle, which could simplify that section. * Naming subresources with paths instead of URLs is more consistent with other archiving formats. ## Multi-origin bundles are better: * Users can expect simple tools to combine and split pre-existing bundles. * When authors are composing a bundle, cross-origin resources can go directly into the bundle, instead of needing to rewrite them to same-origin or put them in a nested bundle. * Implementations only need to spin up the bundle parser once, which could affect performance. * Implementers need to write and maintain tools to rewrite cross-origin URLs when saving a bundle from a website
- [Wpack] Counter-proposal where bundles only conta… Jeffrey Yasskin
- Re: [Wpack] Counter-proposal where bundles only c… Martin Thomson
- Re: [Wpack] Counter-proposal where bundles only c… Marcin Rataj