Re: [Wpack] Counter-proposal where bundles only contain a single origin

Martin Thomson <mt@lowentropy.net> Thu, 27 August 2020 05:45 UTC

Return-Path: <mt@lowentropy.net>
X-Original-To: wpack@ietfa.amsl.com
Delivered-To: wpack@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 546EF3A0D5D for <wpack@ietfa.amsl.com>; Wed, 26 Aug 2020 22:45:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=lowentropy.net header.b=OqxgaWz9; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=MJUgBJU6
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NLhaCZdX53us for <wpack@ietfa.amsl.com>; Wed, 26 Aug 2020 22:44:59 -0700 (PDT)
Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 426CF3A0D43 for <wpack@ietf.org>; Wed, 26 Aug 2020 22:44:59 -0700 (PDT)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 71A025C025E for <wpack@ietf.org>; Thu, 27 Aug 2020 01:44:58 -0400 (EDT)
Received: from imap10 ([10.202.2.60]) by compute2.internal (MEProxy); Thu, 27 Aug 2020 01:44:58 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lowentropy.net; h=mime-version:message-id:in-reply-to:references:date:from:to :subject:content-type; s=fm3; bh=+4WsiyIHOaGuuBL4CGrhvOqZhxyqv9X e24ZHkNgBwJg=; b=OqxgaWz90QVrVLdIfPk+tk298XjdlAZ6ob0i8a2626f4jqP ss5nBijQ/OM52fWDg+RG+OT/bQW0tQB7Ddv/tSW/65yoYEsLR2TADGL0Th+r0JUB bucEf5F6KzrRzFPgi1NmmRwGULF6pEpK/SZZEddCminKwvk5ZWZOUii+bUfWNZhb NP5U8N25gMkO/YB3GnLddlyDZprmb7WoDaz9FMDOYIoOyJ0rK/P0XhW1Oq8biy8n xmT5acl26Ygv7cz1ijm8rp05pIVirQhz/HTlQthtfZt/oogMFBNL1IFb37QlGEg7 7dybFuy+IP99jFwDB1N1zpwCbYV+t0HxgY0T5lA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=+4Wsiy IHOaGuuBL4CGrhvOqZhxyqv9Xe24ZHkNgBwJg=; b=MJUgBJU6KX9SJlcyp+yjZd 9zmlt6jeX3azhNr9I1a1BmnEiZVqoiiD7uMWjPW1Wrj8j12gta0TMGS4Qf56GjnK zMOjZ2AM5PwzGVLNKOv/PUxk1uPMY183rooq1Uyx/D757vm7BS0kG/M9CdmGyTx4 UAItby/TbehRJ7wRtANYqwrRc53uQ/3LWcXCVCLn8t0k0zYaEKpGb67tZmfL+xmT Nirde0hOB2gmV5sLAugPgN/fmB6LuiXeLFPvRuATCkGKsSqRYWcuwOg/VbqZH5p7 rHj7yajZOBagKHKrFDrfGP1AY14clZvqaCcPATZPGuL+sddC0gRbpjV5e2zCx/hw ==
X-ME-Sender: <xms:WkhHX-_SJc4T-iUPMOC58nlIybMqWQbqhL6ls6aSQeT7N1oniLARsg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduiedruddvfedgleelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefofgggkfgjfhffhffvufgtsehttd ertderreejnecuhfhrohhmpedfofgrrhhtihhnucfvhhhomhhsohhnfdcuoehmtheslhho figvnhhtrhhophihrdhnvghtqeenucggtffrrghtthgvrhhnpeeuieekvdetfeeludfhtd ffuedvueefjefgveeivdeviefhteekgfdvfeefieeihfenucffohhmrghinhepvgigrghm phhlvgdrtghomhenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehmtheslhhofigvnhhtrhhophihrdhnvght
X-ME-Proxy: <xmx:WkhHX-ul57-U0rJahxoPV7eZNHSjq_nCkw9bSa3ZjT0rXSH0mI47vQ> <xmx:WkhHX0Aft4qlL7yVHch8Fpd1GPyL318XVCvZjta3N82WXLG9kybWNQ> <xmx:WkhHX2c67yngMchODv8-kyq6Ckd1XZt-DKjZl01yThR3jMyF0EzYzw> <xmx:WkhHX5vEb4VOCyU_uPGI2YaRkOy-w3vV0qhXEQM-GYMXY_Xdqs_sow>
Received: by mailuser.nyi.internal (Postfix, from userid 501) id 0A52B200BD; Thu, 27 Aug 2020 01:44:58 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.3.0-232-g4bdb081-fm-20200825.002-g4bdb081a
Mime-Version: 1.0
Message-Id: <614809c3-02d7-4e8e-a63d-1f2bd6307789@www.fastmail.com>
In-Reply-To: <CANh-dXkC8i6F1gxoD6nTJ4bp=7TVyy1fcN3v1vurj6h4+cZqiQ@mail.gmail.com>
References: <CANh-dXkC8i6F1gxoD6nTJ4bp=7TVyy1fcN3v1vurj6h4+cZqiQ@mail.gmail.com>
Date: Thu, 27 Aug 2020 15:44:36 +1000
From: Martin Thomson <mt@lowentropy.net>
To: wpack@ietf.org
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/wpack/9d-hJPynWdpAMSIDVier8Xp97Io>
Subject: Re: [Wpack] Counter-proposal where bundles only contain a single origin
X-BeenThere: wpack@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Web Packaging <wpack.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wpack>, <mailto:wpack-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wpack/>
List-Post: <mailto:wpack@ietf.org>
List-Help: <mailto:wpack-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wpack>, <mailto:wpack-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Aug 2020 05:45:02 -0000


On Thu, Aug 27, 2020, at 04:00, Jeffrey Yasskin wrote:
> # Origins for nested bundle resources
> 
> The origin of one of these holds the information from the URL up to the 
> last `$`. 

This is one option.  I tend to think that adding to the tuple is also reasonable (it doesn't have to be origin attributes).  Suborigins proposed a new origin tuple item, for instance.

> The difference becomes whether the origin computation includes an 
> authority component in the last $-delimited segment.

This is the bit that I want to draw on a little harder.  If the URI for a resource is its identity, and we are basing the fact that x$https://example.com/foo comes from example.com purely on an unsubstantiated assertion, why would we consider example.com to be a necessary part of the identity of the resource?  Or the identity of the origin for that matter?

> ## Multi-origin bundles are better:
> 
> * Users can expect simple tools to combine and split pre-existing bundles.

I think that this is not correct, or at least founded on a poor assumption.  

I believe that your assumption is that a resource can refer to content by an https:// URL (or otherwise "canonical" URL) without identifying the bundle in which it appears.

It's a good assumption when content is properly attributed to an https:// (or otherwise "canonical") source through something like a signature.  But also completely unnecessary in that case (we no longer need the new scheme).  In that case saying "http://example.com/resource" is enough.  It perfectly matches with the treatment of the referenced content.  But I don't believe that you can safely identify content without also identifying the bundle in which it appears unless you have that guarantee.

The narrow cases in which this is interesting is when you have resources from multiple origins with multiple cross references.  If your CSS is in a different origin and it references content from yet another origin, then those links need to be adjusted based on the bundle structure.  (Which was your next point; but see below.)

> * When authors are composing a bundle, cross-origin resources can go 
> directly into the bundle, instead of needing to rewrite them to 
> same-origin or put them in a nested bundle.

There is another option, which is to adopt resources.  Many subresources are effectively adopted anyway (JS, CSS, CORS images).  Bundling removes confidentiality advantages conferred to others (non-CORS images).  A simple bundler gains nothing by retaining isolation for either; only a bundler that intends to have the content re-attributed to the "online" origin gains anything by isolating the latter.  

The main reasons I can think of to maintain any isolation are: reuse (a large bundle like El Paquete Semanal might want to avoid resource duplication), or sandboxing (for re-attribution to online origins, or isoation of mutually distrustful content like iframes).  Both of which are things I might argue are sufficiently advanced as to justify the non-trivial extra effort.

There are other options of course.  If you find that link writing is awkward, then a different authority form for the package: scheme might help.  Let's say we choose a non-reg-name form, maybe something that starts with a $ and reserve that.  Then we can name packages and reference content in them.  Then you might name a shared package of images as eps-images and refer to an image in it using <package://$eps-images/people/person.jpg>.

This is as meaningful as the pseudo-origin that appears after the $ in your examples, it just doesn't pretend to be something it isn't.

You could use the same methods for avoiding collisions (i.e., a domain name), but you can also use short mnemonics or UUIDs.   The phishing potential of this simple design leads me to wonder if it is the best strategy, but maybe we can just ensure that that string never appears in security-sensitive UX.

> * Implementations only need to spin up the bundle parser once, which 
> could affect performance.

Gee, I hope it doesn't cost that much.  A better baseline comparison is not zero overhead, but a new TLS connection.  And if we can't beat that, we'd have to try not to.

> * Implementers need to write and maintain tools to rewrite cross-origin 
> URLs when saving a bundle from a website

See above.  It might be enough to avoid that at a granular level.