Re: [Wpack] package: URL scheme

Jeffrey Yasskin <jyasskin@google.com> Thu, 11 June 2020 18:59 UTC

Return-Path: <jyasskin@google.com>
X-Original-To: wpack@ietfa.amsl.com
Delivered-To: wpack@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94E6B3A0F12 for <wpack@ietfa.amsl.com>; Thu, 11 Jun 2020 11:59:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PFouxhOsKBNN for <wpack@ietfa.amsl.com>; Thu, 11 Jun 2020 11:59:57 -0700 (PDT)
Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC1183A0AA3 for <wpack@ietf.org>; Thu, 11 Jun 2020 11:59:47 -0700 (PDT)
Received: by mail-qk1-x732.google.com with SMTP id q8so6610718qkm.12 for <wpack@ietf.org>; Thu, 11 Jun 2020 11:59:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GAR10bruQWU+FLxtk4icuKknzyALPdezjcTnmk9OO3M=; b=ZEqFS4R4k/MJ1bvsswUAy+rJpATagdfsjl5NhJOaLwPEV3y2/gRX08E09G1vOATvNT RuZ/btEPBZlNT6bS2AVzdc7gS5TcQS4q6E0fcS3c0cXJb0w5qxta2OAdn5h4NCxSYDJO bcQiJ5cinecziSgn2N6SNAConZNc7yDuzDkQpG/nvlYLdOca8qwpsKTNNi15kTwP5NyB acx1BP+Y4LLTZ6OEfOOkVuGvfPJBj+Uz5+7B00cXG7zVN9aDCX+FDMI7JRQknmDqBdF9 gYkOwgouc+Z/iuZ20FFMrXgKEJxOsOE5/MPhSsQVm+iHROWF1Zw6IPSW5FeglMvQOIgT DO2w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GAR10bruQWU+FLxtk4icuKknzyALPdezjcTnmk9OO3M=; b=rdoHw2W2Hpi0xvQhTB3l0svmirzZBxzW6mrnSfM4iJU8qEuvdJOS79IZrip51yzhyL y6f85Tf8xLttW1GFBMps0AGHdZV2WB0hHWyax3S6W5tLDEsWxAKvu432BHGqIaFu5qeF tgaSC0WsJPLmhlnJNOcV/eZ94TnJXTsm1ae/xnnUCvB2eCtkminRBeMBncr+4N1eppBt 7BNW5y+nCV9M5jnu0/LDjCNHCGbAtBBQDxJ8vY54/aT9tfZo7aileBV3URrOxHuov5ZU KDVFMPYJfwWovJXvuC6eGHJLKmxzXyAP8Uyku7gSm2TZVRl34gpKQEp3Fu9neYIYwE7i goGQ==
X-Gm-Message-State: AOAM5309bHkt+yBUk6khQCXtQ2QIkLYWpyvt9OP0aldIUhosjC3q/CKP 9QFrW7jbfOQEkyIWykMjDsdzXsU5Q1JJQwxycbguCQ==
X-Google-Smtp-Source: ABdhPJy0wogkZwZFS4Gq7udDua4L0z/7vIctZX2Z5RLH1eWpN5noemC9ey3N9T5D3nGE4Juxt0m+99QRQecNscKB5uY=
X-Received: by 2002:a37:e110:: with SMTP id c16mr10251830qkm.38.1591901986219; Thu, 11 Jun 2020 11:59:46 -0700 (PDT)
MIME-Version: 1.0
References: <CANh-dXndPaue3zAADhpc+wyNb8dxs=nVKOAp1n=6SMCKoUe=eQ@mail.gmail.com> <CA+9kkMAPqvXUwq3XqeqzkurmPbMVvR3bc_YPG6xQK0PUS4em-Q@mail.gmail.com> <032201d6401f$a3e58460$ebb08d20$@acm.org>
In-Reply-To: <032201d6401f$a3e58460$ebb08d20$@acm.org>
From: Jeffrey Yasskin <jyasskin@google.com>
Date: Thu, 11 Jun 2020 11:59:34 -0700
Message-ID: <CANh-dX=sp-S_DPKO4adFgX3GBpo7QyYD3o7rrUJ+Sv=_qKYPYQ@mail.gmail.com>
To: Larry Masinter <LMM@acm.org>
Cc: Ted Hardie <ted.ietf@gmail.com>, WPACK List <wpack@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000686cdf05a7d390f3"
Archived-At: <https://mailarchive.ietf.org/arch/msg/wpack/EYtnuIMWrBkthdZ95nlA9T3-DLU>
Subject: Re: [Wpack] package: URL scheme
X-BeenThere: wpack@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Web Packaging <wpack.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wpack>, <mailto:wpack-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wpack/>
List-Post: <mailto:wpack@ietf.org>
List-Help: <mailto:wpack-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wpack>, <mailto:wpack-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Jun 2020 19:00:06 -0000

This use of a fragment matches the way the TAG's packaging draft identified
subresources:
https://www.w3.org/TR/2015/WD-web-packaging-20150115/#fragment-identifiers.
I have some discussion of the option at
https://docs.google.com/document/d/1BYQEi8xkXDAg9lxm3PaoMzEutuQAZi1r8Y0pLaFJQoo/edit#heading=h.jkcl1u8boyar,
but I realize I didn't include the part that worries me most there:

We need to make the nested URL contribute to the resource's origin, or else
when folks create a bundle of multiple sites for distribution together,
their storage will collide. Including part of the fragment into the origin
would be novel and potentially break assumptions and cause bugs. However,
it could be the best option anyway.

Jeffrey

On Thu, Jun 11, 2020 at 11:39 AM Larry Masinter <LMM@acm.org> wrote:

> I think this would work and avoid some of the squirrely awkward encodings
> , for / etc
>
>
>
>
>
> pkg+
> https://distributor.example/package.wbn#cid=https://publisher.example?q=query
>
>
>
> Define pkg+originalscheme to always returning an application/wbn
> content-type which accepts a fragment identifier which identifies the
> package component.
>
>
>
> This form avoids encoding or transforming any part of the original except
> the scheme and the fragment identifier.
>
>
>
> You can even apply fragment identifier to the publisher URL.
>
>
>
>
>
> *From:* Wpack <wpack-bounces@ietf.org> *On Behalf Of *Ted Hardie
> *Sent:* Wednesday, June 10, 2020 3:44 PM
> *To:* Jeffrey Yasskin <jyasskin=40google.com@dmarc.ietf.org>
> *Cc:* WPACK List <wpack@ietf.org>
> *Subject:* Re: [Wpack] package: URL scheme
>
>
>
> Hi Jeffrey,
>
>
>
> On Wed, Jun 10, 2020 at 3:00 PM Jeffrey Yasskin <jyasskin=
> 40google.com@dmarc.ietf.org> wrote:
>
> Hi all,
>
>
>
> I wanted to raise awareness of a discussion about the URL scheme for
> addressing resources within bundles (draft-yasskin-wpack-bundled-exchanges).
>
>
>
> We seem to be heading toward a URL of the form
> package:<encoded-package-url>$<encoded-resource-uri>, which for a package
> URL of https://distributor.example/package.wbn and resource URI of
> https://publisher.example/page.html?q=query would lead to a URL of:
>
>
>
> *package:https:,,distributor.example,package.wbn;q=query$https:,,publisher.example/page.html?q=query*
>
>
>
> RFC 3986, section 2.2 notes:
>
>    If data for a URI component would conflict with a reserved
>
>    character's purpose as a delimiter, then the conflicting data must be
>
>    percent-encoded before the URI is formed.
>
> Wouldn't that apply to a couple of the delimiters in the package example
> above, e.g. the  ":" in "https:"?  If so, that seems like it would change
> the display form a bit (though not the basic idea).  If that is needed, I
> think it reinforces the notion that showing these to users or expecting the
> users to grok them is a non-starter.
>
>
>
> regards,
>
>
>
> Ted Hardie
>
>
>
>
>
> This arises from several considerations:
>
> 1. A bundle is served from a URL.
>
> 2. After a user downloads the bundle, it gets a new URL, often file:///...
>
> 3. We can also hash the bundle to get a URI that stays stable across
> transfers.
>
> 4. Resources inside a bundle are named by URIs (which, since the bundle
> has an index, are also URLs even if, like urn:uuid:..., they wouldn't
> normally be locators).
>
> 5. Once a user downloads a bundle, for web browsers to give its content
> storage that's persistent across reloads, as requested in
> https://github..com/WICG/webpackage/issues/498
> <https://github.com/WICG/webpackage/issues/498>, the content needs to be
> assigned a non-opaque origin.
>
>
>
> I'm updating one of the documents about this in
> https://github.com/WICG/webpackage/pull/584 and would welcome comments
> here or there.
>
>
>
> The URLs are obviously gross, so
> https://github.com/WICG/webpackage/pull/560 suggests that browsers avoid
> showing them to users in most cases.
>
>
>
> We could potentially simplify things if packages named things with just
> paths instead of full URIs. We'd then name things based on the bundle's
> origin. However, this loses archiving use cases.
>
>
>
> This is all further discussed in the following documents and issues, but
> you shouldn't feel responsible to read everything here:
>
>
>
> *
> https://docs.google.com/document/d/1BYQEi8xkXDAg9lxm3PaoMzEutuQAZi1r8Y0pLaFJQoo/edit#
> <https://docs.google.com/document/d/1BYQEi8xkXDAg9lxm3PaoMzEutuQAZi1r8Y0pLaFJQoo/edit>
>
> *
> https://chromium-review..googlesource.com/c/chromium/src/+/2226248/7#message-0a3efda5aff84770a1729422a5b26aeca3ee4e80
> <https://chromium-review.googlesource.com/c/chromium/src/+/2226248/7#message-0a3efda5aff84770a1729422a5b26aeca3ee4e80>
>
> * https://github.com/WICG/webpackage/issues/583
>
> *
> https://github.com/WICG/webpackage/blob/master/explainers/navigation-to-unsigned-bundles.md#urls-for-bundle-components
>
> * https://lists.w3.org/Archives/Public/uri/2019Nov/0000.html
>
>
>
> Thanks,
>
> Jeffrey
>
> _______________________________________________
> Wpack mailing list
> Wpack@ietf.org
> https://www.ietf.org/mailman/listinfo/wpack
>
>