Re: [Wpack] Signed Exchanges and Web Archiving Use Case

Ilya Kreymer <ikreymer@gmail.com> Sat, 19 October 2019 00:39 UTC

Return-Path: <ikreymer@gmail.com>
X-Original-To: wpack@ietfa.amsl.com
Delivered-To: wpack@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B739712080C for <wpack@ietfa.amsl.com>; Fri, 18 Oct 2019 17:39:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e1dDZ1wrA_io for <wpack@ietfa.amsl.com>; Fri, 18 Oct 2019 17:39:07 -0700 (PDT)
Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 35DAB12012E for <wpack@ietf.org>; Fri, 18 Oct 2019 17:39:07 -0700 (PDT)
Received: by mail-io1-xd34.google.com with SMTP id a1so9504873ioc.6 for <wpack@ietf.org>; Fri, 18 Oct 2019 17:39:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=whgOS6/nceXXLi2KurKGG/srglql60oxYwrPFizBY08=; b=N/4UYiONohF+amEf6M9MDHNLHso8bk3sng6xlizl3vTh+NCA4WsWl/FEFpcvmdrKmz Q+4a1riwBKsIc4WHyQo4sJ2zhOULoW+kt2hQ8HSBV5WkYsJsMKYJJVMOZBz2YYGrdagg 49QGQqk+a7gLw8/8+0VrV2DSGuAI/NG/LHn76Yw0pnzMGMTftx8zaSGPV6H2/WIuuKuI zk+fmWgBRl5xGAB7CE2B/2SZ8FrOHdzxC071azWX5QgVdsEkoAA5orZHqeAMji+UBHIh Mn9Jo68J5/LLVpGJ2MvA7w6X/L7Mg3dBm2bmV53+ot6FNdpFyRNDqmT+nbRqVU5hDfFa x0Sg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=whgOS6/nceXXLi2KurKGG/srglql60oxYwrPFizBY08=; b=KzqC55QyOfPOmJx1A/5O3gp/gEPTXBPFGV1fZy+ReJf4NSSdkdNo2Fi35Dfle1NhYx spIb9Utsc6tzGYr0C2ggc0Y044mZklmHCGS3KGtA5m69qJ+wihMbRVMnt1urpwQtvs8Z mqSaKFPC9UvfqMrqSw+nPH3cfgc41nDSaKYUQfvTderdY1dX2tuuQdGxsZIz/7/jwai9 BsmFb3UctwamUkchxyMZHE9b0JZWCAoQZpw1ZMrdknXN2DEhEeKL3oBHXE/CUJMCnqPA rv5GpbaZukZnBqZQaUjtPkurxJYk6OzDhFXUBqVY3Ri2iYRRg9SCa6iRg0RwfnbDyu6n cqxw==
X-Gm-Message-State: APjAAAUqqbBTGvXynVJBBr6G0NOlpR2o1dm3hm4nFl3EwcO1kiUU02mj 069fhLtHDSw4URoIH4/cxPcn3BmWm9wbkiDBu3M=
X-Google-Smtp-Source: APXvYqziSaP+DBE4blihDcvooKVUoYVQf7mXjSaXhwXEsttueYQ3nftee0+IfsaNFmLOo/Rl9YN2rkFTUKiVyIgAGMg=
X-Received: by 2002:a5d:8d8f:: with SMTP id b15mr9209353ioj.296.1571445546271; Fri, 18 Oct 2019 17:39:06 -0700 (PDT)
MIME-Version: 1.0
References: <CANAUx6i5VCRnq3nM+OtXQAPVKq_V+N_-_JdbkWx1rE_L+JyXDA@mail.gmail.com> <CA+6j2ggk2nZXCwULqN_MPigcKP47yRmpSO=GN6HTA5_euTeAeA@mail.gmail.com>
In-Reply-To: <CA+6j2ggk2nZXCwULqN_MPigcKP47yRmpSO=GN6HTA5_euTeAeA@mail.gmail.com>
From: Ilya Kreymer <ikreymer@gmail.com>
Date: Fri, 18 Oct 2019 17:38:55 -0700
Message-ID: <CANAUx6gK8oLihgcKTOjC0NNSocekQe33Z3=xmo7YY3nSqbWShw@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@google.com>, wpack@ietf.org
Content-Type: multipart/alternative; boundary="00000000000091c7b0059538ade8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/wpack/1ibfSwbFsMKfM-Fo-eCcMsn3yt8>
Subject: Re: [Wpack] Signed Exchanges and Web Archiving Use Case
X-BeenThere: wpack@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Web Packaging <wpack.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/wpack>, <mailto:wpack-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/wpack/>
List-Post: <mailto:wpack@ietf.org>
List-Help: <mailto:wpack-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/wpack>, <mailto:wpack-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Oct 2019 02:44:05 -0000

Hi Jeffrey, all,

(Moving to this list at Jeffrey's request)

I am very interested in the 'archival' use case of web pack. In particular,
I have a prototype, https://wab.ac/ that can
'replay' and HAR or WARC file (standard format used by Internet Archive and
others).
The tool replays individual HAR or WARC files, using service workers,
providing a rendering of the network traffic contained (and a lot of JS
emulation).

The prototype demonstrates that such replay may be possible, if at least
using service workers and custom replay engine, and so verification would
be extremely useful to have.

Currently,  the 'emulation' of a particular time is handled by the replay
system already, as is isolation between different WARC/HAR files, as is
isolation of other built-in primitives (like local storage, etc..).

But, the service worker has to rewrite the responses in order to render at
a different domain. This probably poses a limitation on what can be
rendered directly.
Probably just verifying the raw data, and being able to show that the
responses that a service worker reads are 'verified' would be a good first
step somehow.

The ability to have longer term signatures would also be key for the
archival use case. Could the client browser create counter-signatures
itself?
For the archiving use case, being able to prove that the exchange happened
from the client browser's perspective would be quite useful, even more so,
but I don't know if that is within scope at all, eg. if it had its own
certificate and keys..

Happy to discuss further, not sure how much of this is doable within
current spec, but definitely interested in wpack and signed exchanges and
how this may apply to make web archiving more reliable/verifiable for users.

Thanks,
Ilya


On Fri, Oct 11, 2019 at 12:40 PM Jeffrey Yasskin <jyasskin@google.com>
wrote:

> https://wab.ac/ looks like an awesome tool. I suspect it can't provide
> isolation between the javascript storage used by different WARC files. We
> plan to do something for that in the bundle format for unsigned/untrusted
> content. Sawood Alam also suggested being able to pretend the archive is
> being run at a particular time, which seems sensible but is currently low
> priority.
>
> The remaining difficulty we've noticed in verifying that the content is
> accurate is that Signed Exchange signatures expire in a short amount of
> time, because we don't want browsers to trust javascript that might have
> fixed security holes. Even if the archival system trusts the signature for
> longer, we still don't expect web server operators to be able to keep a
> private key secret for multiple years. So you need a system of timestamping
> servers, and a contiguous chain of their (counter-)signatures back to the
> original server signature. Signed Exchanges don't currently support
> counter-signatures well, but that's something that could be added. I
> wouldn't expect browsers' UI teams to decide that chain of trust is worth
> showing in the UI, although I could always be surprised.
>
> Would you mind having this conversation on the wpack@ietf.org list
> <https://www.ietf.org/mailman/listinfo/Wpack> so more people can see your
> interest and can chime in?
>
> I unfortunately won't be at the Chrome Dev Summit. I'll be at the WPACK
> BoF the next week in Singapore:
> https://trac.tools.ietf.org/bof/trac/wiki#WPACK.
>
> Jeffrey
>
> On Fri, Oct 11, 2019 at 11:12 AM Ilya Kreymer <ikreymer@gmail.com> wrote:
>
>> Hi Jeffrey,
>>
>> I wanted to reach out to you regarding the archival use case for web
>> packaging and signed exchanges. I work on a project called Webrecorder,
>> which includes a hosted service (https://webrecorder.io/) and desktop
>> app to allow users to create and view their own web archives. To support
>> more decentralized web archives,
>> verification that the content is authentic and was actually served by the
>> web server is imperative, and for this reason I'm very interested in the
>> signed exchange efforts.
>> Like most other web archiving efforts, Webrecorder users standard WARC
>> files, but if verification was an option, could certainly support new
>> package format.
>>
>> I also wanted to share a new development, https://wab.ac/ which allows
>> the browser to open and render WARC files directly, without requiring a
>> server, by using service workers. This also works with HAR files, that can
>> be made by Chrome DevTools as well and then loaded in the tool.
>> This demonstrates a proof-of-concept of browser support for WARC files,
>> as mentioned in
>> https://wicg.github.io/webpackage/draft-yasskin-webpackage-use-cases.html#rfc.section.2.2.9
>>
>> Here are some example link which load a WARC from github pages and
>> redirect to rendered page:
>>
>> https://wab.ac/?coll_example=examples/netpreserve-twitter.warc&url=/example/https://netpreserveblog.wordpress.com/2019/05/29/warc-10th-anniversary/
>>
>> https://wab.ac/?coll_example=examples/netpreserve-twitter.warc&url=/example/https://twitter.com/netpreserve
>>
>> This should allow for rendering of any web archive directly in the
>> browser, but no way to verify that the content is accurate, so signed
>> exchanged or some form of verification becomes really imperative.
>>
>> For a web archiving perspective, we want to be able to every http
>> exchange from the server, and if that's not possible, at least from the
>> client, so that future users can verify the authenticity of the archive.
>>
>> I'd love to chat further if you have time and would be interested.
>>
>> I'm based in SF, and applied to attend the Chrome Dev Summit in November,
>> in case you'll be here then.
>>
>> Thank you,
>> Ilya
>>
>>
>>
>>