Re: Atom Link Extensions Use Case
Tim Bray <twbray@google.com> Fri, 08 June 2012 14:48 UTC
Return-Path: <owner-atom-syntax@mail.imc.org>
X-Original-To: ietfarch-atompub-archive@ietfa.amsl.com
Delivered-To: ietfarch-atompub-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8234221F888C for <ietfarch-atompub-archive@ietfa.amsl.com>; Fri, 8 Jun 2012 07:48:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.661
X-Spam-Level:
X-Spam-Status: No, score=-102.661 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_MILLIONSOF=0.315, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcIytz-XFHFS for <ietfarch-atompub-archive@ietfa.amsl.com>; Fri, 8 Jun 2012 07:48:52 -0700 (PDT)
Received: from hoffman.proper.com (IPv6.Hoffman.Proper.COM [IPv6:2605:8e00:100:41::81]) by ietfa.amsl.com (Postfix) with ESMTP id 8EC0821F8870 for <atompub-archive@ietf.org>; Fri, 8 Jun 2012 07:48:32 -0700 (PDT)
Received: from hoffman.proper.com (localhost [127.0.0.1]) by hoffman.proper.com (8.14.5/8.14.5) with ESMTP id q58EfXrp003269 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 8 Jun 2012 07:41:33 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
Received: (from majordom@localhost) by hoffman.proper.com (8.14.5/8.13.5/Submit) id q58EfXp5003268; Fri, 8 Jun 2012 07:41:33 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
X-Authentication-Warning: hoffman.proper.com: majordom set sender to owner-atom-syntax@mail.imc.org using -f
Received: from mail-wg0-f53.google.com (mail-wg0-f53.google.com [74.125.82.53]) by hoffman.proper.com (8.14.5/8.14.5) with ESMTP id q58EfVEA003254 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=FAIL) for <atom-syntax@imc.org>; Fri, 8 Jun 2012 07:41:33 -0700 (MST) (envelope-from twbray@google.com)
Received: by wgbfm10 with SMTP id fm10so756584wgb.22 for <atom-syntax@imc.org>; Fri, 08 Jun 2012 07:41:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-system-of-record; bh=vcHb1P7Fkrvd/++Lomhfxc5JHwioDETlS9H+eatlbQk=; b=m5gYfgpl7RwvkvpjV1cOGh+cmjw4z1G2f3gl6H/PdfOZ3IMUSWmzXxCSs+cO7eCGSv av+nUNEoq6h33DpGZSu0bamEBouvYr7p1PsgM7+xAb8p+8gOdmhH6IhSIL90JXSJ2Zfa TyUBWDjJeberJ3Apan4usCItHIQo18XJ2hTgzKpMW2jKQmQSNox/JOAutBD9bUt73/53 oDLjfHN0RayXbPU7+7njPPHUUb7I8nK36FuJ55jbYj1Z6VK6rqebPZv7GRg1pAAcYrTg 5qWQYePX8KSdHDuAapz5x0Abs4Vw8kW/A1cLHktlwMUlmOrJvxCtaIXq/q7HbIz4bXkh OAhQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=vcHb1P7Fkrvd/++Lomhfxc5JHwioDETlS9H+eatlbQk=; b=Jv/r202mLI7nPra1Ii/aYfXFSLesy0oKWf5hLWw35/ogJK04BW6t1UWFfIxWUadXTS FX4FhFw0qtGSGyA32SkHw9IWCSEdRFbV4CpzOKGBPE+SUJPQE+GJeNk05XX9NfrGodUc HNlnc0dOCOQLlgjeviZWj4RdNysGCE6bRurDJTkKB42uGqGhLnPqRJ0J6SVLjm7G/+0f 082WedDmcDldXJ8cNwhAD+FfUuUNklfT7/QbOcLZfdCGCHo/VbEHZUuu8DiRhaOnU4kG QZzlL+L+BkiE6OITi5s67pNkEYp5oWC6yUyYDPMYVr2Lh8F+tjPcE81v+Q2IGJPsNWXt vWuQ==
Received: by 10.180.80.37 with SMTP id o5mr959573wix.12.1339166491051; Fri, 08 Jun 2012 07:41:31 -0700 (PDT)
Received: by 10.180.80.37 with SMTP id o5mr959479wix.12.1339166490428; Fri, 08 Jun 2012 07:41:30 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.223.81.65 with HTTP; Fri, 8 Jun 2012 07:41:00 -0700 (PDT)
In-Reply-To: <CABzDd=4pwK3Ao=fGOL4K+vN3po9iwd2QBkmL8OwEw3ZmYvW=Xw@mail.gmail.com>
References: <CABzDd=4pwK3Ao=fGOL4K+vN3po9iwd2QBkmL8OwEw3ZmYvW=Xw@mail.gmail.com>
From: Tim Bray <twbray@google.com>
Date: Fri, 08 Jun 2012 07:41:00 -0700
Message-ID: <CA+ZpN27_XfQnedj1v0BgS7G1BLR2Yq5ETkwROLXnCZbSEJLZ5A@mail.gmail.com>
Subject: Re: Atom Link Extensions Use Case
To: Ed Summers <ehs@pobox.com>
Cc: atom-syntax <atom-syntax@imc.org>, James Snell <jasnell@gmail.com>
Content-Type: multipart/alternative; boundary="f46d04428644f5058c04c1f6fe2f"
X-System-Of-Record: true
X-Gm-Message-State: ALoCoQmemXihoL1Aa1w37xBwosrmPCm7s9Ga9e1U0F2UT8lxd+nxGOTvvXio8r8NGezb9emi0J4nvB0Z2Pdidb9UqTDjmFev/6etuoM8s+afgWVKE1oFECPeqOVYAbvGH7wnP8sP7UtVmh7rLULaP3bMRm/2N1GMkQ==
Sender: owner-atom-syntax@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/atom-syntax/mail-archive/>
List-Unsubscribe: <mailto:atom-syntax-request@imc.org?body=unsubscribe>
List-ID: <atom-syntax.imc.org>
Why not just drop an element into the <entry> in your own namespace? This doesn’t feel like any kind of a link to me. <feed xmlns:loc="http://whatever.loc.gov"> ... <entry> ... <loc:checksum>3c89ea593c01483fd091</loc:checksum ... On Fri, Jun 8, 2012 at 6:04 AM, Ed Summers <ehs@pobox.com> wrote: > > Hi all, > > I am using Atom to syndicate access to data dumps at the Library of > Congress. We have a web application that provides access to historic > newspapers [1], and we have received requests for access to the > underlying OCR data for research and commercial purposes. Despite the > fact that this is historic data, we are routinely adding new content > as it is digitized. Rather than require clients to issue millions of > requests to get at the OCR data (which is actually web addressable) > the plan is to periodically create a tarred and compressed dump file > of new OCR content, and publish the availability of the file in an > Atom feed, which interested parties can subscribe to. It's a similar > model to what Wikimedia does for various Wikipedia projects [2]. > > Here's a minimal example, to give you an idea of what I mean (warning > URLs don't currently resolve): > > <?xml version="1.0" encoding="utf-8"?> > <feed xmlns="http://www.w3.org/2005/Atom"> > <title>Chronicling America OCR Dumps</title> > <link rel="self" type="application/atom+xml" > href="http://chroniclingamerica.loc.gov/dumps/ocr/feed/" /> > <id>info:lc/ndnp/dumps/ocr</id> > <author> > <name>Library of Congress</name> > <uri>http://loc.gov</uri> > </author> > <updated>2012-06-08T08:35:27-04:00</updated> > <entry> > <title>part-00001.tar.bz2</title> > <link rel="alternate" type="application/x-bzip2" > href="http://chroniclingamerica.loc.gov/data/dumps/ocr/part-00001.tar.bz2" > /> > <id>info:lc/ndnp/dump/ocr/part-00001.tar.bz2</id> > <updated>2012-06-07T13:57:23-04:00</updated> > <summary type="xhtml"><div > xmlns="http://www.w3.org/1999/xhtml">OCR dump file <a > href="http://chroniclingamerica.loc.gov/data/dumps/ocr/part-00001.tar.bz2 > ">part-00001.tar.bz2</a> > with size 162.7 MB generated June 7, 2012, 1:57 p.m.</div></summary> > </entry> > </feed> > > So the reason why I am writing here is that I would like to add > checksum information to the feed to let clients verify that they have > downloaded the data dump file correctly. An argument could be made > that it's not necessary since a corrupted bz2 file would likely not > decompress. An argument could also be made that the Content-MD5 header > could be used. But I like the idea of making an explicit assertion > about the checksum in the Atom document. > > After a bit of googling I ran across James Snell's Atom Link > Extensions draft, which provides a pattern for including an md5 > checksum in the <link> element like so: > > <link rel="alternate" type="application/x-bzip2" > hash="md5:579758192095fde80896058af4ce0aee" > href="http://chroniclingamerica.loc.gov/data/dumps/ocr/part-00001.tar.bz2" > /> > > Unfortunately it looks like the draft has expired. I was wondering: > > a) are there other established patterns for adding checksum > information for resources in Atom > b) if it's worth it for James to update the draft and try to push it > forwards to an Informational status > > As more and more data providers make dumps of their data available to > reduce crawling (like Wikipedia) it seems like a good use case for > Atom to support. > > //Ed > > [1] http://chroniclingamerica.loc.gov > [2] > http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-abstract.xml-rss.xml > [3] http://tools.ietf.org/html/draft-snell-atompub-link-extensions-08 > >
- Atom Link Extensions Use Case Ed Summers
- Re: Atom Link Extensions Use Case Richard Salz
- Re: Atom Link Extensions Use Case Philippe Rathé
- Re: Atom Link Extensions Use Case James M Snell
- Re: Atom Link Extensions Use Case Tim Bray
- Re: Atom Link Extensions Use Case James M Snell
- Re: Atom Link Extensions Use Case Ed Summers
- Re: Atom Link Extensions Use Case Ed Summers
- Re: Atom Link Extensions Use Case Tim Bray
- Re: Atom Link Extensions Use Case Ed Summers
- Re: Atom Link Extensions Use Case James M Snell
- Re: Atom Link Extensions Use Case Philippe Rathé
- Re: Atom Link Extensions Use Case James M Snell
- Re: Atom Link Extensions Use Case Ed Summers
- Re: Atom Link Extensions Use Case Tim Bray