Re: Atom Content Negotiation (follow-up)

Erik Wilde <dret@berkeley.edu> Mon, 02 May 2011 16:15 UTC

Return-Path: <owner-atom-syntax@mail.imc.org>
X-Original-To: ietfarch-atompub-archive@ietfa.amsl.com
Delivered-To: ietfarch-atompub-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 024D5E06CB for <ietfarch-atompub-archive@ietfa.amsl.com>; Mon, 2 May 2011 09:15:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FdtFpwkL8bPd for <ietfarch-atompub-archive@ietfa.amsl.com>; Mon, 2 May 2011 09:15:56 -0700 (PDT)
Received: from hoffman.proper.com (IPv6.Hoffman.Proper.COM [IPv6:2001:4870:a30c:41::81]) by ietfa.amsl.com (Postfix) with ESMTP id 41B7BE0688 for <atompub-archive@ietf.org>; Mon, 2 May 2011 09:15:54 -0700 (PDT)
Received: from hoffman.proper.com (localhost [127.0.0.1]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id p42G9VEK064662 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 2 May 2011 09:09:31 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
Received: (from majordom@localhost) by hoffman.proper.com (8.14.4/8.13.5/Submit) id p42G9VQH064661; Mon, 2 May 2011 09:09:31 -0700 (MST) (envelope-from owner-atom-syntax@mail.imc.org)
X-Authentication-Warning: hoffman.proper.com: majordom set sender to owner-atom-syntax@mail.imc.org using -f
Received: from cm03fe.IST.Berkeley.EDU (cm03fe.IST.Berkeley.EDU [169.229.218.144]) by hoffman.proper.com (8.14.4/8.14.3) with ESMTP id p42G9Vqw064654 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <atom-syntax@imc.org>; Mon, 2 May 2011 09:09:31 -0700 (MST) (envelope-from dret@berkeley.edu)
Received: from dhcp220.ischool.berkeley.edu ([128.32.226.220]) by cm03fe.ist.berkeley.edu with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (auth plain:dret@berkeley.edu) (envelope-from <dret@berkeley.edu>) id 1QGvgd-0005Vv-BH; Mon, 02 May 2011 09:09:29 -0700
Message-ID: <4DBED72F.2040503@berkeley.edu>
Date: Mon, 02 May 2011 09:09:19 -0700
From: Erik Wilde <dret@berkeley.edu>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: Atom-Syntax <atom-syntax@imc.org>
CC: James Holderness <j4_james@hotmail.com>
Subject: Re: Atom Content Negotiation (follow-up)
References: <4DBC60F4.7000402@berkeley.edu> <BLU122-W686A8FCBEDB816FFA2BF8BE9D0@phx.gbl>, <4DBCF02B.6080601@berkeley.edu> <BLU122-W284C7C2E53440F1AFE1BCBBE9F0@phx.gbl>
In-Reply-To: <BLU122-W284C7C2E53440F1AFE1BCBBE9F0@phx.gbl>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Sender: owner-atom-syntax@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/atom-syntax/mail-archive/>
List-Unsubscribe: <mailto:atom-syntax-request@imc.org?body=unsubscribe>
List-ID: <atom-syntax.imc.org>

hello james.

> With HTTP pipelining those multiple GETs shouldn't have that much of a
> negative impact, although I'll admit that in practice pipelining does
> have potential problems.
> But even without pipelining, after the first feed retrieval, I would
> still expect those multiple GETs to be more effecient on subsequent
> updates. Two small GETs to retrieve one new feed entry would seem more
> efficient than one giant GET of a 20 entry feed when you only want the
> latest entry (unless you're also supporting RFC3229+feed?)
> I guess this does depend to a some extent on the nature of your data,
> the frequency of updates, and the kind of clients you expect to have.

of course it does. and please not that everything i propose is entirely 
optional, of course. if you want to do it old-style, then that's fine. 
but in particular in the case of push with fat ping, there is a very 
substantial difference between being able to receive the data in the 
required format via push updates, or having to GET the alternate version 
of every single update.

> I still think you are going about it the wrong way. If the content-type
> for Atom doesn't sufficiently distinguish between the types of feed you
> want to serve for HTTP content negotiation to work, it seems to me you
> should be looking at ways to extend the content-type (like the type
> parameter that was proposed to distinguish between Atom entry documents
> and Atom feed documents). Inventing a new form of content negotiation
> that requires parsing links from the top of an Atom feed is just twisted.

i am not saying that i am proposing the perfect solution, and there are 
certainly different ways of approaching the problem i am looking at. it 
seems to me, however, that changing the way how this could be done in a 
more perfect way would require a lot of very substantial changes to the 
core of media types and HTTP content negotiation. it would be very nice 
to have HTTP content negotiation doing this, but unfortunately, atom 
sort of "hides" the "real content" (the /feed/entry/content) from 
visibility on the HTTP level.

> But the real problem may just be an inappropriate use of Atom as a
> general purpose container format. If you have a client that prefers
> content as RDF, why not give them a true RDF feed? Why RDF embedded in
> Atom? That kind of misses the point of RDF. Even worse, if a client
> prefers JSON, do you really think it makes sense to serve them that JSON
> as base-64 encoded blobs inside an XML container? I can assure you that
> nobody is going to thank you for that option.

saying that atom is inappropriate as a general purpose container format 
is your opinion. i happen to think that atom is exactly that. afaik, RDF 
does not have feeds, and if i as a service designer choose to model my 
service in a RESTful way based on atom abstractions, then atom probably 
is a very good way to represent that. serving RDF content as well would 
just be a convenience feature for those who would prefer to get RDF 
representations of what my service exposes.

here is something i wrote a while ago about atom as a general container:

http://dret.typepad.com/dretblog/2009/05/atoms-future-as-a-generalpurpose-format.html

you might not agree with what i am saying here, but there are more and 
more services where feeds are used as the RESTful abstraction of the 
service model, and this is the area i am looking at for the proposal i 
have made.

> If you served RDF clients a real RDF feed, and JSON clients a real JSON
> format, then you could be using standard HTTP content negotiation - at
> least for most cases. If you wanted to provide an "HTML friendly" feed
> that was separate from the one with raw XML embedded, you'd still need a
> way to differentiate between the two, but at that point I would think it
> not worth the complication. I'd just make your feed content the raw XML,
> include a short summary, and then an alternate link to a more detailed
> HTML page if really necessary.

again, you're saying "real RDF feed" and "real JSON feed" and these are 
not things i am aware of. it might be interesting to think of RDF and 
JSON serializations of the atom data model, but this would actually be 
very hard to do well because of atom's openness and support of XML 
namespaces. it's a very different discussion anyway, but we are 
currently in the process of encoding atom as an RDF ontology, and i can 
tell that it's suprisingly complicated.

so my assumption is to have atom XML as the feed container so that there 
is only one representation of the feed abstraction that can be handled 
by intermediaries, value-added components (such as push frameworks), and 
then make sure that various types on content can be transported in that 
framework in a well-defined and flexible way.

cheers,

dret.

-- 
erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-6432253 |
            | UC Berkeley  -  School of Information (ISchool) |
            | http://dret.net/netdret http://twitter.com/dret |