Re: Reliable links [Was: Stab in the dark ]

"Daniel W. Connolly" <connolly@hal.com> Sat, 19 March 1994 00:42 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa15497; 18 Mar 94 19:42 EST
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa15493; 18 Mar 94 19:42 EST
Received: from mocha.bunyip.com by CNRI.Reston.VA.US id aa24093; 18 Mar 94 19:42 EST
Received: by mocha.bunyip.com (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA20695 on Fri, 18 Mar 94 16:06:30 -0500
Received: from hal.COM by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA20691 (mail destined for /usr/lib/sendmail -odq -oi -furi-request uri-out) on Fri, 18 Mar 94 16:06:09 -0500
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1) id AA14234; Fri, 18 Mar 94 13:05:44 PST
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2) id AA12042; Fri, 18 Mar 94 14:54:18 CST
Message-Id: <9403182054.AA12042@ulua.hal.com>
To: Tony Sanders <sanders@bsdi.com>
Cc: Multiple recipients of list <www-talk@www0.cern.ch>, Multiple recipients of list <uri@bunyip.com>
Subject: Re: Reliable links [Was: Stab in the dark ]
In-Reply-To: Your message of "Fri, 18 Mar 1994 14:20:50 CST." <199403182020.OAA15469@austin.BSDI.COM>
Date: Fri, 18 Mar 1994 14:54:17 -0600
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: "Daniel W. Connolly" <connolly@hal.com>

In message <199403182020.OAA15469@austin.BSDI.COM>, Tony Sanders writes:
>"Daniel W. Connolly" writes:
>> But in either case, you can give the same url twice and there's no
>> mechanism to guarantee that you'll get the same thing back,
>This is true with a given URL but note the folowing from the HTTP spec
>where it talks about the URI: header:
>
>    However, it is guaranteed that if an object is successfully retrieved
>    using that URI it will be to a certain given degree the same object as
>    this one.  If the URI is used to refer to a set of variants, then the
>    dimensiosn in which the variants may differ must be given with the "vary"
>    parameter:
>
>    Syntax          URI: <uri>  [ ; vary = dimension [ , dimension ]* ]
>    dimension       content-type[12] | language[13] | version[14]
>
>    If no "vary" parameters are given, then the URI may not return anything
>    other than the same bit stream as this object.
>
>    Multiple occurencies of this field give alternative access names or
>
>I think this addresses a lot of the points you made but even more important
>it makes it clear that reliable references to bitstreams have been thought
>about.  However, *MOST* references should not be reliable in this fashion.
>For example, you almost always want a vary=language, vary=content-type.

Ah! So the issue has been addressed somewhere... but (1) the scope of
this mechanism is only HTTP -- I can't make reliable links to FTP
files, and (2) shouldn't the URI: header tell where this document is
on the various dimenstions so that I can retrieve it again?

For example, suppose I ask:

	GET: /foo/bar

and the server says:

	HTTP/1.0 200 Message follows
	URI: http://host/foo/bar ; vary=version

How do I make a reference to this document? (or what do I scribble in
my cache to uniquely identify this doc?) It needs to say something like:

	URI: http://host/foo/bar ; vary=version=1.0

so that I can write

	<A HREF="http://host/foo/bar" VERSION="1.0">

I hashed this over with a friend last night, and we talked a lot about
what it would take to migrate documents around the net something like
NNTP broadcasting or IP routing tables and such. We decided there
wasn't a clear scalable strategy, but for the case, we came up with
a workable solution. The GET request should say something like:

	"Give me any copy of /foo/bar dated March 1 thru March30"

and proxy servers keep an "lifetime" for each document in the cache.
Some documents, FAQ postings for example, explicitly contain  the
lifetime. The NCSA folks worked out a set of heuristics for other
types of documents.

Then, when the proxy server gets a "GET(doc, t0, t1)" request, it
looks up doc in its cache, and if the lifetime intersects [t0, t1],
the query is resolved. Else, it turns around and makes the request to
the original server (or some neighbors or some such...).

This generalizes fairly well... things like CGI script results should
have very short lifetimes. RFC documents should have very long
lifetimes.

For the content-type dimension, the format negociation algorithm in
HTTP works pretty well...

But in all these cases, I'd like to be able to put version, format,
language, etc. info in the reference itself, if I choose. For example,
I may know that
	ftp://foo.com/lksjfli4jlij43
is a postscript file. But there's currently no way to express this.
And Mosaic, for example, will assume it's a plain text file.

Dan