X.500, SR/Z39.50 and networked information retrieval

Karen Rosin Sollins <sollins@lcs.mit.edu> Thu, 19 December 1991 17:12 UTC

Received: by merit.edu (5.65/1123-1.0) id AA08252; Thu, 19 Dec 91 12:12:52 -0500
Received: from ALLSPICE.LCS.MIT.EDU by merit.edu (5.65/1123-1.0) id AA08247; Thu, 19 Dec 91 12:12:48 -0500
Received: by PTT.LCS.MIT.EDU id AA01204; Thu, 19 Dec 91 11:56:37 EST
Date: Thu, 19 Dec 1991 11:56:37 -0500
Message-Id: <9112191656.AA01204@PTT.LCS.MIT.EDU>
From: Karen Rosin Sollins <sollins@lcs.mit.edu>
Sender: sollins@ALLSPICE.LCS.MIT.EDU
To: bajan@mocha.cc.mcgill.ca
Cc: S.Kille@cs.ucl.ac.uk, Jill.Foster@newcastle.ac.uk, emv@cic.net, disi, wais-talk@quake.think.com, nic-interest@cicm.net, osi-ds@cs.ucl.ac.uk, rare-wg3-usis@newcastle.ac.uk, peterd@expresso.cc.mcgill.ca, yeongw@psi.com, clw, timbl@nxoc01.cern.ch, p.barker@cs.ucl.ac.uk, ghb@concert.net, bajan@cc.mcgill.ca, AWG@se.sunet, STGEORGE@bootes.unm.edu, jkrey@isi.edu, map@lcs.mit.edu, sollins@lcs.mit.edu
In-Reply-To: bajan@mocha.cc.mcgill.ca's message of Wed, 18 Dec 1991 17:27:58 -0500 <9112182227.AA12327@mocha.cc.mcgill.ca>
Subject: X.500, SR/Z39.50 and networked information retrieval
Status: O

I am very interested in living documents and believe that there are
some problems not yet addressed by WAIS, X.500 and Prospero.  Also,
the problem is larger than simply information retrieval; it is access
but also more (for example, maintenance, monitoring, access control,
cost recovery, etc. - see below).  The research we are now initiating,
the Information Mesh, takes as a starting point several assumptions or
goals that are particularly relevant to living documents:

1) The future of networks will be to provide the connectivity (links)
among nodes of information.

2) No single organization (eg a single hierarchy) will suffice to
organize the connectivity of all such nodes of information.

3) The links will contain names that are not necessarily user friendly
- humans won't see these names.  These names serve two functions: the
ability to test for equality and the ability to locate the node at the
other end of a link.

4) The names in link must be independent of ownership, storage
facility, storage organization, location service, and any other
semantics besides the uniqueness of the node.  This is important both
because nodes and links are expected to outlive all of these, and must
be mobile.  We are assuming that one wants to allow for lifetimes of
at least 20 yrs, and what the library gives us (a promise to "archive"
for 100 yrs is standard), must be taken seriously.  In this time, the
original owner of a node, storage device or even storage organization
(such as Unix file system) may become outdated and replaced.  But if
the node still exists and someone has a link to it, it ought to be
findable, without having to revise the contents of old links.

5) Types play a significant role here - both the types of nodes and
the types of links.  In this area we still have much to learn, but
there are some things to say even now.  The name in a link (that
unique id) is unrelated to either the type of the link or the type of
the node.  On the other hand, typing is relevant to the next point
below about regions, and how to declare and use them.  In addition, we
must be able to handle typing with enormous flexibility, since we must
be able to handle not only existing typed information, but future type
systems as they evolve.  It is our belief that most of the importance
of typing is in the applications themselves, and that the
infrastructure may not need to have much of an understanding of type
systems.

6) Links must be able to relate regions of multidimensional nodes to
each other.  In other words, in terms of documents, a simple example
might be in including a quote from one document in another.  The whole
quote might be included, and might be linked to the region of the
bit-map of the page of the newspaper from which the quote was taken,
thus allowing by following the link from anywhere within the quote a
presentation of the whole page of the newspaper in order to see other
articles placed in the proximity of the quote.

7) In this world we must be prepared for a great many nodes.  We are
currently estimating 10**18, just to have something large to aim for.

8) In order to make all this work, a general form of location
architecture is needed as well.  This must accomodate unknown and
future location services as well.  The architecture will also include
a hinting mechanism to help find useful location services.

9) The whole architecture must be prepared for the possibility of
failure.  In a system of the size and scope we are discussing, even if
the destination node exists, it is possible that one can't find it.
Two possible reasons for this are partitioning, and an inability to
find a location service that knows about the node in question.  Thus,
a node may be temporarily unfindable.

10) We realize that in order to make such an infrastructure useful, it
will be necessary to provide an externsible set of tools.  These will
support such things as replication, a variety of location services,
organization, and perhaps such facilities as security and charging
services.  (We are also working on protocols for charging services and
mechanisms to off-load security mechanisms from end-hosts as somewhat
separate projects at present.)

Well, I could go on and on about this as you may have guessed, but
should stop for now.  I am incredibly enthusiastic about having real,
live applications for our Information Mesh.  In fact, only yesterday I
was talking with Mike Patton who had come back from the IETF with a
report about the problem of living documents that the NOC Tools people
were having, and that there clearly were other such groups as well.

I have talked with Brewster Kahle (WIAS) about some of these problems,
but haven't had a chance to talk with Steve Hardcastle-Kille (QUIPU)
or Cliff Neumann (Prospero) or the others on this list about them.
Count me in on a BOF or whatever having to do with this area, travel
schedule permitting.  In any case, I would definitely like to know as
much more about the problems of living documents as I can learn.  I
therefore solicit comments either to me personally or to this (these)
group(s).  I will summarize if I get lots that is not sent to the
whole group (after New Year's).

			Karen Sollins
			sollins@lcs.mit.edu