Document representation and optimized retrieval

Thomas Johannsen <> Fri, 17 April 1992 08:13 UTC

Received: from by ietf.NRI.Reston.VA.US id aa00382; 17 Apr 92 4:13 EDT
Received: from by NRI.Reston.VA.US id aa03485; 17 Apr 92 4:17 EDT
Received: from by NRI.Reston.VA.US id aa03481; 17 Apr 92 4:17 EDT
Received: from JP-GATE.WIDE.AD.JP by with Internet SMTP id <>; Fri, 17 Apr 1992 07:57:46 +0100
Received: from by (5.65+1.6W/2.8Wb-jp-gate/1.2) with SMTP id AA18698; Fri, 17 Apr 92 15:57:33 JST
Received: from ([]) by (4.1/2.7W) id AA29307; Fri, 17 Apr 92 15:57:30 JST
Received: from by (4.0/6.4J.6-92/2) id AA13214; Fri, 17 Apr 92 15:57:26 JST
Received: by (4.1/6.4J.6-91/1/29) id AA00488; Fri, 17 Apr 92 15:57:48 JST
Date: Fri, 17 Apr 1992 15:57:48 -0000
From: Thomas Johannsen <>
Return-Path: <>
Message-Id: <>
Subject: Document representation and optimized retrieval

Recently, there was a discussion going on concerning document representation 
and retrieval at this list. The following might be of interest for you:

AIC Systems Laboratories, Tohoku University and WIDE (all in Japan) have a
project (Soft Pages Project) running that deals with representing and 
retrieving of fileserver information by use of the X.500 Directory.  

The aim is to provide users with information not only about the set of 
fileservers holding the files they want but also the most cost-effective
site ( from the network management  point of view) from which the file may 
be retrieved  (by ftp, ..). 
Optimization can be done for speed, cost,traffic-load .. . That is, 
comprehensive support will be provided for users for file retrieval.
Two aspects are covered in this project. The first deals with the Directory 
representation of the contents of fileservers. We have done it in a manner
similiar to that of W. Yeong (see his entries at o=Internet@ou=FTP Archives).

The second aspect deals with the storage of network configuration information
which is (in our opinion) a very useful application of the  Directory and can be 
used by a multitude of applications. An image of the network (the aim 
is an Internet image in future) in terms of trunk lines and (major) 
nodes/gateways is represented in the Directory. Line and node objects have 
some attributes (like speed, costs, delay, etc.) that allow an 
evaluation/comparison of network connections.

Soft Pages Project uses the above features to give users a cost-effective
ftp-site for a sought file.

A lot of costly bandwidth, particularly in overseas links, is used for ftp.
It is believed that if local alternatives for file retrieval is used more
often, a lot of costly bandwidth, particularly in overseas links, will be 
released. This is because files exist in multiple copies not only all over
the world but even within one country or organization. The problem we face
today is that even if users are aware of their inefficient file retrieval 
they lack the knowledge of alternatives.

There is a document "Optimizing Document Retrieval by Using X.500 Directory
 * Soft Pages Project" giving a detailed description. The ASCII version can
be obtained from me via e-mail and we will put a postscript version on
an anonymous ftp server soon.  

Any comments are welcome. Please either post to this list or e-mail me.
There is also a mailing list '' discussing Soft Pages issues.
To join send a mail to ''.


|   Thomas Johannsen                     Internet:   | 
|   AIC Systems Lab.                     BITNET: JOHANNSE at DDDTU1   | 
|   Sendai (Japan)      Tel: +81 22 279 3310   Fax: +81 22 279 3640   |