Re: root knowledge

yeongw@spartacus.psi.com Wed, 13 May 1992 02:19 UTC

Received: from nri.nri.reston.va.us by ietf.NRI.Reston.VA.US id aa26258; 12 May 92 22:19 EDT
Received: from nri.reston.va.us by NRI.Reston.VA.US id aa19813; 12 May 92 22:25 EDT
Received: from bells.cs.ucl.ac.uk by NRI.Reston.VA.US id aa19795; 12 May 92 22:24 EDT
Received: from spartacus.psi.com by bells.cs.ucl.ac.uk with Internet SMTP id <g.14805-0@bells.cs.ucl.ac.uk>; Wed, 13 May 1992 03:19:36 +0100
Received: from localhost by spartacus.psi.com (5.61/1.3-PSI/PSINet) id AA01128; Tue, 12 May 92 22:19:22 -0400
Message-Id: <9205130219.AA01128@spartacus.psi.com>
To: osi-ds@cs.ucl.ac.uk
Subject: Re: root knowledge
Cc: yeongw@psi.com
Reply-To: osi-ds@cs.ucl.ac.uk
In-Reply-To: Your message of Wed, 13 May 92 11:15:51 +1000. <9205130115.AA12015@squid.mel.dit.CSIRO.AU>
Date: Tue, 12 May 92 22:19:21 -0400
From: yeongw@spartacus.psi.com

> But, to be fair, I doubt that the hierarchical database
> was chosen lightly or without due thought for the consequences.

No, of course not. There are some very good reasons for the hiearchical
structure too, as you point out in your message. Sorry if my message
seemed overly negative. Fundamentally, I think X.500 is a great thing,
even though it has its problems.

[There. Can I please be readmitted to the society of X.500 admirers
 now? :-) :-)]

> X.500 was designed as a directory which, potentially, could store billions
> of entries ... [so how does one distribute knowledge and authority
> for naming?] The solution adopted by X.500 [to the problem of
> distributing knowledge and authority for naming] was the DIT hierarchy
> and the hierarchical distribution of this DIT amongst DSAs with
> appropriate linkages.

Right. The hiearchy provides a very easy way to distribute knowledge
and authority for naming, thus the name resolution algorithm works
in an fairly obvious manner. But I don't think this does anything
for the searching problem I was referring to.

The problem is that the hiearchy isn't much help when you can't 'find'
things by 'bullet reading'.

Maybe this is just a terminology problem, but it sounds like we're talking
about searching in two different contexts: I think you're talking about
'finding' an entry given its name, whereas I'm talking about 'finding'
an entry based on nondistinguished attribute values.

> I found it interesting that you refer to the Internet domain name space and
> address space as an example of this problem. It is interesting because the
> DNS uses exactly the same solution as X.500 (or should I say X.500 uses the
> same solution as DNS :-). The DNS has a hierarchical tree and the information
> is stored in a hierarchically arranged system of servers. DNS has, of course,
> exactly the same problem.

Not exactly. Actually from my perspective, the DNS doesn't have this problem,
because the DNS really doesn't provide much beyond a 'read' operation.
So the hierarchy is perfectly reasonable there, because everything
is basically name resolution, which the hierarchy is structured specifically
to support. The problem I'm referring to only occurs with systems that
provide access to data by means other than 'read'.

> Remember that a 'DSA' is a logical construct.

Well, that's one way of looking at it. I tend to take the more
literal view that a DSA is a real, logically monolithic entity,
and that the DIB subset that it stores cannot be viewed as having
any logical structure outside of the X.500 system itself. What this
translates to is that each of the different entities that make
up a 'DSA' to you is a 'DSA' in its own right to me. Actually
if 'DMD' were an Information Framework term, your 'DSA' would be my 'DMD'
(of course it's not meaningful to talk about DMDs in terms of information
storage, but who cares, I've already been burnt at the stake once :-)).

But 'that which we call a rose by any other name is still as sweet'.
Its just terminology, and I hear you. Although, you'll notice that
if we both use 'DSA' to mean the same thing, we both made the exact
same statement (that local optimizations within a single physically
realized entity solves only half the problem, and there is a need
to solve the other half, either by means outside of X.500 -- your
'quite different methods' for data storage/replication -- or by
trying to work around the tree-limitation within X.500).

> One final point. I hope I haven't given the impression that I think that X.500
> is the be all and end all of distributed databases. It is not. It does have
> serious problems when used for certain applications. However, many of these
> problems are caused by the limitations imposed by the highly distributed
> requirements. When designing an application we must be aware of the strengths
> and weaknesses of X.500 and choose the right tool for the job. In many cases
> this will not be X.500. In many cases, though, it will and there are ways
> around some of the problems.
> 

Yes, we are violently agreeing, and saying essentially the same thing
from different perspectives.

I have to say though that I think you're underestimating the problems
caused by the hierarchy. Some of the limitations are due not so much
to something intrinsic to distributed databases, but to the fact that
X.500 (for all the good reasons you've outlined above) distributes 
along hierarchical lines.

Not that I could suggest any better way of doing the distribution,
mind you, without drastically increasing the complexity of the
technology (which really doesn't need to be even more complex,
I think :-) :-)).


Wengyik