Re: URCs and URL resolution
Paul Francis <francis@cactus.slab.ntt.jp> Thu, 29 September 1994 05:40 UTC
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa21914; 29 Sep 94 1:40 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa21910; 29 Sep 94 1:40 EDT
Received: from mocha.Bunyip.Com by CNRI.Reston.VA.US id aa26134; 29 Sep 94 1:40 EDT
Received: by mocha.bunyip.com (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA28374 on Thu, 29 Sep 94 00:10:57 -0400
Received: from mail.ntt.jp by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA28370 (mail destined for /usr/lib/sendmail -odq -oi -furi-request uri-out) on Thu, 29 Sep 94 00:10:52 -0400
Received: by mail.core.ntt.jp (8.6.9/COREMAIL.4); Thu, 29 Sep 1994 13:10:16 +0900
Received: by slab.ntt.jp (8.6.9/core-slab.s5+) id NAA12388; Thu, 29 Sep 1994 13:10:15 +0900
Received: by cactus.slab.ntt.jp (4.1/core*slab.s5) id AA03599; Thu, 29 Sep 94 13:10:03 JST
Date: Thu, 29 Sep 1994 13:10:03 -0000
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Paul Francis <francis@cactus.slab.ntt.jp>
Message-Id: <9409290410.AA03599@cactus.slab.ntt.jp>
To: Michael.Mealling@oit.gatech.edu, wade@cs.utk.edu
Subject: Re: URCs and URL resolution
Cc: moore@cs.utk.edu, sgreen@cs.utk.edu, uri@bunyip.com
> > > > The other--for URL resolution would use a very lightweight > > RPC or RPC-like mechanism. > > I've been toying with a lightweight version of whois++ that is basically > a stripped version of the protocol that runs under udp. It has no support > for the HOLD constraint and doesn't support centroids or fancy return > types. So basically your UDP packet would contain: > > template=URC;URN=bla:match=exact > I see there is a lot of talk on this list about using whois++ as the service for both "URC" and "URN" based searches. (To be clear, what I mean by a URN-based search is that the query contains the URN, and the answer contains the URC/URL. By URC-based search is the more general case where the query contains some descriptive information such as keywords, and the answer contains the URC/URN.) I want to insert a word of caution concerning easy assumptions about what whois++ can do and when it will be able to do it. First, concerning the "easy" problem---URN-based searches (well, easy compared to URC-based searches). As near as I can tell from my readings of the whois++ doc draft-ietf-wnils-whois-03.txt, whois++ forward knowledge is currently unable to encode the information needed to process the URN-based search given in the example of Micheal's paper draft-ietf-mealling-urc-spec-00.txt: % 220 Enter search string or type 'help' for help. template=urn;URN=IANA:623:oit:cs:ftp-and-telnet This query contains URN=<string>. As near as I can tell, whois++ forward knowledge works on a per-attribute basis, not a part-of-an-attribute basis. However, for the search of the above example to scale, the (hierarchical) URN attribute must itself be parsed (i.e., the IANA server sends the query to the 623 server, that server sends it to the oit server, etc.). One fix to this, I suppose, is to have multiple attributes URNL0, URNL1, URNL2, ..., where: URNL0 = IANA URNL1 = 623 URNL2 = oit etc. I don't know the pros and cons of this approach versus some other approach (offhand it looks ugly), but the fact that I can't find any specific information about how whois++ would work as a hierarchical URN lookup service makes me think that whois++ has a long way to go. Of course, strictly speaking, the above "fix" requires no "changes" to whois++ at all---it just requires that the right attributes be defined. But defining the right attributes is the hard part of the problem, since that is where the real functionality comes from. So, the simple fact that whois++ is capable of encoding the right information (because it is a general type-value kind of system) doesn't mean the problem is solved or even easily solved. The use of whois++ for a more general URC-based lookup is much more vague. I have yet to see a cogent argument (or much of any argument) as to how centroids is expected to scale in terms of BOTH the number of queries AND the amount of memory needed. If the hierarchy is deep, then the amoung of information in the index servers at the upper parts of the hierarchy is resonable---basically the size of the vocabulary times a relatively small factor (the number of index servers at the next level down). But, the number of resources that each index server represents is enormous, meaning that almost all queries would go to almost all index servers, so bad scaling by number of queries. If, on the other hand, the hierarchy is shallow, then the amount of information represented by each index server can be made smaller, so in theory the queries can be spread to only a few of them, but since the amount of memory needed by the servers at the top scales according to the number of index servers at the next level down, the amount of memory (and updating of said memory) at the top must be large. In addition, the performance of centroids (for instance, in terms of percentage of queries that go to machines that don't have a resource that satisfied the query) is highly dependent on what information goes where. It may be possible to fine tune the hierarchy to a high degree, and it may even be possible to find an acceptable middle ground between number of queries and amount of memory. But, I haven't seen any good description of how this might be done, or even of what the characteristics of centroids is in general. (I've seen "anecdotal" examples, but not a general description of characteristics.) Well, this message seems to be more about whois++ than UR*, but I guess most of the whois++ readership is on this list too, so I won't cross post it. PF
- URCs and URL resolution Reed Wade
- Re: URCs and URL resolution rdaniel
- Re: URCs and URL resolution Michael Mealling
- Re: URCs and URL resolution Keith Moore
- Re: URCs and URL resolution Paul Francis
- Re: URCs and URL resolution Paul Francis
- Re: URCs and URL resolution pays