Re: New URC Specification is ready....
"Ronald E. Daniel" <rdaniel@acl.lanl.gov> Wed, 06 July 1994 22:23 UTC
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa12453; 6 Jul 94 18:23 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa12449; 6 Jul 94 18:23 EDT
Received: from mocha.bunyip.com by CNRI.Reston.VA.US id aa21922; 6 Jul 94 18:23 EDT
Received: by mocha.bunyip.com (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA25618 on Wed, 6 Jul 94 13:31:34 -0400
Received: from acl.lanl.gov by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA25606 (mail destined for /usr/lib/sendmail -odq -oi -furi-request uri-out) on Wed, 6 Jul 94 13:31:16 -0400
Received: from idaknow.acl.lanl.gov (idaknow.acl.lanl.gov [128.165.161.102]) by acl.lanl.gov (8.6.8.1/8.6.4) with ESMTP id LAA19294; Wed, 6 Jul 1994 11:31:10 -0600
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: "Ronald E. Daniel" <rdaniel@acl.lanl.gov>
Received: (rdaniel@localhost) by idaknow.acl.lanl.gov (8.6.8.1/8.6.4) id LAA02219; Wed, 6 Jul 1994 11:31:09 -0600
Date: Wed, 06 Jul 1994 11:31:09 -0600
Message-Id: <199407061731.LAA02219@idaknow.acl.lanl.gov>
To: ccoprmm@oit.gatech.edu
Subject: Re: New URC Specification is ready....
Cc: pays@faugeres.inria.fr, rdaniel@lanl.gov, uri@bunyip.com
Hi Michael, Really good job on the URC spec. There are, of course, a few nits to pick, please don't take the size of my response to mean that I think poorly of your work. Overall it looks really good. General comments: I agree with pays@faugeres.inria.fr that the whois++ stuff should not be in the final version of this document, it should be in a seperate document. I also agree with him(?) in not liking the precedence rules approach. Reasons why are are give below. I agree with you and disagree with him(?) about prefering a single value for each attribute. Your example of Author(s): is a perfect illustration. Now, on to the specific comments. > 2: Design Goals > =============== ... > o Simplicity: A URC specification must be simple enough for practically > anyone to understand or to encode. This allows users to encode and > maintain a given URC without the need for esoteric computer science > knowledge. I agree that this is a *very* worthwhile goal. I don't know that the precedence rules you define later meet this goal. Right now they are not too bad. However, I can certainly think of other elements that could be very useful and would require that proposed elements be changed to have precedence rules. (An example? Abstract. Currently it is a text field. However, since the best abstract for a picture is a thumbnail, the best abstract for a movie is a trailer, etc. it is not too hard to imagine making Abstract: into something with precedence so that the abstract can have a content-type, length, URLs, etc.) As the system is used over the next 15 years there will no doubt be lots of things added, and a desire to change the operation of particular elements. I don't think implicit precedence rules are going to age gracefully. > o Compatibility: Since URCs will be utilized by vastly different systems > on vastly different networks it must be encoded in such a way as to > allow very complex systems to communication complex information > via very simple gateways and access methods. Perhaps a mention of base-64 encoding, quoting rules, etc. is necessary? Yuck. See what comes out of the URL discussion and see if it can be adopted. > o Use of existing and developing technology: In order to be able to > implement something soon, an encoding specification should allow > existing systems to be easily retrofitted to use URCs. The use of > existing systems that already support object similar to URCs is > encouraged. A nice goal. I certainly don't think we should have gratuitous differences with existing and developing systems (such as whois++). However, I think there are fundamental requirements of the system that must be addressed, and that whois++ does not currently handle. (Here we go again :-) The fundamental requirement that does not seem to have received adequate consideration is, in a word, security. There has been no consideration of how to prevent me from issuing URNs that apparently originate from any publisher I choose. There has been no discussion of how to prevent me from forging authorship of resources. There has been only the most trivial discussion of ensuring the integrity of the resources I retrieve. An MD5 field in the URC is nice, but if I can get into the URC info I can provide an MD5 that is accurate for what you get when you access my URL. Too bad that what is hiding at my URL is not what the original author intended. I *strongly* believe we need to give security serious consideration now, rather than try to hack it in later. > It then becomes a simple > exercise of selecting the equality character and specifying some method of > encoding special situations such as character quoting and line continuation. Nicely put. As I mentioned earlier, some discussion of character quoting would not be amiss. Perhaps just a statement that you are monitoring the character handling discussions in other working groups and will adopt one of the resulting approaches? > Experimental attribute_names should be encoded with the > [X-attribute_name] notation. This reminded me of a purported bug in Mosiac - case sensitivity where X-foo is different than x-foo. Should we specify right now that attribute names are case insensitive? What does MIME say? > There are no attribute/value pairs that are required to be a part of a URC. Not even a URN? Seems mighty handy to require this so that when people do searches on Author, URL, ... they end up with something they can do more with. > It is intended that any additions or subtractions from this list will be > handled by the Uniform Resource Identifier Working Group. It is also > intended that this list should be extended since the full usefulness of > URCs is beyond the scope of these pairs listed. We should also start a registry for the current types so that people don't have to wade through the archives of the mailing list to find out what is the current set of well-known attributes. > o URL: > This pair must conform to the current Uniform Resource Locator > specification as defined in the URL Internet-Draft[Berners-Lee 94-1]. > > Example: > > URL:http://www.gatech.edu/ietf/urc.encoding.html What? No angle brackets? Aren't these responses text/plain? :-) More seriously, should we get a MIME Content-type for the URC responses? Might make dealing with them easier. > Since many cultures have different ways of writing names > there are no requirements on how a name should be written. Thus it is > encouraged that users encode names in the most common format i.e. > first, middle and last in English societies. Nicely put. > o TTL: > This pair encodes a Time To Live measured in seconds. Infinity is > denoted by the '+' character. This element references the attribute/value > pair directly preceding it (see section 4) and is meant as a caching aid. > > Example: > > TTL:86400 This works well for the resources identified by a URN or URL. However, the URC information for a resource is something that will also be cached in order to avoid unnecessary expense in the URN->URL resolution. We need a means of specifying the TTL for URC info, which will not be the same as the time for the resource itself. Perhaps the TTL associated with the URN tells the time to live for the URC info, while TTLs associated with URLs tell the time to cache particular resources. e.g.: URN:IANA:foo:bar:123434523 TTL:36000 // Cache URC info for 10 hours URL:http://www.bar.foo/huh.html TTL:+ // Cache the html until LRU rules kick it out. BTW - what are the units in the TTL field? Seconds? Microseconds? > o Abstract: > This pair encodes a short abstract about the given resource. Any > characters are allowed. Line continuation follows normal rules. > > Example: > > Abstract: > This document explores the various flight patterns and speeds of > unladden African and European swallows. A companion document concerning > the relative velocities of swallows ladden with coconuts is available. Pretty soon we could imagine using thumbnails as abstracts of images, trailers as abstracts for movies, etc. How about we change Abstract: to be the URN for an abstracted version of the resource, and Text-Abstract to be an ASCII description of the resource. (Of course, this doesn't solve the problem of knowing what language the text description was written in). > o Version header field: > In order to give some ability to utilize different version schemes it is > recommended that the Version field be given the idea of schemas so > that machine based algorithms can be used to differentiate resources. > For this specification only one schema is given but more can be > developed. > > o Schema 1: decimal This schema specifies the use of the > standard decimal type of version enumeration. For example, > this is version 1.0 of this document. At the authors or publishers > whim it can change to version 1.1 or even 2.0. > > Example: > > Version:decimal:1.0 Really good plan to specify the scheme. The whole issue of versioning could stand a good deal of serious thought. We may want Supercedes: and Superseded-By: fields to hold URNs, but that is a back-burner thing that should go through an X- phase first. > ... there must be some structure to a URC. The easiest and most elegant ^^^^^^^^^^^^^^^^^^^^^^^^^^^ > method is simply to introduce a set of precedence rules onto the above set of > attribute/value pairs. Well, this is a point where reasonable people may disagree. As I mentioned earlier, I am concerned about how precedence rules might have to change in the future. Explicit delimiters do not have that problem. Some people are concerned about complexity with explicit delimiters, that has not been my (admittedly limited) experience. > As above, this set of precedence rules is extensible by the IETF URI Working > Group. We probably need a protocol version identifier in the URC so that the client will know what precedence rules and attribute definitons to use. > o URNs have precedence over all other pairs, except for LIFNs. OK. > o LIFNs are equal to URNs in precedence only. > > An LIFN has many of the same characteristics of a URN. While there > is no current specification of exactly what a LIFN is or does this paper ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > will attempt to place them in the structure of a URC. This is definitely > up for discussion. Well, I'm just an Okie, but if we don't know what a LIFN is or what it is supposed to do, it seems kind of strange to be specifying a standard field that clients have to interpret consistently. How about we let people play around with X-LIFN, and assume that it has no precedence rule? > o URLs have precedence over all other pairs except for URNs > and LIFNs until the occurrence of a new URL. I can go with this. > o TTLs have no precedence over any other attribute/value pair > and therefore describe any element directly before it. I am not comfortable with this for two reasons. The first is the need to be able to put a TTL on the URC info. The second is that I think you are being too restrictive with the placement of the TTL field. It, along with things like Content-type and Content-length, should talk about the *section* they are currently in. If we can elaborate on the notion of sections we may be able to come up with a scheme that will generalize over many extensions to the set of known elements. For example, divide the fields into "Section-starting fields" and filler fields. Section starting fields, in order of priority, are URN and URL. Sections can nest, but cannot partially overlap. Therefore, the second URN: field in the example below closes both the first URN section and the URL subsection. URN:bla Author:foo URL:bar Content-type:text/html URL:baz Content-type:text/plain URN:bletch > Example: > > URN:IANA:626:oit.5674 > TTL: + > > In this example the first TTL is not needed since a URN has an infinite > time to live. This one is simply used as an illustration. I would recommend you take out the TTL: + field in the example above. The later TTL field showed a legitimate use of it, this is not a good use unless such a positioning is adopted as the TTL for the URC info. > 4.1 Other possible elements and precedence rules > ++++++++++++++++++++++++++++++++++++++++++++++++ ... > o Collection: I suppose we can go around and around on this again, but probably the best thing to do is use X-Collection for awhile, as well as X-References and X-Related and see which one(s) receive enough use to promote to full status. > o Authoritative: > > This pair will give the location of the authoritative URC server for the > given URN. This will serve as a pointer of last resort for the URC of > the given URN. This would require some method of being able to > identify a given URC database server. I don't see this as being necessary or desirable. First of all, to ever see it you have to have contacted *some* sort of URC service. My worry is that people will use this as a quick hack rather than a last resort and won't do a proper job on the distributed aspects of the URC service. > Possible future precedence rules: > > o Multiple URNs in the same URC denote simple relationship > > This is simply used as a method for the URC server to return > additional URNs that it thinks may be of value to the client. This is > useful if the server can do link prediction. If a client can already have a > URC for a given URN cached then it doesn not have to do a network call for > that related resource. > > Example: > > URN:IANA:626:oit.5674 > URL:http://www.gatech.edu/iiir/urc2.paper.html > URN:IANA:1:ietf-uri-002 > URL:http://cnri.reston.va.us/internet-drafts/draft-ietf-uri-urn2urc.txt > > This simply is the server's way of telling the client if the user is > interested in this resource that he/she may also be interested in the > other one. Hmm, having multiple URNs in a URC is certainly something that will happen because of the query language access that is needed for the URC service. However, having that mean an implicit "Prefetch-URC" doesn't seem a good idea. An explicit X-prefetch field is OK. I would rather leave it up to the browser to do implicit prefetches of the URC info, presumably by linearly scanning the current document the user is reading doing the prefetches from another thread. Really looks like a browser-side efficiency hack to me, not an implicit meaning we want to assign to a very common search result. > o Relationship operations denoted by special attribute/value pairs > > Attribute/value pairs could be specified that allow different types of > precedence rules to apply in different instances. A Block: pair could > specify a set of values that describe a specific URL or URN without > interacting with the given external precedence rules. These block > pairs would have numbers assigned to denote block nesting. > > Example: > > URN:IANA:626:oit.5674 > Authoritative:URL:whois://whois.gatech.edu:7070/template=urc > Block:1 > URN:IANA:626:oit.5600 > URN:IANA:626:oit.5601 > URN:IANA:626:oit.5602 > Block:1 > > This illustrates that the URNs in block number 1 also have the > given authoritative site as their authoritative URC server. If you are going to do this, why not just use something like braces or parenthesis, have general grouping, and do away with the precedence rules? Block seems to combine the worst of both worlds - you don't have a flat structure and you still have fragile precedence rules. Also, BLOCK and ENDBLOCK will let you do away with the digits, and should be easier to parse. > Currently there are several > systems that could be retrofitted to handle URCs. One of the best suited > services is the draft whois++ [Deutsch 94]. Whois++ is an extension to the > trivial WHOIS service which allows servers to make more structure > information available. Additions to the trivial WHOIS protocol allow for > communication between whois++ servers so that information can be shared > across collections of servers. > > The two primary advantages to using whois++ are that the data is structured > in the same template format as URCs and that the distributed nature allows a > search to start local and expand globally as required. Whois++ probably makes a good basis for the system, but there are a few places where (I think) retrofitting is needed. The first is in security, the current draft I saw was not adequate in this area. Second, I was trying to find the Internet draft on the architecture of the Whois++ indexing a few days ago, it seemed to have been withdrawn or not provided in the first place. Without seeing that, I don't know how easy it is to change some of the "forward knowledge" that is used for directing queries to remote servers. For a global system, I am concerned about centroids and would like to see something that cached "authoritative" servers for publishers in one area, and cached casual URC info for various URNs in another area. > Below are several > sessions between some client and a whois++ server in which example URCs > are given: Nice examples. It would be nice to see something about how a server that can answer a query is located. There is Mitra's suggestion of adding .uri.int to the publisher ID. I am working on a scheme more like DNS and hope to have something to post to the URI list in a few weeks. Good work Michael! Ron Daniel Jr. email: rdaniel@acl.lanl.gov Advanced Computing Lab voice: (505) 665-0139 MS B-287 TA-3 Bldg. 2011 fax: (505) 665-4939 Los Alamos National Lab http://www.acl.lanl.gov/~rdaniel/Home.html Los Alamos, NM, 87545 tautology: "Conformity is very popular"
- New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... pays
- Re: New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... Ronald E. Daniel
- Re: New URC Specification is ready.... Jim Davis
- Re: New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... Larry Masinter
- Re: New URC Specification is ready.... pays
- Re: New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... Larry Masinter
- Re: New URC Specification is ready.... Michael Mealling
- Re: New URC Specification is ready.... pays