Keywords, direct navigation, and search layer 2 (was: RE: "so-called" keyword and layer 3)
John C Klensin <klensin@jck.com> Thu, 06 December 2001 17:18 UTC
Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNX00604MRHM8@eListX.com> (original mail from klensin@jck.com); Thu, 06 Dec 2001 12:18:53 -0500 (EST)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNX00601MRGM6@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Thu, 06 Dec 2001 12:18:52 -0500 (EST)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNX00601MRFM5@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Thu, 06 Dec 2001 12:18:51 -0500 (EST)
Received: from bs.jck.com ([209.187.148.211]) by eListX.com (PMDF V6.0-025 #44856) with ESMTP id <0GNX003IZMRETF@eListX.com> for ietf-irnss@lists.elistx.com; Thu, 06 Dec 2001 12:18:51 -0500 (EST)
Received: from [209.187.148.217] (helo=P2) by bs.jck.com with esmtp (Exim 3.22 #1) id 16C26w-000Gwc-00; Thu, 06 Dec 2001 17:15:02 +0000
Date: Thu, 06 Dec 2001 12:15:01 -0500
From: John C Klensin <klensin@jck.com>
Subject: Keywords, direct navigation, and search layer 2 (was: RE: "so-called" keyword and layer 3)
In-reply-to: <7FC3066C236FD511BC5900508BAC86FE4D7823@trestles.internal.realnames.com>
To: Yves Arrouye <yves@realnames.com>
Cc: ietf-irnss@lists.elistx.com
Message-id: <122895213.1007640901@P2>
MIME-version: 1.0
X-Mailer: Mulberry/2.1.1 (Win32)
Content-type: text/plain; charset="us-ascii"
Content-transfer-encoding: 7bit
Content-disposition: inline
References: <7FC3066C236FD511BC5900508BAC86FE4D7823@trestles.inte rnal.realnames.com>
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>
--On Thursday, 06 December, 2001 06:44 -0800 Yves Arrouye <yves@realnames.com> wrote: >> (i) I'm worried about scaling with them, and especially about >> creating yet another situation in which someone has to decide >> who is entitled ("has rights to", "has the best claim on", >> "most closely matches") some word or string. In a way, that >> is another kind of economic constraint, but, if we can meet >> the technical and end-user requirements without having to >> implicitly write ICANN, WIPO, or the equivalent into the >> protocol, I think that is desirable. I believe that the "no >> overseer" requirement is more easily satisfied with keywords >> at sublayer three than at two. > > I am worried about having categories in the layer 2 for that > reason. To me, industry categories implicitely mean WIPO. As a > matter of fact, the authors of the SLS document explicitely > refer to the Nice agreement: >... Oh, yes, indeed. "dns search" explicitly refers to the Nice agreement. > I know that the introduction of category helps widen the space > of potential common names for a given service type, but > basically, it means IP lawyers still decide who can get what. Oh, no. That is the beauty of this model. Let me try to review it, since I gather "dns search" isn't clear enough: In most countries, trademark "rights", and the resultant trademark lawyers, are a fact of life. No amount of wishing will cause them to disappear. And, for organizations doing business internationally, WIPO is a fact of life for similar reasons. The idea is not to undo either, but to try to avoid Internet-specific regulation or rules, or things that force the Internet to be treated differently than anything else. (I'm going to ignore "well-known marks" in what follows -- they are a mess in and of themselves, but really don't change anything but arguments about scope.) When one registers a trademark in any country which is part of the WIPO treaties (most of them) and, I've been told, in most of the others, one has to identify the types of businesses or services to which it applies. Ultimately, that is done by selecting business name categories from the nationally-accepted list. The Nice treaty list isn't definitive --some countries don't use it in explicit form-- but it is the only internatially-agreed-upon list, and it is pretty representative of the genre. That is a reality of trademark registration; it has nothing to do with the Internet. It is important to note that, at least in most countries, no one tells a potential registrant what categories they can list. A longer list expands the scope of coverage and widens the range of possible challenges (those tradeoffs are ordinary business decisions) and, in some areas, may increase fees. A too-broad list may also lead to later challenges on the grounds that the name isn't being used "that way", but that is dealt with in established ways, too. And again, this way of doing things predates that Internet by many years and really has nothing to do with it. If one trademark holder tries to challenge another over the name, or a trademark holder challenges a use of a similar name as infringing, the first question is, more or less, "are they in the same industry sector" and, again, these lists are very important and we are stuck with them. Application of this model in the faceted system differs from the current DNS model in one absolutely essential way: WIPO's role is passive with regard to any registration or set of registrations. They establish a list of category values -- a restricted vocabulary if you want to look at it from an information retrieval point of view-- or, more specifically, we adopt a list they have already established. And after that, they are out of the picture: no dispute resolution mechanisms rooted in specific names, no reports about how little green rocks are actually apples (ok, they haven't done that, but it has felt that way). Instead, someone comes to a database provider and says "I want to be listed this way, with a name string, a country, an industry code, etc". And they can say that several times if they want to, varying the value for any of those facets. If someone doesn't like it, they mount a challenge using _conventional_ mechanisms: the database providers are no more part of the problem that a newspaper would be if it ran an advertisement for a company under a name that was later challenged. In case it isn't clear, WIPO's category-value-list specification role is strictly limited to that particular facet, too. The other facets are Not Their Problem and should not be formally visible to them. Yes, WIPO staff (very senior staff) has reviewed, and agreed with, the analysis that underlies the above, although they (obviously) haven't looked at my summary of it in this note. >> If a "keyword system" is structured as >> (a) {common name, country, language, service type}, >> then it meets my criteria for a sublayer two system (although >>... > I see keywords systems, from a direct navigation standpoint, > as (a). The common name is the key (along with country, > language, and service type) used to get an object descriptor > who contains a set of facets. The fact that this descriptor > may also hold other facets in order to help additional > applications on top of the layer 2 lookup system, is not > really relevant to the fact that it does behave properly with > a set of key facets. We tried to explain that, and the > importance of having these facets that are part of a key, in > draft-arrouye-kls-00.txt. The fact that an implementation may > want to add non-key facets to directly support higher level > services is more of an implementation and ease of subscription > issue than an acknowledgement that this implementation does > not want to participate in a layer 2 lookup system. Ok. draft-arrouye-kls-00.txt is not as clear as it could be on this point, just as, obviously now, draft-klensin-dns-search-02.txt isn't clear enough. I'd welcome text from you, and will look at your draft again and try to supply text. To see if we finally understand each other, let me try to restate your comment above into the language of "dns search" (and the relevant information retrieval and classificatory systems literature as I understand it): The database(s) for search layer 2 are going to contain a full set of facets. One can "leave one out" by asserting that any searches that involve that facet should always match, i.e., by giving it a "matches everything" value. I wouldn't recommend it, but that is a business decision which we don't need to resolve and the marketplace will figure out who is right. And, since uniqueness is not required in the database itself, I'm not going to use the work "key": the database is not intrinsically relational in normal form (although one might implement it that way); keys are a function of search and retrieval strategies. A search in that search layer can specify values for any combination of facets that the searcher, or search-vendor, finds appropriate. Leaving one out is equivalent to "match anything that happens to be there". And the question of how much fuzziness to permit is also a function of the search mechanism. (I would love to see a search product as general as what I'm saying here implies, but I don't expect to see one out of the laboratory, nor would I expect it to work at scale or to be economically viable. But I could be wrong.) To invent a plausible notation for talking about this (but just a notation, not a norm), we might talk about a search as specifying (since we have readers of this list who may not be familiar with ABNF, I'm going to use "pure" BNF for the notational syntax): <search> :== "{" <facet-tuple-list> <referral-range> "}" <facet-tuple-list> :== "{" <facet-tuple> [<facet-tuple-list>] "}" <facet-tuple> :== <facet-name> <facet-value> <distance-indicator> The (vague) semantics for the things that may not be obvious, are: "distance-indicator" specifies the degree of fuzziness permitted. "referral-range" specifies how far to go down a referral (or "search the next database") chain if a match is not found in the initial database search. I'd guess it would be expressed in a hop-count TTL, but haven't worked all of the cases through. In both cases, I'm not sure that the value can really be expressed as an integer or real scalar, but let's try to keep at least the example simple. Now, in that language, your "direct navigation" keyword lookup process (and key), as I now understand it, might be expressed as: {{ name-string "common name" 0 } { geographic-location "country" 0 } { language "language-id" 0 } { industry-code "service type" 0 } 0 } Where the first four null values indicate "exact match" (i.e., no distance permitted between the search value and the database value) and the last one would indicate "no referrals" (i.e., if it isn't found in your database, quit and return "not found"). The latter is, I believe, necessary to enforce/preserve your particular brand of keyword-based uniqueness. Aside: Where your system and this may get into trouble is this model assumes that the geographic-location, language, and and industry-code facets will have values based on established, consensus-standardized, lists from which the database entries are merely choices. Search vendors don't get to make up either the facet names or the names of the category values. Without that constraint, we have a bigger mess than the LDAP one, with everyone essentially selecting their own schema and values. In particular, if your "service type" isn't isomorphic with the WIPO/Nice list (or whatever else is chosen), you will need a mapping function, s.t. that <facet-tuple> element becomes { industry-code MapToNice ("service type") 0 } I don't see a problem with doing that, and suspect your going through your service-types and checking them against the Nice list might be intellectually interesting (and that it would ultimately provide value to your customers). Now, I hope obviously, the important business issue here is whether generalizing that to the point that users can specify the additional facets, or different degrees of fuzziness, or different referral properties, makes sense. UIs are hard, users like things simple, and general solutions don't have a good history in the marketplace, often for precisely those reasons. So permitting only an exact-match {common name, country, language, service type) mechanism to be exposed may make much more sense than doing something more general. But the general _model_ is important, if only because I can prove it is scalable while I believe that, in the last analysis, your system is still subject to what we have come to call the "Joe's Pizza" problem -- far less quickly than "joes.com" or "joes-pizza.com" would be, but the potential is there. john
- reading...? bmanning
- Re: reading...? Michael Mealling
- Re: "so-called" keyword and layer 3 James Seng/Personal
- Re: reading...? John C Klensin
- RE: "so-called" keyword and layer 3 Yves Arrouye
- RE: "so-called" keyword and layer 3 Yves Arrouye
- Keywords, direct navigation, and search layer 2 (… John C Klensin
- Re: "so-called" keyword and layer 3 YangWoo Ko
- RE: "so-called" keyword and layer 3 Yves Arrouye
- RE: "so-called" keyword and layer 3 Nicolas Popp
- Re: "so-called" keyword and layer 3 Eric Brunner-Williams in Portland Maine
- RE: "Layering" terminology (was: Re: "so-called" … Keith Teare
- "Layering" terminology (was: Re: "so-called" keyw… John C Klensin
- RE: "so-called" keyword and layer 3 John C Klensin
- RE: "so-called" keyword and layer 3 Nicolas Popp
- Re: "so-called" keyword and layer 3 YangWoo Ko
- Re: "so-called" keyword and layer 3 James Seng/Personal
- Re: "so-called" keyword and layer 3 Maynard Kang
- Re: "so-called" keyword and layer 3 YangWoo Ko
- Re: "so-called" keyword and layer 3 YangWoo Ko
- Re: "so-called" keyword and layer 3 YangWoo Ko
- Re: "so-called" keyword and layer 3 YangWoo Ko
- RE: "so-called" keyword and layer 3 John C Klensin
- Re: "so-called" keyword and layer 3 John C Klensin
- Re: "so-called" keyword and layer 3 James Seng/Personal
- RE: "so-called" keyword and layer 3 Nicolas Popp
- Re: "so-called" keyword and layer 3 James Seng/Personal
- Re: "so-called" keyword and layer 3 James Seng/Personal
- Re: "so-called" keyword and layer 3 Eric Brunner-Williams in Portland Maine
- Re: "so-called" keyword and layer 3 James Seng/Personal
- "so-called" keyword and layer 3 YangWoo Ko