RE: "so-called" keyword and layer 3

Nicolas Popp <nico@realnames.com> Wed, 05 December 2001 17:36 UTC

Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNV00704SWKHL@eListX.com> (original mail from nico@realnames.com); Wed, 05 Dec 2001 12:36:20 -0500 (EST)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNV00701SWJHG@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Wed, 05 Dec 2001 12:36:19 -0500 (EST)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GNV00701SWIHE@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Wed, 05 Dec 2001 12:36:18 -0500 (EST)
Received: from friendly.realnames.com (friendly.realnames.com [63.251.238.102]) by eListX.com (PMDF V6.0-025 #44856) with SMTP id <0GNV004B4SWHF8@eListX.com> for ietf-irnss@lists.elistx.com; Wed, 05 Dec 2001 12:36:18 -0500 (EST)
Received: (qmail 29244 invoked by uid 104); Wed, 05 Dec 2001 17:33:46 +0000
Received: from nico@realnames.com by friendly.realnames.com with qmail-scanner-0.96 (. Clean. Processed in 0.838224 secs); Wed, 05 Dec 2001 17:33:46 +0000
Received: from heaven.internal.realnames.com (10.1.5.39) by friendly.realnames.com with SMTP; Wed, 05 Dec 2001 17:33:45 +0000
Received: From RINCON.INTERNAL.REALNAMES.COM (10.1.5.99[10.1.5.99 port:2358]) by heaven.internal.realnames.com Mail essentials (server 2.422) with SMTP id: <153660@heaven.internal.realnames.com> for <ietf-irnss@lists.elistx.com>; Wed, 05 Dec 2001 09:29:42 +0000 (AM)
Received: by rincon.centraal.com with Internet Mail Service (5.5.2653.19) id <XHJ5GQP4>; Wed, 05 Dec 2001 09:32:58 -0800
Date: Wed, 05 Dec 2001 09:31:36 -0800
From: Nicolas Popp <nico@realnames.com>
Subject: RE: "so-called" keyword and layer 3
To: 'John C Klensin' <klensin@jck.com>
Cc: ietf-irnss@lists.elistx.com
Message-id: <7FC3066C236FD511BC5900508BAC86FE4364BA@trestles.internal.realnames.com>
MIME-version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 8BIT
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>

John.

Thanks for the clarification. I finally undertsand what a "keyword system"
means to you now.

>(a)	{common name, country, language, service type},
> then it meets my criteria for a sublayer two system (although I'm
> still concerned about scaling).

So, just to confirm, the architecture of the RealNames system is indeed of
type (a) not (b) or (c) (hence, as you say meets your definition of layer 2
service).

> (i) I'm worried about scaling with them, and especially about
> creating yet another situation in which someone has to decide who
> is entitled ("has rights to", "has the best claim on", "most
> closely matches") some word or string.   In a way, that is
> another kind of economic constraint, but, if we can meet the
> technical and end-user requirements without having to implicitly
> write ICANN, WIPO, or the equivalent into the protocol, I think
> that is desirable.   I believe that the "no overseer" requirement
> is more easily satisfied with keywords at sublayer three than at
> two.

I see two different things in your comment:
1. scaling concern
2. regulation 

Could you clarify what you mean by "scaling concerns" (1)? I think you mean
name collisions but I am not sure I understand considering that the
uniqueness context (country, language, service type) is already fairly large
to accomodate name collisions. 

As far as (2), do you envision like in Michael and Leslie's proposal
multiple layer 2 services (registries and lookup systems) or do you envision
a unique layer 2 data set with multiple registrars? In other words, do you
see layer 2 standardizing the data structure / schema or do you also see it
standardizing the data set? It seems to me that if you have multiple
competing services at layer 2, then overseers (like ICANN or WIPO) are not
necessary and each service can decide what rules/policies it wants to use to
decide who is entitled to what name & facets.

-Nico

-----Original Message-----
From: John C Klensin [mailto:klensin@jck.com]
Sent: Tuesday, December 04, 2001 10:37 PM
To: Nicolas Popp
Cc: ietf-irnss@lists.elistx.com
Subject: RE: "so-called" keyword and layer 3


Just for calibration, and for whatever it is worth, I agree with
Nico's analysis, although not necessarily his conclusion (see
below).

Building on my earlier note, which I'm not going to repeat, two
observations:

* The white pages/ yellow pages analogy was in the document
because it seemed useful.  One of my inferences from Nico's note
is that it isn't as useful as I thought and that it might be
confusing things a bit.  Analogies are like that.

* I have been trying to push keyword systems out to sublayer
three of the model for two reasons, which should probably be in
the "dns search" document.  Next time.

(i) I'm worried about scaling with them, and especially about
creating yet another situation in which someone has to decide who
is entitled ("has rights to", "has the best claim on", "most
closely matches") some word or string.   In a way, that is
another kind of economic constraint, but, if we can meet the
technical and end-user requirements without having to implicitly
write ICANN, WIPO, or the equivalent into the protocol, I think
that is desirable.   I believe that the "no overseer" requirement
is more easily satisfied with keywords at sublayer three than at
two.

(ii) Many years ago, I spent time working on information
retrieval and automatic indexing systems.   The experience left
me quite afraid of the "is that a string, or something else, and,
if so, what" meta-question.   End users don't understand it (but,
worse, often think they do) and the systems turn out to be harder
to implement in an intelligent and consistent way than one would
like.   
So, sublayer two is defined with the first facet as
"name-string".  It isn't "name-string, or unordered list of
keywords, or "phrase with proximity weighting among words", or
any of the other options.

If a "keyword system" is structured as 
 (a)	{common name, country, language, service type},
then it meets my criteria for a sublayer two system (although I'm
still concerned about scaling).  It is only when we have
 (b)	{{descriptive-keyword1, descriptive-keyword2,...},
	country, language, service type}
or
 (c)	{{common-name, descriptive-keyword1,
	descriptive-keyword2,...}, country, language, service
	type}
that I start getting anxious and pushing toward sublayer three.

But most of the keyword systems I've seen described seem to be
more like (b) or (c) than like (a).  And, in case it hasn't been
clear, 
I have been using terminology such as "so-called" because, in the
traditional terminology of information retrieval (at least as I
was taught it), only (b) and (c) contain "keyword systems".  Even
they are not strictly keyword systems, because the additional
facets are arguably something else (e.g., the geographical
location facet described in "dns search" would presumably be a
set of numeric values from a continuous scale).  And (a) isn't a
keyword system at all, but a "name and additional facets" system,
or, if one prefers, an aliasing system (it was presumably no
accident that, historically, the company Nico works for is called
"RealNames", not "Real Keywords").

If it is useful from a marketing standpoint to call these things
"keywords" and "keyword systems", so be it.  But, here, a little
bit of precision about terminology may be helpful.

    john

--On Tuesday, 04 December, 2001 09:56 -0800 Nicolas Popp
<nico@realnames.com> wrote:

> 
> James. 
> 
> I think you are missing Mr Ko¡¯s point.
> 
> John asserts that keyword Systems are best viewed as layer 3
> services. At the same time, all the Keyword System implementers
> are saying that this not a faithful reflection of reality. In
> fact, what does ¡°real-world deployment¡± experience tells us?
> It tells us the following:
> 
> 1.	None of the deployed Keyword systems (AOL, Netpia, CNNIC,
> TWNIC, 3722, RealNames¡¦) have been deployed as yellow page
> services (layer 3). The breadth of their data and registered
> metadata (today) actually makes them inferior directory
> solutions to the Yahoo or the Looksmart of the world (or layer
> 4 services like search engines). As far as local information,
> beside country and language, they don¡¯t host any local
> information like the one John describes for layer 3 services.
> 
> 2.	On the other hand, ALL deployed Keyword Systems have been
> deployed for direct navigation (AOL, Netpia, CNNIC, TWNIC,
> 3721, RealNames¡¦). In the context of direct navigation, a
> Keyword System unambiguously looks like a white page service
> (with a strong uniqueness requirement (each tuple {common name,
> country, language, service type} is unique). That, on the other
> hand, does look like a layer 2 service to me from what I read in
> John¡¯s paper.
> 
> Now, you can argue that this is a choice of business model.
> However, it would be missing the larger picture and the larger
> picture is: ¡°how do you solve the chicken and egg problem of
> deploying a large-scale, high quality directory service on the
> network¡±? 
> 
> The answer from all the Keywords System has been the same:
> first you lay an "egg". Laying an egg means you restrict the
> cope of the facets and build a layer two system with a focus on
> one differentiated application that destination directory
> services cannot compete with and that users want (direct
> navigation in all scripts with simple names that have no
> syntax). Once you have enough data, deployment and adoption,
> you build the "chicken" (you had more metadata and build a
> layer 3 service). 
> 
> What I am saying is that layer 2 service must come first and
> then can grow into differentiated directory services. Beside
> business models, that¡¯s why all Keyword Systems are layer 2
> systems. They have been trying to bootstrap layer two for 4
> years now. They are indeed layer two services. They could not
> be anything else.
> 
> -Nico