RE: Transport requirements for DNS-like protocols

John C Klensin <klensin@jck.com> Fri, 28 June 2002 17:51 UTC

Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00J04GAFHS@eListX.com> (original mail from klensin@jck.com); Fri, 28 Jun 2002 13:51:51 -0400 (EDT)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00J01GAFHQ@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 13:51:51 -0400 (EDT)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00J01GAFHP@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 13:51:51 -0400 (EDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by eListX.com (PMDF V6.0-025 #44856) with ESMTP id <0GYF00HGWGAEKD@eListX.com> for ietf-irnss@lists.elistx.com; Fri, 28 Jun 2002 13:51:51 -0400 (EDT)
Received: from [209.187.148.217] (helo=P2) by bs.jck.com with esmtp (Exim 3.35 #1) id 17Nzu9-0004B1-00; Fri, 28 Jun 2002 17:51:33 +0000
Date: Fri, 28 Jun 2002 13:51:31 -0400
From: John C Klensin <klensin@jck.com>
Subject: RE: Transport requirements for DNS-like protocols
In-reply-to: <7FC3066C236FD511BC5900508BAC86FED21D3B@trestles.internal.realnames.com>
To: Nicolas Popp <nico@realnames.com>
Cc: Rob Austein <sra@hactrn.net>, ietf-irnss@lists.elistx.com
Message-id: <91515694.1025272291@localhost>
MIME-version: 1.0
X-Mailer: Mulberry/3.0.0a3 (Win32)
Content-type: text/plain; charset="us-ascii"
Content-transfer-encoding: 7bit
Content-disposition: inline
References: <7FC3066C236FD511BC5900508BAC86FED21D3B@trestles.inte rnal.realnames.com>
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss/>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>

Nico,

A few observations here.   Let me stress that I'm trying to keep
an open mind on some of these issues, so am less positive about
my statements of my positions than I might otherwise be.

--On Friday, 28 June, 2002 10:24 -0700 Nicolas Popp
<nico@realnames.com> wrote:

> On the client versus server discusssion, I strongly disagree
> with any approach that requires pre-processing on the client,
> even localization / normalization type of processing. John has
> already emphasized some of the main advantages to putting the
> IRNSS smart on the server side. I would add to that the issue
> of writing, maintaining and pushing new client codes which
> over time becomes a huge penalty that our design should
> minimize if not preclude, especially if we want to make IRNSS
> real in our lifetime (I would have hoped that the slowness of
> IDNA deployment had made it clear by now). 

I think this is exactly correct.   Doing things server-side also
reduces the need to negotiate with each client vendor
separately, and you can probably give the rest of us lessons on
the difficulties of such strategies.

That said, part of the language I've been trying to use has been
fairly careful to preserve the option of various mixed
strategies.   For example, as some of you know, I think that
some clients (like 3G cellphones) are going to end up doing IDNA
by either sending whatever the user keys in to intermediate
servers that, in turn, do the nameprep and other IDNA
operations, then query the DNS.   I don't find anything in the
present IDNA spec that prohibits that strategy.    More
generally, I can imagine a search layer two implementation in
which a client, given a faceted query:

	(i) Passes the raw form of the query, whatever that
	means, off to a _generic_ server that canonicalizes it
	into standard form(whatever _that_ means) and returns
	the result.  Note that this server would not need access
	to the database of registered entries and so could be
	replicated and distributed as much as needed.
	
	(ii) Consults a cache database that is user-specific but
	possibly on some server somewhere rather than
	client-local, and returns either a result or a failure
	indication.
	
	(iii) If results are not in the cache, consults the
	IRNSS server using the canonical form and a full set of
	facets.
	
	(iv) Caches the results, by storing back to that
	user-specific cache.
	
	(v) Takes the URL and does its thing.

I think that is more or less the right order.  Verification of
signed data records (or equivalent) and authentication to the
cache might imply additional steps.  But the important thing is
that this is an "almost everything on the server" model without
"big load on the database servers".    Intuitively, it has a lot
of appeal.

> Also note that the name-string pre-processing (normalization
> and so forth) which is CPU bound is relatively cheap on the
> server side (more importantly highly predictable so that you
> can adjust your number of servers to query rates). What is
> always much much much more costly is records lookup form the
> data store. As soon as you do fuzzy matching that forces you
> to retrieve multiple records and rank them, the operational
> complexity is increased ten-fold (and your query response time
> becomes way more inpredictable unless you do a few "right
> things"). At that point, the string pre-processing time
> becomes relatively irrelevant and you may as well go all the
> way and do everything on the server side. That way, your
> client libraries are trivial to develop and maintain across OS
> platforms, client application and devices.

Yes.  The question is whether we can take the load.  And, given
the advantages of highly-optimized database search engines, it
still might make sense to split things up a bit.   Another
advantage of the sort of split above is that the client has
access to _all_ of the forms of a name, the queries, and the
response.  For what I would consider a well-written client, that
implies the ability to return error messages that include the
user-provided name as well as more specific information.  I can
easily imagine other schemes requiring a good deal of complexity
to avoid returning the equivalent of 
    not found - <some ACE-ish gibberish>
and no other information.

> On the other hand, I agree that we should design IRNSS with
> server-side optimizations in mind (such as data set
> distribution, caching strategies and stateless servers...).
> 
> Lastly, operations wise, don't you think that an IRNSS name
> service will be closer to a search engine than DNS? In fact, I
> would anticipate that most of the traffic will still be repeat
> traffic using straight layer 1 identifiers.

Partially yes to the first -- it depends a bit on what you mean
by "search engine", and I would have said "standard database
seach" instead.  To the second, I think also yes, which is why
really smart caching strategies are going to be very important.

      john