Re: Transport requirements for DNS-like protocols

John C Klensin <klensin@jck.com> Fri, 28 June 2002 13:30 UTC

Return-Path: <ietf-irnss-errors@lists.elistx.com>
Received: from ELIST-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00B04461GT@eListX.com> (original mail from klensin@jck.com); Fri, 28 Jun 2002 09:30:01 -0400 (EDT)
Received: from CONVERSION-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00B01460GR@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 09:30:01 -0400 (EDT)
Received: from DIRECTORY-DAEMON.eListX.com by eListX.com (PMDF V6.0-025 #44856) id <0GYF00B01460GQ@eListX.com> for ietf-irnss@elist.lists.elistx.com (ORCPT ietf-irnss@lists.elistx.com); Fri, 28 Jun 2002 09:30:00 -0400 (EDT)
Received: from bs.jck.com (ns.jck.com [209.187.148.211]) by eListX.com (PMDF V6.0-025 #44856) with ESMTP id <0GYF00A7A45ZL1@eListX.com> for ietf-irnss@lists.elistx.com; Fri, 28 Jun 2002 09:30:00 -0400 (EDT)
Received: from [209.187.148.217] (helo=P2) by bs.jck.com with esmtp (Exim 3.35 #1) id 17NvoJ-0003VO-00; Fri, 28 Jun 2002 13:29:15 +0000
Date: Fri, 28 Jun 2002 09:29:13 -0400
From: John C Klensin <klensin@jck.com>
Subject: Re: Transport requirements for DNS-like protocols
In-reply-to: <15430645.1025273478@localhost>
To: Patrik Fältström <paf@cisco.com>, Rob Austein <sra@hactrn.net>, ietf-irnss@lists.elistx.com
Message-id: <75776902.1025256553@localhost>
MIME-version: 1.0
X-Mailer: Mulberry/3.0.0a3 (Win32)
Content-type: text/plain; charset="iso-8859-1"
Content-transfer-encoding: quoted-printable
Content-disposition: inline
References: <199812050411.UAA00462@daffy.ee.lbl.gov> <vern@ee.lbl.gov> <20020628034133.D50C518AC@thrintun.hactrn.net> <15430645.1025273478@localhost>
List-Owner: <mailto:ietf-irnss-help@lists.elistx.com>
List-Post: <mailto:ietf-irnss@lists.elistx.com>
List-Subscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=subscribe>
List-Unsubscribe: <http://lists.elistx.com/ob/adm.pl>, <mailto:ietf-irnss-request@lists.elistx.com?body=unsubscribe>
List-Archive: <http://lists.elistx.com/archives/ietf-irnss/>
List-Help: <http://lists.elistx.com/elists/admin.shtml>, <mailto:ietf-irnss-request@lists.elistx.com?body=help>
List-Id: <ietf-irnss.lists.elistx.com>

(Note to readers of this list other than Patrik and Rob...
Patrik has raised, in conjunction with this "transport
requirements" discussion, a rather fundamental design issue.
The explanation below reflects what I believe about the issue.
It is thinking that very strongly influences the "dns-search"
framework.  It may also be wrong: people who think it is, or
might be, should try to speak up, since, if it is, resolving it
now and looking for alternatives will save us hugh amounts of
time and energy later.  And, if the argument isn't clear,
whether you agree with me or a much more extreme form of
Patrik's positin than I think he is taking, please ask now...
this may be _very_ important)

--On Friday, 28 June, 2002 14:11 +0200 Patrik Fältström
<paf@cisco.com> wrote:

> --On 2002-06-27 23.41 -0400 Rob Austein <sra@hactrn.net> wrote:
> 
>> It's not a big deal for the server to perform the entire
>> query operation again.
> 
> ...given the matching operation is cheap. I.e. what you don't
> say explicitly is that we should use as much as possible of
> the CPU on the client side.
> 
> This is one of the reasons why I when I came up with IDNA want
> the client to do the normalization etc, so the server can do
> bitwise comparison which make handling of hash tables easier.
> 
> I am very nervous when people come up with protocols where the
> client can pass whatever constraints to the server, and
> request the server to do complicated things. Things which is
> easy when you have an MySQL database with 100 records, but,
> when you have more data in the database than what fits in
> primary memory, then you loose.
> 
>    paf -- with tons of experience regarding "hitting the wall"
> regarding           implementation of databases etc

Yes, but...  If one can keep the operations sufficiently simple
(e.g., no profiles and few or none of the constraints you are
concerned about), there are very significant advantages to
server-side operations, particularly to interoperability,
getting things right, and being able to do upgrades/ changes in
a rational way. Based on some experience with very large
databases involving a high ratio of queries to updates, I think
one can go well beyond bitwise comparison and still have that be
true, at least given reasonable database design.

To be specific, I have a substantive concern about IDNA which I
have not wanted to raise in the "IDN" context because I don't
see an alternative there.  There is, as we know, some
controversy about the various normalizations and mappings.  Even
those who believe that they represent reasonable tradeoffs and
are globally correct are likely to find cases that they believe
should be different for local use (e.g., should U+00F6 and
U+00F8 map together if one knows the user is in Sweden or
Norway).  I don't think there is ever going to be a
universally-agreed "right" answer to this sort of question,
given adequate localization.  Of course, for a German user
working in Norway, the answer is clearly "no".  

I fear, and expect, that we will see client implementations of
IDNA/Nameprep that are "improved for use in ...", i.e., that use
slightly different mappings for locally-important characters.
And that is the road to subtle and hard-to-detect
non-interoperability, one that may ultimately require exhibiting
the ACE form to users as a disambiguator.  If, by contrast, the
functions were based on servers that explicitly had to serve the
global Internet community, the incentives for non-interoperable
implementations, and the ability to implement them without being
noticed, would be much less.

The alternative here is to be very explicit about localization
and what is localized and what is not.  Unless the servers
themselves are localized (basically impossible with the DNS,
although it is possible, as we know, to fake it for particular
subtrees), it seems to me to be a reasonable model to assume
that global services and databases, and the associated mappings,
are server-side, that localization is more of a client function,
and we had best figure out how to make it work that way.

Coming back to the U+00F6 / U+00F8  case, I can easily imagine
reasonable client-side localizations that would map one way from
Norwegian and the other way from Swedish, given knowledge of the
language of the user and that of the database entry.  I would
see the UIs for that sort of function making the
global/canonical form readily available to users, and users
becoming aware that, typing convenience aside, they are better
off using the canonical form in global interchange.  One could,
in principle, do that in IDNA as well, but the IDNA (and other
IDN WG) documents have been (probably properly) very quiet on
processing and presentation recommendations during the various
pre-Unicode coding, mapping, and canonicalization steps.  And,
with everything client-side, there may not be enough incentive
to get this right.

Because it is bound to the DNS, which has no semantics to
support the anything but global functions (in
internationalization or anything else), IDNA will almost
certainly never be a completely adequate solution since it moves
global normalization and matching functions to the client,
confusing them with localization issues.  That doesn't make it
"wrong" -- I still believe it is as good, or better, than any
other approach given the constraints -- but it becomes part of
the justification for an "above DNS" approach to these problems.

Let's not lose one of the advantages of such an approach by
shifting functionality and processing to the client side just
because it will save some CPU, at least without doing a very
careful analysis of what is actually involved.

    john