Re: [idn] Report from the ACE design team

Makoto Ishisone <ishisone@sra.co.jp> Mon, 25 June 2001 18:14 UTC

Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with SMTP id OAA23117 for <idn-archive@lists.ietf.org>; Mon, 25 Jun 2001 14:14:24 -0400 (EDT)
Received: from lserv by psg.com with local (Exim 3.16 #1) id 15EZHi-000PkD-00 for idn-data@psg.com; Mon, 25 Jun 2001 09:32:22 -0700
Received: from sraigw.sra.co.jp ([202.32.10.2]) by psg.com with esmtp (Exim 3.16 #1) id 15EZHh-000Pk7-00 for idn@ops.ietf.org; Mon, 25 Jun 2001 09:32:21 -0700
Received: from sranhe.sra.co.jp (sranhe [133.137.44.3]) by sraigw.sra.co.jp (8.8.7/3.7W-sraigw) with ESMTP id BAA00709 for <idn@ops.ietf.org>; Tue, 26 Jun 2001 01:32:17 +0900 (JST)
Received: from srapc1567.sra.co.jp (srapc1567 [133.137.44.38]) by sranhe.sra.co.jp (8.8.7/3.6Wbeta7-srambox) with ESMTP id BAA24250 for <idn@ops.ietf.org>; Tue, 26 Jun 2001 01:32:16 +0900 (JST)
Received: from localhost (localhost [127.0.0.1]) by srapc1567.sra.co.jp (8.11.1/3.4W-sra) with ESMTP id f5PGWGq09553 for <idn@ops.ietf.org>; Tue, 26 Jun 2001 01:32:16 +0900 (JST)
To: idn@ops.ietf.org
Subject: Re: [idn] Report from the ACE design team
From: Makoto Ishisone <ishisone@sra.co.jp>
In-Reply-To: <p0510032cb74fd055e43c@[165.227.249.18]>
References: <p0510032cb74fd055e43c@[165.227.249.18]>
X-Mailer: Mew version 1.94.1 on Emacs 20.7 / Mule 4.0 (HANANOEN)
Mime-Version: 1.0
Content-Type: Text/Plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <20010626013216L.ishisone@sra.co.jp>
Date: Tue, 26 Jun 2001 01:32:16 +0900
X-Dispatcher: imput version 20000228(IM140)
Lines: 55
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

In message <p0510032cb74fd055e43c@[165.227.249.18]>,
Paul Hoffman / IMC <phoffman@imc.org> wrote:
> Greetings again. This report from the ACE design team was turned in 
> yesterday and will appear in the official Internet Drafts directory 
> on Monday.
> ...

As draft-ietf-idn-ace-report-00 says, the current recommendation of the
ACE design team is DUDE.  However, as stated in the draft, the design
team has not yet come to a complete agreement on DUDE, and I'm one of
the members who is not convinced of the recommendation.

My opinion is certainly reflected in the draft, but I'd like to
explain here the reason why I think DUDE may not be the best choice.

What I'm worrying about DUDE is its relative inefficiency for CJK
scripts.

DUDE's compression algorithm (variable-length differential encoding)
seems to work very efficiently when the code points of the characters
in a name are clustered in a small range.  It is the case for most of
Western scripts.

However, for languages with large number of characters (such as CJK),
the algorithm tends to work poorly.  In the worst case DUDE encodes
one Unicode character (in the Basic Multilingual Plane) as 4-octet
sequence.  This happens frequently for CJK Han or Hangul names because
the characters in these scripts are scattered in the Unicode code
point space.

This means that in the worst case a name of 15 characters might not
fit into a 63-octet label (assuming 4-octet prefix such as 'dq--').
We expect that typically up to 15 character name can be encodable by
DUDE.

The following points are my main concern:

1) Is 14-15 character is enough?
   At least for Japanese domain names, name of a company or an
   organization is sometimes quite long.  My question is whether the
   maximum of 14-15 character name for CJK is enough or not.  If it
   is, DUDE would be fine.  But if it isn't, other ACE which is
   more efficient (in dealing long names) but less simple might be
   better.

2) Potential migration problem
   Many NICs has already begun registering internationalized domain
   names using RACE as the ACE.  In RACE, any names up to
   17-characters can be fit in 63-octet label.  So it is possible that
   some of the registered names suddenly become invalid when migration
   from RACE to DUDE take place.  Of course it is a risk that they
   have to take, but if choosing other ACE can prevent it, or lower
   the possibility, it might be a better choice.

						-- ishisone@sra.co.jp