Re: [idn] comments on IDNA-04

David Hopwood <david.hopwood@zetnet.co.uk> Tue, 20 November 2001 14:17 UTC

Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA10074 for <idn-archive@lists.ietf.org>; Tue, 20 Nov 2001 09:17:53 -0500 (EST)
Received: from lserv by psg.com with local (Exim 3.33 #1) id 166Bbk-0000ii-00 for idn-data@psg.com; Tue, 20 Nov 2001 06:10:40 -0800
Received: from irwell.zetnet.co.uk ([194.247.47.48] helo=zetnet.co.uk) by psg.com with esmtp (Exim 3.33 #1) id 166Bbj-0000ib-00 for idn@ops.ietf.org; Tue, 20 Nov 2001 06:10:39 -0800
Received: from zetnet.co.uk (man-s329.dialup.zetnet.co.uk [194.247.45.200]) by zetnet.co.uk (8.11.3/8.11.3/Debian 8.11.2-1) with ESMTP id fAKEAVs13280; Tue, 20 Nov 2001 14:10:32 GMT
Message-ID: <3BFA3816.3063B762@zetnet.co.uk>
Date: Tue, 20 Nov 2001 11:01:42 +0000
From: David Hopwood <david.hopwood@zetnet.co.uk>
X-Mailer: Mozilla 4.7 [en] (WinNT; I)
X-Accept-Language: en-GB,en,fr-FR,fr,de-DE,de,ru
MIME-Version: 1.0
To: James Seng/Personal <jseng@pobox.org.sg>, idn@ops.ietf.org
Subject: Re: [idn] comments on IDNA-04
References: <022e01c1718d$4c0dba20$a4615ed3@jamessonyvaio>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNED MESSAGE-----

[quoting fixed]

James Seng/Personal wrote:
> draft-ietf-idn-idna-04.txt:
>
> # The ToASCII operation takes a sequence of Unicode code points and
> # transforms it into a sequence of code points in the ASCII range (0..7F).
> # The original sequence and the resulting sequence are equivalent host
> # labels.
> 
> # The ToUnicode operation takes a sequence of Unicode code points and
> # returns a sequence of Unicode code points. If the input sequence is a
> # host label in ACE form, then the result is an equivalent host label
> # that is not in ACE form, otherwise the original sequence is returned
> # unaltered.
> 
> Suggest to define ASCII Compatible Encoding (ACE) before using it.
> then s/ToASCII/ToACE/.

ToASCII doesn't convert to ACE. It converts to ASCII (in particular,
if the string contains only ASCII characters, the result will not be ACE).

> # ToASCII consists of the following steps:
> #
> #    1. If all code points in the sequence are in the ASCII range (0..7F)
> #       then skip to step 3.
> 
> Step 1 seem to be optimization, but not a required step.

It is required.

> The ACE prefix, used in the conversion operations (section 4), will
> be specified in a future revision of this document. It will be two
> alphanumeric ASCII characters followed by two hyphen-minuses. It MUST
> be recognized in a case-insensitive manner.
> 
> JS>> Suggest s/ACE prefix/ACE tag/. An "ACE tag" could be a uniquely
> defined prefix and/or suffix defined by IANA and not neccessary in the
> form of xx--.

Is there any particular reason to reserve the possibility of this being
a suffix? The space of names with "xx--" prefixes has already been reserved,
in the sense of warning registrars that it may be used for IDN.

(Incidentally, anyone know what happened to the process of reporting on
whether any such names had already been registered, as required by
<http://www.rfc-editor.org/internet-drafts/draft-ietf-idn-aceid-02.txt>?)

The only minor issue I can think of relating to this choice, is that
using a prefix means that an ACE label can end with a digit, and using
a suffix means that it can begin with a digit. That might be relevant
for top-level IDN labels, since some existing software might only check
whether the last character is a digit in order to distinguish a domain name
from a dotted-decimal IPv4 address. (The correct test, implied but not
explicitly stated in RFC 1035, is to check whether the top label consists
*only* of ASCII digits.)

> # Internationalized host name data in zone files (as specified by section
> # 5 of RFC 1035) MUST be processed with ToASCII before it is entered in
> # the zone files.
> 
> Dont think we should make a mistake to define zone file format. No
> reasons Zone file cant be in UTF-8 so long ToASCII/ToACE is applied
> before using it.

I don't think this is a good way of handling zone files, but I do think
that the zone file format is entirely within the scope of an IDN proposal
(as a format for zone transfers and for interoperability between different
server implementations, not as the required representation of a zone).
This is how IDNA-04 handles it - by requiring ACE in zone files.

> # It is imperative that there be only one ASCII encoding for a particular
> # host name. ACE is an encoding for host name labels that use non-ASCII
> # characters. Thus, a primary master name server MUST NOT contain an
> # ACE-encoded label that decodes to an ASCII label. The ToASCII operation
> # assures that no such names are ever output from the operation.
> 
> An ACE label that decodes to ASCII label should be defination an
> invalid ACE. You dont want this to happen in any place where ACE is
> used, not just primary master name server.

I agree.

- -- 
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBO/o31zkCAxeYt5gVAQG+XQgAxfmx8cgwjlwG6KOHmOW8bbgl/xfW+YW0
5jtJNx/ctaWpMUrwU7ROwfgN0tIG2R6A3crsUJa583vQL6mpu+07fjfQW7W7c1F+
IFOJqKcLh1CUq8ruCTUopDUROE+TNjGGBt4uOTKAm4E/23g0QKwFxNSTpT7kbCbT
8nNso4TgzafOKPgQLvgRI4e3KEXW98yTJRdmAVZlVVFTMRt3M3V6gf9PRI6E46K9
flWHs/NPyxuyWd74RXjfK1oV5bqb99i6A5GfszZAl0LWuOqnUX7Rtn9rHzcVjf2i
KBK+kSLC20uKcYiru97TRPOjB+ifb1Vxg96zzgv+B//JY9Jm8H+zzA==
=NNVe
-----END PGP SIGNATURE-----