Re: [idn] Overspecifications in draft-ietf-idn-requirements-08

"James Seng/Personal" <jseng@pobox.org.sg> Fri, 02 November 2001 17:36 UTC

Received: from psg.com (exim@psg.com [147.28.0.62]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA12636 for <idn-archive@lists.ietf.org>; Fri, 2 Nov 2001 12:36:09 -0500 (EST)
Received: from lserv by psg.com with local (Exim 3.33 #1) id 15zi5J-000Hhz-00 for idn-data@psg.com; Fri, 02 Nov 2001 09:26:25 -0800
Received: from smtp22.singnet.com.sg ([165.21.101.202]) by psg.com with esmtp (Exim 3.33 #1) id 15zi5I-000Hhr-00 for idn@ops.ietf.org; Fri, 02 Nov 2001 09:26:24 -0800
Received: from jamessonyvaio (bb159-238.singnet.com.sg [165.21.159.238]) by smtp22.singnet.com.sg (8.11.6/8.11.6) with SMTP id fA2HTtD19755; Sat, 3 Nov 2001 01:29:55 +0800
Message-ID: <01e001c163c3$768f9b60$0201000a@jamessonyvaio>
From: James Seng/Personal <jseng@pobox.org.sg>
To: David Hopwood <david.hopwood@zetnet.co.uk>, idn@ops.ietf.org
References: <3BC28B99.E4E7CFB0@zetnet.co.uk>
Subject: Re: [idn] Overspecifications in draft-ietf-idn-requirements-08
Date: Sat, 03 Nov 2001 01:26:09 +0800
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4807.1700
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700
Sender: owner-idn@ops.ietf.org
Precedence: bulk
Content-Transfer-Encoding: 7bit

David asked me to make some comments on his suggestion: Sorry for not
able to do so earlier but I am busy with some other wg work.

Some general comments:

1. It is important for requirements to be simple and clear. The
difficulties lies in striking a right balance between too much detail
(i.e. too restrictive) and too little information (i.e. too vague).

Therefore, it is always possible to "improve" a requirements doc. What
we need to ask ourselves is that if we move beyond this balance.

2. When the requirements was drafted, we work on a concept that *any*
solutions, client or server, applications or infrastructure, any
character set or multiples character set, any TES, any encodings etc are
possible so long it meets these requirements.

OTOH, your comments made one basic assumption that IDNA-NAMEPREP-ACE is
the solution. If IDN Protocol is IDNA-NAMEPREP-ACE, then obviously a lot
of your comments would be right but that is hindsight.

3. Incidently, the suggestion (not yours David), that requirements is
biased to IDNA is groundless. The requirements is bias to *any*
solutions that satisfy it. (I could argue that the requirements is bias
to UDNS too).

4. For those which I did not comment, I dont find the changes
significant enough to have a separate comment.

> > 6. A transfer encoding syntax (TES) is a reversible transform of
encoded
> >    data which may (or may not) include textual data represented in
> >    one or more character encoding schemes. Examples: 8bit,
> >    Quoted-Printable, BASE64, UTF-7 (defunct), UTF-5, and RACE.
>
> This definition is never used.

ACE is an form of TES.

> > HTTP use the old service, it is a matter of great concern how the
new
> > and old services work together, and how other protocols can take
> > advantage of the new service.
>
> IDN is not a new service; it makes more sense to consider it as an
> extension of all the existing services. For example, in IDNA, the
> existing IP-to-hostname service can return an (ACE-encoded) IDN, or a
> non-IDN query can follow a DNAME record that points to an IDN. These
> cases wouldn't be possible if IDN was a separate service.

IDN is not a new service only if you assumed it is IDNA.

There are other proposal which make it a new service, e.g. IDNE, UDNS.

> > [1] The DNS is essential to the entire Internet. Therefore, the
service
> > MUST NOT damage present DNS protocol interoperability. It MUST make
the
> > minimum number of changes to existing protocols on all layers of the
> > stack.
>
> Requiring the "minimum number of changes" fails to consider the cost
> or feasibility of any change; it is requiring an absolute, which is
> always a bad idea.
>
> > It MUST continue to allow any system anywhere that implements
> > the IDN specification to resolve any internationalized domain name.
>
> "continue to" should be deleted. Obviously no system can resolve an
IDN
> at the moment.

Sound reasonable to me.

> > [3] The DNS protocol (the packet formats that go on the wire) MUST
> > NOT limit the codepoints that can be used. A service defined on top
of
> > the DNS, for instance the IDN-to-address function, MAY limit the
> > codepoints that can be used. The service descriptions MUST describe
> > what limitations are imposed.
>
> The packet formats that go on the wire use octet strings, not strings
> of codepoints. In order to maintain compatibility with the
requirements
> of RFC 2181, it is the set of octet strings that must not be limited.

But the "string of codepoints" did get send over-the-wire. Of course, it
is not send across as-is, but seldom I18N codepoints can be send across
as-is.

> > [4] The protocol MUST work for all features of DNS, IPv4, and
> > IPv6. The protocol MUST NOT allow an IDN to be returned to a
requestor
> > that requests the IP-to-(old)-domain-name mapping service.
>
> This is unclear. Returning an ACE name to an "old" requestor will
> clearly not break anything, and an ACE name is an (encoded) IDN. It
also
> doesn't take into account that some resolver interfaces are already
> Unicode-aware, in which case they would not require any distinction
> between old and new requests (this is true for InetAddress.getHostName
> in the Java API, for example, or for getipnodebyaddr, etc. in Plan-9).
>
> => [4] The proposal MUST work for all features of DNS, IPv4, and IPv6.
> => The proposal MUST ensure that the responses to requests for an IP
> => to domain name mapping will not break existing requestors.

Again, you assumed IDNA as the solution. But I like your wordings.

> > [11] The protocol should handle with care new revisions of the CCS.
> > Undefined codepoints should not be allowed unless a new revision of
> > the protocol can handle it. Protocol revisions should be tagged.
>
> The current version of nameprep allows unassigned code points in
queries
> without revision tagging, for good reasons.
>
> => [11] The proposal should handle with care new revisions of the CCS.
> => Proposals MUST discuss how undefined codepoints are handled.

This is hindsight based on some agreements we have now but not a
requirement.

> The overspecification here is "at a *single* ... place". For example,
> if canonicalization is specified by nameprep, it is idempotent, i.e.
> nameprep(nameprep(x)) = x. So doing it more than once only hurts
> efficiency, not interoperability or any other requirement. It doesn't
> even hurt efficiency very much, since the common case where a name is
> already in the correct form can be optimised.

True but not significant.

> > ... The protocol MUST specify canonicalization; ...
>
> This is meaningless without specifying what the goal of
canonicalization
> is. The minimum requirement is to ensure that characters that are
> indistinguishable to users are treated the same, and so that is what
> should be stated:

It is left to be vague so it can be defined or argued later. But of
course, Nameprep have later made this much clearer.

> > [23] If other canonicalization is done, it MUST be done before the
> > domain name is resolved.
>
> It makes perfect sense to do canonicalization as part of resolution,
not
> before it. Also, canonicalizing after resolution is certainly
feasible,
> even if it is inefficient.

Again, this is hindsight, but not at the time we draft the requirements.

> > 3. Security Considerations
> >
> > Any solution that meets the requirements in this document MUST NOT
be
> > less secure than the current DNS.
>
> That is not necessarily achievable. The main issue is name spoofing
using
> look-alike characters: even if a proposal specifically tries to
address
> that (by registration procedures, for example), it can't absolutely
> guarantee that there will not be cases of this that rely on IDNs.

name spoofing of "look-alive" characters already exists in the DNS. IDN
introduce more of the same problem but no more.

-James Seng