IDNA and getnameinfo() and getaddrinfo()

Nicolas Williams <Nicolas.Williams@oracle.com> Mon, 14 June 2010 21:01 UTC

Return-Path: <Nicolas.Williams@oracle.com>
X-Original-To: idna-update@alvestrand.no
Delivered-To: idna-update@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 7524839E1AB for <idna-update@alvestrand.no>; Mon, 14 Jun 2010 23:01:20 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JfLiUVj899kd for <idna-update@alvestrand.no>; Mon, 14 Jun 2010 23:01:14 +0200 (CEST)
X-Greylist: from auto-whitelisted by SQLgrey-1.6.8
Received: from rcsinet14.oracle.com (rcsinet14.oracle.com [148.87.113.126]) by eikenes.alvestrand.no (Postfix) with ESMTPS id 9E4F539E0A9 for <idna-update@alvestrand.no>; Mon, 14 Jun 2010 23:01:13 +0200 (CEST)
Received: from rcsinet10.oracle.com (rcsinet10.oracle.com [148.87.113.121]) by rcsinet14.oracle.com (Sentrion-MP-4.0.0/Sentrion-MP-4.0.0) with ESMTP id o5EHRaUA005658 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <idna-update@alvestrand.no>; Mon, 14 Jun 2010 17:27:36 GMT
Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o5EHROXi000935 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 14 Jun 2010 17:27:27 GMT
Received: from acsmt355.oracle.com (acsmt355.oracle.com [141.146.40.155]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o5EHRLxA032200; Mon, 14 Jun 2010 17:27:23 GMT
Received: from abhmt010.oracle.com by acsmt353.oracle.com with ESMTP id 344678241276536397; Mon, 14 Jun 2010 10:26:37 -0700
Received: from oracle.com (/129.153.128.104) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 14 Jun 2010 10:26:36 -0700
Date: Mon, 14 Jun 2010 12:26:31 -0500
From: Nicolas Williams <Nicolas.Williams@oracle.com>
To: idna-update@alvestrand.no
Subject: IDNA and getnameinfo() and getaddrinfo()
Message-ID: <20100614172631.GQ9605@oracle.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2010-03-02)
X-Auth-Type: Internal IP
X-Source-IP: acsinet15.oracle.com [141.146.126.227]
X-CT-RefId: str=0001.0A090202.4C166681.00D5:SCFMA922111,ss=1,fgs=0
Cc: cheshire@apple.com, john+ietf@jck.com, dthaler@microsoft.com
X-BeenThere: idna-update@alvestrand.no
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: IDNA update work <idna-update.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/idna-update>
List-Post: <mailto:idna-update@alvestrand.no>
List-Help: <mailto:idna-update-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Mon, 14 Jun 2010 21:01:20 -0000

Hello, I'm not subscribed to this list, so please Cc' me on replies.

Over in the NFSv4 WG we're discussing how to fix NFSv4.1 to properly
handle IDNA.  In the process of doing so I ran into draft-iab-idn-
encoding, which has a cogent discussion of name service switches
(pictured in figure 2).

draft-iab-idn-encoding aims for Informational status.  I'm wondering if
we could publish a Standards-Track document describing how getnameinfo()
and getaddrinfo() should handle IDNA.

For example, one could say that when using DNS getnameinfo() should:

 - perform the DNS lookup
 - apply ToUnicode() to the resulting domainname
 - attempt to convert the address' name to the caller's locale's codeset
   if that codeset is not UTF-8
    - if failure, then return the A-label as the canonical hostname
    - if success return the U-label (in the caller's locale's codeset)
      as the canonical hostname and the A-label as an alias

And that when using DNS getaddrinfo() should:

 - convert the given host/domainname from the caller's locale's codeset
   to UTF-8 if necessary
 - apply ToASCII(), perform DNS lookups
    - if success, return the IP address(es) found, the given name as the
      canonical hostname, the A-label form of the hostname as an alias,
      and the U-label form (converted to the caller's locale's codeset)
      as an alias if different from the given hostname.

Would you agree?  This would greatly simplify the application of IDNA to
various application protocols, such as, for example, NFSv4.  NFSv4 has
several domainname slots, and several more coming from ancilliary
protocols in current development.  Being able to send un-pre-processed
Unicode in NFS because the peer's getaddrinfo() must handle that
correctly seems like a very good approach -- this way IDNA does not have
to interfere with non-DNS name services.

Unfortunately we probably cannot rely on getnameinfo()/getaddrinfo()
doing the Right Thing.  A Standards-Track RFC on this would probably
help.

Nico
--