Re: [Uri-review] [hybi] ws: and wss: schemes

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Fri, 18 September 2009 09:18 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: uri-review@core3.amsl.com
Delivered-To: uri-review@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id CAF673A6A02 for <uri-review@core3.amsl.com>; Fri, 18 Sep 2009 02:18:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.231
X-Spam-Level:
X-Spam-Status: No, score=0.231 tagged_above=-999 required=5 tests=[AWL=0.021, BAYES_00=-2.599, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, MIME_8BIT_HEADER=0.3]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xPIh+WhOYozd for <uri-review@core3.amsl.com>; Fri, 18 Sep 2009 02:18:29 -0700 (PDT)
Received: from scmailgw01.scop.aoyama.ac.jp (scmailgw01.scop.aoyama.ac.jp [133.2.251.41]) by core3.amsl.com (Postfix) with ESMTP id ADB373A698A for <uri-review@ietf.org>; Fri, 18 Sep 2009 02:18:28 -0700 (PDT)
Received: from scmse01.scbb.aoyama.ac.jp (scmse01.scbb.aoyama.ac.jp [133.2.253.158]) by scmailgw01.scop.aoyama.ac.jp (secret/secret) with SMTP id n8I9JBCO015446 for <uri-review@ietf.org>; Fri, 18 Sep 2009 18:19:11 +0900
Received: from (unknown [133.2.206.133]) by scmse01.scbb.aoyama.ac.jp with smtp id 10ea_5391deb0_a434_11de_b50a_001d096c566a; Fri, 18 Sep 2009 18:19:11 +0900
Received: from [IPv6:::1] ([133.2.210.1]:48785) by itmail.it.aoyama.ac.jp with [XMail 1.22 ESMTP Server] id <S11F5FD6> for <uri-review@ietf.org> from <duerst@it.aoyama.ac.jp>; Fri, 18 Sep 2009 18:16:02 +0900
Message-ID: <4AB3507A.7050802@it.aoyama.ac.jp>
Date: Fri, 18 Sep 2009 18:18:50 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1b3pre) Gecko/20090108 Eudora/3.0b1pre
MIME-Version: 1.0
To: Julian Reschke <julian.reschke@gmx.de>
References: <Pine.LNX.4.62.0908070531430.28566@hixie.dreamhostps.com> <1249651007.25446.8934.camel@dbooth-laptop> <0B450D619CC0486E8BD51C31FBA214AD@POCZTOWIEC> <20090812021926.GC19298@shareable.org> <AB9A0CF094F04D39BC7DC5DEAFF7FC1C@POCZTOWIEC> <4AA8A2CE.3000801@it.aoyama.ac.jp> <34660A8503164BE88641374ADF2BF1A3@POCZTOWIEC> <20090910124618.GB32178@shareable.org> <11DFA16908CB4B7D8AF0F45975DE425A@POCZTOWIEC> <20090910224151.GA17387@shareable.org> <Pine.LNX.4.62.0909170834040.14605@hixie.dreamhostps.com> <4AB21FD6.3070008@gmx.de>
In-Reply-To: <4AB21FD6.3070008@gmx.de>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
Cc: hybi@ietf.org, uri-review@ietf.org, "public-iri@w3.org" <public-iri@w3.org>, URI <uri@w3.org>, Ian Hickson <ian@hixie.ch>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Subject: Re: [Uri-review] [hybi] ws: and wss: schemes
X-BeenThere: uri-review@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Proposed URI Schemes <uri-review.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/uri-review>, <mailto:uri-review-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/uri-review>
List-Post: <mailto:uri-review@ietf.org>
List-Help: <mailto:uri-review-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/uri-review>, <mailto:uri-review-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Sep 2009 09:18:30 -0000

On 2009/09/17 20:39, Julian Reschke wrote:
> Ian Hickson wrote:

>> Encoding considerations.
>> Characters in the host component that are excluded by the syntax
>> defined above must be converted from Unicode to ASCII by applying
>> the IDNA ToASCII algorithm to the Unicode host name, with both the
>> AllowUnassigned and UseSTD3ASCIIRules flags set, and using the
>> result of this algorithm as the host in the URI.

I think this has various problems.

First, it is fixed to IDNA 2003 (I think I may have said this already). 
IDNA 2008 is around the door. It doesn't use terms such as "ToASCII" or 
"AllowUnassigned".

Second, if this is about resolution (rather than about generic 
conversion), and because this is a new scheme, it should not exclude the 
case that some part of a domain name (reg-name) is percent-encoded, 
because both RFC 3986 and 3987 allow this.

Third, wording this as "characters" seems to say that this is a 
character-by-character operation, or that it might be applied to 
subsequent non-ASCII characters in groups, but ToASCII, when used, has 
to be applied to whole labels, not characters.

Fourth, as http://tools.ietf.org/html/draft-iab-idn-encoding-00 shows in 
more detail, assuming that DNS is always used for resolution of 
reg-names, and the technology will never be used e.g. on intranets with 
other resolution services seems to be unnecessarily restrictive.

Ideally, all the above points should be addressed by some work on the 
IRI front (public-iri@w3.org cc'ed), but that work isn't done yet.


>> Characters in other components that are excluded by the syntax
>> defined above must be converted from Unicode to ASCII by first
>> encoding the characters as UTF-8 and then replacing the
>> corresponding bytes using their percent-encoded form as defined in
>> the URI and IRI specification. [RFC3986] [RFC3987]
>
> I think that's good, except that the mention of IRI in the last sentence
> seems to be superfluous. RFC3986 already defines everything that is
> needed here. Or is there something specific from the IRI spec you think
> is relevant? (In which case it should state that more clearly).

RFC 3986 indeed defines how to use %-encoding, but except for domain 
names (which are not involved in this case), it does not specify UTF-8, 
which is only done in RFC 3987.

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp