Re: [hybi] [Uri-review] ws: and wss: schemes

Joseph A Holsten <joseph@josephholsten.com> Fri, 04 September 2009 23:51 UTC

Return-Path: <josephholsten@gmail.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id F329E3A69A4; Fri, 4 Sep 2009 16:51:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ghm3iAqPRZ51; Fri, 4 Sep 2009 16:51:29 -0700 (PDT)
Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.240]) by core3.amsl.com (Postfix) with ESMTP id ECF983A6980; Fri, 4 Sep 2009 16:51:27 -0700 (PDT)
Received: by an-out-0708.google.com with SMTP id c5so472709anc.4 for <multiple recipients>; Fri, 04 Sep 2009 16:51:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:cc:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=PifcMqvlqPnkTriueWlbW59zFQGNtfq0aa/eACxuyBc=; b=TPPl6FI0tYGOubM8e0N5YpyI4gb7VEiFZvmlVE9Gvlm3nYFVRGfLYvsTN6B7xpbGS5 +j56oIHE/pFKFW9uotwoZPZgvFTuB32JZFVpMUJQHP16yk9BwYBHDJke9Ie14MNaK9rw OnCrebHYcxPez/fk1HQXwPEBwFi3odwpGSUBc=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:cc:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=qolovQqfjBsxJtt84JXWEKfovX7t3oj3RGSrsc8NPXDzKAaM/R9mTtaNzKIipcpiZP SDg3jUgO00+ijoaQcgPuq+g0E0+2qyzEq+of7CAfy8S1eLkCAZFAdlGvdqUVuhVVcc+E gTggyN6lUdfk0uxA+BUlGtef1IQ52t0BtNucQ=
Received: by 10.101.15.6 with SMTP id s6mr13059483ani.120.1252108305074; Fri, 04 Sep 2009 16:51:45 -0700 (PDT)
Received: from ?192.168.1.137? (ip70-189-74-20.ok.ok.cox.net [70.189.74.20]) by mx.google.com with ESMTPS id c14sm227119ana.8.2009.09.04.16.51.43 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 04 Sep 2009 16:51:44 -0700 (PDT)
Sender: Joseph Holsten <josephholsten@gmail.com>
Message-Id: <27A077DD-42B6-4ABD-B633-12EB73AA4201@josephholsten.com>
From: Joseph A Holsten <joseph@josephholsten.com>
To: Ian Hickson <ian@hixie.ch>
In-Reply-To: <Pine.LNX.4.62.0909042013300.26930@hixie.dreamhostps.com>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Fri, 04 Sep 2009 18:51:42 -0500
References: <OF22CD1320.96C55266-ON85257610.004AB599-85257610.004BC9CA@lotus.com> <C9931C12-E123-437D-8E7D-F8C680C62397@mnot.net> <4A8CAA72.3000209@berkeley.edu> <Pine.LNX.4.62.0909040147300.6775@hixie.dreamhostps.com> <4AA14792.4020009@gmx.de> <4D25F22093241741BC1D0EEBC2DBB1DA01AD6282C2@EX-SEA5-D.ant.amazon.com> <Pine.LNX.4.62.0909041947250.26930@hixie.dreamhostps.com> <4AA17310.1090108@gmx.de> <Pine.LNX.4.62.0909042013300.26930@hixie.dreamhostps.com>
X-Mailer: Apple Mail (2.936)
X-Mailman-Approved-At: Thu, 10 Sep 2009 18:54:56 -0700
Cc: URI <uri@w3.org>, "hybi@ietf.org" <hybi@ietf.org>, "uri-review@ietf.org" <uri-review@ietf.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Subject: Re: [hybi] [Uri-review] ws: and wss: schemes
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Sep 2009 23:51:30 -0000

On Sep 4, 2009, at 3:19 PM, Ian Hickson wrote:

> On Fri, 4 Sep 2009, Julian Reschke wrote:
>> Ian Hickson wrote:
>>>>
>>>> Because that's how URI and thus URLs are defined.
>>>
>>> The ws: and wss: URLs are IRIs; why would we limit them to URIs? I'm
>>> not especially interested in ASCII-only URIs at this point. These  
>>> URLs
>>> are only ever going to be used in contexts that accept full IRIs.
>>
>> But that's not who registering an URI scheme works. Check the  
>> relevant
>> RFCs. Essentially you register the *URI* scheme, and get IRIs based  
>> on
>> the mapping rules defined in RFC 3987.
>
> That's what I thought, but then I got feedback saying I had to  
> register an
> IRI scheme if I wanted to use IRIs.
>
> I've no interest in making ws: and wss: URIs. Only IRIs.
>
> If I define the syntax to be a subset of the full URI syntax, how  
> does it
> ever get extended to be a subset of the full IRI syntax?
>
> What should I put in the spec to make you happy and to make the use  
> of ws:
> and wss: IRIs fully well-defined?

The only scheme I can think of that was defined as an IRI was XMPP  
[RFC4622]. It actually makes more sense when you start with IRIs. If  
that's what you need, please just do that.

Traditionally, every other scheme defined since RFC3987 has defined  
itself as a URI and defined the exact encoding considerations to  
handle reserved characters that may occur given the semantics of a  
particular part. You have very standard semantics: userinfo, host,  
port, path segments, query. Those that might meaningfully contain  
reserved characters are userinfo, reg-name segments, and query. reg- 
name parts get ToASCII, everything else gets mapped with percent- 
encoding. You actually have to say this because it's not obvious.  
There's more than one way to do it.


>>>>>> I've deferred to RFC3987 to sidestep this issue.
>>>>> A URI is not a IRI.
>>>>>
>>>>> You can refer to the mapping, but that really needs a few more  
>>>>> words
>>>>> than "See RFC3987.".
>>>> It may not need many more words, but certainly a few more words.
>>>
>>> Could you elaborate? Which words should I add?
>>
>> You need to state how you want to encode non-ASCII characters. "See  
>> RFC3987"
>> goes into the right direction but really isn't sufficient. Please  
>> see RFC
>> 4395, Section 2.6:
>>
>> "2.6. Internationalization and Character Encoding
>>
>>   When describing URI schemes in which (some of) the elements of the
>>   URI are actually representations of human-readable text, care  
>> should
>>   be taken not to introduce unnecessary variety in the ways in which
>>   characters are encoded into octets and then into URI characters;  
>> see
>>   RFC 3987 [6] and Section 2.5 of RFC 3986 [5] for guidelines.  If  
>> URIs
>>   of a scheme contain any text fields, the scheme definition MUST
>>   describe the ways in which characters are encoded, and any
>>   compatibility issues with IRIs of the scheme."
>
> I've read this, but as far as I can tell, "Always UTF-8" and "See  
> IRI" are
> both complete and accurate ways of addressing this.

No. There's at least two ways to encode reg-names, tons of UCS  
encoding issues, and more. Pedantic, but that's the point of spec  
review, no?

> Since apparently neither of these options satisfies you, could you  
> state
> exactly what literal text would satisfy you?

If you're going to define it as URI and handle IRIs by mapping, I  
believe my text[1] should satisfy.

1: http://lists.w3.org/Archives/Public/uri/2009Sep/0001.html

Joseph Holsten