Re: [pkix] [apps-discuss] character repertoire for fragment identifiers, was: Fwd: FW: New Version Notification for draft-kerwin-file-scheme-13.txt

Nico Williams <nico@cryptonector.com> Mon, 12 January 2015 18:15 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: pkix@ietfa.amsl.com
Delivered-To: pkix@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFD611ACD29; Mon, 12 Jan 2015 10:15:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.366
X-Spam-Level:
X-Spam-Status: No, score=-1.366 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, IP_NOT_FRIENDLY=0.334, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bbRdffL_6Kaa; Mon, 12 Jan 2015 10:15:41 -0800 (PST)
Received: from homiemail-a110.g.dreamhost.com (sub4.mail.dreamhost.com [69.163.253.135]) by ietfa.amsl.com (Postfix) with ESMTP id 039B01ACD2D; Mon, 12 Jan 2015 10:15:39 -0800 (PST)
Received: from homiemail-a110.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a110.g.dreamhost.com (Postfix) with ESMTP id D6F642004EE91; Mon, 12 Jan 2015 10:15:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=kmyaTxkVkxj6EueriUSKhK8xRok=; b=yeVauIbmOW+ TdksHvl38RR7ovFTII/03N+Odr9flSCSfqmIeSZsRe1rI4mF0MLMC768XKtCysHP lD71ngcsTdx5et4iOzQNGRg6vctaPGwB41GmDjZpIfkqZyAh6awtE8ETB3PcEY0A WLwQqFH9R9aHik9vk/P/d7Y6vccp451o=
Received: from localhost (108-207-244-174.lightspeed.austtx.sbcglobal.net [108.207.244.174]) (Authenticated sender: nico@cryptonector.com) by homiemail-a110.g.dreamhost.com (Postfix) with ESMTPA id 338972004EE90; Mon, 12 Jan 2015 10:15:38 -0800 (PST)
Date: Mon, 12 Jan 2015 12:15:37 -0600
From: Nico Williams <nico@cryptonector.com>
To: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Message-ID: <20150112181536.GN16323@localhost>
References: <54B1B682.3070609@intertwingly.net> <54B28E0F.8070306@gmx.de> <54B2936B.7030805@intertwingly.net> <05AD7DE2-1C54-45CD-B33A-13766D771E57@mnot.net> <54B2A2CD.5080502@gmx.de> <1A5BBD25-FEBD-49B1-9EFB-4EF8877BF0E7@mnot.net> <54B2A4F9.2070909@gmx.de> <54B2A894.4020201@intertwingly.net> <54B2F4C3.5020008@seantek.com> <54B3940A.6020308@it.aoyama.ac.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <54B3940A.6020308@it.aoyama.ac.jp>
User-Agent: Mutt/1.5.21 (2010-09-15)
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/pkix/UOJ_FMxmR_FwS0_ag_X9TVl32s0>
Cc: apps-discuss@ietf.org, Julian Reschke <julian.reschke@gmx.de>, Mark Nottingham <mnot@mnot.net>, Sean Leonard <dev+ietf@seantek.com>, "pkix@ietf.org" <pkix@ietf.org>, Sam Ruby <rubys@intertwingly.net>
Subject: Re: [pkix] [apps-discuss] character repertoire for fragment identifiers, was: Fwd: FW: New Version Notification for draft-kerwin-file-scheme-13.txt
X-BeenThere: pkix@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: PKIX Working Group <pkix.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pkix>, <mailto:pkix-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/pkix/>
List-Post: <mailto:pkix@ietf.org>
List-Help: <mailto:pkix-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pkix>, <mailto:pkix-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jan 2015 18:15:46 -0000

On Mon, Jan 12, 2015 at 06:29:46PM +0900, "Martin J. Dürst" wrote:
> On 2015/01/12 07:10, Sean Leonard wrote:
> >***
> >I have empathy for what Sam/the W3C wants, since the HTML protocol slots
> >basically beg to be filled with Unicode strings like <a
> >href="http://zh.wikipedia.org/wiki/巴泰勒米·波岡達"> (instead of <a
> >href="http://zh.wikipedia.org/wiki/%E5%B7%B4%E6%B3%B0%E5%8B%92%E7%B1%B3%C2%B7%E6%B3%A2%E5%B2%A1%E9%81%94">).
> 
> Very very much so. The former is readable (although there are better
> examples than foreign names such as Barthélemy Boganda) to a
> significant part of the world's population; the later is gibberish
> for everybody.

The href attribute in HTML is one thing.  "_All_ URI/IRI slots" is
another.

I don't object to versioning protocols and data formats to change
specific URI slots into IRI slots.

Thus I have no objection to -say- the href attribute in HTML being made
into an IRI slot (with or without versioning of HTML -- that being an
issue for W3C, though if it were the IETF we'd want to do something
better than a flag day).

I object to any proposal that implies that all existing URI slots must
now [magically] be able to carry Unicode non-ASCII character data.

> >But maybe the more interoperable approach is to define a format and
> >mechanism (e.g., IRIs, or something like IRIs v2) to map /from ///the
> >Unicode-capable protocol slots, /to/ the well-standardized RFC 3986 URI
> >format.

We already have this: RFC 3987.

> With IRIs, that's essentially what we have. (of course I don't want
> to imply that the IRI spec cannot be improved upon)

Exactly.

Terminology is a problem here.  In conversation and these e-mail
threads, I really want to refer to IRIs as "URIs".  But in the context
of RFCs 3986 and 3987 it is critical to be careful to say "URI" in
reference to "slots" accepting US-ASCII only, and "IRI" in reference to
slots that notionally[*] accept Unicode.

It would be very convenient if we only needed to be so careful when
pairing "URI" and "IRI" with the word "slot".

[*] I say notionally because related UI elements generally can only
    handle an equivalent subset when used in non-Unicode locales.

Nico
--