Re: [Json] On characters and code points

Carsten Bormann <cabo@tzi.org> Sat, 08 June 2013 16:07 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 47CE821F8763 for <json@ietfa.amsl.com>; Sat, 8 Jun 2013 09:07:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.978
X-Spam-Level:
X-Spam-Status: No, score=-105.978 tagged_above=-999 required=5 tests=[AWL=0.271, BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vNxm26BEiYeB for <json@ietfa.amsl.com>; Sat, 8 Jun 2013 09:07:27 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id DE63721F8749 for <json@ietf.org>; Sat, 8 Jun 2013 09:07:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.4/8.14.4) with ESMTP id r58G7DXl028212; Sat, 8 Jun 2013 18:07:13 +0200 (CEST)
Received: from [192.168.217.105] (p54893DC9.dip0.t-ipconnect.de [84.137.61.201]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id 4450A3614; Sat, 8 Jun 2013 18:07:13 +0200 (CEST)
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset="iso-8859-1"
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <3A9644F9-A0E2-46FA-B4BD-9A834C2F442B@vpnc.org>
Date: Sat, 08 Jun 2013 18:07:11 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <C81F7926-68D9-442D-BD14-B8DAFF11ACBD@tzi.org>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com> <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com> <51B1B4E7.8090101@it.aoyama.ac.jp> <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de> <56A163E9-E7CD-46B3-9984-8F009EBFF500@vpnc.org> <CDFC7751-98EE-466C-98D9-A53D278B2113@tzi.org> <3A9644F9-A0E2-46FA-B4BD-9A834C2F442B@vpnc.org>
To: Paul Hoffman <paul.hoffman@vpnc.org>
X-Mailer: Apple Mail (2.1503)
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] On characters and code points
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Jun 2013 16:07:33 -0000

On Jun 8, 2013, at 17:25, Paul Hoffman <paul.hoffman@vpnc.org> wrote:

> Ummm, how is that "much better"? "Code points minus THEONESWEHATE" seems a lot simpler than "characters plus ADDITIONAL1 plus ADDITIONAL2 plus ADDITIONAL3".

Well, we need a (short) word for "that thing that goes into strings".

Most people I know would expect a "string" to be composed of "characters".

So why not define the "character" in such a way that it works for us?

RFC 4627 does not go to the length to formally define its terms.  In particular, it is not always clear where a term is imported from somewhere else, and what that somewhere is.  This provides some potential for improvement.  Once this is done, there is no need to use the term character exactly like Unicode uses it.  (The term "Unicode character" can be used where that specific definition is actually meant.)

Grüße, Carsten