Re: [Json] On characters and code points

Tim Bray <tbray@textuality.com> Fri, 07 June 2013 18:49 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9809321F9950 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 11:49:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.594
X-Spam-Level:
X-Spam-Status: No, score=-1.594 tagged_above=-999 required=5 tests=[AWL=0.782, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, J_CHICKENPOX_14=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zf0VE0ScPFi6 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 11:49:38 -0700 (PDT)
Received: from mail-vc0-f174.google.com (mail-vc0-f174.google.com [209.85.220.174]) by ietfa.amsl.com (Postfix) with ESMTP id 79B5121F994F for <json@ietf.org>; Fri, 7 Jun 2013 11:49:38 -0700 (PDT)
Received: by mail-vc0-f174.google.com with SMTP id kw10so2958348vcb.19 for <json@ietf.org>; Fri, 07 Jun 2013 11:49:38 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=NpjdujwFZBdG7kJWoFZhJIp9xdjN75vDtu4yeFqnA4A=; b=de6LB5aVKFyhsR+HB0EBJz1pbS7+kZkjCfzJhZx3QYwaU3v8Y412v0GGZqxVM8UUMA G+HLpLeJzbFh2vqrj8VGXTXRTPnGgcAiWhLwRJKvjfhmYcG3e3hgohFLNSg0VxPCViNs wxNJH18GGGU2Vump+83H4W6iBS8+bnZq7J8+XfEMeTYukW02rz6G58oeE5D3W8xFWXv8 pvT1t8nbTx6NdtTFNGU4qqAcY59KuwXmJOx/yvlpmD7hLHH2U9MMH3JyIDnnxHdbGKFS fPPgrLxvdyK1BGN6A/4LLyICOofeaY2e1/OpziY5B0eK4RR5x34fCMMw/6HkxQxVzCm+ ed4w==
MIME-Version: 1.0
X-Received: by 10.52.93.8 with SMTP id cq8mr6639836vdb.77.1370630977711; Fri, 07 Jun 2013 11:49:37 -0700 (PDT)
Received: by 10.220.48.14 with HTTP; Fri, 7 Jun 2013 11:49:37 -0700 (PDT)
X-Originating-IP: [96.49.81.176]
In-Reply-To: <o4a4r8ldc0sp12k310b9gv3486ht4sis2l@hive.bjoern.hoehrmann.de>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com> <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com> <51B1B4E7.8090101@it.aoyama.ac.jp> <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de> <56A163E9-E7CD-46B3-9984-8F009EBFF500@vpnc.org> <CA+mHimO-bUvodjgM89Nskg+tqWrsTAfL8EWRx++fd16t1hFR_g@mail.gmail.com> <o4a4r8ldc0sp12k310b9gv3486ht4sis2l@hive.bjoern.hoehrmann.de>
Date: Fri, 07 Jun 2013 11:49:37 -0700
Message-ID: <CAHBU6ivCEYhSoZo6pSg+wt6J2q+qnPB5aKV8_NgGfGF--h2tUw@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Content-Type: multipart/alternative; boundary="20cf307f3ad88bab8804de94e4ea"
X-Gm-Message-State: ALoCoQnupOCLuKm/WCREDfzitn+g+dtbcRNC+qkPIYXB8wg4Bq8aZIeNX5j/Eoxbiwp3VG5mVOMo
Cc: Stephen Dolan <stephen.dolan@cl.cam.ac.uk>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] On characters and code points
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 18:49:42 -0000

Interestingly, that doesn’t mention surrogates. -T


On Fri, Jun 7, 2013 at 11:40 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

> * Stephen Dolan wrote:
> >(3) includes such beasts as U+FFFE (which you can only get by reading
> >a UTF16 byte order mark with the wrong byte order). The set (1)
> >increases with every Unicode revision to include characters from (2),
> >but (3) is stable (see
> >http://unicode.org/policies/stability_policy.html).
>
> >I think JSON should allow characters from (1) and (2) to avoid being
> >dependent on a specific Unicode revision. I do not think (3) should be
> >allowed - this would cause problems with many existing parsers which
> >represent JSON strings using another system's native unicode
> >representation.
>
> Please see <http://www.unicode.org/versions/corrigendum9.html>.
> --
> Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
> Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
> 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>