Re: [Json] Proposed change: update the Unicode version

Tim Bray <tbray@textuality.com> Tue, 04 June 2013 21:39 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 631C821F9A19 for <json@ietfa.amsl.com>; Tue, 4 Jun 2013 14:39:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.452
X-Spam-Level:
X-Spam-Status: No, score=0.452 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FH_RELAY_NODNS=1.451, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_SORBS_DUL=0.877, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G9OPaf2ULirq for <json@ietfa.amsl.com>; Tue, 4 Jun 2013 14:39:19 -0700 (PDT)
Received: from mail-ve0-x234.google.com (mail-ve0-x234.google.com [IPv6:2607:f8b0:400c:c01::234]) by ietfa.amsl.com (Postfix) with ESMTP id C906521F9A36 for <json@ietf.org>; Tue, 4 Jun 2013 13:31:56 -0700 (PDT)
Received: by mail-ve0-f180.google.com with SMTP id pa12so625901veb.11 for <json@ietf.org>; Tue, 04 Jun 2013 13:31:55 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=du8JKp0EHo9Dsxi5WHXQJ/BqnNXX5tcJlvD2p7B8Gs4=; b=gDCa8MV2y9MoJkvrC0CdsnhNkWe3n/GwxRw7Uvzizi8FxdLwYLaQHEA+/O3FHOK05G mtWQgjSJfGcso4WmSbO3VL5tn5apvwwA75ErRLY6uZnHLLIjOPdiXQLODWCuFHEMiNX5 TaqhZ1SLIcU5QaE56oGTUL6mx8KAwY/uXoPBRDWvt0XXJMY2FLSNL1mOUaZQ5GDyv4TF Ea4UCy0tepkx8jQ7/VpHiSsteOOOGVzN1l8FD0zfK902zmjlyDdR61eWW6tznMP9S4pb 5xrREtUzpt/xjKW/lLdhq875wpLbEC3HOpOJGuPbXR7ianga+fqDqeIaJC/3njQ8pp8l bWrg==
MIME-Version: 1.0
X-Received: by 10.52.93.8 with SMTP id cq8mr2524437vdb.77.1370377915789; Tue, 04 Jun 2013 13:31:55 -0700 (PDT)
Received: by 10.220.48.14 with HTTP; Tue, 4 Jun 2013 13:31:55 -0700 (PDT)
X-Originating-IP: [96.49.81.176]
In-Reply-To: <A723FC6ECC552A4D8C8249D9E07425A70FC27CC9@xmb-rcd-x10.cisco.com>
References: <51AE2B03.5070100@stpeter.im> <A723FC6ECC552A4D8C8249D9E07425A70FC27CC9@xmb-rcd-x10.cisco.com>
Date: Tue, 04 Jun 2013 13:31:55 -0700
Message-ID: <CAHBU6isjx7rWvDXZBRqO9h5pjtjS_BL2SeiwtOM5vXA5GPrgug@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
Content-Type: multipart/alternative; boundary="20cf307f3ad8e11f2504de59f848"
X-Gm-Message-State: ALoCoQnzZuAj4zayHnkr/R8xNZhiG/ulUNi9y+dULNH/7KZPhyoTsnK4VNoRdDsRYodJI2t5Camm
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, Peter Saint-Andre <stpeter@stpeter.im>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposed change: update the Unicode version
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Jun 2013 21:39:24 -0000

On Tue, Jun 4, 2013 at 11:56 AM, Joe Hildebrand (jhildebr) <
jhildebr@cisco.com> wrote:

> WAS:
>
> A string is a sequence of zero or more Unicode characters [UNICODE].
>
> SUGGESTED:
>
> A string is a sequence of encoded Unicode codepoints defined in
> [UNICODE6.2] or later versions of the Unicode specification.
>

Joe, I get what you’re trying to do, and I suspect this is technically
correct, but the language is kind of klunky and I don’t think it helps
comprehension.  I think that when you say “A string is a series of Unicode
characters" and have a reference to chapter 2, especially section 2.7, of
Unicode, it’s really perfectly clear what you mean.  I think we can
probably avoid mentioning “code points”, which is a good thing.

So how about “A string is a sequence of zero or more Unicode characters
defined in version 6.2 (or any subsequent version) of [UNICODE].”


> WAS:
>
> JSON text SHALL be encoded in Unicode.  The default encoding is
>    UTF-8.
>
> SUGGESTED:
>
> JSON text SHALL be transmitted as encoded Unicode codepoints.  The default
> encoding is
> UTF-8.
>

Whether it’s transmitted or read out of a file is irrelevant.

JSON text takes the form of a Unicode string [UNICODE, section 2.7]; the
default encoding is UTF-8.