Re: [Json] A possible summary of the discussion so far on code points and characters
R S <sayrer@gmail.com> Sat, 08 June 2013 20:52 UTC
Return-Path: <sayrer@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 486CA21F85E0 for <json@ietfa.amsl.com>; Sat, 8 Jun 2013 13:52:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.299
X-Spam-Level:
X-Spam-Status: No, score=-2.299 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_14=0.6, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id md9hr42BTrbp for <json@ietfa.amsl.com>; Sat, 8 Jun 2013 13:52:27 -0700 (PDT)
Received: from mail-wi0-x233.google.com (mail-wi0-x233.google.com [IPv6:2a00:1450:400c:c05::233]) by ietfa.amsl.com (Postfix) with ESMTP id CDD6B21F85D1 for <json@ietf.org>; Sat, 8 Jun 2013 13:52:26 -0700 (PDT)
Received: by mail-wi0-f179.google.com with SMTP id hm9so2180246wib.6 for <json@ietf.org>; Sat, 08 Jun 2013 13:52:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=bJlFbuENtQUoBHE9art41Th4iTtvRRE02uANZFlGTW4=; b=RhvVx2BBaHQRcsbvlmV0WDcJ9EDapWAFqblC1o2kzR9fYXgME6dXQsR7ZQXU4WMYP9 qGllqIfVKW4utnu1oGzlN8BCholzzJ97rvIdRxMx4vsB+UJ0gGk7gAZcyoz/GuKgQMHS K/OEQoGb9+ELfTm9UmumT/K2kmgMwzfCoEfDCf1RXBwYccc2fAPdihZsYo6X8m5KY0W0 V3okTFKkNLziBwLvi18dVLVfGXC6ueJiHj8bq/p4EP0mkjOVkOoR3gSXicnKOX8Cs53e tIxodALwL6pV+2rnPzK0Zfo4TSu57Pr7Ag4H3MjSfYiBE8/C3P+bUIIwBARnkJqYx9ek kkDg==
MIME-Version: 1.0
X-Received: by 10.194.58.239 with SMTP id u15mr2102022wjq.87.1370724745826; Sat, 08 Jun 2013 13:52:25 -0700 (PDT)
Received: by 10.194.83.35 with HTTP; Sat, 8 Jun 2013 13:52:25 -0700 (PDT)
In-Reply-To: <AF793CAF-B30B-44A7-B864-82CEF79EA34D@vpnc.org>
References: <AF793CAF-B30B-44A7-B864-82CEF79EA34D@vpnc.org>
Date: Sat, 08 Jun 2013 13:52:25 -0700
Message-ID: <CAChr6SwLDCUk0DC9pGTKqUu_V5vJHvs7Sgv4EneTJMryn1iKSA@mail.gmail.com>
From: R S <sayrer@gmail.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: multipart/alternative; boundary="047d7ba97b948f59fb04deaab9b2"
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] A possible summary of the discussion so far on code points and characters
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Jun 2013 20:52:28 -0000
A seventh point of view, which I happen to agree with: JSON strings are a sequence of code units. This is similar to the definition of 'source text' in ECMAScript: "ECMAScript source text is assumed to be a sequence of 16-bit code units for the purposes of this specification. Such a source text may include sequences of 16-bit code units that are not valid UTF-16 character encodings." http://es5.github.io/x6.html - Rob On Sat, Jun 8, 2013 at 1:15 PM, Paul Hoffman <paul.hoffman@vpnc.org> wrote: > No hat at all, just trying to get some focus on the current facts before > trying to reach conclusions. > > 1) Some people have read the following statement from the RFC to mean > "only Unicode characters are allowed in strings": > A string is a sequence of zero or more Unicode characters [UNICODE]. > > 2) The ABNF is more liberal about what can be in a string than that > statement: > char = unescaped / > escape ( ... > %x75 4HEXDIG ) ; uXXXX U+XXXX > ... > unescaped = %x20-21 / %x23-5B / %x5D-10FFFF > > 3) Some JSON parsers enforce (1), rejecting JSON texts that have strings > that have some unallowed code points. > > 4) Some JSON parsers accept strings with all code points. > > 5) The definition of "Unicode character" has been surprising to some > people on this list, and thus might be surprising to some developers and > users of JSON. > > 6) Some people on the list consider some code points that are Unicode > non-characters to be more objectionable than other code points. > > --Paul Hoffman > _______________________________________________ > json mailing list > json@ietf.org > https://www.ietf.org/mailman/listinfo/json >
- [Json] A possible summary of the discussion so fa… Paul Hoffman
- Re: [Json] A possible summary of the discussion s… R S
- Re: [Json] A possible summary of the discussion s… Paul Hoffman
- Re: [Json] A possible summary of the discussion s… Stephen Dolan
- Re: [Json] A possible summary of the discussion s… R S
- Re: [Json] A possible summary of the discussion s… Carsten Bormann
- Re: [Json] A possible summary of the discussion s… R S
- Re: [Json] A possible summary of the discussion s… Carsten Bormann
- Re: [Json] A possible summary of the discussion s… R S
- Re: [Json] A possible summary of the discussion s… Tim Bray
- Re: [Json] A possible summary of the discussion s… Stephen Dolan
- Re: [Json] A possible summary of the discussion s… Norbert Lindenberg