Re: [Json] Proposal for strings/Unicode text

Tim Bray <tbray@textuality.com> Thu, 13 June 2013 18:00 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F16D121F9A42 for <json@ietfa.amsl.com>; Thu, 13 Jun 2013 11:00:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.358
X-Spam-Level:
X-Spam-Status: No, score=-0.358 tagged_above=-999 required=5 tests=[AWL=-0.810, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_SORBS_DUL=0.877, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zy77PsuRkITb for <json@ietfa.amsl.com>; Thu, 13 Jun 2013 11:00:20 -0700 (PDT)
Received: from mail-ve0-x235.google.com (mail-ve0-x235.google.com [IPv6:2607:f8b0:400c:c01::235]) by ietfa.amsl.com (Postfix) with ESMTP id F1A8821F9A33 for <json@ietf.org>; Thu, 13 Jun 2013 11:00:19 -0700 (PDT)
Received: by mail-ve0-f181.google.com with SMTP id db10so7921041veb.40 for <json@ietf.org>; Thu, 13 Jun 2013 11:00:19 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=+SPO6eIzlHiq8ghNhkVgM/ajkyYlt+mNRDRi3yPZTGo=; b=RJaGDSMOvnKiBuhVUHW5tB7PTXsK1wCS2tEX/rNbVEfBI2Qkv/3/vs+QvY4BZ2lSBB ydWzCP89Ov6jEuQIIvuCgbYvAxCqVQtUOR+CeViWrf5nhNY13I5+FGqJiNXvkC5sEcI7 mFFmiOYdP3Ib4wH+D3YauReoqQzKp3KfdPCSkUwg6i09NL6NsMSEiPAWRRpgj4YubBzk PDkrU2/wenBkosw8yKQP3Wrmg6ZVsPKB/lk66yvXhMzKcfnLpV+zErBlg6fJ/7zbGxkk avy/bRMLnpiR714hPoEDw5ikB4hUrn7lpHsKSkCLBHOyPqdzxq6HF1ZkTyPv5o5AS3g9 ARmw==
MIME-Version: 1.0
X-Received: by 10.58.236.42 with SMTP id ur10mr819419vec.48.1371146419391; Thu, 13 Jun 2013 11:00:19 -0700 (PDT)
Received: by 10.220.25.199 with HTTP; Thu, 13 Jun 2013 11:00:19 -0700 (PDT)
X-Originating-IP: [96.49.81.176]
In-Reply-To: <20130613121620.GB11739@mercury.ccil.org>
References: <CAHBU6ivNjMUwN2Hsn-E8FKxjqXS6b4qz=_MeeaHahWBWqG_Hgg@mail.gmail.com> <ED62F638-C0C4-411D-BA5B-EB9BA71EDB75@lindenbergsoftware.com> <20130613003213.GA26989@mercury.ccil.org> <jr5jr85h6pig2cr9id5hf1eh586g0u09i7@hive.bjoern.hoehrmann.de> <20130613121620.GB11739@mercury.ccil.org>
Date: Thu, 13 Jun 2013 11:00:19 -0700
Message-ID: <CAHBU6ismp6HZqUQOgDnjBRYtC5jFCzhTB3RFG8Ms7qohz+w1eg@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: John Cowan <cowan@mercury.ccil.org>
Content-Type: multipart/alternative; boundary="047d7bd6ac2a435eb604df0ce74c"
X-Gm-Message-State: ALoCoQm8nFYkacyqxRSvkPQNkUzWlIcPc5+rUe9eGjGaaWBQ+wj3hLZ6YtxtOx8/qqePW1m+estL
Cc: Bjoern Hoehrmann <derhoermi@gmx.net>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposal for strings/Unicode text
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 18:00:24 -0000

On Thu, Jun 13, 2013 at 5:16 AM, John Cowan <cowan@mercury.ccil.org> wrote:

> The point is that if JSON is encoded in UTF-8, any surrogate code points
> MUST be escaped, even though the grammar does not say so.
>

Why?  UTF-8 is perfectly capable of representing those integers.  Yes, the
spec says that You Shouldn’t Do That, but it says the same thing about
unpaired surrogates in UTF-16.  For historical reasons JSON allows the
encoding of stuff that is strictly nonconforming to Unicode.  This will
break lots of things, not just UTF-8 decoders (most of which, I bet, will
never actually notice).  -T



>
> --
> John Cowan            http://www.ccil.org/~cowan     cowan@ccil.org
> Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>