Re: [Json] Encoding Schemes

Tatu Saloranta <tsaloranta@gmail.com> Tue, 18 June 2013 19:11 UTC

Return-Path: <tsaloranta@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0459221E8092 for <json@ietfa.amsl.com>; Tue, 18 Jun 2013 12:11:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.48
X-Spam-Level:
X-Spam-Status: No, score=-2.48 tagged_above=-999 required=5 tests=[AWL=0.119, BAYES_00=-2.599, HTML_MESSAGE=0.001, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mx7eougLj0P4 for <json@ietfa.amsl.com>; Tue, 18 Jun 2013 12:11:50 -0700 (PDT)
Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) by ietfa.amsl.com (Postfix) with ESMTP id EF74B11E80EE for <json@ietf.org>; Tue, 18 Jun 2013 12:11:49 -0700 (PDT)
Received: by mail-wg0-f49.google.com with SMTP id a12so3736589wgh.28 for <json@ietf.org>; Tue, 18 Jun 2013 12:11:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=2TsrfonbPcBZkR82YLkWPX058VTyFFF+l+mbPD8y25w=; b=g+6fYamr7NYUuRQsscgewCxgcwOB1GxwVNG/anVDhWZzTNvKigyH/2blxYR9EU0Gn+ 1NMDntf/yag053IXnQh5xm/CTP63UZ/klsdg1kJQvgb07PKPocsxMiO0SeJhwbMOAZFy bIMwoMInmAa9lmpQbKGP+5BoRVoQrNTjynbEZbNQjAlu8/1o2QLxgiWz0JgZxKToFcU8 Bbmhp8aWqReFrEMFrpmb53aOcpPQ4yWkhsTrk2f2+mqea89uvRccleK90PrWsdwbrc4Z leXucC9Nu1HN+DZ2MXlfdKjvXZVkMLymwzsdUAf0+ajDd1Hi35b/XXTZCNRnIdIwIPLR 9faQ==
MIME-Version: 1.0
X-Received: by 10.194.90.244 with SMTP id bz20mr12102227wjb.69.1371582709019; Tue, 18 Jun 2013 12:11:49 -0700 (PDT)
Received: by 10.227.72.74 with HTTP; Tue, 18 Jun 2013 12:11:48 -0700 (PDT)
In-Reply-To: <A723FC6ECC552A4D8C8249D9E07425A70FC582BF@xmb-rcd-x10.cisco.com>
References: <20130618183926.GG12085@mercury.ccil.org> <A723FC6ECC552A4D8C8249D9E07425A70FC582BF@xmb-rcd-x10.cisco.com>
Date: Tue, 18 Jun 2013 20:11:48 +0100
Message-ID: <CAGrxA278qrfTigETYuzR3yXANaKK3LLkBX-daHOkbS6NA5OYxg@mail.gmail.com>
From: Tatu Saloranta <tsaloranta@gmail.com>
To: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
Content-Type: multipart/alternative; boundary=047d7bfd090a26bd8304df727cdd
Cc: John Cowan <cowan@mercury.ccil.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Encoding Schemes
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Jun 2013 19:11:51 -0000

On Tue, Jun 18, 2013 at 7:54 PM, Joe Hildebrand (jhildebr) <
jhildebr@cisco.com> wrote:

> On 6/18/13 12:39 PM, "John Cowan" <cowan@mercury.ccil.org> wrote:
>
> >Joe Hildebrand (jhildebr) scripsit:
> >
> >> "When serialized to an octet stream, JSON text SHALL be encoded in one
> >>of
> >> the following Unicode encoding schemes: UTF-8,  UTF-16BE, UTF-16LE,
> >> UTF-32BE, and UTF-32LE.  The default and RECOMMENDED encoding is UTF-8.
> >
> >Oh no, that SHALL will not fly.  Any encoding (which means "encoding
> >scheme" in a media-type context) can be used to represent JSON.
> >Including an EBCDIC variant.
>
> I assume you're talking about UTF-EBCDIC from tr16?  I don't see how you
> could auto-determine the encoding for that.  How about this:
>
> """
> Without an external mechanism that specifies encoding, when serialized to
> an octet stream, JSON text SHALL be encoded in one of the following
> Unicode encoding schemes: UTF-8,  UTF-16BE, UTF-16LE, UTF-32BE, and
> UTF-32LE.  The default and RECOMMENDED encoding is UTF-8.
>
> Note: the MIME type registered in section 6 does not specify a mechanism
> to specify the encoding scheme, so when used in a MIME context, one of the
> above encoding schemes MUST be used.
>
>
> Other Unicode encoding schemes MAY be used, but such octet streams cannot
> have their encoding scheme automatically detected and SHOULD NOT be
> assumed to interoperate with existing software.
> """
>
>
I like this wording, because it is somewhat compatible with the current
state of affairs.
Unlike XML, where most encodings can be detected from content (including
even EBCDIC, given xml declaration), JSON decoders have to rely on small
subset of encodings for auto-detection, and external encoding information
for others.

-+ Tatu +-