Re: [Gen-art] [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09

Patrik Fältström <paf@frobbit.se> Sun, 07 December 2014 18:55 UTC

Return-Path: <paf@frobbit.se>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B885A1A0097; Sun, 7 Dec 2014 10:55:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.361
X-Spam-Level:
X-Spam-Status: No, score=-1.361 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_SE=0.35, J_CHICKENPOX_14=0.6, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UEyt_iceUtXt; Sun, 7 Dec 2014 10:55:45 -0800 (PST)
Received: from mail.frobbit.se (mail.frobbit.se [85.30.129.185]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4DDBB1A0087; Sun, 7 Dec 2014 10:55:45 -0800 (PST)
Received: from [IPv6:2a02:80:3ffc::2959:e46f:9292:ec55] (unknown [IPv6:2a02:80:3ffc:0:2959:e46f:9292:ec55]) by mail.frobbit.se (Postfix) with ESMTPSA id 4CAEE22C69; Sun, 7 Dec 2014 19:55:42 +0100 (CET)
Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\))
Content-Type: multipart/signed; boundary="Apple-Mail=_1E8D1AD8-5BD3-466B-BCE6-8C9E53B771D1"; protocol="application/pgp-signature"; micalg="pgp-sha1"
X-Pgp-Agent: GPGMail 2.5b3
From: Patrik Fältström <paf@frobbit.se>
In-Reply-To: <20141207180528.GA1116@mercury.ccil.org>
Date: Sun, 07 Dec 2014 19:55:41 +0100
Message-Id: <D4E95FE1-0C25-4541-8327-16313175F13A@frobbit.se>
References: <CE03DB3D7B45C245BCA0D24327794936289DC7@MX104CL02.corp.emc.com> <89601952-AA04-44EE-A6DA-E76D0AB07C21@frobbit.se> <20141207180528.GA1116@mercury.ccil.org>
To: John Cowan <cowan@mercury.ccil.org>
X-Mailer: Apple Mail (2.1993)
Archived-At: http://mailarchive.ietf.org/arch/msg/gen-art/HMoxQhjLuqTKpXUvq7nxIc-O1Yg
Cc: "ops-dir@ietf.org" <ops-dir@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>, "json@ietf.org" <json@ietf.org>, Nico Williams <nico@cryptonector.com>, "General Area Review Team (gen-art@ietf.org)" <gen-art@ietf.org>
Subject: Re: [Gen-art] [Json] Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 07 Dec 2014 18:55:46 -0000

> On 7 dec 2014, at 19:05, John Cowan <cowan@mercury.ccil.org> wrote:
> 
> Patrik Fältström scripsit:
> 
>> But it also reference RFC7159, which doesn't require UTF-8 but instead
>> for some weird reason also allow other encodings of Unicode text. And
>> on top of that it says Byte Order Mark is not allowed.
> 
> 7159 was meant to tighten the wording of 4627, not to impose additional
> constraints on it.  For that, see the I-JSON draft.

The problem I have is that 7159 is not tight enough as it allows other encodings than UTF-8, which in turn make the encoding not work very well as this draft take for granted each one of the separator characters is one byte each.

I.e. the way I read draft-ietf-json-text-sequence (and I might be wrong), you have specific octet values that act as separators. That only works if the encoding is UTF-8.

See Figure 1:

> possible-JSON = 1*(not-RS); attempt to parse as UTF-8-encoded
>                                ; JSON text (see RFC7159)

Now, if this is NOT UTF-8, then this might be pretty bad situation.

What I am saying is that I would like this draft to explicitly say that the only profile of RFC7159 that can be used is when UTF-8 is in use, i.e. somewhere something like "The encoding MUST be UTF-8, although RFC7159 also allow other encodings, like UTF-16." Then in the security considerations section that "RFC7159 do allow not only UTF-8 encoding but also for example UTF-16, which MIGHT create problems for a parser, all depending on what data is serialized."

I.e. I want this draft to be even more tight than RFC7159.

Let me ask it this way: is there any reason to allow other encodings than UTF-8? If so, how do you handle the encoding of the separators?

>> This together implies that first of all this draft might not lead to
>> stable implementations, secondly one can not store in JSON strings
>> that include the Byte Order Mark, and there are other unspecified
>> situations.
> 
> If by that you mean that a JSON string may not contain U+FEFF, that is
> incorrect, for U+FEFF is recognized as a BOM only when placed at the
> beginning of an entity body, whereas an entity body in JSON format can
> begin only with [ or { classically, or by extension with [0-9"tfn].

Ok, so what you say is that a string in an attribute value in the JSON blob can still start with U+FEFF?

If so, good, and my apologies for not understanding this at my read of the text.

   Patrik