Re: [Json] Working Group Last Call on draft-ietf-json-text-sequence

Tim Bray <tbray@textuality.com> Thu, 22 May 2014 23:24 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 942AE1A024B for <json@ietfa.amsl.com>; Thu, 22 May 2014 16:24:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3dB-BAHmuD7X for <json@ietfa.amsl.com>; Thu, 22 May 2014 16:24:25 -0700 (PDT)
Received: from mail-vc0-f175.google.com (mail-vc0-f175.google.com [209.85.220.175]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A4211A0248 for <json@ietf.org>; Thu, 22 May 2014 16:24:24 -0700 (PDT)
Received: by mail-vc0-f175.google.com with SMTP id id10so1943383vcb.20 for <json@ietf.org>; Thu, 22 May 2014 16:24:22 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=th+FiwTVVxPl+k7XYsS+hgTwyiBOR5Qi71C2nBHVQ7M=; b=LeJIzdV3rZiyjKYRuHJFhhcIMnrnHVvFM+YDxh3OPfrZJ5HHoH7+pYSzPHLzZnv4ZH nIHxqXvATLPoXPBBe+ajLksaGh+VzwzMg1OjtECTJasaPVJFA0836hP6KxxZa+pnr7Y+ kCXFitKuXqGeHH9A1WV4QL9NmllhFbVDsE7/1BR56Dp55/ee7JuE/x5xnRqpKHFOkzOV EfzTfR8/Xs3bC86x5Mai+WBeP0e5D5N6FC+NeqaAflEdvyYqfEtKpsG4p7GDnSSj8GMV QmtA5Dp8CfQBZHUgovhuB7eK53wIz8Sn3Q4KLbL5quPaSHBA+Uswpcx9rvzfdxWLa8hV XE1w==
X-Gm-Message-State: ALoCoQmqkj65AiVeWEJWp1HqGR/D3GVmn3MgajWgZT2JAi2jG86ko6MOXonKxHB4hO6QbxGvJdXb
X-Received: by 10.58.195.231 with SMTP id ih7mr729640vec.32.1400801062750; Thu, 22 May 2014 16:24:22 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.98.73 with HTTP; Thu, 22 May 2014 16:24:02 -0700 (PDT)
X-Originating-IP: [24.84.235.32]
In-Reply-To: <F6B74FE0-AEBE-43CC-BDE6-BA443BC04F2D@vpnc.org>
References: <F6B74FE0-AEBE-43CC-BDE6-BA443BC04F2D@vpnc.org>
From: Tim Bray <tbray@textuality.com>
Date: Thu, 22 May 2014 18:24:02 -0500
Message-ID: <CAHBU6itW=UQq=w_wFwYJkZLT2GotUg_J1LGs-Fhcqg_vBd4+6A@mail.gmail.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: multipart/alternative; boundary="047d7b676630bf229d04fa056959"
Archived-At: http://mailarchive.ietf.org/arch/msg/json/8FAvAes2nFjpXjpSeB3ZFvcSJEU
Cc: IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Working Group Last Call on draft-ietf-json-text-sequence
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 May 2014 23:24:27 -0000

>From section 1:


   Ideally such datasets could be parsed and processed one element at a
   time.  Even if each element must be parsed in a not-online manner due
   to local choice of parser, the result will usually be sufficiently
   online: limited by the size of the biggest element in the sequence
   rather than by the size of the sequence.


The second sentence is klunky and I don’t understand what it’s trying
to say.  I suggest attaching the first sentence to the previous para
and just losing the 2nd sentence.

=====================

Just lose section 2.1.  Section 2 says perfectly clearly what a JSON
sequence is, and requires that the material separating the texts MUST
include one newline; so this language is at best redundant.  How about
putting a one-liner after the ABNF in section 2 saying.


“The effect of the ABNF is that the JSON texts in a JSON Text Sequence
are separated by whitespace which MUST include at least one newilne
(U+000A) character.”

=====================

3., first para: missing word, should be “until they are closed”.

=====================

I suggest removing section 3. It’s not hard to construct something
that could appear in the middle of a JSON text that would match
“boundary”. As you note, this will not be a very common circumstance,
and I’m thinking of a bunch of different approaches I’d use.  In
practice, the best solution is probably to arrange that records
deterministically start with "\n{".


Having said that, this is obviously an attack surface. I wonder how
well deployed JSON parsers would survive bogus sequences constructed
with careful maliciousness to drive typical parser implementations
into severe CPU or memory burn?  I bet I could construct a byte stream
that would cause serious pain to most of the popular parsers.

=====================

Security considerations:

- Why does the lack of an end-of-seq indicator constitute a security
issue?  I can believe it might be but there needs to be some sort of
threat model

- this paragraph...


   JSON text sequence parsers based on non-incremental, non-online JSON
   text parsers will not be able to efficiently parser JSON texts in
   which newlines appear; attempting to parse such sequences with non-
   incremental, non-online JSON text parsers creates a compute resource
   exhaustion vulnerability.

Huh? What does the word “online” mean in this context?  Not a
rhetorical question, I genuinely don’t get it.  I propose the
following rewrite for Security Considerations


<div>

All the security considerations of JSON [RFC7159] apply to JSON Text Sequences.


An attacker could try to cause breakage in JSON Text Sequence
processing software by arranging for a JSON text in a sequence to be
very large, for example by including very large keys or values in JSON
objects.  JSON allows parsing software to impose limits on the such
lengths, and implementors receiving software SHOULD implement and
enforce such limits.


An attacker could try to cause breakage in JSON Text Sequence
processing software by including syntactically broken JSON texts which
are designed to cause misbehavior in software attempting
resynchronization.

</div>



On Thu, May 22, 2014 at 4:00 PM, Paul Hoffman <paul.hoffman@vpnc.org> wrote:

> This begins a two-week Working Group Last Call on
> draft-ietf-json-text-sequence <
> http://tools.ietf.org/html/draft-ietf-json-text-sequence>. We are seeking
> any and all comments on the document in order to assess the strength of
> consensus for it. If you are on this mailing list and have not yet read the
> document, please do so now, and then comment on the list. If you have
> already read the draft and commented, please feel free to do so again.
>
> As a reminder, comments can be anywhere in the continuum of "looks fine"
> to "terrible idea"; they can include questions; they can include
> suggestions for singificant editorial changes; they can include minor
> editorial notes.
>
> --Matt Miller and Paul Hoffman
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>



-- 
- Tim Bray (If you’d like to send me a private message, see
https://keybase.io/timbray)