Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10

"Black, David" <david.black@emc.com> Wed, 10 December 2014 15:51 UTC

Return-Path: <david.black@emc.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 16ED91A700D; Wed, 10 Dec 2014 07:51:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.311
X-Spam-Level:
X-Spam-Status: No, score=-4.311 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VUaJbUy7Qh0l; Wed, 10 Dec 2014 07:51:44 -0800 (PST)
Received: from mailuogwdur.emc.com (mailuogwdur.emc.com [128.221.224.79]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 963DA1A6F62; Wed, 10 Dec 2014 07:51:22 -0800 (PST)
Received: from maildlpprd51.lss.emc.com (maildlpprd51.lss.emc.com [10.106.48.155]) by mailuogwprd54.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id sBAFpHCr021357 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 10 Dec 2014 10:51:18 -0500
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd54.lss.emc.com sBAFpHCr021357
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=emc.com; s=jan2013; t=1418226679; bh=kfVk5a74dQv3bvjWg9zQ1XefvVI=; h=From:To:CC:Subject:Date:Message-ID:Content-Type: Content-Transfer-Encoding:MIME-Version; b=iqgvmIvO3lbyeNnmMrz0Izzqyjob3s9MHG2x53REIPkrBK8ioDD5pyfzXYr5KjSAv BWdaFHksCbWyXnB/5K/GhjYv/Q/k9u8JlhkGd7AGU+C0FkKDU89oC2P877gmJrPDYs mEOZEtmHbPz8aNsrxq2DZD3Oe377TnyXGZJk2/xc=
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd54.lss.emc.com sBAFpHCr021357
Received: from mailusrhubprd01.lss.emc.com (mailusrhubprd01.lss.emc.com [10.253.24.19]) by maildlpprd51.lss.emc.com (RSA Interceptor); Wed, 10 Dec 2014 10:51:07 -0500
Received: from mxhub13.corp.emc.com (mxhub13.corp.emc.com [128.222.70.234]) by mailusrhubprd01.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id sBAFp6Nq009661 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 10 Dec 2014 10:51:06 -0500
Received: from MXHUB206.corp.emc.com (10.253.68.32) by mxhub13.corp.emc.com (128.222.70.234) with Microsoft SMTP Server (TLS) id 8.3.327.1; Wed, 10 Dec 2014 10:51:06 -0500
Received: from MX104CL02.corp.emc.com ([169.254.8.208]) by MXHUB206.corp.emc.com ([10.253.68.32]) with mapi id 14.03.0195.001; Wed, 10 Dec 2014 10:51:05 -0500
From: "Black, David" <david.black@emc.com>
To: "nico@cryptonector.com" <nico@cryptonector.com>, "General Area Review Team (gen-art@ietf.org)" <gen-art@ietf.org>, "ops-dir@ietf.org" <ops-dir@ietf.org>
Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10
Thread-Topic: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-10
Thread-Index: AdAUkROzD04JM08VQjyyKTYMqQun9A==
Date: Wed, 10 Dec 2014 15:51:05 +0000
Message-ID: <CE03DB3D7B45C245BCA0D243277949362B18C7@MX104CL02.corp.emc.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.238.45.76]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Sentrion-Hostname: mailusrhubprd01.lss.emc.com
X-RSA-Classifications: GIS Solicitation, DLM_1, public, Resumes
Archived-At: http://mailarchive.ietf.org/arch/msg/ietf/sMjzrvshdxMZhS8XNGELnNFN2bY
Cc: "Black, David" <david.black@emc.com>, "ietf@ietf.org" <ietf@ietf.org>, "json@ietf.org" <json@ietf.org>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Dec 2014 15:51:48 -0000

The -10 version of this draft resolves items [A]-[E] from the
Gen-ART and OPS-Dir review of -09, and the IESG is in the process of
resolving the (silly) idnits complaint about RFC 20 being a possible
downref.

For item [D], a different approach was taken instead of modifying
the ABNF - the resulting new Section 2.4 is a definite improvement
to the draft, and is significantly clearer than the modified ABNF
would have been.  Nicely done.

Item [F] about the <angle-bracketed> text in the IANA Considerations
(Section 4) remains open - if the intent is to not deal with replacing
that text until after IESG approval, an RFC Editor Note to that effect
should be added to Section 4.

I have an additional editorial concern - given all the discussion about
UTF-8, it would be good for the draft to make it clear early on 
that JSON text sequences are UTF-8 only.  Here are some suggested changes.

Abstract:

   This document describes the JSON text sequence format and associated
   media type, "application/json-seq".  A JSON text sequence consists of
   any number of JSON texts, each prefix by an Record Separator
   (U+001E), and each ending with a newline character (U+000A).

"any number of JSON texts" -> "any number of UTF-8 encoded JSON texts"

It also looks like ASCII names for RS and LF are being mixed w/Unicode
codepoints in the second sentence in the abstract.  I'm not sure that's
a good thing to do, especially as the body of the draft refers to RS and
LF as being ASCII.  Here are a couple of changes that would remedy this:

   "an Record Separator (U+001E)" -> "an ASCII Record Separator (0x1E)"
   "a newline character (U+000A)" -> "an ASCII newline character (0x0A)"

Section 2 JSON Text Sequence Format:

I suggest adding this sentence as a separate paragraph at the end of this
section (i.e., just before Section 2.1):

   JSON text sequences MUST use UTF-8 encoding; other encodings of JSON
   (i.e., UTF-16 and UTF-32) MUST NOT be used.

Aside from item [F], all of the above are editorial suggestions, but I
think making this clear early in the draft will help avoid potential
implementer confusion.

Thanks,
--David

> -----Original Message-----
> From: Black, David
> Sent: Friday, December 05, 2014 9:51 AM
> To: nico@cryptonector.com; General Area Review Team (gen-art@ietf.org); ops-
> dir@ietf.org
> Cc: ietf@ietf.org; json@ietf.org; Black, David
> Subject: Gen-ART and OPS-Dir review of draft-ietf-json-text-sequence-09
> 
> This is a combined Gen-ART and OPS-Dir review.  Boilerplate for both follows
> ...
> 
> I am the assigned Gen-ART reviewer for this draft. For background on
> Gen-ART, please see the FAQ at:
> 
> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
> 
> Please resolve these comments along with any other Last Call comments
> you may receive.
> 
> I have reviewed this document as part of the Operational directorate's ongoing
> effort to review all IETF documents being processed by the IESG.  These
> comments
> were written primarily for the benefit of the operational area directors.
> Document editors and WG chairs should treat these comments just like any other
> last call comments.
> 
> Document: draft-ietf-json-text-sequence-09
> Reviewer: David Black
> Review Date: Dec 5, 2014
> IETF LC End Date: Dec 5, 2014
> IESG Telechat date: Dec 18, 2014
> 
> Summary: This draft is on the right track, but has open issues
>  		described in the review.
> 
> This draft specifies a format that packs multiple JSON texts into a
> single string.  The ASCII RS (0x1E) character is used to separate texts,
> and a linefeed is appended to each text to ensure that a complete text
> always ends with a whitespace character.
> 
> All of the open issues are minor - the most important ones center on
> treatment of incomplete JSON texts - that appears to be an afterthought
> in this draft and needs more attention.  I also found a couple of
> minor issues in the Security and IANA Considerations sections, both of
> which are almost nits.
> 
> Major issues: None.
> 
> Minor issues:
> 
> [A] Section 2.1:
> 
>    If parsing of such an octet string as a JSON text fails, the parser
>    SHOULD nonetheless continue parsing the remainder of the sequence.
> 
> That's not quite right - there are two levels of parsing, JSON
> sequence parsing and JSON text parsing of each text in the sequence,
> both of which might be implemented in a single-pass parser.  For such an
> implementation, the above sentence could be (mis-)read to imply that the
> JSON text parse should resume from the point at which it failed, which
> would be silly (although I've seen heroic PL/1 parsers do exactly that).
> Instead, the parse needs to skip ahead to the next RS, ignoring the rest
> of the JSON text that failed to parse.  I suggest:
> 
>    If parsing of such an octet string as a JSON text fails, and the
>    octet string is followed by an RS octet, the parser
>    SHOULD nonetheless skip ahead to that RS octet and continue parsing
>    the remainder of the sequence from there.
> 
> That also covers the case where there is nothing more to parse after the
> JSON text that caused the parse failure.
> 
> [B] Section 2.3:
> 
> Is incremental parsing of a JSON text within a sequence allowed, or
> is the parser required to not produce any results until the parse of
> the entire text is successful?  I'd expect that incremental parsing
> is ok (so results may be produced from a text that ultimately fails
> to parse), and I think that's worth stating.
> 
> [C] Section 2.4:
> 
>    Parsers MUST check that any JSON texts that are a top-level number
>    include JSON whitespace ("ws" ABNF rule from [RFC7159]) after the
>    number, otherwise the JSON-text may have been truncated.
> 
> That reference to the "ws" rule doesn't get the job done because that
> rule allows a match to no characters - it's of the form ws = *( ... )
> where ... is the list of whitespace characters.  What's needed here is
> a rule of the form vws = 1*( ...) to force there to be at least one
> whitespace character, but see the next issue for a better way to deal
> with this topic by pulling the appended LF into the sequence parse
> instead of the text parse.
> 
> [D] I wonder whether the possibility of incomplete texts ought to be
> encoded into the parsing rules to directly catch JSON texts that must
> be incomplete because the last character is not LF, e.g.:
> 
>      JSON-sequence = *(1*RS (possible-JSON / truncated-JSON / empty-JSON))
>      RS = %x1E; "record separator" (RS), see RFC20
>      possible-JSON = 1*(not-RS) LF ; attempt to parse as UTF-8-encoded
>                                ; JSON text (see RFC7159)
>      truncated-JSON = *(not-RS) not-LFRS); truncated, don't attempt
> 					; to parse as JSON text
>      empty-JSON = LF ; only the LF appended by the encoder, nothing to parse
> 
>      not-RS = %x00-1D / %x1F-FF; any octet other than RS
>      not-LFRS = %x00-09/ %x1B-1D / %x1F-FF; any octet other than RS or LF
> 
> Note that this won't detect all incomplete JSON texts, because LF is allowed
> within a JSON text (and this should be stated).
> 
> [E] Section 3 - Security Considerations
> 
> Incomplete and malformed JSON texts can be used to attack JSON parsers -
> that should be pointed out, as I don't see that in RFC 7159's security
> considerations and incomplete texts are a relevant consideration for
> this draft.
> 
> [F] Section 4 - IANA Considerations
> 
>    Security considerations: See <this document, once published>,
>    Section 3.
> 
>    Interoperability considerations: Described herein.
> 
>    Published specification: <this document, once published>.
> 
>    Applications that use this media type: <by publication time
>    <https://stedolan.github.io/jq> is likely to support this format>.
> 
> Replace all three instances of the angle bracketed text.  The first two
> instances should be RFC references (e.g., RFC XXXX) w/a note to the RFC
> Editor to insert the number of the RFC when published.  The third instance
> should be resolved now, or could have an RFC Editor note added indicating
> that the author will resolve that during Authors 48 hours.
> 
> Nits/editorial comments:
> 
> idnits didn't like the reference to RFC 20 for ASCII:
> 
>   ** Downref: Normative reference to an Unknown state RFC: RFC   20
> 
> RFC 5234 (ABNF) uses this, which looks like a better reference:
> 
>    [US-ASCII]  American National Standards Institute, "Coded Character
>                Set -- 7-bit American Standard Code for Information
>                Interchange", ANSI X3.4, 1986.
> 
> --- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---
> 
> Most of these questions are n/a because this draft describes a format
> that will be used in other protocols to which RFC 5706's concerns would apply.
> 
> A.1.4   Have the Requirements on other protocols and functional
>        components been discussed?
> 
> The specification of the interaction of the JSON sequence parser with the
> JSON text parser is not as clear as it should be for incomplete or malformed
> JSON texts.  See Minor Issues [A]-[E] above.
> 
> A.1.8   Are there fault or threshold conditions that should be reported?
> 
> Yes, incomplete JSON texts - this is covered in sections 2.3 and 2.4.
> 
> Thanks,
> --David
> ----------------------------------------------------
> David L. Black, Distinguished Engineer
> EMC Corporation, 176 South St., Hopkinton, MA  01748
> +1 (508) 293-7953             FAX: +1 (508) 293-7786
> david.black@emc.com        Mobile: +1 (978) 394-7754
> ----------------------------------------------------