Re: [secdir] secdir review of draft-ietf-json-text-sequence-11

Nico Williams <nico@cryptonector.com> Tue, 16 December 2014 19:37 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: secdir@ietfa.amsl.com
Delivered-To: secdir@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 822E71A8737; Tue, 16 Dec 2014 11:37:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.666
X-Spam-Level:
X-Spam-Status: No, score=-1.666 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, IP_NOT_FRIENDLY=0.334, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ml_GlyjwjsOZ; Tue, 16 Dec 2014 11:37:10 -0800 (PST)
Received: from homiemail-a27.g.dreamhost.com (sub4.mail.dreamhost.com [69.163.253.135]) by ietfa.amsl.com (Postfix) with ESMTP id CCB411A872E; Tue, 16 Dec 2014 11:37:10 -0800 (PST)
Received: from homiemail-a27.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTP id 70C7759807A; Tue, 16 Dec 2014 11:37:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=5zobZSNFivI/ETQAvkFzpmrW9Z8=; b=XWX4XOty//9 bPlXiTbPYGUnV5YebHmxFyCG/n184XTDmbQIMKpllAa8Vpj+YoHQ7E8jFAtpTyj+ VUn+xfUmcrlG0hX7UvpMbkuPAfEEnXlfl3354N9XIIbASzroEdk9nbzQ8UyqSthP QYRSK7gLQYdgfmqFuo/Hq+ifOyNsDwRQ=
Received: from localhost (108-207-244-174.lightspeed.austtx.sbcglobal.net [108.207.244.174]) (Authenticated sender: nico@cryptonector.com) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTPA id E3E74598065; Tue, 16 Dec 2014 11:37:09 -0800 (PST)
Date: Tue, 16 Dec 2014 13:37:09 -0600
From: Nico Williams <nico@cryptonector.com>
To: Carl Wallace <carl@redhoundsoftware.com>
Message-ID: <20141216193707.GE3241@localhost>
References: <D0B1EECD.29290%carl@redhoundsoftware.com> <20141216000109.GP3241@localhost> <D0B587AB.2948E%carl@redhoundsoftware.com> <20141216163238.GT3241@localhost> <D0B5C964.2954A%carl@redhoundsoftware.com> <20141216174829.GZ3241@localhost> <D0B5DC2E.295DB%carl@redhoundsoftware.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <D0B5DC2E.295DB%carl@redhoundsoftware.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/secdir/3fNJf1IHwNwMuFvBJwSHZQnLyjw
Cc: draft-ietf-json-text-sequence@tools.ietf.org, iesg@ietf.org, secdir@ietf.org
Subject: Re: [secdir] secdir review of draft-ietf-json-text-sequence-11
X-BeenThere: secdir@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Security Area Directorate <secdir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/secdir>, <mailto:secdir-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/secdir/>
List-Post: <mailto:secdir@ietf.org>
List-Help: <mailto:secdir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/secdir>, <mailto:secdir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Dec 2014 19:37:12 -0000

On Tue, Dec 16, 2014 at 01:46:52PM -0500, Carl Wallace wrote:
> On 12/16/14, 12:48 PM, "Nico Williams" <nico@cryptonector.com> wrote:
> >Supporting validation of signed sequences by first re-encoding the
> >sequence is absolutely not a goal.
> 
> Re-encoding is not the point. The value parsed from the sequence has an
> extra <LF> relative to what was passed in and would thus break a signature
> generated on the original value. I don’t think there’s any lack of clarity
> on this point. You already noted that in your view either the entire
> sequence must be signed or individual elements signed with a wrapper.  I
> should have written “if supporting detached signing is not important to
> anyone, OK”.

The answer is clear: either wrap-to-sign, or remove the LF prior to
validation (since it must have been added by the encoder; if the text
was truncated, then the signature should fail).  Or, if you're chaining
encoding, parsing, encoding, parsing, you must canonicalize (except that
there is no JSON canonical form), or wrap-to-sign, or not sign, or not
re-encode.  These choices fall out from the fact that JSON has no
canonical encoding form.

If the encoder should have written <RS>123<LF><SP><LF><RS>4<LF> but
wrote <RS>123<LF><RS>4<LF>, then this would parse to the same two
elements (123 and 4), but any signatures of the first element taken over
123<LF><SP> would fail to validate.  That's too bad.

> >The whole section is about JSON text parse errors not being fatal for
> >sequence parsing.  I don't understand the objection.  Perhaps if you
> >propose text I will?
> 
> I think given the lending of <LF>s to the JSON text, there is not such
> thing as a JSON text sequence parsing error. You find an RS to terminate

Sure there is.  For example: <RS>{<LF><RS> (the LF might have been part
of the truncated JSON text).  Complying encoders shouldn't produce this,
but if they are writing log-style to a file, they might.  Which is why
the parse ABNF is more tolerant than the encode ABNF.

> an element or you run out of bytes and terminate the element - [...]

Pete Resnick, for example, wants the SHOULD NOT in section 2.3 to be
changed to something less strong.  My interpretation is that some people
want to parse JSON text sequences in contexts where failure to parse a
JSON text in the sequence should yield a failure to parse the sequence.

Therefore there is such a thing as a JSON text sequence parsing error.

>                                                        [...] - no failures
> at the sequence parsing level (though there may be errors that percolate
> to the JSON parser).

The application is on top, the JSON text parser at the bottom.  No
errors percolate to the JSON text parser :)

> >> >> [extensive discussion of the LF elided]
> >> 
> >> How can a decoder know that <RS>123<LF><RS> was what the originator
> >> intended and not something that was terminated by the text sequence
> >> encoder? The originator may have intended <RS>1234<ws><LF><RS>. There
> >> seems to be some assumption that the supplier of JSON text may fail to
> >> self-delimit but would not fail to supply the full value. It’s a
> >> contrived
> >> example, but how should an incremental JSON parser handle texts returned
> >> from a parser operating on the sequence: <RS>123<LF><RS>4<ws><LF><RS>?

I think we're miscommunicating.

The correct way to parse <RS>123<LF><RS>4<ws><LF><RS> is as two octet
strings containing putative JSON texts: 123 and 4<ws> -- keeping
the LF or not is irrelevant (except to your signature validation
concern).  Each of those then parses to the numbers 123 and 4,
respectively.

Therefore the JSON parser should not see <RS>123<LF><RS>4<ws><LF><RS>,
it should see 123 (or 123<LF>) and then 4<ws> (or 4<ws><LF>).  Feeding
the LF, or not, to the JSON text parser is not important, since it
shouldn't require it.

One might feed the entire sequence to the incremental JSON text parser
and rely on it complaining about <RS>, skipping past it, resetting the
parser's state, and restarting it.  That's a perfectly legitimate
implementation design.

One might also scan for <RS>, split on <RS>, and feed the resulting
possible-JSON texts to the JSON text parser.

> >> Would it be two values 123 and 4 or one value 1234? Why is it not be

123 and 4.

> >> preferable to report an error here <RS>123<LF><RS> instead of trying to
> >> auto-terminate it when encoding the sequence?

I don't understand this question.

> You have avoided this question in a couple of forms now. An answer here
> would probably clarify things tremendously with regard to <LF> additions
> and how incremental parsing or detecting incomplete encoding is supposed
> to work.

LF is there for several reasons, one of them being to unambiguously
terminate non-self-delimiting top-level values.

> >The assumption is that the "process" writing the sequence will properly
> >encode the sequence elements,
> 
> My assumption was “a" process encoding a “sequence" may receive an
> "encoded sequence element” from “another” process, possibly as an API call
> from a different box.

I answered that earlier (right?) and then lost track of this concern.  A
process that merely adds RS/LF bracketing without parsing the sequence
elements... can produce invalid sequences, and might be encoding
sequence elements that are themselves sequences (though this is easy to
check for: just scan for RS in the input).

Although such an encoder is not considered in the I-D, we could add some
text about it.  I'd rather say: don't do that, always encode the JSON
texts to encode the JSON text sequence.

> >and will write the <RS><element><LF>
> >sequence correctly.
> 
> The point is what if element is incomplete, either due to failure or
> incremental contribution to the sequence?

Section 2.1 says it's a possible JSON text.  Section 2.3 tells you what
to do if it's incomplete.  Section 2.4 explains how to determinie if
<element> is complete when it's not a self-delimiting value (and, of
course, if it is self-delimiting, then it is self-delimiting, but the LF
is still useful for the other reasons that I gave).

> ><POSIX discussion elided>
> > This too is out of scope.
> 
> Which is fine because it is not the point.

If I still haven't answered your question then please propose some text.

> ><snip>
> >> 
> >> I guess we just disagree on whether the text sequence encoder is
> >> necessarily in a position to terminate data that may be incrementally
> >> supplied or incompletely supplied by a caller and whether or not this
> >> important function should be allocated to the caller instead of to the
> >> JSON text sequence encoder.  One alternative would be to add a <ws> only
> >
> >Of course a properly functioning encoder on a properly function system
> >is in a position to terminate each element.  How can this be in doubt?
> 
> The JSON text encoder may be on one system and the JSON text sequence
> encoder may be on another system (one of the examples in the draft is for
> logs, so this must already be assumed as possible). In the example above,
> if the text encoder fails after sending 123, the text sequence encoder
> will add an <LF> and a decoder will not detect truncation. How can this be
> in doubt?

It's not.  I just don't think one should encode JSON text sequences this
way.  After lunch I'll propose text explicitly requiring sequence
encoders to also invoke the JSON text encoder.

> >A sequence encoder might write() RS, then invoke an incremental JSON
> >text encoder to encode and write() the JSON text, then finally when the
> >JSON text encoder completes its task, the sequence encoder write()s the
> >LF.
> 
> This may be the source of confusion. Does incremental parsing encompass a
> single text sequence element or multiple text sequence elements?  Or can
> it be either way?

See above.

> I never associated re-encoded sequences with signature verification. I
> only asked about <LF> accumulation during re-encoding.

If you parse a sequence and its elements, and re-encode, you'll get more
differences than different amounts of LFs.

> >-- it is commonly
> >accepted and strongly recommended practice that signatures should be
> >validated over what is signed, then and only then (after the signature
> >is validated successfully) should the payload be parsed.  If you have
> >any other security concerns relating to the LF, let's hear them.
> 
> No additional security concerns relating to the LF other than what has
> been stated (non-support for detached signatures, potential to alter
> interpretation of elements in some circumstances). As noted above,
> re-encoding is not and has not been the point re: signature verification.

OK.

Nico
--