[mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 04 September 2019 23:08 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: mile@ietf.org
Delivered-To: mile@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 17332120143; Wed, 4 Sep 2019 16:08:41 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-mile-jsoniodef@ietf.org, Nancy Cam-Winget <ncamwing@cisco.com>, mile-chairs@ietf.org, ncamwing@cisco.com, mile@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.100.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156763852105.22719.3719785399652487432.idtracker@ietfa.amsl.com>
Date: Wed, 04 Sep 2019 16:08:41 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/mile/VH_fFVSQOthanP9n5n-1cM8y7w4>
Subject: [mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)
X-BeenThere: mile@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Managed Incident Lightweight Exchange, IODEF extensions and RID exchanges" <mile.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mile>, <mailto:mile-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mile/>
List-Post: <mailto:mile@ietf.org>
List-Help: <mailto:mile-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mile>, <mailto:mile-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Sep 2019 23:08:41 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-mile-jsoniodef-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-mile-jsoniodef/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

We use a subset of the JSON "number" type to represent integers, which
inherits JSON's range limits on numbers.  My understanding is that such
limits are not present in IODEF XML (e.g., we do not specify a
totalDigits value), so this is a new limitation of the JSON format that
needs to be documented (and, technically, drops us out of full parity
with the XML form).

The JSON "examples" seem to be using a "//" notation for comments, that
is not valid JSON nor described by draft-zyp-json-schema, thus appearing
to make the examples malformed (absent some other disclaimer of the
commenting convention).

How does STRUCTUREDINFO relate to EXTENSION?  What makes one vs the
other appropriate for a given piece of information?  Since the former is
only in  RFC 7203 and not 7970, we do not have an easy reference for
their interplay, given 7970's minimal discussion of the use of 7203.
(It sounds like STRUCTUREDINFO is for structures from other published
specifications and EXTENSION is for more local/custom things, but I'm
not entirely sure if that's exactly the intended split.)

Can the shepherd please report on what level of validation has occurred
on the CDDL syntax, the mappings between RFC 7970's content and this
document's content, and the consistency between the formal syntax and
the body text (e.g., listings of enum values, member fields of each
type, etc.)?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

It's somewhat surprising to see CBOR used but with CBOR maps required to
use string form for representing map keys (i.e., no short integer key
values are defined).  Some of the strings that are map keys in the JSON
objects are fairly long; is this extra space not a concern for the
overall CBOR encoded document (e.g., due to containing a large quantity
of binary data such that encoding overhead is a small relative portion
of the encoded document)?

I guess that since it seems to only be used in (non-normative) Appendix
B, [jsonschema] can remain as an informative reference, though it would
be nice to have a citation where it is actually used in the document, as
opposed to just in the Introduction.  Since it is an informative
reference, the following point is not Discuss-worthy: The current
citation gives no absolute locator, leaving me somewhat unclear about
whether to consult draft-zyp-json-schema or something on json-schema.org
or some other source.  The contents of Appendix B suggest it is the
second of those...

Section 1

   processing is JSON.  To facilitate the automation of incident
   response operations, IODEF documents should support JSON
   representation.

Is it documents or implementations that should support the JSON
representation?

Section 2.1

The string "STRUCTUREDINFO" does not appear in RFC 7203, so I think we
need some additional locator information to indicate what behavior we're
referring to.

Using CBOR major type 2 for HEXBIN implies that actual binary values
are recorded directly, i.e., without any "hex" encoding.  We should be clear
about this one way or the other, and I didn't really see anything in the
CDDL schema that called this out.

Section 2.2.2

It's not claer to mey why using a plain text string is allowed for
representing a ML_STRING, as on the face of it that could lose language
information.  Is the idea that this is supposed to inherit from some
higher-level element, or just to reflect an efficiency of encoding when
neither of the optional language/translation-id fields are present?
Regardless, we should be more clear about that, since neither here nor
Section 5 includes any discussion thereof.  I see that Section 3.2 does
have a brief statement about this being a change from RFC 7970, but I'd
still like to see a little more clarity on this point.

   Examples are shown below.

   "MLStringType": {
     "value": "free-form text",                              //STRING
     "lang": "en",                                             //ENUM
     "translation-id": "jp2en0023"                           //STRING
   }

That looks more like a schema than an example (especially with those
//-comments!).  Also, nit-level, but there's only one, so "examples"
plural does not apply.

Section 2.2.4

   SoftwareReference class is a reference to a particular version of
   software.  Examples are shown below.

"class" seems to be the prevailing RFC 7970 terminology but is not used
much in this document; we seem to use "type" instead.

   "SoftwareReference": {
     "value": "cpe:/a:google:chrome:59.0.3071.115",    //STRING
     "spec-name": "cpe",                                 //ENUM

RFC 7970 suggests that the value portion is interpreted solely within
the context defined by the "spec-name", so it's unclear to me if the
initial "cpe:" prefix in this example is representative.

Section 2.2.5

I'd suggest to add a sentence along the lines of "Note that the
structure of this information is not interpreted in the IODEF JSON, and
the word 'structured' indicates that the data item has internal
structure that is intended to be processed outside of the IODEF
framework.

   When embedding the raw data, base64 encoding defined in Section 4 of
   [RFC4648] SHOULD be used for encoding the data, as shown below.

Does this apply just to JSON or to CBOR as well?  I'm not sure if the
CBOR HEX encoding actually uses raw binary or not, which would be more
compact (and more recommendable?).

Section 3.1

[[I stopped verifying mappings at DetectionPattern]]

Section 3.2

   o  Attributes and elements of each class in XML IODEF document are
      both presented as JSON attributes in JSON IODEF document, and the
      order of their appearances is ignored.

Are there any practical consequences of this loss of information that we
should discuss?

   o  The elements of ML_STRING type in XML IODEF document are presented
      as either STRING type or ML_STRING type in JSON IODEF document.

Why?

Section 5

   SpecID = "urn:ietf:params:xml:ns:mile:mmdef:1.2" /  "private"

This enum is managed by IANA; shouldn't we have some sort of signal in
the CDDL to indicate it is extensible?  (Applies to any enum maintained
by IANA.)

   PortlistType = text .regexp "\\d+(\\-\\d+)?(,\\d+(\\-\\d+)?)*"

\d matches by unicode properties; I think that we want just [0-9] here?

Section 6

I think we are making use of some fields whose content is controlled by
IANA, so we may wish to consider asking IANA to update the description
of the registry(ies) in question to include the JSON and CBOR usage.

Appendix B

Is it more useful to have a non-normative JSON schema in the document,
or a pointer to a tool that can generate one from the CDDL?
(Alternately, how was this one generated -- were there any manual
modifications needed from the output of some tool?)