Re: [mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 14 January 2020 22:34 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: mile@ietfa.amsl.com
Delivered-To: mile@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1ECAF120046; Tue, 14 Jan 2020 14:34:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id inpbWCnZLyln; Tue, 14 Jan 2020 14:34:10 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5B254120059; Tue, 14 Jan 2020 14:34:10 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 00EMY2AQ010976 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Jan 2020 17:34:05 -0500
Date: Tue, 14 Jan 2020 14:34:02 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: Takeshi Takahashi <takeshi_takahashi@nict.go.jp>
Cc: "'The IESG'" <iesg@ietf.org>, mile@ietf.org, mile-chairs@ietf.org, draft-ietf-mile-jsoniodef@ietf.org
Message-ID: <20200114223402.GR66991@kduck.mit.edu>
References: <156763852105.22719.3719785399652487432.idtracker@ietfa.amsl.com> <006101d5aeb9$71c275d0$55476170$@nict.go.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <006101d5aeb9$71c275d0$55476170$@nict.go.jp>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/mile/qVUL4hD5LtdxQixPH2xTdp_t8jE>
Subject: Re: [mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)
X-BeenThere: mile@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Managed Incident Lightweight Exchange, IODEF extensions and RID exchanges" <mile.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mile>, <mailto:mile-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mile/>
List-Post: <mailto:mile@ietf.org>
List-Help: <mailto:mile-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mile>, <mailto:mile-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jan 2020 22:34:14 -0000

Hi Take,

Sorry for the slow response.

On Mon, Dec 09, 2019 at 09:52:37AM -0800, Takeshi Takahashi wrote:
> Hi Benjamin,
> 
> Thank you very much for your kind replies.
> Let us reply your comments as follows.
> 
> Below are the current version of the draft that reflected your kind comments.
> https://github.com/milewg/draft-ietf-mile-jsoniodef
> 
> > We use a subset of the JSON "number" type to represent integers, which inherits JSON's range limits on numbers.  My understanding is that such limits are not present in IODEF XML (e.g., we do not specify a totalDigits value), so this is a new limitation of the JSON format that needs to be documented (and, technically, drops us out of full parity with the XML form).
> 
> We were not aware of these limitations.
> So we have checked the web: http://json.org/, but we could not find any sentences that limit the limitation of numbers.

Sorry for being terse; I thought it was more well-known.
It stems from the common JSON usage of IEEE 754 floating-point (binary64)
numbers for the JSON number type, which only has full precision for integer
values between (basically) -2**53 and 2**53.  Not all integers outside that
range are representable as binary64 floats, so strange behaviors can result
if attempts are made to use them.

> In CBOR, we cope with bignum in our draft, and the byte length of bignum is unlimited, in our understanding.

I believe that's correct.

> ---------------------------
> 2.4.2.  Bignums
> 
>    Bignums are integers that do not fit into the basic integer
>    representations provided by major types 0 and 1.  They are encoded as
>    a byte string data item, which is interpreted as an unsigned integer
>    n in network byte order.  For tag value 2, the value of the bignum is
>    n.  For tag value 3, the value of the bignum is -1 - n.  Decoders
>    that understand these tags MUST be able to decode bignums that have
>    leading zeroes.
> 
>    For example, the number 18446744073709551616 (2**64) is represented
>    as 0b110_00010 (major type 6, tag 2), followed by 0b010_01001 (major
>    type 2, length 9), followed by 0x010000000000000000 (one byte 0x01
>    and eight bytes 0x00).  In hexadecimal:
> 
>    C2                        -- Tag 2
>       29                     -- Byte string of length 9
>          010000000000000000  -- Bytes content
> ---------------------------
> 
> So, we are not sure whether we are pushing any limitation on the numbers in our draft.
> We would appreciate any further information on this matter.

Per the above, JSON integers with magnitude greater than 2**53 should not
be assumed to be portable.  At a minimum, we should note this limitation
explictly (though other workarounds are possible, e.g., "JSON bignums" aka
string representations of decimal integers.

> > The JSON "examples" seem to be using a "//" notation for comments, that is not valid JSON nor described by draft-zyp-json-schema, thus appearing to make the examples malformed (absent some other disclaimer of the commenting convention).
> 
> Agreed. The latest version of the draft has the following notes to cope with this comment.
> 'Note that in figures throughout this document, some supplementary information follows "#", but these are not valid syntax in JSON, but are intended to facilitate reader understanding.'

Thanks, that works for me.

> > How does STRUCTUREDINFO relate to EXTENSION?  What makes one vs the other appropriate for a given piece of information?  Since the former is only in  RFC 7203 and not 7970, we do not have an easy reference for their interplay, given 7970's minimal discussion of the use of 7203. (It sounds like STRUCTUREDINFO is for structures from other published specifications and EXTENSION is for more local/custom things, but I'm not entirely sure if that's exactly the intended split.)
> 
> IODEF version 1 had several extension mechanisms, using AdditionalData class (EXTENSION type).
> RFC 7203 specifies one structured use of AdditionalData so that it facilitates to embed XML data.
> RFC 7970 defines version 2 of the extension, but we thought defining next (independent) version of RFC7203 does not make sense and included the context inside the draft.
> In the xml version of the STRUCTUREDINFO indicates the schema/structure of the content, thus the receiver may validate the content, while AdditionalData could be used to fit any types of data.
> If receiver does not care about the structure of the data, one may not have to use STRUCTUREDINFO.

Thanks for the extra explanation.  I see that the -12 has a note in Section
2.2.5 clarifying that "structured" is not processed by the IODEF framework
but carries data that has structure to be interpreted externally.  Could
we also have asimilar note in Section 2.2.6 to clarify the usage of
EXTENSION (e.g., for data whose structure is to be interpreted by IODEF)?


I'd also still like to see a response from the shepherd to:

% Can the shepherd please report on what level of validation has occurred
% on the CDDL syntax, the mappings between RFC 7970's content and this
% document's content, and the consistency between the formal syntax and
% the body text (e.g., listings of enum values, member fields of each
% type, etc.)?

> > It's somewhat surprising to see CBOR used but with CBOR maps required to use string form for representing map keys (i.e., no short integer key values are defined).
> > Some of the strings that are map keys in the JSON objects are fairly long; is this extra space not a concern for the overall CBOR encoded document (e.g., due to containing a large quantity of binary data such that encoding overhead is a small relative portion of the encoded document)?
> 
> Thank you for pointing this issue.
> We agree. Currently, we have not been sticking to shortening the byte sizes.
> If it is necessary, we can define the map keys in the JSON objects, but we are not sure whether we want to define it here.

It would be highly surprising to me to see CBOR usage that does not attempt
to minimize the encoding size -- the 'C' is for "Concise", after all.  What
is the motivation for defining/allowing CBOR encoding in the first place?

> > I guess that since it seems to only be used in (non-normative) Appendix B, [jsonschema] can remain as an informative reference, though it would be nice to have a citation where it is actually used in the document, as opposed to just in the Introduction. 
> > Since it is an informative reference, the following point is not Discuss-worthy: The current citation gives no absolute locator, leaving me somewhat unclear about whether to consult draft-zyp-json-schema or something on json-schema.org or some other source. 
> > The contents of Appendix B suggest it is the second of those...
> 
> We agree, we should put the reference here.
> As you guessed, we followed json-schema.org.
> Therefore we put the reference to json-schema.org in Appendix B.

I only see the "[jsonschema]" tag but no explicit mention of
json-schema.org in the corresponding entry in the references section.

> > Section 1
> 
> >    processing is JSON.  To facilitate the automation of incident
> >    response operations, IODEF documents should support JSON
> >    representation.
> >
> > Is it documents or implementations that should support the JSON representation?
> 
> We hope IODEF as a specification should support JSON representation.
> Therefore, we hope to say that IODEF documents should support the JSON representation.
> Having said that, implementation also needs to support JSON representation.
> Therefore, let us rephrase so that IODEF documents and implementations should support the JSON representation.

Thanks!

> > Section 2.1
> >
> > The string "STRUCTUREDINFO" does not appear in RFC 7203, so I think we need some additional locator information to indicate what behavior we're referring to.
> 
> Thank you very much for pointing this here.
> We have added the following sentence: "Note that this type was originally specified in Section 4.4 of <xref target="RFC7203" /> as a basic structure of its extension classes"
> 
> > Using CBOR major type 2 for HEXBIN implies that actual binary values are recorded directly, i.e., without any "hex" encoding.  We should be clear about this one way or the other, and I didn't really see anything in the CDDL schema that called this out.
> 
> Figure 1 of the draft has the following information.
> 
> ```txt
>  | BYTE            | Section 2.5.1     | "string" per [RFC8259]        |
>  | BYTE[]          | Section 2.5.1     | "string" per [RFC8259]        |
>  | HEXBIN          | Section 2.5.2     | "string" per [RFC8259]        |
>  | HEXBIN[]        | Section 2.5.2     | "string" per [RFC8259]        |
> ```
> 
> If we see the reference, it says as follows.
> 
> ```txt
> 2.5.2.  Hexadecimal Bytes
> ​
>    A binary octet encoded as a character tuple consistent of two
>    hexadecimal digits is represented in the information model by the
>    HEXBIN data type.  A sequence of these octets is of the HEXBIN[] data
>    type.
> ​
>    The HEXBIN and HEXBIN[] data types are implemented in the data model
>    as an "xs:hexBinary" type per Section 3.2.15 of [W3C.SCHEMA.DTYPES].
> ```
> 
> So, if readers follow the reference, readers can reach the information.
> We are not sure whether we should put more information here.
> If you think it is better, we will add some sentences here, we rather prefer to keep it simple (because the draft is already too long).

I'm referring to Figure 2, not Figure 1:

| HEXBIN          | 2                | bytes                           |
| HEXBIN[]        | 2                | bytes                           |

Following RFC  7049, CBOR major type 2 is:

   Major type 2:  a byte string.  The string's length in bytes is
      represented following the rules for positive integers (major type
      0).  For example, a byte string whose length is 5 would have an
      initial byte of 0b010_00101 (major type 2, additional information
      5 for the length), followed by 5 bytes of binary content.  A byte
      string whose length is 500 would have 3 initial bytes of
      0b010_11001 (major type 2, additional information 25 to indicate a
      two-byte length) followed by the two bytes 0x01f4 for a length of
      500, followed by 500 bytes of binary content.

as distinct from

   Major type 3:  a text string, specifically a string of Unicode
      characters that is encoded as UTF-8 [RFC3629].  The format of this
      type is identical to that of byte strings (major type 2), that is,
      as with major type 2, the length gives the number of bytes.  This
      type is provided for systems that need to interpret or display
      human-readable text, and allows the differentiation between
      unstructured bytes and text that has a specified repertoire and
      encoding.  In contrast to formats such as JSON, the Unicode
      characters in this type are never escaped.  Thus, a newline
      character (U+000A) is always represented in a string as the byte
      0x0a, and never as the bytes 0x5c6e (the characters "\" and "n")
      or as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and
      "a").

If we want to keep the "hex digit" encoding for the "HEXBIN" types, it
seems like a text string (major type 3) is more appropriate than a
byte-string type.

On the other hand, there's not necessarily a reason to need to keep the
"hex digit" encoding in a binary protocol like CBOR, so perhaps putting the
decoded octets that the hex string represents into a binary string makes
the most sense.

My point is that the document is unclear here; reasonable people might
disagree on whether to be faithful to HEXBIN and use hex digits or to be
faithful to CBOR and pack binary data.
 

> > Section 2.2.2
> >
> > It's not claer to mey why using a plain text string is allowed for representing a ML_STRING, as on the face of it that could lose language information. 
> > Is the idea that this is supposed to inherit from some higher-level element, or just to reflect an efficiency of encoding when neither of the optional language/translation-id fields are present?
> > Regardless, we should be more clear about that, since neither here nor Section 5 includes any discussion thereof.  I see that Section 3.2 does have a brief statement about this being a change from RFC 7970, but I'd still like to see a little more clarity on this point.
> 
> Yes, as you pointed out, it is just for efficiency of encoding.
> Though multilingual support is very appreciated and necessary, many information will still use only alphabets.
> So, for those information that require multilingual support can use ML_STRING while those that does not require can use STRING.
> 
> In the XML version (RFC7970), it is easy to cope with alphabet-only string efficiently because xml:lang and translation-id fields are optional in the iodef:MLStringType (See section 2.4 of RFC7970).
> "<Description lang=en>This is a sample<Description>" could be written as "<Description>This is a sample<Description>"
> To realize the same efficiency for those alphabet-only string, we prepared ML_STRING and STRING.
> 
> To describe this issue, we have the following sentence in Section 3.2: "The elements of ML_STRING type in XML IODEF document are presented as either STRING type or ML_STRING type in JSON IODEF document."

Can we say something about how an implementation would choose to use STRING
vs. ML_STRING?  Right now it seems fairly ambiguous.

> >    Examples are shown below.
> >    "MLStringType": {
> >      "value": "free-form text",                              //STRING
> >      "lang": "en",                                             //ENUM
> >      "translation-id": "jp2en0023"                           //STRING
> >    }
> 
> > That looks more like a schema than an example (especially with those //-comments!).  Also, nit-level, but there's only one, so "examples"
> > plural does not apply.
> 
> Thank you, we've fixed this point: "An example is shown below"
> 
> > Section 2.2.4
> 
> >    SoftwareReference class is a reference to a particular version of
> >    software.  Examples are shown below.
> >
> > "class" seems to be the prevailing RFC 7970 terminology but is not used much in this document; we seem to use "type" instead.
> 
> "type" would work as well.
> However, since the caption of Figure 3 is "IODEF classes", and since we refer to RFC7970, we prefer to keep using the term "class" here, if it is not a big problem for you.

My personal preference is to prefer consistency within a single document to
consistency across documents, when they are in conflict.  But it is your
personal preference that takes precedence here, so it is not a big problem
for me to use "class".

> >    "SoftwareReference": {
> >      "value": "cpe:/a:google:chrome:59.0.3071.115",    //STRING
> >      "spec-name": "cpe",                                 //ENUM
> 
> > RFC 7970 suggests that the value portion is interpreted solely within the context defined by the "spec-name", so it's unclear to me if the initial "cpe:" prefix in this example is representative.
> 
> As you pointed out, "cpe:/" could not be necessary since the spec-name indicates that it will be a CPE-ID.
> However, the CPE spec defines that the URI form of CPE-ID begins with cpe:/.
> https://cpe.mitre.org/specification/
> 
> > Section 2.2.5
> 
> > I'd suggest to add a sentence along the lines of "Note that the structure of this information is not interpreted in the IODEF JSON, and the word 'structured' indicates that the data item has internal structure that is intended to be processed outside of the IODEF framework.
> 
> Thank you very much. We'll have added the sentence.
> 
> >    When embedding the raw data, base64 encoding defined in Section 4 of
> >    [RFC4648] SHOULD be used for encoding the data, as shown below.
> 
> > Does this apply just to JSON or to CBOR as well?  I'm not sure if the CBOR HEX encoding actually uses raw binary or not, which would be more compact (and more recommendable?).
> 
> In Section 3.2, we have the following sentence.
> ```txt
>    o  Signature, X509Data, and RawData are encoded with base64 and are
>       represented as string (BYTE type) in JSON IODEF documents.
> ```
> So, basically, yes, this applies to both JSON and CBOR.

Ah, I see that the STRUCTUREDINFO contents are indeed inside the RawData
element; I did not make that connection originally.

That said, going back to the point of CBOR being "Concise", it is highly
unusual for it to contain base64-encoded data as opposed to the binary
representation thereof.  [Further discussion trimmed, as this is basically
the same conversation as above.]

> > Section 3.2
> 
> >    o  Attributes and elements of each class in XML IODEF document are
> >       both presented as JSON attributes in JSON IODEF document, and the
> >       order of their appearances is ignored.
> 
> > Are there any practical consequences of this loss of information that we should discuss?
> 
> We do not think so, and we believe this loss is not important.
> Indeed, I am not that sure whether the order of the appearance of attributes and elements is important even for RFC7970.

I am not sure, either, for whether RFC 7970 assigns importance to the order
of appearance.

> >    o  The elements of ML_STRING type in XML IODEF document are presented
> >       as either STRING type or ML_STRING type in JSON IODEF document.
> > Why?
> 
> As we discussed above, many of the documents still uses only alphabets, and RFC7970 can concisely represent those (thanks to the nature of XML). Because forcing to encode everyting in ML_STRING in JSON is not that efficient, we allowed to describe information in STRING as well.

Allowing it is okay, but I think more should be said about when it is safe
or allowed to do so, and what language (English?) is assumed if ML_STRING
is not used.

> > Section 5
> >    SpecID = "urn:ietf:params:xml:ns:mile:mmdef:1.2" /  "private"
> > This enum is managed by IANA; shouldn't we have some sort of signal in the CDDL to indicate it is extensible?  (Applies to any enum maintained by IANA.)
> 
> Yes, we agree.
> We have put the following sentence in Section 3.2: "ENUM values in this document is extensible and is managed by IANA, as with <xref target="RFC7970" />."
> (Since this document defines only the mapping to JSON, the value itself should be managed by RFC7970. Therefore, this draft does not have any IANA section.)

Thanks!

> >    PortlistType = text .regexp "\\d+(\\-\\d+)?(,\\d+(\\-\\d+)?)*"
> > \d matches by unicode properties; I think that we want just [0-9] here?
> 
> Thank you for pointing this issue.
> We agree, we will change it accordingly.
> (we haven't reflected this part yet, let us spent a bit more time to double check the data model.)

Okay.  (Sponsoring AD take note.)

> > Section 6
> > I think we are making use of some fields whose content is controlled by IANA, so we may wish to consider asking IANA to update the description of the registry(ies) in question to include the JSON and CBOR usage.
> 
> That is one way, but we rather hope to keep it as a mapping to the RFC7970.
> Therefore, if RFC7970 requests to change IANA tables, the users of this document can also access to the revised IANA table.

I agree that users of this document should also be able to access updated
IANA tables.  I think that the description associated with the IANA tables
can say that the values in the table are used both by RFC 7970
implementations and by their JSON (and CBOR) bindings as specified by this
document.

> > Appendix B
> 
> > Is it more useful to have a non-normative JSON schema in the document, or a pointer to a tool that can generate one from the CDDL?
> > (Alternately, how was this one generated -- were there any manual modifications needed from the output of some tool?)
> 
> These schema is manually made this time.
> Currently, the non-normative JSON schema for IODEF is in the appendix.
> Would you mean to move it to the main document?

I do not propose to move it to the main document.
Since this schema is made manually, I do not propose any change here.

Thank you,

Ben

> Thank you very much, and best regards,
> Takeshi Takahashi
> 
> 
> -----Original Message-----
> From: Takeshi Takahashi <takeshi_takahashi@nict.go.jp> 
> Sent: Monday, November 18, 2019 8:41 PM
> To: 'Benjamin Kaduk' <kaduk@mit.edu>du>; 'The IESG' <iesg@ietf.org>
> Cc: 'mile@ietf.org' <mile@ietf.org>rg>; 'mile-chairs@ietf.org' <mile-chairs@ietf.org>rg>; 'draft-ietf-mile-jsoniodef@ietf.org' <draft-ietf-mile-jsoniodef@ietf.org>
> Subject: RE: [mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)
> 
> Hi Benjamin,
> 
> Thank you very much for your kind review, and I am sorry for not being able to reply you earlier.
> Though we have submitted the revised version some time ago, I was unable to cope with your comments yet.
> Let us reply to your points when I submit the next version.
> 
> Thank you, and best regards,
> Take
> 
> PS: note that the latest version is available here: https://datatracker.ietf.org/doc/draft-ietf-mile-jsoniodef/
> 
> 
> -----Original Message-----
> From: mile <mile-bounces@ietf.org> On Behalf Of Benjamin Kaduk via Datatracker
> Sent: Wednesday, September 4, 2019 4:09 PM
> To: The IESG <iesg@ietf.org>
> Cc: mile@ietf.org; mile-chairs@ietf.org; draft-ietf-mile-jsoniodef@ietf.org
> Subject: [mile] Benjamin Kaduk's Discuss on draft-ietf-mile-jsoniodef-10: (with DISCUSS and COMMENT)
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-mile-jsoniodef-10: Discuss
> 
> When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)
> 
> 
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-mile-jsoniodef/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> We use a subset of the JSON "number" type to represent integers, which inherits JSON's range limits on numbers.  My understanding is that such limits are not present in IODEF XML (e.g., we do not specify a totalDigits value), so this is a new limitation of the JSON format that needs to be documented (and, technically, drops us out of full parity with the XML form).
> 
> The JSON "examples" seem to be using a "//" notation for comments, that is not valid JSON nor described by draft-zyp-json-schema, thus appearing to make the examples malformed (absent some other disclaimer of the commenting convention).
> 
> How does STRUCTUREDINFO relate to EXTENSION?  What makes one vs the other appropriate for a given piece of information?  Since the former is only in  RFC 7203 and not 7970, we do not have an easy reference for their interplay, given 7970's minimal discussion of the use of 7203.
> (It sounds like STRUCTUREDINFO is for structures from other published specifications and EXTENSION is for more local/custom things, but I'm not entirely sure if that's exactly the intended split.)
> 
> Can the shepherd please report on what level of validation has occurred on the CDDL syntax, the mappings between RFC 7970's content and this document's content, and the consistency between the formal syntax and the body text (e.g., listings of enum values, member fields of each type, etc.)?
> 
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> It's somewhat surprising to see CBOR used but with CBOR maps required to use string form for representing map keys (i.e., no short integer key values are defined).  Some of the strings that are map keys in the JSON objects are fairly long; is this extra space not a concern for the overall CBOR encoded document (e.g., due to containing a large quantity of binary data such that encoding overhead is a small relative portion of the encoded document)?
> 
> I guess that since it seems to only be used in (non-normative) Appendix B, [jsonschema] can remain as an informative reference, though it would be nice to have a citation where it is actually used in the document, as opposed to just in the Introduction.  Since it is an informative reference, the following point is not Discuss-worthy: The current citation gives no absolute locator, leaving me somewhat unclear about whether to consult draft-zyp-json-schema or something on json-schema.org or some other source.  The contents of Appendix B suggest it is the second of those...
> 
> Section 1
> 
>    processing is JSON.  To facilitate the automation of incident
>    response operations, IODEF documents should support JSON
>    representation.
> 
> Is it documents or implementations that should support the JSON representation?
> 
> Section 2.1
> 
> The string "STRUCTUREDINFO" does not appear in RFC 7203, so I think we need some additional locator information to indicate what behavior we're referring to.
> 
> Using CBOR major type 2 for HEXBIN implies that actual binary values are recorded directly, i.e., without any "hex" encoding.  We should be clear about this one way or the other, and I didn't really see anything in the CDDL schema that called this out.
> 
> Section 2.2.2
> 
> It's not claer to mey why using a plain text string is allowed for representing a ML_STRING, as on the face of it that could lose language information.  Is the idea that this is supposed to inherit from some higher-level element, or just to reflect an efficiency of encoding when neither of the optional language/translation-id fields are present?
> Regardless, we should be more clear about that, since neither here nor Section 5 includes any discussion thereof.  I see that Section 3.2 does have a brief statement about this being a change from RFC 7970, but I'd still like to see a little more clarity on this point.
> 
>    Examples are shown below.
> 
>    "MLStringType": {
>      "value": "free-form text",                              //STRING
>      "lang": "en",                                             //ENUM
>      "translation-id": "jp2en0023"                           //STRING
>    }
> 
> That looks more like a schema than an example (especially with those //-comments!).  Also, nit-level, but there's only one, so "examples"
> plural does not apply.
> 
> Section 2.2.4
> 
>    SoftwareReference class is a reference to a particular version of
>    software.  Examples are shown below.
> 
> "class" seems to be the prevailing RFC 7970 terminology but is not used much in this document; we seem to use "type" instead.
> 
>    "SoftwareReference": {
>      "value": "cpe:/a:google:chrome:59.0.3071.115",    //STRING
>      "spec-name": "cpe",                                 //ENUM
> 
> RFC 7970 suggests that the value portion is interpreted solely within the context defined by the "spec-name", so it's unclear to me if the initial "cpe:" prefix in this example is representative.
> 
> Section 2.2.5
> 
> I'd suggest to add a sentence along the lines of "Note that the structure of this information is not interpreted in the IODEF JSON, and the word 'structured' indicates that the data item has internal structure that is intended to be processed outside of the IODEF framework.
> 
>    When embedding the raw data, base64 encoding defined in Section 4 of
>    [RFC4648] SHOULD be used for encoding the data, as shown below.
> 
> Does this apply just to JSON or to CBOR as well?  I'm not sure if the CBOR HEX encoding actually uses raw binary or not, which would be more compact (and more recommendable?).
> 
> Section 3.1
> 
> [[I stopped verifying mappings at DetectionPattern]]
> 
> Section 3.2
> 
>    o  Attributes and elements of each class in XML IODEF document are
>       both presented as JSON attributes in JSON IODEF document, and the
>       order of their appearances is ignored.
> 
> Are there any practical consequences of this loss of information that we should discuss?
> 
>    o  The elements of ML_STRING type in XML IODEF document are presented
>       as either STRING type or ML_STRING type in JSON IODEF document.
> 
> Why?
> 
> Section 5
> 
>    SpecID = "urn:ietf:params:xml:ns:mile:mmdef:1.2" /  "private"
> 
> This enum is managed by IANA; shouldn't we have some sort of signal in the CDDL to indicate it is extensible?  (Applies to any enum maintained by IANA.)
> 
>    PortlistType = text .regexp "\\d+(\\-\\d+)?(,\\d+(\\-\\d+)?)*"
> 
> \d matches by unicode properties; I think that we want just [0-9] here?
> 
> Section 6
> 
> I think we are making use of some fields whose content is controlled by IANA, so we may wish to consider asking IANA to update the description of the registry(ies) in question to include the JSON and CBOR usage.
> 
> Appendix B
> 
> Is it more useful to have a non-normative JSON schema in the document, or a pointer to a tool that can generate one from the CDDL?
> (Alternately, how was this one generated -- were there any manual modifications needed from the output of some tool?)
> 
> 
> _______________________________________________
> mile mailing list
> mile@ietf.org
> https://www.ietf.org/mailman/listinfo/mile
>