Re: [core] Benjamin Kaduk's Discuss on draft-ietf-core-senml-data-ct-05: (with DISCUSS and COMMENT)

Carsten Bormann <cabo@tzi.org> Tue, 12 October 2021 13:37 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 53F883A124C; Tue, 12 Oct 2021 06:37:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eLFcZfIOUHh8; Tue, 12 Oct 2021 06:37:21 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [134.102.50.15]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81D103A0875; Tue, 12 Oct 2021 06:37:19 -0700 (PDT)
Received: from [192.168.217.118] (p5089a8ac.dip0.t-ipconnect.de [80.137.168.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4HTGtP1mpjz2xjZ; Tue, 12 Oct 2021 15:37:17 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <163233240502.20840.5498014177264082102@ietfa.amsl.com>
Date: Tue, 12 Oct 2021 15:37:16 +0200
Cc: The IESG <iesg@ietf.org>, draft-ietf-core-senml-data-ct@ietf.org, core-chairs@ietf.org, core@ietf.org
X-Mao-Original-Outgoing-Id: 655738636.620616-4e0b446813128d263c8a5052452d9867
Content-Transfer-Encoding: quoted-printable
Message-Id: <CC1B2304-A1F0-4E40-A4D1-CE7C1242FAA3@tzi.org>
References: <163233240502.20840.5498014177264082102@ietfa.amsl.com>
To: Benjamin Kaduk <kaduk@mit.edu>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/jQAsCebwtDKYnxNiIOshDOsvFKU>
Subject: Re: [core] Benjamin Kaduk's Discuss on draft-ietf-core-senml-data-ct-05: (with DISCUSS and COMMENT)
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Oct 2021 13:37:25 -0000

Hi Ben,

thank you for bringing up a number of important issues.

> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I have a couple points for discussion, essentially relating to how much
> we're diverging from HTTP and to what extent the specifics of the
> divergence should be specifically mentioned in the document.
> 
> (1) I'd like to dig a little more into the analogy with HTTP and
> whether we are artificially limiting ourselves: currently we only allow
> 0 or 1 content-codings to be specified, but per
> https://www.ietf.org/archive/id/draft-ietf-httpbis-semantics-19.html#name-content-encoding
> the HTTP ecosystem permits multiple codings to be applied in turn to the
> same representation.  While the sensor data values are likely to be
> relatively small and applying multiple content-codings is not likely to
> be useful in such a scenario, this seems like something where we should
> only consciously diverge from HTTP, rather than inadvertently doing so.

I agree that this should be handled (if only for     application/json@deflate@aes128gcm and similar cases).
We discussed this in the previous interim and it seems we like the syntax I proposed in the previous sentence.

Combining the numerous changes needed with addressing Christer’s comments on the Abstract:

https://github.com/core-wg/senml-data-ct/pull/8

> (2) Let's also discuss whether we want to reuse ABNF rule names from
> HTTP while having the rule content diverge, without specific enumeration
> of the divergence.  So far I found instances where this document does
> not allow HTAB or obs-text in places that draft-ietf-httpbis-semantics
> does, which may well be the right way to spell the rule, but seems to
> merit a little discussion.

Very good point.  I’m not sure the ABNF behind “ABNF rule names from HTTP” is stable enough that we need to stick with it in all cases.
I was certainly surprised by the recent change to

   parameters = *( OWS ";" OWS [ parameter ] )

(which makes sequences of ";" legal; see also section B.2 in -semantics.) and wonder whether we should follow this change (and why!?).

Httpbis-semantics uses modified ABNF anyway; so there is no way to have exactly the same ABNF.  But it is worthwhile pointing out that legacy 8-bit text and HTAB have no place in the strings used in this protocol and therefore are not allowed.

I added a comment:
 
-; Cleaned up from RFC 7231:
+; Cleaned up from RFC 7231, only leaving SP as blank space, and
+; removing legacy 8-bit characters:

The other difference from RFC 2616 and its descendants is the absence of “quoted-pair” from the “quoted-string” production, which indeed can be discussed.

(May need to discuss this in the interim tomorrow for final resolution.)

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> Do we want to comment anywhere about the situation where an
> implementation receives a message using an IANA-registered numeric
> content-format that is "too new" for that implementation to know about?

(Should have a discussion about error handling in the interim tomorrow.
But generally, the important thing is that the implementation *know* that it needs to be updated to understand.)

> It also feels a little weird that we have to end up using the
> text-string encoding of a decimal number for the Content-Format, even
> for the CBOR representation of SenML structures, but I guess that's what
> RFC 8428 intended and not worth trying to change.

We started out with a numeric encoding, but then noticed that SenML is usually very specific about only allowing a single data type per field.

> Abstract
> 
> I'm somewhat sympathetic to the gen-art reviewer's contention that the
> new field is not indicating the "Content-Format" of the binary data
> values (since Content-Format is a defined term in CoAP and SenML is
> claimed to not be limited to CoAP usage).  Perhaps we could switch
> around the order of description, i.e., "for indicating the Internet
> media type (including parameters) of these binary data values (i.e., the
> CoAP Content-Format that would apply when CoAp is used), as well as any
> content-coding that is applied"?

This is addressed in https://github.com/core-wg/senml-data-ct/pull/8, but not exactly following your blueprint.
The term “content format” is still used both for a “Content-Format number” and a “Content-Format-String”.

> 
> Section 3
> 
>   *  a CoAP Content-Format identifier in decimal form with no leading
>      zeros (except for the value "0" itself).  This value represents an
>      unsigned integer in the range of 0-65535, similar to the CoRE Link
>      Format [RFC6690] "ct" attribute).
> 
> Should we also reference
> https://datatracker.ietf.org/doc/html/rfc7252#section-7.2.1 which is
> where the "ct" link attribute is actually defined?  I spent a bit of
> time looking for it in 6690 itself only to discover that it was removed
> in draft-ietf-core-link-format-07 with remark "Moved [...] to the base
> CoAP specification".

Good point.  YYY

> 
>   The CoAP Content-Format number provides a simple and efficient way to
>   indicate the type of the data.  [...]
> 
> If we're limited to string representation, is it really "efficient" in a
> CBOR context?

That’s not what the sentence is about :-), but “11050” is still more efficient than "application/json@deflate”.

> Section 6
> 
>   ; Cleaned up from RFC 7231:
> 
> (per the DISCUSS,) I'm a bit anti-enthused about saying "cleaned up" without saying what
> changed (i.e., whether it's just refactoring, or actual changes to the
> rule like requiring specifically *SP instead of OWS that allows HTAB as
> well).

See above.

> Section 7
> 
> These security considerations are well-thought-out and nicely written.
> Thank you!
> 
> I think there are some (rare) situations where individual media-type
> (specifications) have their own security considerations, but I'm not
> really convinced that we need to mention that/incorporate them by
> reference, here.
> 
> NITS
> 
> Section 1
> 
>   The receiver is expected to know how to interpret the data in the
>   "vd" field based on the context, e.g., name of the data source and
>   out-of-band knowledge of the application.  However, this context may
>   not always be easily available to entities processing the SenML pack.
> 
> I'd consider adding ", especially if the pack is propagated to multiple
> entities".

That is a bit implied in the context of SenML, but it doesn’t hurt to say it again.

 field based on the context, e.g., name of the data source and out-of-band
 knowledge of the application. However, this context may not always be
-easily available to entities processing the SenML pack. To facilitate
+easily available to entities processing the SenML pack, especially if
+the pack is propagated over time and via multiple entities. To facilitate
 automatic interpretation it is useful to be able to indicate an Internet

> Section 2
> 
>   Content-Coding:  A name registered in the HTTP Content Coding
>      registry [IANA.http-parameters] as specified by Section 8.5 of
>      [RFC7230], indicating an encoding transformation with semantics
>      further specified in Section 3.1.2.1 of [RFC7231].  [...]
> 
> (I expect that the RFC Editor will be able to replace the references to
> point to draft-ietf-httpb-semantics if it has been published before this
> document.)

(With the actual changes in this document, I’m not sure that is a mechanical operation.  But let’s wait for httpbis-semantics to emerge from EDIT.)

> Section 4
> 
>   up to the end of the pack otherwise.  Resolution (Section 4.6 of
>   [RFC8428]) of this base field is performed by adding its value with
>   the label "ct" to all Records in this range that carry a "vd" field
>   but do not already contain a Content-Format ("ct") field.
> 
> The conjugation "resolution" does not actually appear in RFC 8428
> itself, just discussion of "resolved records" and "to resolve the
> records".  It might be helpful to tweak things so that we don't rely on
> the reader knowing the irregular conjugation (but I don't have any good
> ideas off the top of my head)...

-pack otherwise.  Resolution ({{Section 4.6 of -senml}}) of this base
+pack otherwise.  The process of resolving ({{Section 4.6 of -senml}}) this base

The changes that are not already accepted in PR 8 are in
https://github.com/core-wg/senml-data-ct/pull/9

Grüße, Carsten