Re: [Cbor] Robert Wilton's Discuss on draft-ietf-cbor-tags-oid-06: (with DISCUSS and COMMENT)

Carsten Bormann <cabo@tzi.org> Wed, 19 May 2021 14:43 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F37453A1290; Wed, 19 May 2021 07:43:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sTzBo_RGd39A; Wed, 19 May 2021 07:43:10 -0700 (PDT)
Received: from gabriel-vm-1.zfn.uni-bremen.de (gabriel-vm-1.zfn.uni-bremen.de [134.102.50.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 642CF3A128F; Wed, 19 May 2021 07:43:10 -0700 (PDT)
Received: from [192.168.217.118] (p548dcc89.dip0.t-ipconnect.de [84.141.204.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-1.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4FlbFl6XDTz2xMy; Wed, 19 May 2021 16:43:07 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.6\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <161788356811.31539.2139615008210880278@ietfa.amsl.com>
Date: Wed, 19 May 2021 16:43:07 +0200
Cc: The IESG <iesg@ietf.org>, draft-ietf-cbor-tags-oid@ietf.org, cbor-chairs@ietf.org, cbor@ietf.org, Christian Amsüss <christian@amsuess.com>
X-Mao-Original-Outgoing-Id: 643128187.4881901-0805e95d9c656e3c3f71c9e831d750f2
Content-Transfer-Encoding: quoted-printable
Message-Id: <6EE767E5-9F97-4D74-B154-C8333D5E8778@tzi.org>
References: <161788356811.31539.2139615008210880278@ietfa.amsl.com>
To: Robert Wilton <rwilton@cisco.com>
X-Mailer: Apple Mail (2.3608.120.23.2.6)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/Ow7KiAx61ohnDVM4h6HsFa51y0c>
Subject: Re: [Cbor] Robert Wilton's Discuss on draft-ietf-cbor-tags-oid-06: (with DISCUSS and COMMENT)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 May 2021 14:43:14 -0000

Hi Rob,

apologies for taking so long to reply to this, but you did raise a very good point that needed some discussion in the WG before we could resolve it.

## Question that took us a while to decide in the WG

> I would like to please see some more clarity or guidance about when TAG TBD112
> should be used, given that there are two possible encodings of absolute OIDs
> below "1.3.6.1.4.1".
> 
> Specifically, the questions that I have, that probably need to be clarified are:
> - is a CBOR encoder allowed to optimize a TBD110 tag into a TBD112 tag?
> - Should CBOR decoder clients always expect to be able to handle both TBD110
> and TBD112 tags? - Or, it the decision over whether to use TBD110 or TBD112
> down to the application and the application needs to agree which is use.

This is indeed a good question (about TBD111 vs. TBD112, actually).

We went for handling this as an issue about preferred serializations, and added Sections 2.2 and 4.1.

## Major items in the COMMENTs

> I found this document to be interesting because I knew from the title that it
> was going to only be 4 pages long and say that OIDs are obviously encoded as a
> tagged array, hence I was surprised to see that was not the solution and it
> uses BER encoded OIDs instead.
> 
> The document explains, and I think that I understand why this has been done,
> but I question whether the title of the document and name of the tags is right.
> Is it really a CBOR representation of OIDs, or is it actually a CBOR
> representation of BER encoded OIDs?  

The first, using the technical approach named in the second.

> I.e., it is plausible that there would
> ever be a requirement for non BER encoded OIDs.  E.g., I'm not an ASN.1 expert,
> but say if somewhat wanted to do a CBOR encoding of ASN.1, then it is not
> obvious to me that they would use a BER encoding for OIDs.  

One example for a CBOR encoding of something that previously has been encoded in ASN.1 is draft-mattsson-cose-cbor-cert-compress, which normatively references the tags-oid spec.  The certificate spec actually goes ahead and registers translation tables between likely OIDs and small integers.  But if the full range of OIDs is needed, the byte string derived from the BER content indeed is the representation chosen.  (As the C509 certificates are schema-driven, they do not need the actual tags defined here.)

> Hence the
> suggestion is to make the title, abstract, and name of the tags clear that it
> is about the CBOR encoding or BER encoded OIDs.

The specification proposes to use BER in a number of tags for binary encoding of OIDs.
The BERness is not a feature that the OIDs already need to have before these tags apply.
So a title "CBOR encoding of BER encoded OIDs” (which is what I think Rob wanted to say) would be a restriction beyond what this spec is about.

## More COMMENTs

> In the introduction:
>  Since the semantics of absolute and relative object identifiers
>  differ, this specification defines two tags, collectively called the
>  "OID tags" here:
> 
> I presume that this should be three tags?

(Fixed by Ben’s PR, https://github.com/cbor-wg/cbor-oid/pull/9 .)

> In section 4.1.  Tag Factoring Example: X.500 Distinguished Name:
> 
> The diagram uses a mix of single letters (e.g. c for country), and a full name
> "street".  Is this how the X.500 attributes are defined?  

This uses the naming defined e.g. in RFC 4519, section 2.2 and section 2.34, which has been derived from X.520 and is quite ubiquitous, going back to Table 1 of RFC 1779.

>   The country and street RDNs are single-valued. The second and fourth RDNs
>   are multi-valued.
> 
> Perhaps:  "The country (first) and street (third) RDNs are single-valued. The
> second and fourth RDNs are multi-valued.”

Included in -07 via https://github.com/cbor-wg/cbor-oid/commit/3d0fd8b .

>   h'550407': "Los Angeles", h'550408': "CA",
> 
> I think that the example would be more clear by splitting the city and county
> onto separate lines.

Included in -07 via https://github.com/cbor-wg/cbor-oid/commit/3d0fd8b .

> Finally, the document contains these two sentences that seem to somewhat
> conflict with each other:
> 
> "While these sequences can easily be represented in CBOR arrays of unsigned
> integers, a more compact representation can often be achieved by adopting the
> widely used representation of object identifiers defined in BER; this
> representation may also be more amenable to processing by other software that
> makes use of object identifiers."
> 
> compared to:
> 
> "Staying close to the way object identifiers are encoded in ASN.1 BER makes
> back-and-forth translation easy; otherwise we would choose a more efficient
> encoding."


While the BER form is reasonably efficient, a more efficient representation of sequences of unsigned integers that roughly follow Zipf’s law in distribution has been described in  <https://mailarchive.ietf.org/arch/msg/cbor/9owRyOcXdsK7Ooc1S3D9-AImDdU>, which would for instance represent 1.3.6.1.4.1 as 0x136141 and does not need a special case to make absolute OIDs efficient.

The WG instead decided to follow the existing BER representation because it is so widely implemented and is compatible with SDNVs, which are used in additional applications and often have good platform support (e.g., Ruby pack/unpack(“w*”)).
For OID-style applications, BER content is often still more compact than a CBOR array of unsigned integers.

Tagged (homogeneous) arrays of arcs don’t help as the OID arcs vary widely in their range; e.g., the common OID 1.2.840.113549.1.1.1 (*, 2+9 bytes in BER) would need to be an array of seven 32-bit integers (2tag+2head+28 bytes) instead of a basic CBOR array of unsigned integers of 14 bytes.

In general, I don’t think we need to discuss an extensive set of alternative approaches (paths not taken) in this specification.

Grüße, Carsten

(*) {iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-1(1) rsaEncryption(1)}