[Cbor] Bignums and the generic data models (Re: πŸ”” WGLC on draft-ietf-cbor-7049bis-09)

Carsten Bormann <cabo@tzi.org> Tue, 14 January 2020 21:52 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D938120113; Tue, 14 Jan 2020 13:52:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hqR4A2cU0aIE; Tue, 14 Jan 2020 13:52:42 -0800 (PST)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84DFB120639; Tue, 14 Jan 2020 13:52:42 -0800 (PST)
Received: from [192.168.217.119] (p548DC4D8.dip0.t-ipconnect.de [84.141.196.216]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 47y4204N0hzyVd; Tue, 14 Jan 2020 22:52:40 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CANh-dXnPRd7w_z2LA0gYD0GHVbmych4BGA5_-vmJz+Zn1qBh_w@mail.gmail.com>
Date: Tue, 14 Jan 2020 22:52:40 +0100
Cc: Francesca Palombini <francesca.palombini=40ericsson.com@dmarc.ietf.org>, "cbor@ietf.org" <cbor@ietf.org>, "draft-ietf-cbor-7049bis@ietf.org" <draft-ietf-cbor-7049bis@ietf.org>, "cbor-chairs@ietf.org" <cbor-chairs@ietf.org>
X-Mao-Original-Outgoing-Id: 600731560.097301-b80188abe631f99c0e7c30ed24367e1f
Content-Transfer-Encoding: quoted-printable
Message-Id: <A4756143-2E47-474C-8EBD-0632DD3B659D@tzi.org>
References: <293AFF31-D0EF-45D6-9B9D-E8136481C404@ericsson.com> <CANh-dXnPRd7w_z2LA0gYD0GHVbmych4BGA5_-vmJz+Zn1qBh_w@mail.gmail.com>
To: Jeffrey Yasskin <jyasskin@chromium.org>
X-Mailer: Apple Mail (2.3608.40.2.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/0EZ6bsABUi8xIGFKbZbkJxi3qiQ>
Subject: [Cbor] Bignums and the generic data models (Re: πŸ”” WGLC on draft-ietf-cbor-7049bis-09)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jan 2020 21:52:47 -0000

On 2019-11-24, at 12:23, Jeffrey Yasskin <jyasskin@chromium.org> wrote:
> 
> 3.4.4.  Bignums
> 	[…]
> 	β€’ "and preferred encoding never makes use of bignums that also can be expressed as basic integers (see below)." <- This seems inconsistent with "In the generic data model, bignum values are not equal to integers from the basic data model". If they're not the same value at the data model level, they can't be alternate encodings of each other.

Hi Jeffrey,

of your many useful comments in that message, let me pick this one because answering it is a prerequisite to getting the map validity text right.

Indeed, in the basic (unextended) generic data model, bignums are different from (mt0/1) integers.

The extended generic data model for tag 2/3 changes this.  It has to.  Consider:

2(h'00112233445566778899')
and
2(h'112233445566778899')

These are different in the basic generic data model, but both mean 316059037807746189465 in the extended generic data model, so that has to be different from the basic one.  

So given that we already had to extend the generic data model with the values (types) specific to the tag semantics, we also can (and should!) say that

2(h'12')

means 18 in the extended generic data model, the preferred encoding of which is 0x12 and not 0xc24112.

For a generic codec, both options are available for a specific tag number:

- offer the basic generic data model.  This leaves it to the application to build the tag content in the way that fits to the tag definition and to interpret it this way.  It is then also up to the application to take care of map validity (e.g., by not using both 2(h'00112233445566778899’) and 2(h'112233445566778899’) as keys in the same map), and to issue preferred serialization if desired.
β€” process the tag, i.e., offer the extended generic data model.  Now the generic codec can do all that work; for the example at hand, at the codec API you only see integers.

This is the direction in which I think we need to go if we want to enable tag processing in generic codecs.

(The cognitive dissonance here is maybe with ECMAscript bigints, which are a separate type from numbers.  But sending a bignum is not on its own a good way to distinguish a number as a ES bigint; probably a language-specific tag is a better way to make that language-specific distinction.)

Grüße, Carsten