Re: [Cbor] Interactions of packed CBOR and tags

Brendan Moran <Brendan.Moran@arm.com> Fri, 04 September 2020 12:09 UTC

Return-Path: <Brendan.Moran@arm.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE5FA3A045B for <cbor@ietfa.amsl.com>; Fri, 4 Sep 2020 05:09:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.b=DQzheRwC; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.b=DQzheRwC
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rcrr2DnbCCZH for <cbor@ietfa.amsl.com>; Fri, 4 Sep 2020 05:09:12 -0700 (PDT)
Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80058.outbound.protection.outlook.com [40.107.8.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7A3243A040F for <cbor@ietf.org>; Fri, 4 Sep 2020 05:09:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nbUQqIC/TBs6XI39v5PY03VOZSjpGv7Nv9/VClxbE3g=; b=DQzheRwC2q15++q9ARrK6CG1nsq8Mab3wwhgNUl9Z/FdIKJH5tHu2l35wWlzPcZzryRsjGxA13K2ER0QigB5/3dHEQTg8/7ewoN1b/+9gzmU6xEghiY6JFbrzg7JU9JW3JnYth5ap1VJ0xzFfvQfhdOzAMueRVudQxTb+foX29Y=
Received: from DB8PR04CA0001.eurprd04.prod.outlook.com (2603:10a6:10:110::11) by AM0PR08MB5172.eurprd08.prod.outlook.com (2603:10a6:208:164::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.15; Fri, 4 Sep 2020 12:09:09 +0000
Received: from DB5EUR03FT043.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:110:cafe::3c) by DB8PR04CA0001.outlook.office365.com (2603:10a6:10:110::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.15 via Frontend Transport; Fri, 4 Sep 2020 12:09:09 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; ietf.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;ietf.org; dmarc=bestguesspass action=none header.from=arm.com;
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;
Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT043.mail.protection.outlook.com (10.152.20.236) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.16 via Frontend Transport; Fri, 4 Sep 2020 12:09:09 +0000
Received: ("Tessian outbound 195a290eb161:v64"); Fri, 04 Sep 2020 12:09:09 +0000
X-CheckRecipientChecked: true
X-CR-MTA-CID: 15907e8d76803be6
X-CR-MTA-TID: 64aa7808
Received: from 744fdaadec27.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B2293B45-6D8E-4B80-B41A-2B2AE12F3562.1; Fri, 04 Sep 2020 12:08:31 +0000
Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 744fdaadec27.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 04 Sep 2020 12:08:31 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IDsfBMLZnFBBFYVpKBAtOcYEBjdOV/Jpv7kSW7idc4P8NrInSWqwWziO2mnEDh1lso3l+YfTiFZTzWXuCNufGJEh/a5msrPRIynhKkeAkfb/0hONk/ICh3qCzlThG/vDGQrOg45ULYiZBrVlZl//VpJKJn8vHfmr0iRVbmASn1FobxKx0/cXiv6Q60FSvliMEqNiZkNYx/n9UdANpcGgmWpGyI3NEwy/9gDagdiP/KGCTIOBHGTtjSo64WZc2CVe+xXOu8vfH8zLZIhgbV7zRc/FmQ10qF6gcPh75ziUJzH4NMoBNVSvu4+qG1ohx9Joa52il8o/a5ht5Q9s1yydiQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nbUQqIC/TBs6XI39v5PY03VOZSjpGv7Nv9/VClxbE3g=; b=OhGY+GDNhFRdWHeJI+rNkuzkYkv+bqQL3YBVm7XKyhYC01Jt9/NTktVZKce+WABFnekEHiwwNkQz7OxsEufhGrHyeH6AuMghJeUGU472VS2Hdxr0U52lM+l2b6M/DAmyw32210b2vU7BaPYtuiBZCu1GLdxlc3Ed73Ve1Cfwwzsgj13Ztge85ddYBr05xRsxJOPjaqlQ6LVAGcLtf4Sn9phAzmPiMYUjuJblqockcZ4k9cDbxCODc0osnj+F72CUhZaXQm7iQthoXK3DHUAG0KkMGsIDVhniDJJqJQsAzJcUC0XnGM3GerGaeCCfsM6AEQ8ucTRU/DtXRWVA4qTDLw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nbUQqIC/TBs6XI39v5PY03VOZSjpGv7Nv9/VClxbE3g=; b=DQzheRwC2q15++q9ARrK6CG1nsq8Mab3wwhgNUl9Z/FdIKJH5tHu2l35wWlzPcZzryRsjGxA13K2ER0QigB5/3dHEQTg8/7ewoN1b/+9gzmU6xEghiY6JFbrzg7JU9JW3JnYth5ap1VJ0xzFfvQfhdOzAMueRVudQxTb+foX29Y=
Received: from AM6PR08MB4738.eurprd08.prod.outlook.com (2603:10a6:20b:cf::10) by AM5PR0801MB1620.eurprd08.prod.outlook.com (2603:10a6:203:3a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3348.15; Fri, 4 Sep 2020 12:08:30 +0000
Received: from AM6PR08MB4738.eurprd08.prod.outlook.com ([fe80::a98d:5ebe:dc1d:ea56]) by AM6PR08MB4738.eurprd08.prod.outlook.com ([fe80::a98d:5ebe:dc1d:ea56%3]) with mapi id 15.20.3348.017; Fri, 4 Sep 2020 12:08:30 +0000
From: Brendan Moran <Brendan.Moran@arm.com>
To: Carsten Bormann <cabo@tzi.org>
CC: "cbor@ietf.org" <cbor@ietf.org>, Jim Schaad <ietf@augustcellars.com>
Thread-Topic: [Cbor] Interactions of packed CBOR and tags
Thread-Index: AdZ8s0xpKERBvw7yTZyxz2a31flriAABj4uAAAP6fwABR/KiAAAIntwAAAHAYYAAAG2GAAAYtYsAAA8zSgA=
Date: Fri, 4 Sep 2020 12:08:30 +0000
Message-ID: <4FCFB117-726F-4F8A-98E4-E51B1098ACC4@arm.com>
References: <00c101d67cb5$2588b790$709a26b0$@augustcellars.com> <E30F54B6-1A63-48AC-89AE-61983654B5A9@tzi.org> <00cc01d67cc9$766c7b60$63457220$@augustcellars.com> <4AE9B2FA-EEB3-4B45-96E4-9DC85118567D@arm.com> <016f01d6820b$bc7d7cc0$35787640$@augustcellars.com> <62FEE35D-75F3-422E-A6C0-FFE86ACBD9A5@tzi.org> <6ED2C94E-CA4D-4D11-87FA-0E8010690A69@arm.com> <313AA38B-7501-42C4-AD1A-0CA302AB8A23@tzi.org>
In-Reply-To: <313AA38B-7501-42C4-AD1A-0CA302AB8A23@tzi.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3608.120.23.2.1)
Authentication-Results-Original: tzi.org; dkim=none (message not signed) header.d=none;tzi.org; dmarc=none action=none header.from=arm.com;
x-originating-ip: [217.140.106.51]
x-ms-publictraffictype: Email
X-MS-Office365-Filtering-HT: Tenant
X-MS-Office365-Filtering-Correlation-Id: b6343596-aa35-44e5-968e-08d850cb541c
x-ms-traffictypediagnostic: AM5PR0801MB1620:|AM0PR08MB5172:
X-Microsoft-Antispam-PRVS: <AM0PR08MB517291004CC44FB0278452C1EA2D0@AM0PR08MB5172.eurprd08.prod.outlook.com>
x-checkrecipientrouted: true
nodisclaimer: true
x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000;
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original: +9Ts4Mtm7NtySwdHPujvjlF68S+1sQklLmdVTqvlWWX0RgrtCFLT5CQgvrjE7cJNLEgdSzB/tGBsWH9KAckTDJOjMkgTrd7/FWsRxwQ+IDvvm7CFQxd2d4zMvj0bRhfiM4kVdroE425B3IimNPx2hOv90yyapjsXs4rwS6IMlDUlClZmWRY7oGLhS6/AMYzlBzfxbEnQVDvenvSkoFwNWMPwhh66sLAaxPJzYXtEASOg3u7dvn9R0HZaVGk/DcBre6Zmu5Glg5qi13jfzqK6QciqflRZ+4QocDZtvRQW3uIlV+nj2uwUv2b+VpLqMctIYYJ/22NURc/ddiju1LKhbg==
X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM6PR08MB4738.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(376002)(346002)(39860400002)(136003)(366004)(33656002)(54906003)(64756008)(66556008)(6512007)(71200400001)(86362001)(66946007)(91956017)(66476007)(66446008)(76116006)(186003)(6506007)(53546011)(2906002)(8936002)(316002)(6916009)(8676002)(2616005)(36756003)(83380400001)(478600001)(26005)(6486002)(5660300002)(4326008); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: AzB6EHdlOnNeJZYC6FLOqbx28fAgn8HycSxaSlBQJ4aFFTYOWDUPQcwBTX5HbszW8VYRGfCFfD3NjrrgXe3OJKuqzaMLkYtoa4G2/2Yo5aoboEu3wnHwZ+4DPb63JhS+jYx7FjcFrrEV2vU1svhBim+poxgqoomUS1FVEDddzIC/ZFVObgr3ApubYLdk9lM17/p1fiH8LpXQVFU/pZ4XF9WtYitqhjXxySte1/iFKjfCJIzZPfWfKnBZ2kjZOGQbn64fEndjobRj8O2aw7pHg94XE1WjtHPu43Bpi3Ba0QgoUBQUwVKdxnf4apT34QS9saKKH5wjghvTbVg4fAgwPkrwAS7PrrZQ0d2pKB8T/HUqns3nRam+60VHANKIKSrmme+V232rNtyEGiKbvIpeMtymBkvDJyESKs5wnUbsMdorTg+zk+SlenKKLddIPuAqmpItbonfOkUX3ZndXrPVpcCyW95WVABD3w4kQRtKIw4sgTJdgwGNs4SDoT7vOEHsK4y805pfiuyvbkI8hA3lngtyVV0FPoFdg0uFFu+dUSKGGfLVKx47qwX/ZJOBJ3akD47z619blAn+YbKtiXEKgqHXLvWdQzMptpZDh78vJYYrYqWNVqqs75oO9/066YClWPdZ6WyLnBHrGg75Zg4zNA==
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-ID: <C5FAFC56DA0DD6428B74F1C28F4E679B@eurprd08.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0801MB1620
Original-Authentication-Results: tzi.org; dkim=none (message not signed) header.d=none;tzi.org; dmarc=none action=none header.from=arm.com;
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT043.eop-EUR03.prod.protection.outlook.com
X-MS-Office365-Filtering-Correlation-Id-Prvs: 56b8e143-fdf6-4392-efa8-08d850cb3d07
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: gWtuf5UasnqduYuoO5tbTj31GAVgEFClmmQvtHnbJH+Jfugh5jrT3c/+Ddf8Hh0VpHk37hdkk/jC+l0TMGraVQtbwdm5cYp19OjE/w8E2g3Ew0/GexHLs/vtz5F0EWCrcL7+S4upXZUE/SsJkQlLHlC+Mn38Y7VEybmh6vDWhZCZ9wrsaY4sdUcgab9qZTxjqqAxh5HXv+gKaaF6FWYVMZMyhXZY2BEgOIL/otJjuqBy3drqB7gPNHTWNpH7v/WRj8MHI3RmN8UeXib+J+jf9QGD01ky/ZI1rTQWHvDnYnw9rMxRmrGqxbgVDbJAW9QwOQxc7jY0XOod+eNjgWPX/K/tXMkoReNZCTz/O+9WN+cB9P5c7BY/fkgPJfGdHEBwMcjlFV0LzvYA+b6U43JZ+A==
X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(136003)(346002)(396003)(39860400002)(46966005)(70586007)(8676002)(33656002)(107886003)(82310400003)(83380400001)(54906003)(36756003)(6486002)(70206006)(81166007)(86362001)(2616005)(356005)(53546011)(5660300002)(2906002)(336012)(316002)(478600001)(6862004)(8936002)(26005)(6512007)(82740400003)(47076004)(186003)(6506007)(4326008); DIR:OUT; SFP:1101;
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2020 12:09:09.2608 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: b6343596-aa35-44e5-968e-08d850cb541c
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT043.eop-EUR03.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5172
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/mzaP2ly_FZkEoaZtrwD1rfSmSV4>
Subject: Re: [Cbor] Interactions of packed CBOR and tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Sep 2020 12:09:15 -0000


> On 4 Sep 2020, at 05:53, Carsten Bormann <cabo@tzi.org> wrote:
>
> On 2020-09-03, at 19:05, Brendan Moran <Brendan.Moran@arm.com> wrote:
>>
>>> Adding suffix packing, we could have a third table and a third set of referents (also with content).  We also could have just separate referents, sharing the table/number space, if that makes sense.
>>
>> [BJM] Unless we have a way to obtain a whole new referent space with small values, I think that sharing the table is optimal. This also allows reuse of individual values as either prefixes or postfixes, which should yield some additional compression opportunities. This will be even more interesting in the array space.
>
> Yes, but how do you know that a referent asks you to do prepend/append?
>
> (1) encode prefix vs. suffix in the referent — but that does mean that we use more code points for the referents
> (2) encode prefix vs. suffix in the table entry — but that means such an entry cannot be used for both (ok, minor problem), and it also requires to represent another bit in the table (which in the end might cost two bytes per entry).
> (3) encode prefix vs. suffix in the position of the table entry — but that is exactly equivalent to (1) on the referent side but does create the need to manage holes.
>
> So I think we need to bite the bullet and provide more referent space (1).

Well, if you want to encode prefix vs. postfix in the table, I think the best option is likely a bitfield for the whole table:

#6.6([rump, [bstr .bits prepost-flags, *prepostfix], *shared])

That’s only 2 bytes per entry if you have one entry.

Another, more obtuse option would be to encode several integers:

1: The number of 2 byte references that are prefixes
2: The number of 3 byte references that are prefixes
3: The number of 5 byte references that are prefixes

Then, sort table entries so that prefixes come before postfixes in each size grouping. I doubt it’s possible to get smaller than that encoding, since it has an absolute maximum size of 10 bytes, but can be substantially smaller.

There’s one more mechanism to improve this slightly: use positive or negative integers. If there are more prefixes than postfixes, encode the number of postfixes as a negative integer. If there are more postfixes than prefixes, encode the number of prefixes as a positive integer.


While I’d rather if references could declare prefix or postfix usage on a per-reference basis without inflating the table, I don’t think this is necessarily the right answer. This would need to be a very common use case for it to yield a better result on average than the schemes above. Reuse can already be accomodated by referencing an existing prefix or postfix within the prefix/postfix table.

Brendan
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.