Re: [Cbor] Self Described CBOR - Little Endian

Laurence Lundblade <lgl@island-resort.com> Thu, 25 March 2021 17:49 UTC

Return-Path: <lgl@island-resort.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C60073A284C for <cbor@ietfa.amsl.com>; Thu, 25 Mar 2021 10:49:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JZoC_-UikcII for <cbor@ietfa.amsl.com>; Thu, 25 Mar 2021 10:48:56 -0700 (PDT)
Received: from p3plsmtpa12-07.prod.phx3.secureserver.net (p3plsmtpa12-07.prod.phx3.secureserver.net [68.178.252.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 59AF03A26F6 for <cbor@ietf.org>; Thu, 25 Mar 2021 10:48:56 -0700 (PDT)
Received: from [192.168.1.81] ([76.167.193.86]) by :SMTPAUTH: with ESMTPA id PU6HlWBjm83tOPU6IlrLDq; Thu, 25 Mar 2021 10:48:54 -0700
X-CMAE-Analysis: v=2.4 cv=ONniYQWB c=1 sm=1 tr=0 ts=605ccd06 a=t2DvPg6iSvRzsOFYbaV4uQ==:117 a=t2DvPg6iSvRzsOFYbaV4uQ==:17 a=IkcTkHD0fZMA:10 a=iClJuljEAAAA:8 a=b4mWfyFAAAAA:20 a=Bk4A-_hnAAAA:20 a=69EAbJreAAAA:8 a=48vgC7mUAAAA:8 a=DHG8ZiEi9wa6onjxP0EA:9 a=QEXdDO2ut3YA:10 a=xTxTf6f5l9QqXJC2ITn9:22 a=w1C3t2QeGrPiZgrLijVG:22 a=RBBcRewTFc8P4JkPnay6:22
X-SECURESERVER-ACCT: lgl@island-resort.com
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
From: Laurence Lundblade <lgl@island-resort.com>
In-Reply-To: <31C92F28-636E-429D-9265-3248509B60F7@cursive.net>
Date: Thu, 25 Mar 2021 10:48:53 -0700
Cc: alex thompson <pierogitus@hotmail.com>, "cbor@ietf.org" <cbor@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <6CEDEB7C-56C3-444F-B56B-8AE62747E2EA@island-resort.com>
References: <BY5PR20MB2898CBA491A4C0AED983C81CD2649@BY5PR20MB2898.namprd20.prod.outlook.com> <31C92F28-636E-429D-9265-3248509B60F7@cursive.net>
To: Joe Hildebrand <hildjj@cursive.net>
X-Mailer: Apple Mail (2.3445.104.17)
X-CMAE-Envelope: MS4xfK1oDuoXmKYosWbmlpgpy5UWH4Ow7vocRYEhM16PWBr6HzFKsntxWafQBCo0FzaYz2uJXTXDBjJyphzWwnjhl2citVYfJ6dfaMmyIVnadeV2XErLlDqr 4+IrIrPnlKa79Lz9s/tgQUNlYpMTp6J7salQZdfDvOvBxj4I+X8DOIEc7WGe39vsrQISXFMoFwZvZCMke5RvM6ygdC4T85htpHIzGmU4qmDK5xY3gZiBUF2C Slac5QyTomhIVZQmzmzyzg==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/8VrwA6Y2wfFcaVfG2wxkJ8vdCDs>
Subject: Re: [Cbor] Self Described CBOR - Little Endian
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Mar 2021 17:49:01 -0000

For native implementation, the amount of extra object code for endian conversion seems pretty small. My guess is 100 or 200 bytes. My encoding function that does endian conversion, and preferred encoding and deals with alignment is 150 bytes of object code. When you start encoding and decoding complicated structures like nested maps with optional items and you add in duplicate label detection, the percentage of the overall implementation saved is small.

It might have been better for CBOR to choose little endian because most CPUs are, but at this point any unnecessary divergence would create interoperability issues.

LL

> On Mar 24, 2021, at 1:06 PM, Joe Hildebrand <hildjj@cursive.net> wrote:
> 
> Coincidentally, I just got a very small generic WASM decoder working:
> 
> https://github.com/hildjj/cbor-wasm
> 
> Inspired by https://github.com/quartzjer/cb0r/, but works synchronously in streaming mode, like expat.
> 
> — 
> Joe Hildebrand
> 
>> On Mar 22, 2021, at 10:50 PM, alex thompson <pierogitus@hotmail.com> wrote:
>> 
>> ​Given the prevalence of little endian hardware and the existence of the LE typed array tags it would be useful to opt-in to LE encoding of the additional information bytes as well. This would simplify decoders that are specific to LE platforms, particularly WebAssembly which is all LE and hasn’t defined a swap instruction.
>> 
>> The opt-in could be self described with tag 55798 (has the same non-unicode magic number characteristic as 55799). The beginning of a stream would default to BE so tag 55798 itself would appear as 0xd9d9f6. Then any following tags and descendant items would be LE.
>> 
>> For completeness, 55798 and 55799 could be nested within one another, switching the endianness as the decoder traverses a tree of items. Nested tags 24 and 63 would pass on their endianness to their embedded items. No effect on typed array tags 64-87.
>> 
>> It would be a breaking change for generic decoders since ignoring 55798 wouldn’t work as expected but I think the impact is very manageable and fits with CBOR’s stated goals of small code size and extensibility.
>> 
>> Alex
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org
>> https://www.ietf.org/mailman/listinfo/cbor
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor