Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-array-tags-03

Laurence Lundblade <lgl@island-resort.com> Sat, 09 March 2019 20:46 UTC

Return-Path: <lgl@island-resort.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2811C127964 for <cbor@ietfa.amsl.com>; Sat, 9 Mar 2019 12:46:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2yNM_9aik8Hx for <cbor@ietfa.amsl.com>; Sat, 9 Mar 2019 12:46:02 -0800 (PST)
Received: from p3plsmtpa09-03.prod.phx3.secureserver.net (p3plsmtpa09-03.prod.phx3.secureserver.net [173.201.193.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ECD02127598 for <cbor@ietf.org>; Sat, 9 Mar 2019 12:46:01 -0800 (PST)
Received: from [192.168.1.82] ([76.192.164.238]) by :SMTPAUTH: with ESMTPSA id 2ir1hZ6LJnjLw2ir2hAOZl; Sat, 09 Mar 2019 13:46:01 -0700
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Laurence Lundblade <lgl@island-resort.com>
In-Reply-To: <033d01d4d6af$c48be3b0$4da3ab10$@augustcellars.com>
Date: Sat, 9 Mar 2019 12:45:59 -0800
Cc: Francesca Palombini <francesca.palombini@ericsson.com>, cbor@ietf.org, draft-ietf-cbor-array-tags@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <E3735603-8744-4686-9FA7-459DDD8F57AE@island-resort.com>
References: <426CD514-B174-4CE7-B467-2727C6B5B354@ericsson.com> <6F7C83DD-E98C-44EF-A315-194E31759518@island-resort.com> <72F7B17A-2684-4591-8D70-01DE32BFA03B@island-resort.com> <033d01d4d6af$c48be3b0$4da3ab10$@augustcellars.com>
To: Jim Schaad <ietf@augustcellars.com>
X-Mailer: Apple Mail (2.3445.9.1)
X-CMAE-Envelope: MS4wfG57VViO1B/LKucx9eFDzZs153zrYmlgcHMVPrAMqG0LfrNVus2/gZIL9e7dtffS7G9x2E23fI1VUy7wIUJIif4DnLq0/0nPlUcMdzCRzGDcQA3xZHd+ ffkgm2ktaGXEk3TAT7YUW8jWgmLaLdrTIyg9A1k0xTZVwIoqKUGFytJgpJtmienvXQuMy0ayjRfuvsR0/sAyVZ3vqy4AAVhi2y45MvwDXB4Jjs0/HlBQBkbg Eao31QfTUGUN4niR/mvQscZ4Aa18OXsOtvRnMbFHuQzGzVVOHzjRINSJVi2Dn52195ReB2+HGpaotQIyynYI1g==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/XJW0PFwE733wLAUMc6QtDte8ZQY>
Subject: Re: [Cbor] =?utf-8?q?=F0=9F=94=94_WGLC_on_draft-ietf-cbor-array-tags?= =?utf-8?q?-03?=
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2019 20:46:05 -0000

Hi Jim,

> On Mar 9, 2019, at 11:39 AM, Jim Schaad <ietf@augustcellars.com>; wrote:
> 
> 
> 
>> -----Original Message-----
>> From: CBOR <cbor-bounces@ietf.org>; On Behalf Of Laurence Lundblade
>> Sent: Friday, March 8, 2019 7:44 AM
>> To: Francesca Palombini <francesca.palombini@ericsson.com>;
>> Cc: cbor@ietf.org; draft-ietf-cbor-array-tags@ietf.org
>> Subject: Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-array-tags-03
>> 
>> In addition to what I said before, a deterministic encoding should be
>> specified (formerly known as canonical) in the specification and that should
>> probably be network / big endian.
> 
> I don't know that I agree with this.  Most of the machines that I am working with today are little endian machines, from that point of view if such a thing is done then little endian makes more sense.  However the fact that these are different tags implies that there is no need to have a deterministic encoding for this.  The encoding is going to be deterministic based on which endianness is chosen and that is machine or program specific.  I would expect that many implementations would not bother with the big endian encode and might not do the decode as they would be very rarely found.

My understanding of deterministic encoding is something like this — For a given input, say the array [100,200,300,400], two independently written encoders should produce *exactly* the same encoded bits. A general test vector for this array would work correctly for both an encoder written in the Foo language on a big endian machine and the Bar language on a little endian machine.

There is also the COSE Sig_structure and similar use cases. Some encoded CBOR that is not transmitted must be constructed by the sender and receiver and it has to be exactly correct because it is being hashed.

If the implementor gets to choose big / little endian as they wish, we don’t get deterministic encoding like that. 

Personally, I don’t think deterministic encoding is that useful and would probably be happy seeing it go away completely. I suggested that in Bangkok, and the response was that some people (who weren’t in the room) really care about it. Here, I’m just pointing out, that if we’re doing deterministic encoding, we should do it for array tags too.


I think interop is the bigger issue here (and deterministic encoding != interop) . If everyone does what is natural for their CPU, we mostly have interop because little endian is the most common, but interop is not guaranteed because big endian CPUs exist. I think the spec should be tighter and should spell out what is to be done by a generic encode/decoder so that interop is guaranteed. Maybe frame it up in three options:

1) Specify rules for preferred encoders and decoders like this: decoders should support both big and little endian. Encoders should do network byte order, but can choose as they wish.  

2) Just outright say tagged arrays are always little endian. Drop the tagging of little vs big.

3) Just outright say tagged arrays are always network big endian. Drop the tagging of little vs big.

If most CPUs were big endian, the obvious choice would be 3). The problem with 2) is the mixed message where some parts of CBOR are big endian and other are little. The problem with 1) is that it is more complicated and implementors are more likely to get it wrong.

Personally, I kind of like 2).


> 
>> 
>> To go on a bit more about detailed implementation...
>> 
>> In a native implementation there is also memory access alignment issue on
>> the decode side, but not the encode side. Memory alignment is an issue on
>> some Arm CPUs, though not on Intel CPUs as far as I know. So on some Arm
>> CPUs, it won’t be possible to just return pointers to the bytes that came off
>> the wire. Because non-string and non-aggregate CBOR data items can be 1, 2,
>> 3, 5 and 9 bytes long, there can be no guarantee of alignment of the incoming
>> encoded CBOR bytes.
>> A portable decoder implementation will have to allocate aligned memory for
>> the array and copy it. The decoder can byte-swap while it is copying so it lines
>> up OK with the decoder having to do most of the work to achieve interop.
> 
> I am confused here about what you are saying here.  I think you mean that returning a pointer to data block in the encoded structure may be a problem if it is not aligned, but if that is what you are saying I really don't understand that from the above text.

Yes, that is what I’m saying. Sorry I wasn’t more clear.

> 
> Jim
> 
>> 
>> There’s no alignment issue on the encode side.
>> 
>> The only other thing I’ve encountered in CBOR that needs copying of data is
>> indefinite length strings.
> 
> That depends on the internal structure of binary data as well.  It may also need to be copied to get correct alignment.

For a rich, complete and true generic decoder, the obvious thing to do with array tags is return an array of integers in the native CPU format. If the CPU doesn’t support unaligned access, then that array of integers must be aligned.

I think the contents of a binary string are different. First of all, if it has internal structure, they should have encoded the internals with CBOR in the first place. :-). Second, there is nothing in the CBOR type information that tells a generic CBOR encoder if it needs to be aligned or not. I would think most of the time it does not, so aligning by default would be wasteful.


All that said, I don’t see that alignment has any affect on the text in the draft.  The main issue for me is the interop issue that arrises from the option to transmit either big or little endian.

LL


> 
> 
>> 
>> LL
>> 
>> 
>> 
>>> On Mar 7, 2019, at 7:15 PM, Laurence Lundblade <lgl@island-resort.com>;
>> wrote:
>>> 
>>> Comments in my usual vein…
>>> 
>>> Because of the big/little endian option, it seems like this needs a discussion
>> about design protocols with it, about preferred encoding / decoding and
>> about interop.
>>> 
>>> Probably it should say decoders should support both big and little and the
>> encoder can do what is natural. It seems like there might be an option for
>> both sides to support only one endianness, which will likely often be little
>> endian because it is common.
>>> 
>>> LL
>>> 
>>> 
>>>> On Mar 6, 2019, at 4:53 AM, Francesca Palombini
>> <francesca.palombini@ericsson.com>; wrote:
>>>> 
>>>> CBOR WG,
>>>> 
>>>> The chairs believe the array-tags document is ready for WGLC:
>>>> https://tools.ietf.org/html/draft-ietf-cbor-array-tags-03
>>>> 
>>>> The WGLC will end by *20th of March*, please make sure to send your
>> comments to the list before then.
>>>> 
>>>> Best regards,
>>>> Francesca & Barry
>>>> 
>>>> On 05/03/2019, 23:58, "CBOR on behalf of Carsten Bormann" <cbor-
>> bounces@ietf.org on behalf of cabo@tzi.org>; wrote:
>>>> 
>>>>  -03 reflects the fact that IANA has made the allocations.
>>>> 
>>>>  From this author’s point of view, we are ready for WGLC.
>>>> 
>>>>  Grüße, Carsten
>>>> 
>>>> 
>>>>> On Mar 5, 2019, at 23:55, internet-drafts@ietf.org wrote:
>>>>> 
>>>>> 
>>>>> A New Internet-Draft is available from the on-line Internet-Drafts
>> directories.
>>>>> This draft is a work item of the Concise Binary Object Representation
>> Maintenance and Extensions WG of the IETF.
>>>>> 
>>>>>     Title           : Concise Binary Object Representation (CBOR) Tags for
>> Typed Arrays
>>>>>     Authors         : Johnathan Roatch
>>>>>                       Carsten Bormann
>>>>> 	Filename        : draft-ietf-cbor-array-tags-03.txt
>>>>> 	Pages           : 14
>>>>> 	Date            : 2019-03-05
>>>>> 
>>>>> Abstract:
>>>>> The Concise Binary Object Representation (CBOR, RFC 7049) is a data
>>>>> format whose design goals include the possibility of extremely small
>>>>> code size, fairly small message size, and extensibility without the
>>>>> need for version negotiation.
>>>>> 
>>>>> The present document makes use of this extensibility to define a
>>>>> number of CBOR tags for typed arrays of numeric data, as well as two
>>>>> additional tags for multi-dimensional and homogeneous arrays.  It is
>>>>> intended as the reference document for the IANA registration of the
>>>>> CBOR tags defined.
>>>>> 
>>>>> 
>>>>> The IETF datatracker status page for this draft is:
>>>>> https://datatracker.ietf.org/doc/draft-ietf-cbor-array-tags/
>>>>> 
>>>>> There are also htmlized versions available at:
>>>>> https://tools.ietf.org/html/draft-ietf-cbor-array-tags-03
>>>>> https://datatracker.ietf.org/doc/html/draft-ietf-cbor-array-tags-03
>>>>> 
>>>>> A diff from the previous version is available at:
>>>>> https://www.ietf.org/rfcdiff?url2=draft-ietf-cbor-array-tags-03
>>>>> 
>>>>> 
>>>>> Please note that it may take a couple of minutes from the time of
>>>>> submission until the htmlized version and diff are available at
>> tools.ietf.org.
>>>>> 
>>>>> Internet-Drafts are also available by anonymous FTP at:
>>>>> ftp://ftp.ietf.org/internet-drafts/
>>>>> 
>>>>> _______________________________________________
>>>>> I-D-Announce mailing list
>>>>> I-D-Announce@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/i-d-announce
>>>>> Internet-Draft directories: http://www.ietf.org/shadow.html or
>>>>> ftp://ftp.ietf.org/ietf/1shadow-sites.txt
>>>>> 
>>>> 
>>>>  _______________________________________________
>>>>  CBOR mailing list
>>>>  CBOR@ietf.org
>>>>  https://www.ietf.org/mailman/listinfo/cbor
>>>> 
>>>> 
>>>> _______________________________________________
>>>> CBOR mailing list
>>>> CBOR@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/cbor
>>> 
>>> _______________________________________________
>>> CBOR mailing list
>>> CBOR@ietf.org
>>> https://www.ietf.org/mailman/listinfo/cbor
>> 
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org
>> https://www.ietf.org/mailman/listinfo/cbor
> 
>