Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-array-tags-03

Jim Schaad <ietf@augustcellars.com> Sat, 09 March 2019 22:07 UTC

Return-Path: <ietf@augustcellars.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0DE71277D9; Sat, 9 Mar 2019 14:07:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YVQM_HVlvbYq; Sat, 9 Mar 2019 14:07:36 -0800 (PST)
Received: from mail2.augustcellars.com (augustcellars.com [50.45.239.150]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3198812705F; Sat, 9 Mar 2019 14:07:36 -0800 (PST)
Received: from Jude (50.252.25.182) by mail2.augustcellars.com (192.168.0.56) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Sat, 9 Mar 2019 14:07:26 -0800
From: Jim Schaad <ietf@augustcellars.com>
To: 'Laurence Lundblade' <lgl@island-resort.com>
CC: 'Francesca Palombini' <francesca.palombini@ericsson.com>, <cbor@ietf.org>, <draft-ietf-cbor-array-tags@ietf.org>
References: <426CD514-B174-4CE7-B467-2727C6B5B354@ericsson.com> <6F7C83DD-E98C-44EF-A315-194E31759518@island-resort.com> <72F7B17A-2684-4591-8D70-01DE32BFA03B@island-resort.com> <033d01d4d6af$c48be3b0$4da3ab10$@augustcellars.com> <E3735603-8744-4686-9FA7-459DDD8F57AE@island-resort.com>
In-Reply-To: <E3735603-8744-4686-9FA7-459DDD8F57AE@island-resort.com>
Date: Sat, 9 Mar 2019 14:07:22 -0800
Message-ID: <034701d4d6c4$7a4f52e0$6eedf8a0$@augustcellars.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQJDckXT1PNyaRbuIeGp9sN/08XZzALTp86QAZld6+8BpCWg/gJzCcw6pOKCIAA=
Content-Language: en-us
X-Originating-IP: [50.252.25.182]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/j1TroTuf9i6GvfnDYTRZCg6kXYY>
Subject: Re: [Cbor] =?utf-8?q?=F0=9F=94=94_WGLC_on_draft-ietf-cbor-array-tags?= =?utf-8?q?-03?=
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Mar 2019 22:07:40 -0000


> -----Original Message-----
> From: Laurence Lundblade <lgl@island-resort.com>
> Sent: Saturday, March 9, 2019 12:46 PM
> To: Jim Schaad <ietf@augustcellars.com>
> Cc: Francesca Palombini <francesca.palombini@ericsson.com>om>;
> cbor@ietf.org; draft-ietf-cbor-array-tags@ietf.org
> Subject: Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-array-tags-03
> 
> Hi Jim,
> 
> > On Mar 9, 2019, at 11:39 AM, Jim Schaad <ietf@augustcellars.com> wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: CBOR <cbor-bounces@ietf.org> On Behalf Of Laurence Lundblade
> >> Sent: Friday, March 8, 2019 7:44 AM
> >> To: Francesca Palombini <francesca.palombini@ericsson.com>
> >> Cc: cbor@ietf.org; draft-ietf-cbor-array-tags@ietf.org
> >> Subject: Re: [Cbor] 🔔 WGLC on draft-ietf-cbor-array-tags-03
> >>
> >> In addition to what I said before, a deterministic encoding should be
> >> specified (formerly known as canonical) in the specification and that
> >> should probably be network / big endian.
> >
> > I don't know that I agree with this.  Most of the machines that I am working
> with today are little endian machines, from that point of view if such a thing is
> done then little endian makes more sense.  However the fact that these are
> different tags implies that there is no need to have a deterministic encoding
> for this.  The encoding is going to be deterministic based on which
> endianness is chosen and that is machine or program specific.  I would expect
> that many implementations would not bother with the big endian encode
> and might not do the decode as they would be very rarely found.
> 
> My understanding of deterministic encoding is something like this — For a
> given input, say the array [100,200,300,400], two independently written
> encoders should produce *exactly* the same encoded bits. A general test
> vector for this array would work correctly for both an encoder written in the
> Foo language on a big endian machine and the Bar language on a little endian
> machine.
> 
> There is also the COSE Sig_structure and similar use cases. Some encoded
> CBOR that is not transmitted must be constructed by the sender and receiver
> and it has to be exactly correct because it is being hashed.
> 
> If the implementor gets to choose big / little endian as they wish, we don’t
> get deterministic encoding like that.
> 
> Personally, I don’t think deterministic encoding is that useful and would
> probably be happy seeing it go away completely. I suggested that in Bangkok,
> and the response was that some people (who weren’t in the room) really
> care about it. Here, I’m just pointing out, that if we’re doing deterministic
> encoding, we should do it for array tags too.
> 
> 
> I think interop is the bigger issue here (and deterministic encoding !=
> interop) . If everyone does what is natural for their CPU, we mostly have
> interop because little endian is the most common, but interop is not
> guaranteed because big endian CPUs exist. I think the spec should be tighter
> and should spell out what is to be done by a generic encode/decoder so that
> interop is guaranteed. Maybe frame it up in three options:
> 
> 1) Specify rules for preferred encoders and decoders like this: decoders
> should support both big and little endian. Encoders should do network byte
> order, but can choose as they wish.
> 
> 2) Just outright say tagged arrays are always little endian. Drop the tagging of
> little vs big.
> 
> 3) Just outright say tagged arrays are always network big endian. Drop the
> tagging of little vs big.
> 
> If most CPUs were big endian, the obvious choice would be 3). The problem
> with 2) is the mixed message where some parts of CBOR are big endian and
> other are little. The problem with 1) is that it is more complicated and
> implementors are more likely to get it wrong.
> 
> Personally, I kind of like 2).

Given that I don't care about deterministic mapping from data model to CBOR encoding, I don't see an issue here.  I will always get the same CBOR bytes given the same binary data. 

> 
> 
> >
> >>
> >> To go on a bit more about detailed implementation...
> >>
> >> In a native implementation there is also memory access alignment
> >> issue on the decode side, but not the encode side. Memory alignment
> >> is an issue on some Arm CPUs, though not on Intel CPUs as far as I
> >> know. So on some Arm CPUs, it won’t be possible to just return
> >> pointers to the bytes that came off the wire. Because non-string and
> >> non-aggregate CBOR data items can be 1, 2, 3, 5 and 9 bytes long,
> >> there can be no guarantee of alignment of the incoming encoded CBOR
> bytes.
> >> A portable decoder implementation will have to allocate aligned
> >> memory for the array and copy it. The decoder can byte-swap while it
> >> is copying so it lines up OK with the decoder having to do most of the
> work to achieve interop.
> >
> > I am confused here about what you are saying here.  I think you mean that
> returning a pointer to data block in the encoded structure may be a problem
> if it is not aligned, but if that is what you are saying I really don't understand
> that from the above text.
> 
> Yes, that is what I’m saying. Sorry I wasn’t more clear.

I would agree this is a valid issue to highlight.

Jim

> 
> >
> > Jim
> >
> >>
> >> There’s no alignment issue on the encode side.
> >>
> >> The only other thing I’ve encountered in CBOR that needs copying of
> >> data is indefinite length strings.
> >
> > That depends on the internal structure of binary data as well.  It may also
> need to be copied to get correct alignment.
> 
> For a rich, complete and true generic decoder, the obvious thing to do with
> array tags is return an array of integers in the native CPU format. If the CPU
> doesn’t support unaligned access, then that array of integers must be
> aligned.
> 
> I think the contents of a binary string are different. First of all, if it has internal
> structure, they should have encoded the internals with CBOR in the first
> place. :-). Second, there is nothing in the CBOR type information that tells a
> generic CBOR encoder if it needs to be aligned or not. I would think most of
> the time it does not, so aligning by default would be wasteful.
> 
> 
> All that said, I don’t see that alignment has any affect on the text in the draft.
> The main issue for me is the interop issue that arrises from the option to
> transmit either big or little endian.
> 
> LL
> 
> 
> >
> >
> >>
> >> LL
> >>
> >>
> >>
> >>> On Mar 7, 2019, at 7:15 PM, Laurence Lundblade
> >>> <lgl@island-resort.com>
> >> wrote:
> >>>
> >>> Comments in my usual vein…
> >>>
> >>> Because of the big/little endian option, it seems like this needs a
> >>> discussion
> >> about design protocols with it, about preferred encoding / decoding
> >> and about interop.
> >>>
> >>> Probably it should say decoders should support both big and little
> >>> and the
> >> encoder can do what is natural. It seems like there might be an
> >> option for both sides to support only one endianness, which will
> >> likely often be little endian because it is common.
> >>>
> >>> LL
> >>>
> >>>
> >>>> On Mar 6, 2019, at 4:53 AM, Francesca Palombini
> >> <francesca.palombini@ericsson.com> wrote:
> >>>>
> >>>> CBOR WG,
> >>>>
> >>>> The chairs believe the array-tags document is ready for WGLC:
> >>>> https://tools.ietf.org/html/draft-ietf-cbor-array-tags-03
> >>>>
> >>>> The WGLC will end by *20th of March*, please make sure to send your
> >> comments to the list before then.
> >>>>
> >>>> Best regards,
> >>>> Francesca & Barry
> >>>>
> >>>> On 05/03/2019, 23:58, "CBOR on behalf of Carsten Bormann" <cbor-
> >> bounces@ietf.org on behalf of cabo@tzi.org> wrote:
> >>>>
> >>>>  -03 reflects the fact that IANA has made the allocations.
> >>>>
> >>>>  From this author’s point of view, we are ready for WGLC.
> >>>>
> >>>>  Grüße, Carsten
> >>>>
> >>>>
> >>>>> On Mar 5, 2019, at 23:55, internet-drafts@ietf.org wrote:
> >>>>>
> >>>>>
> >>>>> A New Internet-Draft is available from the on-line Internet-Drafts
> >> directories.
> >>>>> This draft is a work item of the Concise Binary Object
> >>>>> Representation
> >> Maintenance and Extensions WG of the IETF.
> >>>>>
> >>>>>     Title           : Concise Binary Object Representation (CBOR) Tags for
> >> Typed Arrays
> >>>>>     Authors         : Johnathan Roatch
> >>>>>                       Carsten Bormann
> >>>>> 	Filename        : draft-ietf-cbor-array-tags-03.txt
> >>>>> 	Pages           : 14
> >>>>> 	Date            : 2019-03-05
> >>>>>
> >>>>> Abstract:
> >>>>> The Concise Binary Object Representation (CBOR, RFC 7049) is a
> >>>>> data format whose design goals include the possibility of
> >>>>> extremely small code size, fairly small message size, and
> >>>>> extensibility without the need for version negotiation.
> >>>>>
> >>>>> The present document makes use of this extensibility to define a
> >>>>> number of CBOR tags for typed arrays of numeric data, as well as
> >>>>> two additional tags for multi-dimensional and homogeneous arrays.
> >>>>> It is intended as the reference document for the IANA registration
> >>>>> of the CBOR tags defined.
> >>>>>
> >>>>>
> >>>>> The IETF datatracker status page for this draft is:
> >>>>> https://datatracker.ietf.org/doc/draft-ietf-cbor-array-tags/
> >>>>>
> >>>>> There are also htmlized versions available at:
> >>>>> https://tools.ietf.org/html/draft-ietf-cbor-array-tags-03
> >>>>> https://datatracker.ietf.org/doc/html/draft-ietf-cbor-array-tags-0
> >>>>> 3
> >>>>>
> >>>>> A diff from the previous version is available at:
> >>>>> https://www.ietf.org/rfcdiff?url2=draft-ietf-cbor-array-tags-03
> >>>>>
> >>>>>
> >>>>> Please note that it may take a couple of minutes from the time of
> >>>>> submission until the htmlized version and diff are available at
> >> tools.ietf.org.
> >>>>>
> >>>>> Internet-Drafts are also available by anonymous FTP at:
> >>>>> ftp://ftp.ietf.org/internet-drafts/
> >>>>>
> >>>>> _______________________________________________
> >>>>> I-D-Announce mailing list
> >>>>> I-D-Announce@ietf.org
> >>>>> https://www.ietf.org/mailman/listinfo/i-d-announce
> >>>>> Internet-Draft directories: http://www.ietf.org/shadow.html or
> >>>>> ftp://ftp.ietf.org/ietf/1shadow-sites.txt
> >>>>>
> >>>>
> >>>>  _______________________________________________
> >>>>  CBOR mailing list
> >>>>  CBOR@ietf.org
> >>>>  https://www.ietf.org/mailman/listinfo/cbor
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> CBOR mailing list
> >>>> CBOR@ietf.org
> >>>> https://www.ietf.org/mailman/listinfo/cbor
> >>>
> >>> _______________________________________________
> >>> CBOR mailing list
> >>> CBOR@ietf.org
> >>> https://www.ietf.org/mailman/listinfo/cbor
> >>
> >> _______________________________________________
> >> CBOR mailing list
> >> CBOR@ietf.org
> >> https://www.ietf.org/mailman/listinfo/cbor
> >
> >