Re: [ogpx] type-system : eliminate closing tags for collections in binary serialization?

Joshua Bell <josh@lindenlab.com> Mon, 29 March 2010 05:36 UTC

Return-Path: <josh@lindenlab.com>
X-Original-To: ogpx@core3.amsl.com
Delivered-To: ogpx@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0EA363A68E4 for <ogpx@core3.amsl.com>; Sun, 28 Mar 2010 22:36:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.476
X-Spam-Level: *
X-Spam-Status: No, score=1.476 tagged_above=-999 required=5 tests=[AWL=-0.278, BAYES_50=0.001, DNS_FROM_OPENWHOIS=1.13, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ko-sokiGllSn for <ogpx@core3.amsl.com>; Sun, 28 Mar 2010 22:36:38 -0700 (PDT)
Received: from mail-wy0-f172.google.com (mail-wy0-f172.google.com [74.125.82.172]) by core3.amsl.com (Postfix) with ESMTP id 51CB23A69D4 for <ogpx@ietf.org>; Sun, 28 Mar 2010 22:34:32 -0700 (PDT)
Received: by wyb29 with SMTP id 29so4888032wyb.31 for <ogpx@ietf.org>; Sun, 28 Mar 2010 22:34:56 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.216.91.8 with HTTP; Sun, 28 Mar 2010 22:34:56 -0700 (PDT)
In-Reply-To: <201003282129.59068.bobby@sharedrealm.com>
References: <b325928b1003281034x68725d0h555d608e4c8a5d36@mail.gmail.com> <201003282129.59068.bobby@sharedrealm.com>
Date: Sun, 28 Mar 2010 22:34:56 -0700
Received: by 10.216.88.148 with SMTP id a20mr9959wef.124.1269840896416; Sun, 28 Mar 2010 22:34:56 -0700 (PDT)
Message-ID: <f72742de1003282234x34e1d23dt9149c980af556500@mail.gmail.com>
From: Joshua Bell <josh@lindenlab.com>
To: ogpx@ietf.org
Content-Type: multipart/alternative; boundary="0016e6d784218d5dbf0482e9df9b"
Subject: Re: [ogpx] type-system : eliminate closing tags for collections in binary serialization?
X-BeenThere: ogpx@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <ogpx.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ogpx>, <mailto:ogpx-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ogpx>
List-Post: <mailto:ogpx@ietf.org>
List-Help: <mailto:ogpx-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ogpx>, <mailto:ogpx-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Mar 2010 05:36:40 -0000

I actually like having the extra characters in there, but I'm not picky.

However, having just implemented a binary serializer, it occurred to me that
the 'k' indicator for map keys is also unnecessary baggage. If we are truly
trying to minimize byte count, that could go too.

Also, the I-D calls out that "characters" are emitted into the stream, which
implies an encoding. Although UTF-8 is dictated for strings (and map keys),
for clarity we might want to describe the octets e.g. "Undefined values are
serialized as a single 0x21 octet (ASCII '!')" and so forth

On Sun, Mar 28, 2010 at 9:29 PM, Robert G. Jakabosky
<bobby@sharedrealm.com>wrote:

>
> I vote to remove the closing tags, since they are redundant.
>
> If the closing tags are kept in the spec, then they should only be used for
> validation (i.e. the number of objects contained between the opening &
> closing tag MUST match the "object count").
>
> Also I vote that the spec require malformed binary/JSON/XML LLSD messages
> to
> be rejected.  This is very important for security reasons and it will make
> it
> easier for new implementations to be made, since there will be less
> convertion/corner case rules.
>
> On Sunday 28, Meadhbh Hamrick wrote:
> > so... just an idea...
> >
> > the current type system draft, draft-hamrick-vwrap-type-system-00,
> > requires that maps or arrays that are serialized using the binary
> > serialization scheme have a closing tag in addition to an opening tag.
> > for example, to serialize the map:
> >
> > {
> >   foo : string,
> >   bar : integer
> > }
> >
> > you would use an opening tag of a left curly brace ('{') followed by a
> > 32 bit count of the number of elements in the map, followed by the two
> > elements, followed by a closing tag of a right curly brace.
> >
> > i have a vague concern that the closing tag is unneeded, may confuse
> > implementers and could lead to security concerns (at worst) or generic
> > errors (at best.) specifically, what happens when you don't have a
> > closing tag following the last element in the collection?
> >
> > i don't think this is a "big deal," and we can certainly provide
> > guidance for implementers with properly worded statements about how we
> > think such situations should be handled.
> >
> > and i think that making this change would require both opensim and LL
> > to modify their code, so i can totally understand if peeps would
> > rather keep the closing tag for collections.
> >
> > but.. i think it would be a good idea to either a) eliminate the
> > closing tags or b) add some guidance to implementers about handling
> > encoding errors. i have a separate email coming up for guidance.
> >
> > so, assuming we wanted to eliminate closing tags, we might want to
> > change section 4.3 to read thusly.
> >
> > 4.3. Binary Serialization
> >
> >    The LLSD Binary Serialization is an encoding syntax appropriate for
> >    situations where high message entropy is required or limiting
> >    processing power for parsing messages is available.
> >
> >    Encoding LLSD structured data using the binary serialization scheme
> >    involves generating tag, (optional) size values, and serialization of
> >    simple values.  Composite types are serialized by iterating across
> >    all members of the collection, serializing each simple or composite
> >    member in turn.  For each element in an
> >    LLSD structured data object, the following process is used to
> >    generate a binary output stream of serialized data:
> >
> >    o  A one octet type tag is emitted to the output stream.  See the
> >       table below for tag octets.
> >
> >    o  If the size of the element being serialized is variable (as it
> >       will be for strings, URIs, arrays and maps), the size or length of
> >       the element is output to the stream as a network-order 32 bit
> >       value.  Elements of types with fixed lengths such as undefined
> >       values, booleans, integers, reals, UUIDs and dates will not
> >       include size information in the output stream.
> >
> >    o  Finally, the binary representation of the element is appended to
> >       the output stream.
> >
> >    Undefined  Undefined values are serialized with a single exclamation
> >        point character ('!').  Undefined values append neither size
> >        information or data to the output stream.
> >
> >    Boolean  True values are serialized with a single '1' character.
> >        False values are serialized with a single '0' character.
> >        Booleans append neither size information or data to the output
> >        stream.
> >
> >    Integer  Integer values are serialized by emitting the 'i' character
> >        to the output stream followed by the four octets representing the
> >        integer's 32 bits in network order.
> >
> >    Real  Real values are serialized by emitting the 'r' character to the
> >        output stream followed by the eight octets representing the real
> >        value's 64 bits in network order.
> >
> >    String  String values are serialized by emitting the 's' character to
> >        the output stream followed by the string's length in octets
> >        represented as a network-order 32 bit integer, followed by the
> >        string's UTF-8 encoding.
> >
> >    UUID  UUID values are serialized by emitting the 'u' character to the
> >        output stream followed by the sixteen octets representing the
> >        UUID's 128 bits, with the most significant byte coming first.
> >
> >    Date  Date values are serialized by emitting the 'd' character to the
> >        output stream followed by the number of seconds since the start
> >        of the epoch, represented as a 64-bit real value.
> >
> >    URI URI values are serialized by emitting the 'l' character to the
> >        output stream followed by the URI's length in octets represented
> >        as a network-order 32 bit integer, followed by the binary
> >        representation of the URI.
> >
> >    Binary  Binary values are serialized by emitting the 'b' character to
> >        the output stream followed by the binary array's length in octets
> >        represented as a network-order 32 bit integer, followed by the
> >        octets of the binary array.
> >
> >    Array  Arrays are serialized by emitting the left square bracket
> >        ('[') character, followed by the count of objects in the array
> >        represented as a network-order 32 bit integer, followed by each
> >        array element in order.  Note that compliant implementations MUST
> >        preserve the order of array elements.
> >
> >    Map Maps are serialized by emitting the left curly brace ('{')
> >        character, followed by the count of objects in the map
> >        represented as a network-order 32 bit integer, followed by each
> >        key-value element.  Map keys are represented as strings except
> >        that they use the character 'k' instead of the character 's' as a
> >        tag.  Note that preserving the order of maps is not REQUIRED.
> >
> > -cheers
> > -meadhbh
> >
> > --
> > meadhbh hamrick * it's pronounced "maeve"
> > @OhMeadhbh * http://meadhbh.org/ * OhMeadhbh@gmail.com
> > _______________________________________________
> > ogpx mailing list (VWRAP working group)
> > ogpx@ietf.org
> > https://www.ietf.org/mailman/listinfo/ogpx
>
>
>
> --
> Robert G. Jakabosky
> _______________________________________________
> ogpx mailing list (VWRAP working group)
> ogpx@ietf.org
> https://www.ietf.org/mailman/listinfo/ogpx
>