Re: [mmox] XML serialization

"Hurliman, John" <john.hurliman@intel.com> Tue, 24 February 2009 00:48 UTC

Return-Path: <john.hurliman@intel.com>
X-Original-To: mmox@core3.amsl.com
Delivered-To: mmox@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6B15128C23E for <mmox@core3.amsl.com>; Mon, 23 Feb 2009 16:48:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.22
X-Spam-Level:
X-Spam-Status: No, score=-5.22 tagged_above=-999 required=5 tests=[AWL=-0.787, BAYES_00=-2.599, FF_IHOPE_YOU_SINK=2.166, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z4hfDJg89IDu for <mmox@core3.amsl.com>; Mon, 23 Feb 2009 16:48:35 -0800 (PST)
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by core3.amsl.com (Postfix) with ESMTP id 1451728C23B for <mmox@ietf.org>; Mon, 23 Feb 2009 16:48:35 -0800 (PST)
Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 23 Feb 2009 16:46:57 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="4.38,256,1233561600"; d="scan'208";a="433547839"
Received: from rrsmsx601.amr.corp.intel.com ([10.31.0.151]) by fmsmga002.fm.intel.com with ESMTP; 23 Feb 2009 16:44:47 -0800
Received: from rrsmsx506.amr.corp.intel.com ([10.31.0.39]) by rrsmsx601.amr.corp.intel.com ([10.31.0.151]) with mapi; Mon, 23 Feb 2009 17:48:53 -0700
From: "Hurliman, John" <john.hurliman@intel.com>
To: "mmox@ietf.org" <mmox@ietf.org>
Date: Mon, 23 Feb 2009 17:48:50 -0700
Thread-Topic: [mmox] XML serialization
Thread-Index: AcmWErMRK86K2hmoTx2RqOSpKlVTxQAA98fA
Message-ID: <62BFE5680C037E4DA0B0A08946C0933D50263B34@rrsmsx506.amr.corp.intel.com>
References: <ebe4d1860902230239q207d4c0ar5b0582ad7ca855bf@mail.gmail.com> <49A30D7A.3040003@gmail.com> <2F80CC37-A5BD-4EAD-8E12-D31A21912A8B@lindenlab.com> <62BFE5680C037E4DA0B0A08946C0933D502639CF@rrsmsx506.amr.corp.intel.com> <E687852F-8091-43BF-961F-215036069846@lindenlab.com> <49A33830.1060509@gmail.com>
In-Reply-To: <49A33830.1060509@gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [mmox] XML serialization
X-BeenThere: mmox@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Massively Multi-participant Online Games and Applications <mmox.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/mmox>, <mailto:mmox-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmox>
List-Post: <mailto:mmox@ietf.org>
List-Help: <mailto:mmox-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmox>, <mailto:mmox-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Feb 2009 00:48:36 -0000

>-----Original Message-----
>From: Jon Watte [mailto:jwatte@gmail.com]
>Sent: Monday, February 23, 2009 3:59 PM
>To: Meadhbh Hamrick (Infinity)
>Cc: Hurliman, John; mmox@ietf.org
>Subject: Re: [mmox] XML serialization
>
>Meadhbh Hamrick (Infinity) wrote:
>> our system works fine with the new serialization. this would require a
>> non-zero amount of resources be spent on deploying the change. ergo,
>> we would need some justification for making the change.
>>
>
>Do you agree that any kind of generally suitable interoperability will
>require engineering effort, sometimes significant, on the part of all
>possible implementors, including Linden Lab or the Open Sim developers?
>
>You do not have to update your internal serialization at all just
>because we, collectively, design a serialization scheme with
>differences. You don't need to do any work until and unless such time
>that
>1) there is an actual standard, or at least a concrete proposal
>2) you decide that supporting this standard is in the best interests of
>your company or group
>
>>
>> but may i ask? why do we need to change something that we know works,
>> is already deployed, and seems to work without flaw?
>>
>
>Because it can work better. It can be more compact, it can better suit a
>variety of parser styles, it can be easier read and written by human
>beings, it can be easier expanded or adapted, it can more easily be
>processed using fast hashing-based (order independent) processing
>methods.
>
>If your argument is that "while I understand that some other people feel
>that direction X is better, we are currently using direction Y in the
>Second Life implementation, so I think Second Life's way should prevail"
>then you're not actually signed on to the generally applicable, vendor
>neutral interoperability standard bandwagon.
>
>> * can we do a bake off?
>
>Once we're done with a serialization format that fulfills all the
>necessary requirements, I believe there won't be anything "else" to
>bake-off against.
>See here for some requirements I would put on a standardized format:
>http://www.interopworld.com/node/32
>
>I will copy and paste for the convenience of those who don't like
>out-of-line references:
>
>I have looked at the proposal, and see a number of issues that I think
>need to be addressed before the proposal can be used as a general
>interoperability standard.
>
>! Proposal Name
>
>The name probably should change to something like "VWSD" to avoid any
>accusation of vendor bias. I trust we can just get this done without too
>much argument about specific naming.
>

No opinion here, other than not tying it to virtual worlds. All we are doing is defining yet another IDL, because the group asserts that ASN.1, Apache Thrift, Google Protocol Buffers, and any other proposed solutions will not work as a virtual world interop base for technical or political reasons. However, that doesn't mean that we have to create an IDL that only works for virtual worlds.

>! Large and Small Integers
>
>The "integer" data type needs to be more flexible. 64-bit integers are
>important, and you can even view UUIDs as 128-bit integers. I propose
>that integers can be specified as a specific bit width, and signed or
>unsigned. If you want to stay byte aligned, the set of allowable sizes
>may be limited to 8, 16, 32 or 64. I additionally propose that a
>Variable Length Integer Encoding be specified.
>

Agreed on supporting multiple lengths of signed vs. unsigned. I would really like to see int32, int64, uint32, uint64 and could take or leave the rest personally. Doing variable length integer encoding (such as the base-128 varints used in Google Protocol Buffers: http://code.google.com/apis/protocolbuffers/docs/encoding.html) would be a nice fix for the binary encoding.

>! Float32 and Float16
>
>Reals should come in at least two forms: 64-bit and 32-bit. The reason
>is that localized position relative to some center usually is best
>described as a 32-bit float. Additionally, mesh data is generally
>described as 32-bit float vertices, rather than 64-bit. There may be
>some additional benefit in supporting 16-bit floats, for things like
>normals, color values or direction vectors.
>

Just adding 32-bit float support is probably fine for most use cases. I don't understand the reasoning behind only supporting 64-bit floats in LLSD at all, especially considering the vast majority of floating point data in Second Life is 32-bit.

>! Compact Binary Serialization
>
>There needs to be support for a binary serialization that does not
>contain embedded type or key references, but instead use explicit
>external schema. Thus, to describe a "quaternion" you would simply
>specify four fp16 (or fp32) values in sequence, with no specific type
>information. This can extend to describing general entity property
>values: the type information can be carried by the schema for the
>entity, and does not need to be encoded in the actual data. Allowing
>external schema leads to significant bandwidth savings during
>transmission.
>

Maybe one of the existing binary serializations used by Apache Thrift or Google Protocol Buffers could be supported? Even if we don't want to tie the LLSD spec to third party references, something that draws on the techniques from those IDLs would be much more efficient than the current LLSD binary. I think the need to tightly couple the IDL and type system for this kind of serialization highlights the fact that LLSD/LLIDL is not doing anything new; the same dependencies are still present.

>! XPath Friendly XML
>
>The XML encoding needs to be XPath friendly, and friendly to introducing
>or adapting new types in a negotiated higher-version schema. Generally,
>this means that it looks something like:
><map>
><value key="success" type="boolean">true</value>
><value key="cpu_temp" type="float32">67.0</value>
></map>
>The currently proposed serialization is poor because it's not easy or
>efficient to find specific values you care about by key; it's also
>inefficient because it requires any new data types to update the XSD
>schema, because each type is a tag name.
>
>! Machine Parsable Descriptions
>
>It is likely that the language describing the serialization structures
>will actually be read by machines at runtime, rather than used to
>generate code. This is definitely true for things like proxy servers and
>systems that treat entities as "generic collections of data" but want to
>have some understanding of the underlying data for display, translation,
>or other processing. While it would be possible to write a parser for
>the current LLSD syntax, introducing a new language syntax into the
>world doesn't seem warranted. I propose that the data structure
>definition language be moved to some XML schema.
>

-1 on this. The LLIDL syntax is very easy to parse, and I posted a simple grammar file to generate a lexer/parser. Introducing a new XML schema creates just as much implementation work. However, LLSD/LLIDL is still encouraging code that is difficult to read and easy to break. Data should enter and exit in the form of a struct, not a map of string/value pairs where anything can be inserted but only a fixed subset of possibilities will be valid.

>
>Sincerely,
>
>jw

John