Re: [vwrap] [ogpx] type-system : version tag and handling unknown tags

Mark Lentczner <markl@lindenlab.com> Tue, 06 April 2010 17:05 UTC

Return-Path: <markl@lindenlab.com>
X-Original-To: vwrap@core3.amsl.com
Delivered-To: vwrap@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 35E303A68A4 for <vwrap@core3.amsl.com>; Tue, 6 Apr 2010 10:05:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Aq4NFqz7lCiP for <vwrap@core3.amsl.com>; Tue, 6 Apr 2010 10:05:55 -0700 (PDT)
Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by core3.amsl.com (Postfix) with ESMTP id C1C863A6767 for <vwrap@ietf.org>; Tue, 6 Apr 2010 10:05:54 -0700 (PDT)
Received: by vws15 with SMTP id 15so40216vws.31 for <vwrap@ietf.org>; Tue, 06 Apr 2010 10:05:49 -0700 (PDT)
Received: by 10.220.127.96 with SMTP id f32mr3608908vcs.32.1270573549577; Tue, 06 Apr 2010 10:05:49 -0700 (PDT)
Received: from nil.lindenlab.com ([38.99.52.137]) by mx.google.com with ESMTPS id 33sm53517642vws.10.2010.04.06.10.05.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 06 Apr 2010 10:05:48 -0700 (PDT)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Apple Message framework v1078)
From: Mark Lentczner <markl@lindenlab.com>
In-Reply-To: <o2kb325928b1004060714p355cd27axff5b33ddc2551ded@mail.gmail.com>
Date: Tue, 6 Apr 2010 10:05:46 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <A6C95D10-C26F-4F90-95F5-F6A16AA78B2C@lindenlab.com>
References: <b325928b1003281033j1ccaa3dend06ebbce29a13359@mail.gmail.com> <201003282229.21448.bobby@sharedrealm.com> <80CE754B-4BF5-4DA1-A63C-DD18695BF4D0@lindenlab.com> <o2kb325928b1004060714p355cd27axff5b33ddc2551ded@mail.gmail.com>
To: vwrap <vwrap@ietf.org>
X-Mailer: Apple Mail (2.1078)
Subject: Re: [vwrap] [ogpx] type-system : version tag and handling unknown tags
X-BeenThere: vwrap@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <vwrap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vwrap>
List-Post: <mailto:vwrap@ietf.org>
List-Help: <mailto:vwrap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Apr 2010 17:05:56 -0000

On Apr 6, 2010, at 7:14 AM, Meadhbh Hamrick wrote:
> what do systems that only understand xml 1.0 do if they encounter xml 1.1?

In the original XML sepc, systems were to signal an error and reject the entire document. In the 5th revision, they are allowed to accept a document with version "1.1" if and only if the document doesn't use anything that isn't in 1.0. (see "Extensible Markup Language (XML)" §2.8)

> the purpose of the version tag and the rule that you ignore but
> preserve tags you don't understand is to provide the hope for
> extension for situations we do not currently foresee.

There is no such provision in XML. DTDs and schemas built with XMLSchema or Relax generally don't have the facility to handle unexpected elements and attributes (with the notable exceptions of XML applications that expect namespace delimited elements in very particular places). XSLT processing provides an facilities that match any element or attribute, and reproduce it in the output, though that is thoroughly at the discretion of the XSLT writer: They can just as easily drop such unknown fragments.

----

Within the context of LLSD, the problem is that there is semantics to the structures of array and map. For example, what does the application reading the following binary sequence see:

	"[" %x00.00.00.03 "1" "z" %x00.00.00.01 "!" "0" "]"

("z" here represents an unknown tag, and follows the recommendation that all new unknown tags have a four byte length so that they can be parsed.)

Is that an array of three items, or two? What does "skipping but preserving" mean? It is tempting to treat unknown tags as values. The middle value is seen by the application as "undef". (Or, perhaps it is seen as yet some other type we'd need to introduce into LLSD like "foreign" with a value similar to binary: "1 byte of %x21")

However, this leads to questions in maps:

	"{" %x00.00.00.01 "k" $x00.00.00.03 "abc" "z" $x00.00.00.01 "!" "}"

This would be clear under the "value" interpretation, but what about:

	"{" %x00.00.00.01 "z" $x00.00.00.01 "!" "1" "}"

Is that invalid (as the unknown tag appears in a non-value location) or is it a "new key type", and hence is seen as the key "undef", only that isn't a valid key. Is the whole key/value pair dropped? Does that make this equivalent to the empty map?

If we are to implement new structures, presumably they'd be wrapped up inside one of these things with the count extending over past all the values. For example, an unordered set might be introduced by:

	"w" %x00.00.00.0E %x00.00.00.02 "i" 0x00.00.01.3A "i" 0x00.00.00.2A

The first count includes all the bytes of the construct, the second count is the number of elements of the set, the two elements are the numbers 314 and 42. To an older application, this would be seen as the "undef" value.

----

A wholly different choice of handling unknown tags is to simply ignore it. This allows the older app to see a "substitute" value, and requires that newer tags that want to be a value to have the semantic of "skip the next value you read".  Hence, the above examples might become:

	"[" %x00.00.00.03 "1" "z" %x00.00.00.01 "!" "1" "0" "]"
	-- this is an array of three booleans, True, True, False,
	-- where there is a newer value type that substitutes for the middle True

	"{" %x00.00.00.01 "k" $x00.00.00.03 "abc" "z" $x00.00.00.01 "!" "1" "}"
	-- this is a map of "abc" to True
	-- where there is a newer value type that substitutes for the True

	"{" %x00.00.00.01 "z" $x00.00.00.01 "!" "k" $x00.00.00.03 "abc" "1" "}"
	-- this is also a map of "abc" to True
	-- where there some newer tag that may have some unknown effect on the map or the key

	"w" %x00.00.00.00 "[" %x00.00.00.02 "i" 0x00.00.01.3A "i" 0x00.00.00.2A "]"
	-- an array of two integers, 314 and 42
	-- where there is a tag in front that indicates to treat the array as a set

In these cases, the newer tags are simply ignored, and the remaining tags form a valid binary serialization. It brings into question what it would mean to "preserve" these. Perhaps, it a very delimited set of future semantics ("all new tags must 'apply' to the next value or key/value pair") LLSD implementations could attached these tags as auxiliary data to existing LLSD types and constructs.

----

But for all the effort such definitions would take, I think it is better to not allow them all, since I don't see a scenario where this is favorable to content negotiation or protocol negotiation.

In any such scenario, the "extension" would have to be minor enough to not warrant protocol negotiation - the "extension" couldn't provide any semantically important information: If newer code wants to pass an unordered set rather than an array, then presumably the existing semantics of the data block are that they array is to be treated as an unordered set anyway. (If not, then we've got a new protocol.) If there are new data types, whose values will be taken as "undef" in older code, either that data is optional (in which case they should be under new map keys) or it is crucial and hence needs to be under a new protocol.

In short, it seems to me that ad hoc extension of LLSD at the layer of elements within a serialization isn't that useful. Such extension should occur at the content or protocol levels, both of which have existing negotiation systems.

	- Mark



Mark Lentczner
Sr. Systems Architect
Technology Integration
Linden Lab

markl@lindenlab.com

Zero Linden
zero.linden@secondlife.com