Re: [vwrap] [ogpx] type-system : version tag and handling unknown tags

Meadhbh Hamrick <ohmeadhbh@gmail.com> Tue, 06 April 2010 18:42 UTC

Return-Path: <ohmeadhbh@gmail.com>
X-Original-To: vwrap@core3.amsl.com
Delivered-To: vwrap@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id CB7233A68F3 for <vwrap@core3.amsl.com>; Tue, 6 Apr 2010 11:42:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.469
X-Spam-Level:
X-Spam-Status: No, score=-2.469 tagged_above=-999 required=5 tests=[AWL=0.130, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id URhE6vA6-1yC for <vwrap@core3.amsl.com>; Tue, 6 Apr 2010 11:42:10 -0700 (PDT)
Received: from mail-qy0-f181.google.com (mail-qy0-f181.google.com [209.85.221.181]) by core3.amsl.com (Postfix) with ESMTP id 8E80B3A6801 for <vwrap@ietf.org>; Tue, 6 Apr 2010 11:40:01 -0700 (PDT)
Received: by qyk11 with SMTP id 11so198630qyk.13 for <vwrap@ietf.org>; Tue, 06 Apr 2010 11:39:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:received:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=+PkhhshizaBxqvoxS90xHJMIfKxxRDZnyScppPo6nYM=; b=qKpY/eJXakMMBPAl3C3fQxzeL9RYrGSiJ/P2AKVpAK93SHM7Ea4YsAMUpEXEOO8NM5 S/zbsfX5fKC+dHCHgnP732QN8Ex+CNpPGNUvO/0yjOvvfjmnjYIBQnboSeJD/NfqQb1L 98t6EzHK1M+9buC3ieaYprzMbUPdw6HTt7hOI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=jMaAEyArmXlACqlgbLj3oZiZbt74IJa+KbW9G0FPIca0HVMspWHOpdmuevihvZuMuM 1vmHyFrdsXuY0pHZ7NHeHz7MximuCEW+9tERwOclx2jJNvn0RRo41dWRWCjxYls391Mq wcwBP9CztV/HokjM+WUhsnWNuVAQ1Fvu3g/Gs=
MIME-Version: 1.0
Received: by 10.229.247.72 with HTTP; Tue, 6 Apr 2010 11:39:39 -0700 (PDT)
In-Reply-To: <A6C95D10-C26F-4F90-95F5-F6A16AA78B2C@lindenlab.com>
References: <b325928b1003281033j1ccaa3dend06ebbce29a13359@mail.gmail.com> <201003282229.21448.bobby@sharedrealm.com> <80CE754B-4BF5-4DA1-A63C-DD18695BF4D0@lindenlab.com> <o2kb325928b1004060714p355cd27axff5b33ddc2551ded@mail.gmail.com> <A6C95D10-C26F-4F90-95F5-F6A16AA78B2C@lindenlab.com>
From: Meadhbh Hamrick <ohmeadhbh@gmail.com>
Date: Tue, 6 Apr 2010 11:39:39 -0700
Received: by 10.229.190.133 with SMTP id di5mr4623202qcb.23.1270579199155; Tue, 06 Apr 2010 11:39:59 -0700 (PDT)
Message-ID: <l2wb325928b1004061139scf72cb72r28f6fe7765d62253@mail.gmail.com>
To: Mark Lentczner <markl@lindenlab.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: vwrap <vwrap@ietf.org>
Subject: Re: [vwrap] [ogpx] type-system : version tag and handling unknown tags
X-BeenThere: vwrap@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <vwrap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vwrap>
List-Post: <mailto:vwrap@ietf.org>
List-Help: <mailto:vwrap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Apr 2010 18:42:11 -0000

mark,

XML defines "well formed" as being distinct from being "valid". up til
now we have been talking on the list about what to do with documents
that are not "well formed."

well-formed documents do not refer to a schema, and as i am sure you
are aware, do not check or enforce a sequence of tags. that is
something that's done to check the document's validity.

are we suggesting that the two should be conflated (or well-formedness
be ignored) for the purposes of LLSD serialization?

should we NEVER allow for the extension of the LLSD XML schema?

it just seems a bit short sighted to say the equivalent of "we know of
all uses for this serialization scheme at the current moment and do
not need to add the capability for expansion."

-cheers
-meadhbh
--
meadhbh hamrick * it's pronounced "maeve"
@OhMeadhbh * http://meadhbh.org/ * OhMeadhbh@gmail.com



On Tue, Apr 6, 2010 at 10:05 AM, Mark Lentczner <markl@lindenlab.com> wrote:
> On Apr 6, 2010, at 7:14 AM, Meadhbh Hamrick wrote:
>> what do systems that only understand xml 1.0 do if they encounter xml 1.1?
>
> In the original XML sepc, systems were to signal an error and reject the entire document. In the 5th revision, they are allowed to accept a document with version "1.1" if and only if the document doesn't use anything that isn't in 1.0. (see "Extensible Markup Language (XML)" §2.8)
>
>> the purpose of the version tag and the rule that you ignore but
>> preserve tags you don't understand is to provide the hope for
>> extension for situations we do not currently foresee.
>
> There is no such provision in XML. DTDs and schemas built with XMLSchema or Relax generally don't have the facility to handle unexpected elements and attributes (with the notable exceptions of XML applications that expect namespace delimited elements in very particular places). XSLT processing provides an facilities that match any element or attribute, and reproduce it in the output, though that is thoroughly at the discretion of the XSLT writer: They can just as easily drop such unknown fragments.
>
> ----
>
> Within the context of LLSD, the problem is that there is semantics to the structures of array and map. For example, what does the application reading the following binary sequence see:
>
>        "[" %x00.00.00.03 "1" "z" %x00.00.00.01 "!" "0" "]"
>
> ("z" here represents an unknown tag, and follows the recommendation that all new unknown tags have a four byte length so that they can be parsed.)
>
> Is that an array of three items, or two? What does "skipping but preserving" mean? It is tempting to treat unknown tags as values. The middle value is seen by the application as "undef". (Or, perhaps it is seen as yet some other type we'd need to introduce into LLSD like "foreign" with a value similar to binary: "1 byte of %x21")
>
> However, this leads to questions in maps:
>
>        "{" %x00.00.00.01 "k" $x00.00.00.03 "abc" "z" $x00.00.00.01 "!" "}"
>
> This would be clear under the "value" interpretation, but what about:
>
>        "{" %x00.00.00.01 "z" $x00.00.00.01 "!" "1" "}"
>
> Is that invalid (as the unknown tag appears in a non-value location) or is it a "new key type", and hence is seen as the key "undef", only that isn't a valid key. Is the whole key/value pair dropped? Does that make this equivalent to the empty map?
>
> If we are to implement new structures, presumably they'd be wrapped up inside one of these things with the count extending over past all the values. For example, an unordered set might be introduced by:
>
>        "w" %x00.00.00.0E %x00.00.00.02 "i" 0x00.00.01.3A "i" 0x00.00.00.2A
>
> The first count includes all the bytes of the construct, the second count is the number of elements of the set, the two elements are the numbers 314 and 42. To an older application, this would be seen as the "undef" value.
>
> ----
>
> A wholly different choice of handling unknown tags is to simply ignore it. This allows the older app to see a "substitute" value, and requires that newer tags that want to be a value to have the semantic of "skip the next value you read".  Hence, the above examples might become:
>
>        "[" %x00.00.00.03 "1" "z" %x00.00.00.01 "!" "1" "0" "]"
>        -- this is an array of three booleans, True, True, False,
>        -- where there is a newer value type that substitutes for the middle True
>
>        "{" %x00.00.00.01 "k" $x00.00.00.03 "abc" "z" $x00.00.00.01 "!" "1" "}"
>        -- this is a map of "abc" to True
>        -- where there is a newer value type that substitutes for the True
>
>        "{" %x00.00.00.01 "z" $x00.00.00.01 "!" "k" $x00.00.00.03 "abc" "1" "}"
>        -- this is also a map of "abc" to True
>        -- where there some newer tag that may have some unknown effect on the map or the key
>
>        "w" %x00.00.00.00 "[" %x00.00.00.02 "i" 0x00.00.01.3A "i" 0x00.00.00.2A "]"
>        -- an array of two integers, 314 and 42
>        -- where there is a tag in front that indicates to treat the array as a set
>
> In these cases, the newer tags are simply ignored, and the remaining tags form a valid binary serialization. It brings into question what it would mean to "preserve" these. Perhaps, it a very delimited set of future semantics ("all new tags must 'apply' to the next value or key/value pair") LLSD implementations could attached these tags as auxiliary data to existing LLSD types and constructs.
>
> ----
>
> But for all the effort such definitions would take, I think it is better to not allow them all, since I don't see a scenario where this is favorable to content negotiation or protocol negotiation.
>
> In any such scenario, the "extension" would have to be minor enough to not warrant protocol negotiation - the "extension" couldn't provide any semantically important information: If newer code wants to pass an unordered set rather than an array, then presumably the existing semantics of the data block are that they array is to be treated as an unordered set anyway. (If not, then we've got a new protocol.) If there are new data types, whose values will be taken as "undef" in older code, either that data is optional (in which case they should be under new map keys) or it is crucial and hence needs to be under a new protocol.
>
> In short, it seems to me that ad hoc extension of LLSD at the layer of elements within a serialization isn't that useful. Such extension should occur at the content or protocol levels, both of which have existing negotiation systems.
>
>        - Mark
>
>
>
> Mark Lentczner
> Sr. Systems Architect
> Technology Integration
> Linden Lab
>
> markl@lindenlab.com
>
> Zero Linden
> zero.linden@secondlife.com
>
>
>
>
>
> _______________________________________________
> vwrap mailing list
> vwrap@ietf.org
> https://www.ietf.org/mailman/listinfo/vwrap
>