[ogpx] type-system : guidance for binary serialization implementers
Meadhbh Hamrick <ohmeadhbh@gmail.com> Sun, 28 March 2010 17:33 UTC
Return-Path: <ohmeadhbh@gmail.com>
X-Original-To: ogpx@core3.amsl.com
Delivered-To: ogpx@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AADB53A68DE for <ogpx@core3.amsl.com>; Sun, 28 Mar 2010 10:33:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.131
X-Spam-Level: *
X-Spam-Status: No, score=1.131 tagged_above=-999 required=5 tests=[BAYES_50=0.001, DNS_FROM_OPENWHOIS=1.13]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bJtQECKElY+q for <ogpx@core3.amsl.com>; Sun, 28 Mar 2010 10:33:55 -0700 (PDT)
Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.25]) by core3.amsl.com (Postfix) with ESMTP id 9A1A93A65A6 for <ogpx@ietf.org>; Sun, 28 Mar 2010 10:33:55 -0700 (PDT)
Received: by qw-out-2122.google.com with SMTP id 9so90138qwb.31 for <ogpx@ietf.org>; Sun, 28 Mar 2010 10:34:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:from:date:received :message-id:subject:to:content-type; bh=VS6KmKmSszajCMyZ3ykvMNYkljImHfquVgSHGDFunrc=; b=FjNCTK1TGnwQibfL92iDNe8G2bPN23CVDSTerwbQ5x2scRJRZETPITHXeOd5VxYyA7 XGvzldKaQ2gerwUigNI7GlZAy5fynztEsj/0/EkBo8PulN3vQI1+oDYm31NuLnrVj4TV vKn9S/prtnRXP2V5YyCDmDRUUazF7WYyXhziw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=W6ymw5wdYxA9c+o0TSxnE+mMdzXrRgwP+rv7m/XHkZypw9z7npTwu5gjtgGWgAaufY 1chX/Hl4Tn3TXUJI7JEXyFe//STfeFOg8irjM8y5aUxwMoBaOYlcXIcAZ3vOpEq5rOLt FwwUjNI+ISKOZvyGNU7yMpv+po9bhkUjI9As0=
MIME-Version: 1.0
Received: by 10.229.20.209 with HTTP; Sun, 28 Mar 2010 10:34:02 -0700 (PDT)
From: Meadhbh Hamrick <ohmeadhbh@gmail.com>
Date: Sun, 28 Mar 2010 10:34:02 -0700
Received: by 10.229.41.140 with SMTP id o12mr3946059qce.40.1269797662209; Sun, 28 Mar 2010 10:34:22 -0700 (PDT)
Message-ID: <b325928b1003281034s55fae732n7be979446759bd12@mail.gmail.com>
To: ogpx <ogpx@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Subject: [ogpx] type-system : guidance for binary serialization implementers
X-BeenThere: ogpx@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <ogpx.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ogpx>, <mailto:ogpx-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ogpx>
List-Post: <mailto:ogpx@ietf.org>
List-Help: <mailto:ogpx-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ogpx>, <mailto:ogpx-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Mar 2010 17:33:56 -0000
so this has stuck in my craw for a couple weeks, finally remembering to post it to the list. the JSON and XML serialization schemes use well known grammars that have wide adoption. not so for the binary serialization. so if i wanted to write an XML parser, i would have a lot of 3rd party resources and discussion to guide me. specifically when it comes to handling errors. as i was writing my own implementation of the LLSD binary serialization scheme, a couple issues came up. i think we should *somewhere* address them. either as a section of the type-system document, or as an informational RFC. i would like to have clear guidance for how implementers would handle the following exceptional events. one of the things we were trying to do was make a type system and serialization schemes that were resilient in the face of error. so i think there's a strong preference that we recover as best we can from an error instead of throwing up our hands and just saying "error!" i would actually go and look at how LL implemented the binary serialization (i'm sure it's in the viewer code somewhere) but i'm in the middle of my gnu/bsd cool down period, and i don't want to reset the timer. i think there may be other people in the same situation, which is why we should have a document describing what we should do in the face of parsing errors. so... if someone out there who's more of a viewer person wants to look at the linden implementation and describe it's behavior in a textual format, and then someone wants to look at the opensim implementation and describe it's behavior in a textual format, we could then document how things work and avoid this situation for other people. here's a list of encoding / decoding errors i think about, along with suggestions for how to handle them. 1. the object count following the opening tag ('{' or '[') is less than the actual number of objects in the collection. (that is, the closing tag is not found after iterating through "object count" number of elements, but is found later in the stream.) the "simple" thing to do would be to simply say, "the array ends after $(OBJECT_COUNT) number of elements, even if there's not a closing tag." the "smart" thing to do would be to look at the LLIDL definition of the array you're parsing and try to come up with a "best fit" for what's supposed to be there vs. what's actually there. in other words, you look at the LLIDL and if it says you're supposed to have three elements in the array and you have three elements in the array, then you're done. or maybe you're supposed to have one additional element in the array. the "smart" thing to do could get quite complicated while the "simple" thing is sure to introduce failures the "smart" thing could recover from. 2. the object count following the opening tag ('{' or '[') is greater than the actual number of objects in the collection. (that is, you get a closing tag before "object count" number of elements) the "simple" thing to do is to simply say, "a closing tag ends the collection" combining this with the previous error case, we could say, "a collection continues until the closing tag is reached or $(OBJECT_COUNT) elements are found in the collection." 3. there is no closing tag. (that is, after "object count" number of elements, you find a tag that is not a '}' or a ']'.) this is more or less the same as error case number 1 above. though i wonder if it makes sense to differentiate the two cases. if we were, we would have to have the smarts enough to look forward in the stream, count the number of opening tags, then count the number of closing tags and see if they're unbalanced. ugh. 4. lack of 'k' elements in a map. (that is, you haven't reached the "object count" number of elements and are expecting a 'k' but get something else.) my druthers would be to say, if you encounter a type that can be converted to a string, you just do the conversion and use the result as the key. another way to handle it would be to ignore it. 5. you encounter a bare 'k'. (that is, you encounter a 'k' tag outside the context of a map.) i vote for "convert to string" 6. duplicate keys in a map. (actually this applies to all serializations) i vote for "the last key in the map with the same name wins." that is, if you have two or more keys with the same name, the one found furthest in the stream is the one that's used. ideas? comments? -- meadhbh hamrick * it's pronounced "maeve" @OhMeadhbh * http://meadhbh.org/ * OhMeadhbh@gmail.com
- [ogpx] type-system : guidance for binary serializ… Meadhbh Hamrick
- Re: [ogpx] type-system : guidance for binary seri… Robert G. Jakabosky
- Re: [ogpx] type-system : guidance for binary seri… Dahlia Trimble
- Re: [ogpx] type-system : guidance for binary seri… Robert G. Jakabosky
- Re: [ogpx] type-system : guidance for binary seri… Meadhbh Hamrick
- Re: [ogpx] type-system : guidance for binary seri… Hurliman, John