Re: [vwrap] Is 'Data Z' immutable (like a git snapshot)?

Morgaine <morgaine.dinova@googlemail.com> Sat, 09 April 2011 03:27 UTC

Return-Path: <morgaine.dinova@googlemail.com>
X-Original-To: vwrap@core3.amsl.com
Delivered-To: vwrap@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A3D213A6A2A for <vwrap@core3.amsl.com>; Fri, 8 Apr 2011 20:27:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.471
X-Spam-Level:
X-Spam-Status: No, score=-2.471 tagged_above=-999 required=5 tests=[AWL=-0.095, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, J_CHICKENPOX_51=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cqZlZB3jlimP for <vwrap@core3.amsl.com>; Fri, 8 Apr 2011 20:27:18 -0700 (PDT)
Received: from mail-px0-f182.google.com (mail-px0-f182.google.com [209.85.212.182]) by core3.amsl.com (Postfix) with ESMTP id C6E853A69E6 for <vwrap@ietf.org>; Fri, 8 Apr 2011 20:27:18 -0700 (PDT)
Received: by pxi20 with SMTP id 20so2338797pxi.27 for <vwrap@ietf.org>; Fri, 08 Apr 2011 20:29:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=d2AkzZv/YrzaP0+aTXKeFwWJMjyXn+8810xMoAo7ZGM=; b=o1ZZxl5HinKYkudgQiVnv82OJqCWkyFJqPrgk7zaqkF3DR6kOqoB4QS7RhNQGQX6WG WDL+8v3xX8pzvrvpsb/a66LEMGCbXZ78l0X27B0sBrlUJd9Y8KLlXdH8fh88wRY7Ljtm 0XNnbt899+x843adFKMtFZOsiWQWzcN4YSLow=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=B8O/j1g2nWi5UBAOUMAvc1uo4s2y4mUcJupn+b+0Gkm+3QAxtg6/TVd6YSaZws/wX6 bJ1r6oLvgbNpkPLLKl7NQw5xJYLGSgGHK85yMV8zP/Chd+8nh1MEK2LDDG1HPlMc49GX /bzwKKRNtT953zqlrDOf+1BDdI+ZCbKLQ54g8=
MIME-Version: 1.0
Received: by 10.142.247.2 with SMTP id u2mr2492827wfh.107.1302319743190; Fri, 08 Apr 2011 20:29:03 -0700 (PDT)
Received: by 10.142.246.6 with HTTP; Fri, 8 Apr 2011 20:29:03 -0700 (PDT)
In-Reply-To: <20110409035837.0d324940@hikaru.localdomain>
References: <BANLkTint6CiMRZWj59sEYM2j7VoKgz4-Bw@mail.gmail.com> <AANLkTimuVubm5Becx8cg_Uq2Gdj8EjHL7maMyqWOeYCJ@mail.gmail.com> <AANLkTi=0iBKxo0_yv2LWsExzrKUjJLqP5Ua2uHB=M_7d@mail.gmail.com> <AANLkTi=QH+c-19PvavnXU+pgWyaqpAA0F5G5SMd6h4JR@mail.gmail.com> <5365485D-FFAE-46CA-B04E-D413E85FB1D1@gmail.com> <4D97E7FE.7010104@gmail.com> <4D97EEC1.7020207@gmail.com> <BANLkTi=9CXCtb=ryFtMuyG2w9ifb-2urkA@mail.gmail.com> <4D98AC5F.70501@gmail.com> <BANLkTikci18U3S-fz6k4doVTdtUig7j=zw@mail.gmail.com> <BANLkTim8uUNmGU91mYmXQX6_Eqqp92--WQ@mail.gmail.com> <20110408223402.36ae68a9@hikaru.localdomain> <BANLkTi=__DRJ-FGvVwsQWyiDkZgz_ekg0g@mail.gmail.com> <20110409035837.0d324940@hikaru.localdomain>
Date: Sat, 09 Apr 2011 04:29:03 +0100
Message-ID: <BANLkTikYULx48ELO_pkWq_6pVnSPuvksqA@mail.gmail.com>
From: Morgaine <morgaine.dinova@googlemail.com>
To: vwrap@ietf.org
Content-Type: multipart/alternative; boundary="00504502c5d2ad5c1904a073f1bd"
Subject: Re: [vwrap] Is 'Data Z' immutable (like a git snapshot)?
X-BeenThere: vwrap@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <vwrap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vwrap>
List-Post: <mailto:vwrap@ietf.org>
List-Help: <mailto:vwrap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Apr 2011 03:27:22 -0000

I agree with virtually everything you've written here, Carlo.

(And your observations about how this even applied to your experience with
IRC were interesting too, and unexpected.)

I'll just add a couple of small decorations.

First, on the issue of  "authorities" that control mutation, there is
another way of looking at this, a way which any practitioner of Functional
Programming would find very natural:  a "mutating" object can be thought of
as an infinite sequence of immutable timestamped object states.  And while
traditional imperative computing tends to consider FP as "odd", there is
nothing at all odd about an incoming stream of events.  There's your
"mutation authority" for you --- it's merely the event stream generator,
each event being immutable. :P  This fits in perfectly with event-driven
programming of course.

The second point I'd like to make refers to your section about assets versus
objects.  As you say, there is a distinction to be made, and the semantic
gap seems to open very wide between them when objects are changing rapidly.

However, there is also an important way in which the two are related, which
again could be considered the "FP viewpoint".  Assets are really just
immutable *STATES* of an object.  In SL and in Opensim currently, assets
provide the *initial* state of an object when it is brought into a region
(rezzing), and new assets can subsequently be *generated* from it by taking
a snapshot of the live object ("Take Copy"), or by saving the object on
finally removing it from the world (derezzing).  The "asset as an immutable
object state" paradigm is very easy to see.

All this seems to fit together rather nicely, and since it is the model that
underlies Second Life, I think Linden Lab deserve kudos for a core data
model that makes sense both conceptually and also for practical
cacheability.  Of course one can't just rest there, but it's a good starting
place. :-)

Going beyond where we are currently, the ability of active objects to
deliver events to viewers is extremely weak in our current model, with an
almost total lack of usable communications other than via repurposed chat
streams.

Other open world systems have a much stronger object communication model.
For example objects in OpenWonderLand have two parts, one in-world and one
in-client, and these two halves can talk to each other directly to implement
any arbitrary complex behavior.  And in OpenCobalt, although there are no
"servers" in the peer-to-peer implementation of islands (but there are
"router" hosts), the states of the fully programmable objects in one
client's scenegraph automatically propagate to the scenegraphs in all other
clients present.  So again, there is a very powerful model of interaction
between participants.

Although we haven't placed object communications  within our scope, if VWRAP
succeeds then one day someone will need to extend it to region-client and
inter-VW object communications as well.


Morgaine.




=======================

On Sat, Apr 9, 2011 at 2:58 AM, Carlo Wood <carlo@alinoe.com> wrote:

> On Sat, 9 Apr 2011 02:11:00 +0100
> Morgaine <morgaine.dinova@googlemail.com> wrote:
>
> > First of all, mutability breaks caching entirely, so it needs to be
> > approached with great caution.  Caching can make the difference
> > between a great service and a totally unusable one (or at least one
> > that doesn't work very well), so if mutability is allowed then things
> > need to be designed in such a way that caching still works *most of
> > the time*, namely in between mutations.
>
> Yup, I totally agree. As you might know I've been working a lot
> on improving the IRC protocol at the time. IRC works with handles
> (nicks and channel names) and everything else is mutable (ie,
> the channel topic, the channel modes, etc). This turned out to
> be an unsolvable horror (and trust me on that please, I worked
> 7 years on this topic). What I did was change nick names into
> numerics that are *unmutable*, so at least the bloody handle
> wasn't changing all the time, heheh (channel names are already
> unmutable) and then assigned authortities for the nick names
> (namely those servers where the users are connected to). Authorities
> have the nice property to have exclusively an outwards message
> stream: from the authority away, and therefore for streamable:
> if every mutating operation is kept in order than everything
> remains synched in the end. This requires that one always ASKS
> the authority to change something and not tell it that you
> changed something. For example, I changed it that when you KICK
> someone from a channel, then a request is sent to the server
> where that person is connected to and that server actually
> sends out the message that this user is removed from the channel.
> Making an authority for channels was never done, there I have
> tried to solve this problem with timestamps.
>
> I don't think we should go this route (using time stamps).
> Also, the authority "solution" has a the major disadvantage
> that it only works in a non-cyclic routing tree; and and
> as soon as any rerouting takes place you get into MAJOR problems
> for messages that were still under way.
>
> The ONLY way to really avoid all those nightmares is to adopt
> how git and mercurial work: immutable data that some hash ID
> refers to, which, as you said is, "never" deleted.
>
> > This is an issue which has occupied many minds for decades, and I
> > think it can be distilled into the widely accepted notion that cached
> > objects should never be overwritten, but only joined by updated
> > versions.  At a stroke this lets highly concurrent asset services
> > avoid the thorny issue of writing in the presence of concurrent
> > readers, and at the same time it lets caches have very simple update
> > semantics since nothing is mutable from their perspective.  The only
> > cost is some loss of disk space to hold old versions, which is rarely
> > significant given disk sizes and costs today.
>
> You totally convinced me :). And for the others: note that I tried
> it in other ways for YEARS. So, it means something that now I'm
> convinced that wasn't right and we should avoid it.
>
> > While that is good for asset services and caches, it does place the
> > burden of achieving mutability on the parties who require it.  And
> > that of course is how burdens should be borne.
>
> Also very true.
>
> > What this means for virtual worlds is that a mutator becomes
> > responsible for notifying endpoints that something has mutated, so
> > that they can fetch the new versions.
>
> This sound like an 'authority' however: one entity and one alone
> can issue the out going message that something has changed...
> That is not how it should work though. I can't wrap my finger
> around the difference yet though :/... If we go for immutability
> then why suddenly are we talking about having to notify others
> that something changed?
>
> Of course, things DO change in-world. For example, someone could
> detach something - and attach something else. Then the Agent of
> that avatar is the authority I'd think: the viewer requests the
> Agent to make a change, and if approved then the Agent tells the
> viewer and everyone else that the change was made.
>
> I guess that the big difference is that in those outgoing messages
> is no large 'data'.. no ASSET data.  The Asset servers themselves
> would never do this. Only work-tied (location tied) things will do
> this: avatars and rezzed objects, with respectively the Agent
> and the Region as authority/source of the mutating messages.
>
> Routing probably goes all through the Region server: if someone
> changes an attachment while they are 4000 meter away (in the same
> sim) then that STILL has to be routed to everyone else in the sim,
> and not later. Basically at least, the Agent *tells* the Region
> server that the avatars appears has changed and the Region sends
> outward messages of that fact to everyone who is connected.
>
> As long as all mutating messages go in one direction: away from
> the authority (the Agent in this case) and there are no RE-routing
> issues (which is not the case here) then there are no problems
> with this model.
>
> >  This needn't be an onerous
> > requirement, and it may not even need to be wrapped up in security
> > tape, since having had access once probably qualifies you for an
> > unconditional update.  In any case, the original item is still in its
> > asset service (as well as in users' terabyte caches), so one very
> > useful property of this approach is that a simulation can't be broken
> > for long by a poor update since reverting is always possible.  That's
> > a an engineering plus point.
>
> If all things are well, then the outgoing 'mutation' message from
> the Agent/Region only contains new asset ID's, not the data itself.
> And people will get the new data using the new ID.
>
> This sound perfectly ok for textures (which aren't even mutatable
> in-world), but what about a little change to the shape of a prim
> of an object?  An object (existing of many prims) is an asset:
> you can get store it in an asset server and later retrieve it
> again.  Hence, we have to realize that such objects are NOT
> mutable. Only once they are rezzed they are mutable. This is a
> requirement for things to work. A rezzed object is therefore
> not an asset: it's world-data that can change (with the Region
> as authority? or the owner/viewer maybe, when they are online).
> Only once an object is taken back into inventory is the data
> send to an asset server and a new ID is created. Until that point
> all the data for such objects is exclusively stored in the region
> and people obtain the data for the shape of objects (not the textures,
> but for which texture ID is used on what face) from the region,
> not from an asset server.
>
> This follows from the mutablity argument (not from the fact that
> this how it works in SL too).
>
> > In respect of addressing, you mentioned using hashes as item
> > addresses, and this is of course my preferred strategy, which I have
> > described and advocated several times this week, and back in
> > September.  Hash-based addressing has numerous excellent engineering
> > properties that put it head and shoulders above other schemes.  I say
> > go with that approach.  In any event, when we pit alternative schemes
> > against each other on merit, I bet nothing will come close to
> > hash-based.
>
> I can't think of any disadvantages. The hash has to have a large size
> of course, comparable to UUID's. Still,I'm willing to assume that
> with a lot of effort it would be possible to create an asset with
> a given hash that in fact is different... Would that be a problem?
>
> Once you know the data, you can reconstruct the item anyway. You
> also know the ID (hash or not). You can only know the hash if you
> already know the data anyway, or when you have access to it (of course).
> Being able to then create an asset that has the same hash, but in
> fact is white noise (I definitely can't think of any other way
> to construct a known hash) should simply result in the white noise
> being discarded: someone "uploads" data that has an already existing
> hash, then discard the uploaded data and use the old data. The result
> is the uploader/hacker didn't gain anything at all.
>
> > On the issue of caps, I think that keeping their semantics simple has
> > great merit, because heavyweight schemes are less likely to be
> > accepted, and complex ones are unlikely to be implemented uniformly.
> > But in any case, as I described to Vaughn, caps should be *optional*
> > anyway (ie. asset dependent).  Having to acquire a cap in order to
> > fetch a Creative Commons licensed asset is unnecessary, and indeed it
> > is rather comic.  A cap only needs to be requested when the asset
> > requires it, and only those assets that need it should bear the
> > burden.
>
> Hear hear! I like this idea very much :).
> But it is unrelated to the ID / unmutable data of course.
> This is just about how easy it is to get a cap for something.
> I guess that what you mean is that a free asset should have a
> cap that exists of it's hash. So that if you know the ID/hash
> (ie 'Z') you know immediately where to get it.
>
> > The issue of flag bits (or more generally, property fields) for
> > assets is one that interests me a lot, as it relates to the above.
> > As I described to Vaughn, it is the assets that impose requirements
> > on the protocol, not vice versa, so it is the assets that should
> > carry the properties that control the protocol.
>
> I've often desired mutable bit in no-modify objects in SL.
> There are numerous applications for it! The idea is that
> you have a no modify object but as owner still can change
> certain bits that define how it is used AND that are
> stored when you take it back into your inventory (are preserved
> when next time you rez it again). However, 'no modify' and
> the immutability that we talked about are different things
> of course! If we assume that such bits are simple changes
> to the object that are allowed by the owner at all times,
> then the only disadvantage is that (apparently) all data
> of the object needs to be stored multiple times: if their
> are 4 bits and the users "plays" with them until they had
> all 16 possibilities once in their inventory then we'd have
> 16 times the same data in the asset server.
>
> This however is just a way to look at it (compare with
> the way how we look at how a git server works: that view
> is highly inefficient too, though easy to grasp).
>
> The real implementation can of course make it so that it
> doesn't store that data 16 times...
>
> --
> Carlo Wood <carlo@alinoe.com>
> _______________________________________________
> vwrap mailing list
> vwrap@ietf.org
> https://www.ietf.org/mailman/listinfo/vwrap
>