Re: [vwrap] Client-side caching, URIs, and scene scalability
Nexii Malthus <nexiim@gmail.com> Thu, 30 September 2010 04:48 UTC
Return-Path: <nexiim@gmail.com>
X-Original-To: vwrap@core3.amsl.com
Delivered-To: vwrap@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 746D13A67E7 for <vwrap@core3.amsl.com>; Wed, 29 Sep 2010 21:48:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.283
X-Spam-Level:
X-Spam-Status: No, score=-2.283 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 373z7WfqfeLC for <vwrap@core3.amsl.com>; Wed, 29 Sep 2010 21:48:40 -0700 (PDT)
Received: from mail-wy0-f172.google.com (mail-wy0-f172.google.com [74.125.82.172]) by core3.amsl.com (Postfix) with ESMTP id 046E63A6C27 for <vwrap@ietf.org>; Wed, 29 Sep 2010 21:48:39 -0700 (PDT)
Received: by wyi11 with SMTP id 11so1744323wyi.31 for <vwrap@ietf.org>; Wed, 29 Sep 2010 21:49:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=8X/K+RBBYtNrsJ8ExujbovU0lkjUTIR5Kpq4JDqdSOc=; b=KmgT9I+P7aN6tbecyd7iqo3ky6M9xlcY1RzEafCw0TgaVhXDFcxI7fggwJecffKN9V iYKGxMTFCdwdSws/a1pUHuJrfA0V4Eb9P4yO92jQZW8UEHoaaooOV3YatLuwqT8rzF4s JvSDqKphmw+sCtZexvsoGJe12ly9+vgVk0frw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Ty4qFZ5YOmcXYlNUkX+ELwPshWLRH3exde4tbyFr7X3YksoZXH8K3ZBv5073mZ6Kb7 OJJQvqzUDSfSa6+tr4Krwly0QcX5A/xsRGw9x40EVRjISs35zsw33AfiEPVrvKl/3EYs zMRsLLHDXO1t8K3rzx8j0/Dkj7LZCRZ+/Ro94=
MIME-Version: 1.0
Received: by 10.227.138.134 with SMTP id a6mr2589235wbu.68.1285822164520; Wed, 29 Sep 2010 21:49:24 -0700 (PDT)
Received: by 10.227.149.194 with HTTP; Wed, 29 Sep 2010 21:49:24 -0700 (PDT)
In-Reply-To: <AANLkTimEBbz5zCtRU8BcO+o65hCwhSxE_R8HM9UyCh40@mail.gmail.com>
References: <AANLkTin5GF7=qPXYTOFyB0T-2C4JrS2=xaDKo0wZC+fH@mail.gmail.com> <62BFE5680C037E4DA0B0A08946C0933D012AD7E419@rrsmsx506.amr.corp.intel.com> <AANLkTimEBbz5zCtRU8BcO+o65hCwhSxE_R8HM9UyCh40@mail.gmail.com>
Date: Thu, 30 Sep 2010 05:49:24 +0100
Message-ID: <AANLkTi=-xDoHVmnA=xQo9mnEzmffyxeKyuyKmqrQ8ss0@mail.gmail.com>
From: Nexii Malthus <nexiim@gmail.com>
To: Morgaine <morgaine.dinova@googlemail.com>
Content-Type: multipart/alternative; boundary="001485f44c025c4e4b049172cd29"
X-Mailman-Approved-At: Wed, 29 Sep 2010 21:59:35 -0700
Cc: "vwrap@ietf.org" <vwrap@ietf.org>
Subject: Re: [vwrap] Client-side caching, URIs, and scene scalability
X-BeenThere: vwrap@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Virtual World Region Agent Protocol - IETF working group <vwrap.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vwrap>
List-Post: <mailto:vwrap@ietf.org>
List-Help: <mailto:vwrap-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vwrap>, <mailto:vwrap-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Sep 2010 04:49:30 -0000
Sounds awesome and like a nicely evaluated approach. - Nexii On Mon, Sep 27, 2010 at 8:23 PM, Morgaine <morgaine.dinova@googlemail.com>wrote: > That's good to hear John, but it can be made far more effective and > efficient. > > Your approach allows a client to avoid requesting the data, but only at the > cost of requesting the ETag header, which entails making a network request. > The approach that I am advocating would eliminate the network requests > altogether, because the hash is part of the asset identifier that is held by > the region and is handed to the client. Requesting hashes with another > round trip doesn't scale as scenes grow massively. > > Avoiding unnecessary network accesses will become crucial as new worlds > expand the asset pool with millions of replicated assets in seconds. > Hash-based URIs would also provide virtual worlds with asset resilience, > since fallback services can be queried automatically for the known asset > hash when retrieval from the initial asset URI fails. You can't do that > when the hash is held by an asset service that is now inaccessible. > > > Morgaine. > > > > > > ============================== > > On Mon, Sep 27, 2010 at 6:26 PM, Hurliman, John <john.hurliman@intel.com>wrote: > >> Agreed. We are already doing this in the SimianGrid asset server by using >> the ETag HTTP header to deliver a SHA256 hash of asset data. This allows a >> client to do a HEAD request before fetching data and is compatible with >> existing web caching systems. >> >> >> >> John >> >> >> >> *From:* vwrap-bounces@ietf.org [mailto:vwrap-bounces@ietf.org] *On Behalf >> Of *Morgaine >> *Sent:* Monday, September 27, 2010 9:53 AM >> *To:* vwrap@ietf.org >> *Subject:* [vwrap] Client-side caching, URIs, and scene scalability >> >> >> >> We've discussed the structure of caps, URIs and asset addressing here many >> times in the past. I would like us to examine this issue in the specific >> context of *client-side caching* and *scene growth*, which we have not >> previously addressed. Scalability is a matter of huge importance (as well >> as being part of the IETF mission statement), and I'm particularly >> interested in making sure that VWRAP standards are scalable in key >> dimensions. >> >> Scenes will inevitably rise in size and complexity with the passage of >> time. In quite a short while we can expect millions of assets in a scene >> within the field of view of an agent, and further orders of magnitude not >> long after. While some may be tempted to call this "sci fi", observing the >> increase in memory, disk, and other computing resources over time suggests >> otherwise. From kilo to mega, giga and tera, it's only when we look back >> that we realize that our inability to visualize exponential growth is epic. >> >> This becomes relevant when deciding on URI formats. It's no use defining >> an elegant URI format if it doesn't scale as scene complexity rises. >> >> What this means for us when we are designing the structure of URIs is that >> we need to focus on what the URI is for, namely data access, both local and >> remote. When designing for scalability through good use of caching, our >> goal is to avoid a client having to perform each remote access if at all >> possible. If our elegant URI format results in clients needing to access >> data remotely despite it already being cached locally, then our elegant >> addressing scheme has failed. "Elegant but non-scalable" is not the mark of >> success, so let's check against this requirement. >> >> When a region tells the client about the items it current holds (narrowed >> down by an interest list in an optimized implementation), it does so by >> listing the items in the scene using item identifiers of some kind. The >> client can then use each identifier as an index into its local cache, and >> then request from the relevant asset services only those items that are not >> already cached. This is easy in an isolated world where identifiers can be >> world-global. Where it breaks down is when worlds interoperate, and those >> arbitrary identifiers (eg. UUIDs or URIs based on them) become useless for >> deciding whether an item common to multiple worlds is actually in the cache >> or not. Done wrongly, it can easily result in repeat downloading on a >> massive scale. >> >> Local or global identifiers will work poorly unless they're an intrinsic >> property of the actual data being indexed. The reason they won't work well >> is because the same data used in two different worlds won't have a common >> URI-based cache index unless it happens to be supplied by the same asset >> service. The same item replicated in thousands of worlds would end up being >> stored thousands of times in the cache. While the storage cost may be of >> little consequence, the repeated access cost is not, because round-trip >> times have very limited downward scalability. >> >> The engineering solution to this is pretty obvious: scene component >> identifiers should include a *hash or digest over the data*, this >> information being separable from other parts of the identifier/URI so that >> it can be used as a key into the cache. The cache is king, and terabyte >> caches should be regarded as normal now, with petabyte caches not so many >> years down the line. The goal of "Never download the same thing twice" is >> already reasonable with terabyte drives today, never mind tomorrow. VWRAP >> needs to embrace this, if it is to be a scalable interop standard. >> >> Note that in the above, my reference to "data" *excludes metadata* by >> intent. Two objects may be quite separate, with totally different metadata, >> yet denote exactly the same data, which would give them the same hash digest >> and hence share a cache index. This situation is likely to be extremely >> common, especially for environmental items such as trees, vegetation and >> other natural elements. We can easily foresee a situation in which people >> create their brand new world by unpacking a region archive and releasing >> another few million items into the metaverse. If those items were cached >> the first time that they were seen in one world, they would not need to be >> loaded again from this new world, if we design our URIs with good >> engineering properties and foresight. >> >> Cache scalability as the number of worlds with common assets rises is one >> issue, but there is also another related one on the horizon. As we move >> away from SL's primitive assets and 1-level linksets towards *hierachical >> objects* that allow object composition, the number of virtual obects made >> from reusable components will skyrocket, because builders will be riding on >> the shoulders of giants, just like in RL engineering. This again will >> result in massive cross-world sharing of replicated components. >> >> In summary: The asset identifiers supplied by a region to a client should >> contain an *explicit hash/digest over the data* (calculated *ONCE* by the >> relevant asset service of course, not by each region), to allow client-side >> caches to be highly effective at eliminating unnecessary network traffic. >> This will be very important in a metaverse of countless worlds, huge amounts >> of shared data, and massive scenes. >> >> >> Morgaine. >> >> PS. Hash digests in asset URIs deliver two other important benefits as >> well, beyond scalability: >> >> - They provide the interesting property of *near-universal asset >> addressing*. This may appeal to those who focus on social aspects of >> digital content such as imposing property semantics, in which case using a >> URI format that almost uniquely identifies assets can kill several birds >> with one stone. >> >> >> - They provide isolation from host and network outages. >> Near-universal asset addressing means that when an asset service fails to >> respond or returns an error code, a new URI containing the same hash digest >> could be manufactured and sent to a second asset service as fallback. The >> benefits of this for *virtual world resilience* are of course >> immense. Resilience is so important that I suggest it should be a protocol >> requirement. The fact that we would gain resilience automatically as a mere >> side-effect of digest-based addressing highlights the rather nice properties >> of this design. >> >> >> -- End. >> >> _______________________________________________ >> vwrap mailing list >> vwrap@ietf.org >> https://www.ietf.org/mailman/listinfo/vwrap >> >> > > _______________________________________________ > vwrap mailing list > vwrap@ietf.org > https://www.ietf.org/mailman/listinfo/vwrap > >
- Re: [vwrap] Client-side caching, URIs, and scene … Hurliman, John
- [vwrap] Client-side caching, URIs, and scene scal… Morgaine
- Re: [vwrap] Client-side caching, URIs, and scene … Morgaine
- Re: [vwrap] Client-side caching, URIs, and scene … Nexii Malthus