Re: [CDNi] Some questions regarding draft-ma-cdni-metadata

Ben Niven-Jenkins <ben@niven-jenkins.co.uk> Mon, 24 October 2011 12:44 UTC

Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="us-ascii"
From: Ben Niven-Jenkins <ben@niven-jenkins.co.uk>
In-Reply-To: <291CC3F9E50E7641901A54E85D0977C651B6682920@MAILR002.mail.lan>
Date: Mon, 24 Oct 2011 13:44:24 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <943A317A-AE56-464A-8CDB-A806332368BC@niven-jenkins.co.uk>
References: <842E1148-9AFD-4A39-BBB3-89CAAB9E833A@niven-jenkins.co.uk> <291CC3F9E50E7641901A54E85D0977C651B668260C@MAILR002.mail.lan> <30D279D6-3F64-4E1F-A221-D293EE0261D7@niven-jenkins.co.uk> <291CC3F9E50E7641901A54E85D0977C651B6682920@MAILR002.mail.lan>
To: Kevin J Ma <kevin.ma@azukisystems.com>
Cc: "cdni@ietf.org" <cdni@ietf.org>
Subject: Re: [CDNi] Some questions regarding draft-ma-cdni-metadata
Precedence: list

Kevin,

On 18 Oct 2011, at 17:03, Kevin J Ma wrote:
>> From: Ben Niven-Jenkins [mailto:ben@niven-jenkins.co.uk]
>> On 17 Oct 2011, at 19:26, Kevin J Ma wrote:
>>> I attempted to create an extensible representation for metadata and a
>>> complete protocol for manipulating metadata, though, not all users of
>>> the interface will need all aspects.  Do you feel that the protocol is
>>> deficient for non-authoritative CDN use, or that authoritative CDNs do
>>> not need the metadata interface?
>> 
>> I think that we should stick to the scope of work we have been chartered
>> to address, in the case of CDNI Metadata that is an interface that "will
>> allow the CDNs to exchange content distribution metadata of inter-CDN
>> scope".
>> 
>> If the interface we specify is also usable in other scenarios then that's
>> great but I don't think we should be actively looking to design an
>> interface that is usable across a multitude of scenarios that are not in
>> scope of CDNI because it's a distraction and likely to defocus our
>> efforts. Our charter is not to define a new CP->CDN provisioning interface
>> (they already exist) so we shouldn't be burning cycles working on things
>> that are out of charter.
> 
> I personally do not see a difference between uCDN->dCDN and CP->uCDN.
> If in supporting uCDN->dCDN you get CP->uCDN, I think that is a bonus.
> I do not think we should ignore uCDN->dCDN just because it might be
> useful to CPs.  Is the argument that a uCDN should never need to push
> metadata to a dCDN and therefore we do not need uCDN->dCDN support?

My argument is that CP->uCDN and uCDN->dCDN are different beasts and in supporting uCDN->dCDN you do not get a sufficient interface for CP->uCDN.

We thought about this when writing draft-jenkins-cdni-metadata and we came to the conclusion that CP->uCDN is a provisioning interface whereas for CDNI no provisioning is required, the dCDN only needs how to execute the already-provisioned (in the uCDN) CDNI metadata.

A multi-tenanted CP->uCDN provisioning interface requires a different feature set and a richer data model than a uCDN->dCDN interface that just needs to transfer already-provisioned CDNI metadata for a dCDN to execute it.

For example, a multi-tentanted CP->uCDN provisioning interface needs to support features such as:
- User & Role based access control and a user account hierarchy that allows different business entities to control different levels of that hierarchy.
- Metadata access permissions as different users (at different levels in the account hierarchy) may not have read or read/write access to all types of metadata record and because users at different levels in the account hierarchy have the requirement to add additional metadata properties/policies (e.g. a Reseller may want to always have their off-shore QA team added to any Delivery ACL to enable them to troubleshoot).
- Service/Account templates to simply provisioning.
etc.

When it comes to executing the CDN metadata (i.e. what a surrogate needs to know to perform the actual content delivery) none of the above features really matter, the CDN metadata can be flattened into a simpler CDNI metadata data model that removes some of the complexities required in a multi-tenanted provisioning data model as knowledge of how the CDNI Metadata got into the system (and who is allowed to do what at which level of the account hierarchy) is not needed when all the Surrogate cares about is if and how it should delivery the actual content.

So your assertion that a uCDN->dCDN interface automatically gives you a sufficient CP->uCDN interface does not IMO hold true. 

It doesn't matter whether the CDNI Metadata interface is push from uCDN to dCDN or pull from dCDN to uCDN, what is required to build a sufficient CP-uCDN provisioning interface is much more than what is required purely for a sufficient CDNI Metadata interface. My opinion is we should concentrate on what we're chartered to do (which is a CDNI Metadata interface) and not try and expand that scope to be much wider than it is currently.

<snip>

>>> My concern with defining all the semantics of metadata and designing a
>>> protocol around that, is you typically end up with a less extensible
>>> design.  I would not want to have to update the metadata interface rfc
>>> just to support a new piece of metadata in the future.  I am not saying
>>> that we should not define metadata, I just think it should be separate
>>> from the data model and the protocol, so I concentrated on those first.
>>> I definitely feel that we should define a base set of metadata that all
>>> CDNs should support.  I do not think the draft prevents that.
>> 
>> We agree on the fact that the CDNI metadata protocol specification doesn't
>> need to define all possible properties/semantics for all possible delivery
>> protocols and that properties and semantics for different delivery
>> protocols can be defined in separate documents.
>> 
>> The CDNI metadata draft that Grant/David/I authored doesn't preclude
>> semantics/properties being defined in other documents (and implies that is
>> likely to be the case - see my response to Spencer just now). The base set
>> of properties that are described in our draft could easily be broken out
>> into a separate document if the consensus of the group is to do so. We
>> included the base properties and interface specification in a single
>> document so that readers could see how the interface could be constructed
>> and applied using a set of common base semantics/properties for HTTP
>> delivery supported by deployed CDNs today.
>> 
>> So if we put aside the question of where to define the actual properties
>> and their semantics we're left with the data model and interface/protocol
>> itself. So let's compare and contrast some key aspects of flexibility
>> between the different proposals.
>> 
>> Cache-ability & Latency:
>> With the CDNI Metadata model we've proposed we think we've managed to
>> define a data model that is as simple as possible but no simpler. It has a
>> minimum number of data objects to allow objects that are likely to to be
>> large or reused to be broken out into separate resources to provide
>> efficiency and at the same time objects that may well have different
>> cache-ability lifetimes can be broken out into separate resources. The
>> interface specification allows inlining so that all related objects for a
>> Site can be included in a single request/response so we provide the
>> ability to have a simple implementation but also the flexibility to
>> optimise the implementation to balance object size and cache-ability
>> against number of requests/re-validations required.
> 
> The protocol in my draft also supports bulk retrieval, but if there is
> a nuance of inlining that is missing, I could certainly look at it.
> Though I did not address cacheability, I think it could be applied at
> the bulk response level as you describe?

My point was not that your proposal couldn't do bulk retrieval but that by having a data model that is "too simple" rather than having achieving your aim of a "more extensible" model/interface you actually have a model/interface that is less flexible and less extensible than ours.

Your model essentially ignores the reality that there are different types of CDNI Metadata which are likely to have different "lifetimes" (TTLs), different levels of reusability across CDN services & CPs and will contain varying amounts of data.

To give just one example: Think about the situation I mentioned earlier of a Reseller (or CDN provider) wanting to add their off-shore QA team to all Delivery ACLs to enable them to perform QA/troubleshooting/etc. Your model provides no ability (that I can see) to avoid having to list all the QA IP addresses against every every Metadata record (which are defined per URI) in every Domain that will be interconnected. That's a hell of a lot of unnecessary duplication.

>> By keeping your model basically flat you force an implementation to
>> repeatedly have to transfer data that is common across a number of
>> Hostnames/Domains (e.g. the definition of Locations) and you force the
>> cache-ability for all data within a hostname/domain to be the same whether
>> it is specific to a hostname/domain or common across multiple
>> hostnames/domains. You therefore do not provide any ability for
>> implementations to optimise based on their knowledge (or configured
>> policies)  of the different types of data that combine to describe the
>> CDNI Metadata for a Site/domain/hostname.
>> 
>> Self-description:
>> The links between objects in our model and the representations of the
>> different objects in our model are self-describing through the use of
>> relationships and media-types within the interface so a client can easily
>> determine what data objects are being referenced and what data objects it
>> is receiving and use the appropriate processing function to interpret
>> their contents. It is easy to add new data objects to our model if
>> required by defining new relationships and/or media types. In contrast
>> your representations aren't self-describing and so force a tighter
>> coupling between the client and server and they require some out of band
>> process to ensure that coupling is maintained.
> 
> The individual pieces of metadata have a defined type.  If the CDN
> supports a given piece of metadata, it will know what the type is.  The
> metadata distribution protocol, imo, does not need to know the type
> as long as the CDN who receives the metadata understands it?  Knowing
> the type is not necessarily enough information to know what to do with
> it, the full definition of the metadata is required either way?

I don't see anything in your proposal that indicates what the type (or protocol version) a particular piece of metadata has so I therefore assume it must be known by some other method (e.g. the protocol version is agreed out of band, the type is inferred from something) that has the consequence of requiring tighter coupling between client & server.

As an example, think through how if your proposal is deployed would you migrate to a newer version of the protocol and the level of explicit knowledge and coupling between clients & servers required to do so.

With self-describing objects and standard HTTP content negotiation to negotiate a version of the objects media-type that both client & server support it is, for example, possible to deploy a new protocol version of our proposal on the server without requiring any changes on the clients while retaining full backward compatibility. Clients than can be upgraded independently of one another over a period of time while still retaining backwards compatibility.

I don't see any way to do that with your proposal without requiring configuration on the server to tell it what version to expect from each client and updating that configuration in lock-step with updating the client requiring tight client/server coupling and coupled out of band configuration.

>> Bootstrapping & Coupling:
>> Similarly, with the CDNI Metadata interface/protocol we have proposed, the
>> configuration/bootstrapping of the interface is as simple as possible but
>> no simpler (a single URI) from which the dCDN can discover the location of
>> all the CDNI metadata records they have access to. It also allows the uCDN
>> to define the URI format (and hostname it's provided by) as they see fit.
>> The uCDN has maximum flexibility to organise the different CDNI metadata
>> objects and complete control over how they extend their implementation to
>> support additional data objects or how they support mapping different
>> dCDNs to different sets of (or versions of) CDNI metadata.
>> 
>> Your proposal imposes a greater configuration/bootstrapping burden on the
>> dCDN as there is no way for it to discover the domains it has access to,
>> it needs to know the URI/hostname of the uCDN's Metadata interface and the
>> domains to query against that interface. The use of fixed paths in the URI
>> that map to the different data object enforces rigidity on the uCDN's CDNI
>> Metadata implementation (and tight coupling between the client and server)
>> and means that if the uCDN wishes to extend its implementation to support
>> additional objects it risks picking paths that may later clash with paths
>> defined by newer versions of your protocol.
> 
> Your SiteFeed object provides Site information (which is fairly analogous
> to domain information).  Domain information could also be put into a feed, 
> however, I assume that business processes drive what goes into the SiteFeed.

Of course it's possible to modify your proposal to look more like ours :-) but my overall argument is that our proposal does everything yours does in a more flexible & extensible way with the exception of defining an Agent object which isn't relevant to CDNI Metadata anyway.

> Is the SiteFeed configured offline, as part of some business process?
> If so, then is it better to configure Site information as a feed in
> the uCDN, or as domains in the dCDN.  I do not have a strong feeling
> either way, but I do not see our schemes as being so different.

They're not the same thing though. The SiteFeed enables a dCDN to bootstrap itself. Whether it is configured offline or automatically generated by the uCDN based on some configured policy and/or other data it has automatically obtained (e.g. another SiteFeed when the uCDN is acting as a dCDN for another CDN) is an implementation decision for each individual CDN.

Your proposal doesn't include a proposal for bootstrapping unless you're suggesting the uCDN POST Domains into the dCDN in advance of the dCDN requiring the associated CDNI Metadata (but your draft doesn't propose that so I'm just guessing)?

Ben

> 
> thanx.
> 
> --  Kevin J. Ma
> 
>> Taking the above factors into consideration I believe that our proposed
>> data model and interface specification is flexible and extensible enough
>> to address the current and future needs of CDNI.
>> 
>> Ben
>> 
>> 
>> 
>

[CDNi] Some questions regarding draft-ma-cdni-met… Ben Niven-Jenkins
Re: [CDNi] Some questions regarding draft-ma-cdni… Kevin J Ma
Re: [CDNi] Some questions regarding draft-ma-cdni… Ben Niven-Jenkins
Re: [CDNi] Some questions regarding draft-ma-cdni… Kevin J Ma
Re: [CDNi] Some questions regarding draft-ma-cdni… Ben Niven-Jenkins
Re: [CDNi] Some questions regarding draft-ma-cdni… Kevin J Ma
Re: [CDNi] Some questions regarding draft-ma-cdni… Ben Niven-Jenkins