Re: [alto] A unified approach to value schemas and ALTO maps

Wendy Roome <wendy.roome@nokia-bell-labs.com> Wed, 27 July 2016 13:38 UTC

Return-Path: <wendy.roome@nokia-bell-labs.com>
X-Original-To: alto@ietfa.amsl.com
Delivered-To: alto@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6402B12D7D4 for <alto@ietfa.amsl.com>; Wed, 27 Jul 2016 06:38:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Level:
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J8mOJdm5hbnb for <alto@ietfa.amsl.com>; Wed, 27 Jul 2016 06:38:42 -0700 (PDT)
Received: from smtp-us.alcatel-lucent.com (us-hpatc-esg-01.alcatel-lucent.com [135.245.18.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ED6D812D7CA for <alto@ietf.org>; Wed, 27 Jul 2016 06:38:41 -0700 (PDT)
Received: from us70tumx1.dmz.alcatel-lucent.com (unknown [135.245.18.13]) by Websense Email Security Gateway with ESMTPS id 1AFB3D3A8CB2A; Wed, 27 Jul 2016 13:38:38 +0000 (GMT)
Received: from us70tusmtp1.zam.alcatel-lucent.com (us70tusmtp1.zam.alcatel-lucent.com [135.5.2.63]) by us70tumx1.dmz.alcatel-lucent.com (GMO) with ESMTP id u6RDcekn008505 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 27 Jul 2016 13:38:40 GMT
Received: from umail.lucent.com (umail.ndc.lucent.com [135.3.40.61]) by us70tusmtp1.zam.alcatel-lucent.com (GMO) with ESMTP id u6RDcdG9020872 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 27 Jul 2016 13:38:40 GMT
Received: from [135.222.152.71] (wdr-i7mbp2.mh.lucent.com [135.222.152.71]) by umail.lucent.com (8.13.8/TPES) with ESMTP id u6RDcZta022182; Wed, 27 Jul 2016 08:38:36 -0500 (CDT)
User-Agent: Microsoft-MacOutlook/14.6.6.160626
Date: Wed, 27 Jul 2016 09:38:38 -0400
From: Wendy Roome <wendy.roome@nokia-bell-labs.com>
To: Jensen Zhang <jingxuan.n.zhang@gmail.com>, Wendy Roome <wendy.roome@nokia-bell-labs.com>
Message-ID: <D3BE26A8.81086D%w.roome@alcatel-lucent.com>
Thread-Topic: [alto] A unified approach to value schemas and ALTO maps
References: <D3BD183E.80D4FC%w.roome@alcatel-lucent.com> <CAAbpuyr8QemDbT+gvytkyWVJrnmK3XtocDfhkCwPi3bNFEQN3A@mail.gmail.com>
In-Reply-To: <CAAbpuyr8QemDbT+gvytkyWVJrnmK3XtocDfhkCwPi3bNFEQN3A@mail.gmail.com>
Mime-version: 1.0
Content-type: multipart/alternative; boundary="B_3552457123_12498432"
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/uje7_3zH5eyGW7e-BoltHIrY6FU>
Cc: IETF ALTO <alto@ietf.org>
Subject: Re: [alto] A unified approach to value schemas and ALTO maps
X-BeenThere: alto@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Application-Layer Traffic Optimization \(alto\) WG mailing list" <alto.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/alto>, <mailto:alto-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/alto/>
List-Post: <mailto:alto@ietf.org>
List-Help: <mailto:alto-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/alto>, <mailto:alto-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Jul 2016 13:38:45 -0000

Jensen,

Thanks for commenting!  My responses are in-line.

> From:  Jensen Zhang <jingxuan.n.zhang@gmail.com>
> Date:  Tue, July 26, 2016 at 21:18
> Subject:  Re: [alto] A unified approach to value schemas and ALTO maps
> 
> Hi Wendy,
> 
> I want to support this idea. Please see comments inline.
> 
> On Wed, Jul 27, 2016 at 1:38 AM, Wendy Roome <wendy.roome@nokia-bell-labs.com>
> wrote:
>> Recently we talked about using "value schemas" so that we can add new
>> value types without defining new media types and message formats.
>> 
>> Here is another take on that approach. It unifies *everything* into one
>> common message format. Extensions simply require defining new entity
>> domains and (simple) value types.
>> 
>> So here's my idea. Every ALTO resource boils down to a mapping from 1-,
>> 2-, 3- or 4-tuples to values. The tuple elements are names of entities in
>> a domain.
>> 
>> Examples:
>>    Cost Map: (pid, pid)  =>  value
>>    Endpoint Costs:  (addr, addr) =>  value
>>    Endpoint Props:  (addr, prop-name)  =>  value
>>    MultiCost Map:   (pid, pid, cost-type)  =>  value
>>    MultiCost Calendar Map: (pid, pid, date-range, cost-type)  =>  value
>>    Network Map:  (cidr)  =>  pid
>> 
>> Note that I flipped the network map around. I think cidr => pid is cleaner
>> than pid => cidr-array, and it enforces the rule that a cidr is in only
>> one pid.
> 
> Agree. And maybe another benefit from cidr => pid is for incremental update.
> Once a cidr changes its pid, pid => cidr-array will require an update of the
> whole cidr-array. But cidr => pid only requires an update of one pid item.

Exactly! 

>> 
>> With this approach, the meta section of a response would give the domain
>> for each tuple element, and the value type. For example, here is a
>> conventional cost map:
>> 
>>    meta:
>>       tuple-domains: [pid, pid]
>>       value-type:
>>          specification: rfc7285
>>          format: cost-type
>>          parameters:
>>              metric: routingcost, mode: numerical
>>    map:
>>       pid1:
>>          pid1: ###,  pid2: ###, ...
>> 
>> The value field gives document that specifies this value format (or a
>> registered name), a format name that is defined in that document, and any
>> additional parameters necessary to understand this value.
> 
> The specification metadata is very interesting and I like this design. But I
> don't think RFC is the good solution for the specification citation. RFC is
> human-readable, but it is hard for programs to parse RFC documents. Maybe we
> find a schema language (or JSON schema) to specify the value format.

Okay, here is where I respectfully disagree. I do not believe a
program-readable schema will help.

Why not? First, JSON is self-describing. An ALTO client does not need a
schema to parse JSON. JSON libraries do that on their own. The JSON
libraries I¹ve used build a DOM from the JSON and allow client programs to
explore that DOM.

IMHO, formal schemas have two uses. One is to allow a program to verify that
a given JSON tree matches the schema. However, if the server¹s response does
not match the schema, what can a client do about it, other than give up on
that ALTO server?

The other is to document the layout, and (maybe!) facilitate automatically
creating a program, or at least a skeleton of a program, to program with
that data. But no ALTO client is doing to do that on-the-fly. That happens
off-line, when the programmer decides how the client will use an ALTO
server.

I believe the important thing is to specify the semantics. E.g., what do
these values mean? For example, take ordinal vs numerical mode. You cannot
capture the difference in a schema.

As another example, consider duration values. Frequently durations have
suffix with the units: 10s, 1.23ms, 15us, etc. Or they have colons, as in
1:23.4. A grammar can define the legal values, but unless the programmer
knows "ms" means milliseconds, the grammar does not help.

So I believe values will have to be described by some document, which may
have a formal schema. The document will define some unique name to identify
values of this kind, and ALTO clients will switch on that name. The name
could be rfc####, or it could be yet another IANA registry, or whatever.
>  
>> 
>> Here is a multi-cost map:
>>    meta:
>>       tuple-domains: [pid, pid, cost-type-name]
>>       value-types:
>>          3:
> 
> Is this a typo? Or what's the mean of '3:' here?

The 3 means the value-type depends on the value of the third tuple element -
the cost-type. But now that I look at it again, that is overly general. It
is probably sufficient to say that either all values are the same type, or
else the value type depends on the value of the last element in the tuple.
E.g., cost-type, property-name, etc.
>  
>>             cost:
>>                specification: rfc7285
>>                format: cost-type
>>                parameters:
>>                  metric: routingcost, mode: numerical
>>             delay:
>>                specification: rfc####
>>                format: duration
>>                parameters:
>>                   metric: propagation-time
>>    map:
>>       pid1:
>>          pid2:
>>             cost: 123, delay: 2ms
>> 
>> Here the different cost types have different value types. meta.value-types
>> means that the value depends on the 3d element of the tuple - the
>> cost-type-name - and gives the value format for each cost-type-name.
> 
> It is also more flexible for incremental updates.
>  
>> 
>> 
>> So are people interested in pursuing this approach? The good news is that
>> will unify everything. The bad news is it will replace everything in rfc
>> 7285.
>> 
>>     - Wendy Roome
>> 
> 
> I find this approach is also compatible with FCS, since FCS returns a fid =>
> value map currently. I'd like to push it forward.
> 
> One more question: I think (src, dst) => cost map and (dst, src) => cost map
> will have the same "tuple-domains" value, but they are different for clients.
> How to differentiate them?

Because the domains in the tuples are ordered. E.g., in a cost map, the
first domain is always the source and the second is the destination.
> 
> Best,
> Jensen

Thanks! Let's keep the discussion going. Anyone else like to comment?
Richard??

- Wendy Roome
>