Re: [dsii] Potential IETF Work Items

Andrew Maffei <amaffei@whoi.edu> Wed, 22 August 2012 14:08 UTC

Return-Path: <amaffei@whoi.edu>
X-Original-To: dsii@ietfa.amsl.com
Delivered-To: dsii@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8F3EE21F8619 for <dsii@ietfa.amsl.com>; Wed, 22 Aug 2012 07:08:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2Cptv3l6A-sq for <dsii@ietfa.amsl.com>; Wed, 22 Aug 2012 07:08:14 -0700 (PDT)
Received: from postal2.whoi.edu (postal2-e0.whoi.edu [128.128.76.88]) by ietfa.amsl.com (Postfix) with ESMTP id 40F5A21F855F for <dsii@ietf.org>; Wed, 22 Aug 2012 07:08:13 -0700 (PDT)
Received: by postal2.whoi.edu (Postfix, from userid 117) id 153DD524567; Wed, 22 Aug 2012 10:08:12 -0400 (EDT)
Received: from [192.168.5.223] (ip-64-134-98-221.public.wayport.net [64.134.98.221]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by postal2.whoi.edu (Postfix) with ESMTPSA id 9AA22524396; Wed, 22 Aug 2012 10:08:11 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1278)
Content-Type: text/plain; charset="us-ascii"
From: Andrew Maffei <amaffei@whoi.edu>
In-Reply-To: <503290C2.7060608@nomountain.net>
Date: Wed, 22 Aug 2012 10:08:11 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <6D125452-6D45-473A-A1B0-5DF461B80D3D@whoi.edu>
References: <E1AB8352-7B89-4D5A-9B36-4872DF105392@vigilsec.com> <7F45CB6F-2FE2-4A25-8A18-C94674489E39@vigilsec.com> <CAPv4CP-SOmFAKqdm+3Xa9oBwNxd_f4dGyAQu7aesaEbc_quLgQ@mail.gmail.com> <CA+9kkMBpwaxHUMXegcQ6j1pPqgmb4k=130BaoDVp6HQ_Kh1Syw@mail.gmail.com> <502BC103.4040107@nomountain.net> <CA+9kkMCDF7LeeJw+G9-DHhxTsz3wC_8SWbPRjyzSDgaTzyA77g@mail.gmail.com> <503290C2.7060608@nomountain.net>
To: Melinda Shore <melinda.shore@nomountain.net>
X-Mailer: Apple Mail (2.1278)
X-Virus-Status: No
Cc: dsii@ietf.org
Subject: Re: [dsii] Potential IETF Work Items
X-BeenThere: dsii@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <dsii.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dsii>, <mailto:dsii-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dsii>
List-Post: <mailto:dsii@ietf.org>
List-Help: <mailto:dsii-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dsii>, <mailto:dsii-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Aug 2012 14:08:19 -0000

On Aug 20, 2012, at 3:32 PM, Melinda Shore wrote:

> I'm still trying to figure out what's being proposed here and I
> realized that my mental model might be considerably different from
> that being used by the work's proponents.  Where I'm coming from,
> someone who needs a chunk of data and isn't sure where it is (or,
> in some cases, whether or not it exists) does a search, and the
> search returns a set of stuff, where "stuff" includes descriptive
> information (metadata) and an identifier that's actually a
> locator.  The locator is used to access the data.
> 
> Is that consistent with what proponents have in mind?

Hi Melinda.

The above is the primary use case. I think the "stuff" is all metadata (attribute/value pairs about the dataset) that includes the "locator" you mention. 

I'd like to comment on some of the things I saw of value at the Vancouver meeting. I don't claim to be an identifier or metadata expert so perhaps some of these ideas were derived outside of IETF. But they were new to me.

One idea would be to consider working together to agree on the "core metadata" that would be returned about scientific datasets for data object access and delivery, etc.

One of the more interesting IETF WG docs I found was the CDN Interconnet Metadata i-d (draft-cjlmw-cdni-metadata-00). The sections on ACLs, ACLRules, Delivery seemed directly applicable to delivery of science datasets, some of which are proprietary and some of which are not. There are all sorts of issues related to delivery of proprietary scientific data and very large datasets (or their subsets) that seem applicable.

I noticed in another I-D (can't find it right now) the practice of allowing attribute values of type "URI" being either an explicit, fully qualified URI *or* a regular-expression substitution that can be applied to a previously defined URI attribute.

So, for example, if the URI for my identity as a WHOI employee was "http://www.whoi.edu/1912/241.11" the URI for a picture of me might be specified in the metadata associated with this URI as "s/$/.jpg/", indicating that adding .jpg to the end of the locator URI derives a picture of the person.

Another example might be a way to define the way to express a substitution string for receiving metadata about a timeslice of a video that is pointed to by a locator for a scientific data object of type "Video". If the orginal locator was http://www.whoi.edu/1912/2342.234 there might be metadata that declares how to modify this URI to one that specifies a start time and end time.

I'm interested in finding "lessons-learned" by the IETF that it would be worth considering in the realm of dataset identifier interoperability. Information in the I-Ds represent hours and hours of discussion/argument and trial in past meetings about what works and what does not work. 

It would be a shame if we could not find some way to take advantage of this work done in the past to help with datset identifiers and certain types of the metadata that would sit behind them.

--Andy