Re: [icnrg] Some thoughts on architectural choices for Manifests

Marc Mosko <mmosko@parc.com> Sun, 09 August 2020 21:08 UTC

Return-Path: <mmosko@parc.com>
X-Original-To: icnrg@ietfa.amsl.com
Delivered-To: icnrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4C7433A0DEA for <icnrg@ietfa.amsl.com>; Sun, 9 Aug 2020 14:08:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=parc.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ecJ7WCu6MzPP for <icnrg@ietfa.amsl.com>; Sun, 9 Aug 2020 14:08:06 -0700 (PDT)
Received: from NAM02-CY1-obe.outbound.protection.outlook.com (mail-eopbgr760084.outbound.protection.outlook.com [40.107.76.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6AACA3A0E0B for <icnrg@irtf.org>; Sun, 9 Aug 2020 14:08:05 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QhZpnmaEKNHxO/fnc3Z0TLWzOK3BdHmxh1CV7n07FHiduULqKvSUnCEmDP6cVV3wpTPg8k6mC8OopjexphvRlvdE+T/3jW+9DqRU4qa/0PJFeK4hgC8cKYIVw88/jNXRQP/JXZvuIiEeu9WgRLvy5HYWZ86z834MfuKikt6hiMXkgxeQN4IJU2pEAZYcunpBTMapCah3BrI8yZ91kLQl5garqH5t1IAjfeighT6kCfFWrIa5OGCIrhT+I57GtzMuX3IHUvvOzVtP/SR7dVfaFJ+dSLW/cNrTTvElDtkH05i9A4A4aBmdU8MsVR6u0hXKlKrpAKQG/BKDs4YFsIIDgA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=thsuGw2ZpTbHRL6DjG2XRHgasGV+kDcH53qpVC0lgzQ=; b=jlOEDPIy5Fb0qHi3fJmsomOuC3F3u34uFNWlXORzntOWVOEOp1jUtzIjYe7lPJqjpeo2hzxtSLHSWvlS/51/Xe+pJUI5NIKzRfn9G5ooKLDoyNlK3/Qj8DOJbLJzkB+xjG2CSp1hQjF77VP2Xr2Y88apLis3t9ubuvWVRjcm3g3V+kG+3pkwvAgNzFf5j4haQeIs/ihu9ngFGe6ql3FRPx6yeD9sjSpMgm2o3Obp9aJf3PNAjJjiFZqo7r77iaOJMHUiroGOnj0bTGB8pyyQ8Lf7VCQCGec3wHqkDBEvO0bG/egScncTEqDqtczo9IURbiU5KqJY+9ZjQiRlAO0L1w==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=parc.com; dmarc=pass action=none header.from=parc.com; dkim=pass header.d=parc.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=parc.onmicrosoft.com; s=selector2-parc-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=thsuGw2ZpTbHRL6DjG2XRHgasGV+kDcH53qpVC0lgzQ=; b=VFASuzFtwUJQDYqNJl7b5Txa/kbridv93h9lF96EkLBdb5GCFnyugSd/D8rtC5irBStZNrZ9UMjLoLRYyQH8ys7jm4ucbl9aRftR2ncf+FGhIzSQS/JKE8dwxkk0G/bpzUR5D9M7uSLAmK3KTm9Ozw1AUXL0PxXnjQXY9jMu73c=
Received: from BYAPR15MB3238.namprd15.prod.outlook.com (2603:10b6:a03:106::29) by BYAPR15MB2485.namprd15.prod.outlook.com (2603:10b6:a02:85::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.20; Sun, 9 Aug 2020 21:08:03 +0000
Received: from BYAPR15MB3238.namprd15.prod.outlook.com ([fe80::1045:4aad:16d6:c0e6]) by BYAPR15MB3238.namprd15.prod.outlook.com ([fe80::1045:4aad:16d6:c0e6%7]) with mapi id 15.20.3261.023; Sun, 9 Aug 2020 21:08:02 +0000
From: Marc Mosko <mmosko@parc.com>
To: "David R. Oran" <daveoran@orandom.net>
CC: ICNRG <icnrg@irtf.org>, "christian.tschudin@unibas.ch" <christian.tschudin@unibas.ch>
Thread-Topic: [icnrg] Some thoughts on architectural choices for Manifests
Thread-Index: AQHWa0w407N6EjI3KEC8NfWlalWbLakqXTiAgAC5kYCAAZewgIADJ+qA
Date: Sun, 09 Aug 2020 21:08:02 +0000
Message-ID: <F0CFC5F1-D677-4CD5-86CE-A7594CAE3CF1@parc.com>
References: <C56B63E0-444A-409B-A68C-D3B0FF491E42@orandom.net> <alpine.OSX.2.21.2008051903040.98272@uusi> <4B2E8A31-4889-4A10-833F-09DB1B0A8BFC@parc.com> <47F70B1F-F995-4A69-9280-D794BB1DE8BA@orandom.net>
In-Reply-To: <47F70B1F-F995-4A69-9280-D794BB1DE8BA@orandom.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.39.20071300
authentication-results: orandom.net; dkim=none (message not signed) header.d=none;orandom.net; dmarc=none action=none header.from=parc.com;
x-originating-ip: [50.0.67.90]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 4e1fdc6c-9bd2-4127-3045-08d83ca84db8
x-ms-traffictypediagnostic: BYAPR15MB2485:
x-microsoft-antispam-prvs: <BYAPR15MB2485DD230A95F96C42D80903AD470@BYAPR15MB2485.namprd15.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:9508;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: zzTW/0r5L1VZyTW3eub6AnFNuzroPzPABin1RRqkxvaogNURMU+ITmxFSOE8TeSH17e7q55RG7+rZyMdfTR6f/p/d1MGmXfv6ca8B6539adgKm1Tqo+0dnzcr7219dwe78m29MznM5/uaBIAhesC5cssbwUIio1dcEQndSPKrxW2wSGYv39pVeAXlFAma7r6JpSCzCc45XlFkISzz+Ja1jCj31qSBRsiEXcDbz3KFKkkG4555xDI4vFb1EsQ75XOY+bdVGR2HDaK2O1YEVlI2cYL+4wH8H/9qd8Jy1drTQTUmLkPAq2VJRtwHunX8uRojNRS9nVU51/gjSJYqm4bSQ==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR15MB3238.namprd15.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(366004)(376002)(346002)(396003)(136003)(39840400004)(8936002)(6916009)(6486002)(54906003)(2906002)(86362001)(316002)(8676002)(5660300002)(83380400001)(53546011)(478600001)(6506007)(71200400001)(186003)(36756003)(26005)(66446008)(64756008)(66476007)(91956017)(76116006)(66946007)(66556008)(6512007)(33656002)(4326008)(2616005); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: Yp9il7qE6f0eW4eOeQ4tNiA0bfShstosoIUEeBJOLWBVe38uwbXhE0d19J+cXwK7yanYLSUo0jbXwyhyfkYdBG8tlYxqET7llP3EEcLGewj1h2T0g0iLnpwSVHRg+5uwA9fKUIslmKpEyOEdKalGwIHiG75LYZSkFCLTHm3ZxSI3oMDOqtJnX3yzTRKQ5UOzYDvuhAOL5ZxjkTDBOQCSufIRJtUcesNiIdFbzJzs0RxLOQq34YovSfVn3DVelrZ7gyADNCbBegoDGuP0EqNUnBkpyy8cwO3ffFGwNCxc8xuMeFE5EY4h10U98Amacfrojk+zZTKoNZfwjbOtwsy7X4SEKMORPnBBOExPPIBnKpl9N21J8waA7gvSl1o1t/sJrRYCQSwqVv5MqZ+JRG3BbYn9erBPXbcR7oZcgfF2JRG+xwuxOack3QOnwK83trcrotOLURuYD32gkviT/iNVScjdcHZSu3Br1QU3Wjq/eNfAeSL4gIQNCId7acdg18vvatJ+cnswXl6oOu77XeT1cXNbA5gXYIBi/R81Jt4dzOP2Or/uZY25AnCtFMNHdMEEaJkbRETVUw2iZLr99UEr2QZQi2q59Q83e8fHm8+Y0W/p3fNihYTrBAFWlxK8JxxbReePtPtUA3UZghL+RWYMsQ==
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-ID: <A0388EA3F8E6334C859F30948EE268A3@namprd15.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: parc.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BYAPR15MB3238.namprd15.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 4e1fdc6c-9bd2-4127-3045-08d83ca84db8
X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Aug 2020 21:08:02.4396 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 733d6903-c9f1-4a0f-b05b-d75eddb52d0d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: N4f5KLd8W3+12QNWzq7Pxwek4DK7xo1ybP/oUW0uTEejxE1kXWyfoGbVTXNn+pLl
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB2485
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/NzA77xEnVkPwGHhho7JgtnFa8FM>
Subject: Re: [icnrg] Some thoughts on architectural choices for Manifests
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Aug 2020 21:08:16 -0000

On 8/7/20, 6:56 AM, "icnrg on behalf of David R. Oran" <icnrg-bounces@irtf.org on behalf of daveoran@orandom.net> wrote:

    On 6 Aug 2020, at 16:37, Marc Mosko wrote:

    > > We choose to make those descriptions use the same data structure. We 
    > > also choose that
    > > when reading a manifest, a reader does not know which pointers belong 
    > > to which sub-tree until it actually fetches a pointer (unless there 
    > > are annotations).
    > I think this is right, however isn’t it a good idea to enforce a 
    > canonical ordering so that you don’t need application knowledge to 
    > re-assemble the pieces of an object in the right order (or are you only 
    > referring to non-leaf manifests pieces)?

Yes, the manifest has an implicit traversal order to read the Data pointers in proper order.

    > > The manifest sub-tree is really just a
    > > FLIC representation of the manifest file and the data sub-tree (it's 
    > > not so much a tree as the in-order traversal of leaf pointers) is a 
    > > representation of the data file.
    > So, looking at my above comment in the context of this sentence we are 
    > saying the manifest ensures that there is a canonical ordering of leaf 
    > manifests, and in leaf manifests, the order of the pointers is congruent 
    > with the order of the pointed-to data bytes, right?

Any manifest may have both manifest pointers and data pointers, not just leaf manifests.  As it is now, there is only one class of pointers, and they may point to a Manifest or Data.  There is only one traversal order, and this order ensures that both the manifest tree and data pointers are visited in the proper order such that the data pointers are read in order.

The only way a leaf manifest is different than an interior manifest is that it only has data pointers.  But there is no easy way for a consumer to know it is a leaf manifest (maybe it could guess based on subtree sizes).

    > >
    > > In some FLIC arrangements, this can be made explicit.  For example, my 
    > > root manifest has two hash groups.  One group pointing to the manifest 
    > > tree and the second points to the data leaves.  Or, as is more common, 
    > > they are just mixed I one hash group.
    > >
    > Seems fine. Might there be advantages (either in flexibility or 
    > simplicity) in separating them?

The only reason to separate them is to have different Name Creators for the manifest subtree versus the data pointers.

The options are things like:

Mixed hashgroup: [MDMDDDMMDDMMDDD]
Separate: [MMMMMM] [DDDDDDDDD]
Interleaved: [DDD] [MMMM] [DDDDDD] [MMM]

Note that the M pointers are not the same between these options, as the traversal order is always pre-order DFS.  This means if you want to read data at the current node first, you want to put Data pointers before Manifest pointers.

    > > This would allow the NameCreator
    > > extension to specify different rules for different HashGroup types, or 
    > > a private producer could have their own secret rules for different 
    > > HashGroup types.  This allows an app to use different NameCreator 
    > > approaches for the manifest file and the data file.   It's like being 
    > > able to use typed name components so an app can have its own rules.

    > There is a certain danger in this if we expect to have manifests widely 
    > used by “third parties” other than the producer and consumer of a 
    > particular app. I also would like to stress the comment I made earlier 
    > that permitting large random amounts of application “metadata” to 
    > find their way into Manifests can be deleterious for both security and 
    > rational namespace design. I am fairly wedded to the notion that the 
    > only thing that belongs in Manifests are things that allow elements that 
    > fetch, store, or cache content to correctly fetch large objects (and in 
    > my view also be able to iterate over collections), and to do so without 
    > any private knowledge of application semantics.

I agree, I'd prefer to not have secret application data determine how to decode a manifest.  Annotations could give hints to optimized ways of reading things (e.g. a video player wants to skip ahead), but it should never change the canonical "if you traverse it like normal you get the original file" paradigm.  I was throwing out a few options to evaluate, not all of them are good.

Marc