Re: [icnrg] Some thoughts on architectural choices for Manifests

Marc Mosko <mmosko@parc.com> Thu, 06 August 2020 20:37 UTC

From: Marc Mosko <mmosko@parc.com>
To: "christian.tschudin@unibas.ch" <christian.tschudin@unibas.ch>, "David R. Oran" <daveoran@orandom.net>
CC: ICNRG <icnrg@irtf.org>
Thread-Topic: [icnrg] Some thoughts on architectural choices for Manifests
Thread-Index: AQHWa0w407N6EjI3KEC8NfWlalWbLakqXTiAgAC5kYA=
Date: Thu, 06 Aug 2020 20:37:15 +0000
Message-ID: <4B2E8A31-4889-4A10-833F-09DB1B0A8BFC@parc.com>
References: <C56B63E0-444A-409B-A68C-D3B0FF491E42@orandom.net> <alpine.OSX.2.21.2008051903040.98272@uusi>
In-Reply-To: <alpine.OSX.2.21.2008051903040.98272@uusi>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-MacOutlook/16.39.20071300
Content-Type: text/plain; charset="utf-8"
Content-ID: <F8BFB2D145B43C41902875D2F47316C1@namprd15.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 7783a94b-ff11-40dc-2e78-08d83a488131
X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Aug 2020 20:37:15.1266 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 733d6903-c9f1-4a0f-b05b-d75eddb52d0d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: yGkWcWKMkb6RNJCClM7o1HDmNAs1pG4UHn3HVEG+VeuJygk/Xb2JPFYGA2P+TgTW
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB3318
Archived-At: <https://mailarchive.ietf.org/arch/msg/icnrg/yrrRMQ6brDc4SSKynXXJR8M8BFc>
Subject: Re: [icnrg] Some thoughts on architectural choices for Manifests
X-BeenThere: icnrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Information-Centric Networking research group discussion list <icnrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/icnrg>, <mailto:icnrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/icnrg/>
List-Post: <mailto:icnrg@irtf.org>
List-Help: <mailto:icnrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/icnrg>, <mailto:icnrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Aug 2020 20:37:25 -0000

Here are a few comments responding to topics from both Dave and Christian.

FLIC describes a single file.  No.  By definition, FLIC must always describe two files: the manifest and the data.  We choose to make those descriptions use the same data structure. We also choose that when reading a manifest, a reader does not know which pointers belong to which sub-tree until it actually fetches a pointer (unless there are annotations).  We choose that a node in the tree must be either a manifest node or a data node, not a hybrid node.  The manifest sub-tree is really just a FLIC representation of the manifest file and the data sub-tree (it's not so much a tree as the in-order traversal of leaf pointers) is a representation of the data file.  

In some FLIC arrangements, this can be made explicit.  For example, my root manifest has two hash groups.  One group pointing to the manifest tree and the second points to the data leaves.  Or, as is more common, they are just mixed I one hash group.

Inlining: Yes!  One use case we had for this was when someone wants to re-publish a manifest object (e.g. a content distributor wants to point its subscribers to local caches via locators), it would encapsulate the original root manifest in its re-written one so the consumer can verify in one step that they (a) got the thing they were asking for (from the interior manifest) and that someone they trust for such things (e.g. their ISP) tells them a good place to fetch it from.

Simplifying: I would be OK taking out all the namespace stuff, but allowing extensions at the Node level (i.e. top level of each manifest inside the security context) so a publisher could put in a NameCreator element (or any other valid extension) if it so chooses.  Otherwise, the process of forming an Interest is implicit in the application or someone just has to guess.  I would then suggest we immediately publish two extensions, one for hash-based naming (ccnx) and one for segmented naming (ndn).

To go along with the simplification, maybe we could define a range of Ts that define hash groups.  There's nothing technically different about the hashgroups, just having say 16 values that are all HashGroup that allows the app to assign its own app meanings.  This would allow the NameCreator extension to specify different rules for different HashGroup types, or a private producer could have their own secret rules for different HashGroup types.  This allows an app to use different NameCreator approaches for the manifest file and the data file.   It's like being able to use typed name components so an app can have its own rules.

Marc

On 8/5/20, 7:33 PM, "icnrg on behalf of christian.tschudin@unibas.ch" <icnrg-bounces@irtf.org on behalf of christian.tschudin@unibas.ch> wrote:

    Hi Dave and Marc,

    great to have your "unpacking"! After the ICNRG interim meeting I had
    started to write down my viewpoint, yesterday I revised it and I was
    about to revise it again - maybe I should just blast it out, same
    TL;DR disclaimer.

    Summary: I distinguish between
    - packet (data object)
    - real blob (a packet's content)
    - virtual blob (content as defined by a manifest packet)
    Manifests are the mechanism to define virtual blobs - full stop. The
    "full stop" excludes object composition.

    Relating to Dave's enumeration of options, my writeup would belong to #2
    ("Keep the current FLIC design, but build in some extensibility
    features"). It would be a very lean FLIC, though.

    Regarding "direct embedding" - I also think that this is essential to
    have (I had called it inlining).

    Regarding Marc's comments on annotation (thanks, too!): my writeup
    belongs to your group 3 ("Separate Data/Content objects within the
    FLIC world") in a strict way, the reason being that the encoding size
    of a manifest's annotations in general cannot be bound, hence would be
    the first candidate to benefit from virtual blobs as offered by the
    manifest construct. If the encoding is small enough, the "separate
    data structure" can be "directly embedded". I guess from your text
    that such embedding is not equal to your "option 2" (parallel groups
    in one manifest), but it could be close as you wrote yourself.

    So here it goes, perhaps not a full manifest manifesto, but laying out
    an argument for keeping FLIC minimal and version-resistant.

    best, c

    ---

    First, I re-tell the FLIC story in new terms (no reference to inodes)
    for positioning FLIC and what it achieves. Then I will look into the
    "parallel datastructure" concern and how directory tree organization
    should be handled. I also include a grammar for such a lean FLIC.

    As a preparation I introduce "derivation functions" which take some
    packets (signed data objects) and produce new packets.  The first
    derivation function of interest is "make_virtual_blob(h_1,..h_n)"
    whose result is exactly one packet of type manifest that contains the
    given sequence of packet hash values, these packet being either a
    normal data object ("real blob") or a manifest spec.

    A packet with a manifest spec defines a "virtual blob" which is data
    of any size. The hash name of the manifest packet becomes the name of
    the virtual blob; the virtual blob inherits the signature of the
    manifest packet.

       real_blob_pkt.content() = justs tap into the named data object
       virtual_blob_pkt.content() = gather all content by traversing the manifest tree

       note that for real blobs, real_blob_pkt.content() equals real_blob_pkt.payload
       (introducing the notation of field access vs a procedure)

    A manifest is a claim, namely that there exists a virtual blob which
    is the concatenating of all referenced content: it's a claim
    (attribution to a given manifest) that can only be verified by
    materializing the virtual blob. The claim made by a manifest can fail
    for several reasons, one being that some of the included hash values
    never had a corresponding data packet or that some of the referenced
    packets have been forgotten, or that the manifest spec is broken.

    A second derivation function is "get_virtual_blob_content_size(h)".
    Again, it is a claim about a manifest that can only be verified by
    materializing the virtual blob. But unlike the first function, this
    function is not essential for defining the virtual blob, it is mere
    decoration (meta data). There are many more decorations:

    vblob_size_in_records
    vblob_size_in_UTF8_chars
    vblob_tree_depth
    vblob_number_of_leaf_blobs
    creation_time_in_sec

    The last example stands for a decoration that cannot be verified.

    My suggestion is to leave decorations outside the FLIC draft, except
    for the possibility to have decorations. That is, it's ok to use a
    manifest for defining resource forks. I will come back to size
    limitations in just a moment.

    Resource forks are a well known pattern that also map well to the
    networking world, for example layered video encodings, labeling
    information for a video etc. These additional resources are optional
    and not needed to build the virtual blob abstraction - they enrich
    it. And these additional (parallel) resources all have the problem of
    potentially being desynchronized - after all they are all claims that
    the consuming end has to handle with caution.

    Size is one among many of such decorations. Yes, one could request
    that all manifest MUST include the size resource fork. But then the
    FLIC draft must explain how a virtual blob consumer should handle
    manifests that lack that resource fork, or what to do when traversal
    reveals that a size value was wrong. The answer will probably include
    a sentence like "if one of the correctness checks fails you must
    revert to deriving that property yourself", which is another way of
    saying that handling decoration data is decoration-specific. So better
    have a separate "FLIC decoration draft" explaining the various ways of
    adding decorations, including size: do it per-manifest (= store the
    sum of all referenced subtrees sizes), or as a single table at the
    root manifest that keeps the (size) labels for all nodes of the tree
    etc.

    My other argument (beside diminishing decorations as claims) for not
    being concerned about potentially non-synchronous data is that code
    for producing and consuming manifests itself is constantly evolving
    and is not synchronized, due to bugs or partial implementations. Even
    when resource forks themselves would be synchronized with a manifest's
    content, the softwares (plural) will not - the next version of a
    producer software may introduce wrong decorations expected by all
    existing consumer software. Said differently: Synchronization loops
    pass through software - making some decorations mandatory will not
    prevent desynchronization.

    Note that decorations can be useful for applications _and_ for the
    manifest traversal algorithms - I did not keep these two usages apart,
    there is only one decorations field. As a side effect, manifests thus
    provide a "decorated virtual blob" abstraction.

    When it comes to the question of using manifests to express
    directories or similar trees of components, we should not confound
    decoration with object composition. It looks as if one could use the
    (hash) table as a table for sub-component data. One argument here is
    that we will bump into the size limitation of (manifest) packets
    wherefore we should compose objects out of virtual blobs _and_ the
    composition spec also be stored in a separate virtual blob. For
    example, what if the file name is larger than fits in one packet?
    It is much cleaner to store the mapping table (that maps from file
    attributes like names to virtual blobs) in a virtual blob.

    Potential confusion can arise when we build a directory tree where
    files and directories are all stored in virtual blobs. Isn't this
    a manifest tree of manifest trees? No, it is not because they operate
    at different levels. When traversing a manifest tree, we extract hash
    references from the manifest's fields.  When traversing directories,
    we extract hash references from inside the virtual blob (defined by
    that manifest). These are very different memory zones.

    Here is a writeup in form of a grammar/definitions:

    packet := length-limited data object (NDN, CCN)

              I assume that a data object has type information such that
              "real blob" can be distinguished from "manifest packet". Having
              a hash name and retrieving the data object will tell us what
              kind of packet it is. Anything else (e.g., tagging hash
              pointers whether they are pointing to real blobs or a manifest
              pkt is a claim and thus unreliable. It's fine to keep such
              tags as a decoration but these are not needed to implement
              the virtual blob abstraction.

    r_pkt := packet with TLV for "real blob"
    m_pkt := packet with TLV for "manifest"

    r_pkt.payload = any                              # the "real blob"
    m_pkt.payload = manifest_spec

    manifest_spec := deco + cont_seq                 # two fields, must fit in one packet
    deco          := None | hashval | inlined
    cont_seq      := sequence of (hashval | inlined)

    vblob_content(h) is defined as follows:
               if type(fetch(h)) == r_pkt:
                       fetch(h).payload
               elif type(fetch(h)) == m_pkt:
                       concat( vblob_content(j) for all j in fetch(h).payload.cont_seq )
                       # if j is inlined content, this is used instead of vblob_content(j)
               else:
                       fail


    # The following would be part of a separate "FLIC decor draft":

    vblob_deco(h, fork_name) is defined as follows:
               if type(fetch(h)) == r_pkt:
                       None
               elif type(fetch(h)) == m_pkt:
                       d = fetch(h).payload.deco
                       if d == None:
                            None
                       elif d is inlined:
                            d
                       else:
                            vblob_content(d).access(fork_name)
               # the mapping for fork_name to the resource, labeled as "access" above,
               # would also be part of that separate "FLIC decor draft"


    # How would composition e.g., directories look? We show content access
    # via a file's name and assume that the directory data structure maps
    # a file name to some hash value (which names a virtual blob
    # containing the file's content). This content access has no
    # similarities with above content retrievals, as it calls
    # vblob_content() twice:

    dir_get_content(dir_hash_val, file_name):
             vblob_content( vblob_content(dir_hash_val).access(file_name) )

    # we can decorate files because they are stored as virtual blobs:

    dir_get_deco(dir_hash_val, file_name):
             vblob_deco( vblob_content(dir_hash_val).access(file_name) )

    Composition does not require new NDN/CCN types: it's an application
    level issue whether it wants to define a "type decoration", or put a
    magic number (type) inside each directory virtual blob, and how it
    encodes the composition spec.


    Conclusions: The sole role of manifests should be to abstract away
    from blob size limitations, providing transparent access to virtual
    blobs (vblobs) of arbitrary size. A manifest defined as a list of hash
    values, together with manifest recursion, are all the tricks we need -
    plus name constructors for forwarding reasons, not discussed in this
    writeup. "Decorations" (meta data) are important for speedup reasons,
    so there must be a hook for them. As decorations may require the
    support of vblobs themselves, the suggestion is that the hook comes in
    form of a single hash value (that references a vblob), instead of
    hardwiring decoration information into the manifest packet. Inlining
    is still possible as an optimization. It's also possible to use the
    manifest construct for simply decorating a real or virtual blob. For
    example, decorations could carry additional signatures, or type
    information outside NDN/CCN's TLV system. Composing objects (stored in
    vblobs) is different from glueing together blobs (to create
    vblobs). Similar to storing decoration data, composition should be
    layered on top of the vblob abstraction using one vblob for every
    component.

    ---

    On Wed, 5 Aug 2020, David R. Oran wrote:

    > 
    > These are a bit stream-of-consciousness, and definitely susceptible to TL;DR, so treat
    > accordingly.
    > 
    > I’ve been mulling over the issues we discussed at the ICNRG Interim around FLIC and I’d
    > like to unpack a few things along a slightly different axis than Marc did in his two
    > excellent messages.
    > 
    > My first thought is that the genesis of FLIC along the lines of iNodes in Unix was
    > probably a really good way to crystalize the problem of representing big single objects
    > that need to be chopped up into pieces. This is what happens when layering a file
    > system directly onto a disk with fixed-size blocks. If one carries this through
    > directly and solves only that problem, it means FLIC’s design derives from the
    > following:
    >
    >  1.
    >
    >     It is designed to represent the chunking of a single object, not enumerating a
    >     collection or other uses.
    >
    >  2.
    >
    >     It has a hierarchical (tree, but possibly digraph) representation since the data
    >     structure itself has to fit in the same chunking limits as the underlying data
    >     objects.
    >
    >  3.
    >
    >     It needs the ability to extend/append to cover cases where the object itself
    >     supports append operations.
    >
    >  4.
    >
    >     It may need some indexing goop and size information if seek operations to
    >     particular bytes in the object is needed. If everything is fixed-size-per chunk and
    >     that size is know a priori, pointer array indexing suffices.
    > 
    > These seem fundamental to the model, but there are some less obvious implications if
    > one adheres closely to the iNode/filesystem analogy:
    >
    >  * 
    >
    >     The data structure is meant to be interpreted by a single client piece of code, not
    >     directly read/written by multiple pieces of code that don’t know about one another
    >     or are unsure what the thing its pointers point to are.
    >
    >  * 
    >
    >     It’s a data structure meant to be used by some middle system software, not random
    >     applications
    >
    >  * 
    >
    >     It is closely bound to the particular client wanting to use it - most different
    >     filesystems in fact use a different format for iNodes - they are somewhat different
    >     for EXTx, VFS, HFS+, etc. and the “on-disk structures” are incompatible. The
    >     implication is that if we want different structures for ICN collections as the
    >     technology matures, one would expect the FLIC data structure to change in
    >     incompatible ways.
    >
    >  * 
    >
    >     modern file systems actually eschew the first-level iNode entries for small files
    >     whose data fits in a single disk block and instead embed the data directly in the
    >     iNode containing the directory entry for the file. This has now mixed things
    >     together quite a bit in modern file systems, so iNodes are’t the simple low-level
    >     thing they once were and directories are not so cleanly layered on iNodes as they
    >     once were. The performance advantages are dramatic, and one might expect pretty big
    >     performance gains for doing the same for ICN. This means, at a minimum, that
    >     supporting packaging a data object “inside” a manifest might in fact be pretty
    >     important. There are a bunch of ways to do this, but not considering this now would
    >     in my view be a mistake.
    > 
    > So, if we take this view for the architecture for FLIC:
    >
    >  *  FLIC is just a convention for a private data structure that can be used by an
    >     application
    >  *  FLIC is for breaking down single objects; not for describing any other kind of
    >     collection.
    >  *  There is no expectation that a FLIC Manifest is understandable by multiple
    >     applications of the same data
    >  *  It’s up to application “magic” using either namespace conventions or some separate
    >     discovery machinery to figure out what Name to use in an Interest to fetch a FLIC
    >     manifest (or piece thereof).
    >  *  We don’t need extension mechanisms, since an application can just make up its own
    >     manifest format by modifying FLIC as needed - the code will just “follow” the app
    >     in the same way that iNode formats are closely bound to the particular filesystem
    >     built on top.
    > 
    > If you find the above exposition at all convincing, I think there are a number of
    > possible ways forward. Let me try to enumerate them:
    >
    >  1. Define FLIC as limited to the above, and punt anything else for “later”.
    >  2. Keep the current FLIC design, but build in some extensibility features (but don’t
    >     define any extensions now). This would include at a minimum a versioning scheme and
    >     some TLVs so additional things other than pointer arrays of hashes can be
    >     expressed.
    >  3. Define FLIC as above, but instead make it an “interior” data structure of a more
    >     general Manifest format that we work on in parallel. That more general data
    >     structure would encompass the stuff we have currently put in the base FLIC
    >     Manifest. A good design would allow the FLIC manifest to either be embedded inside
    >     the general Manifest, or externally referenced via one of the pointers.
    >  4. Continue down the current path, by having one general Manifest format that is
    >     extensible and contains the features we considered important, like Name Construtors
    >     and annotated pointers.
    > 
    > In addition to the above, I think any general Manifest format should allow direct
    > embedding of content objects to handle the cases of simple small data (e.g. IoT sensors
    > etc.) so fetch doesn’t require extra RTTs. This might be useful even in the case of
    > FLIC, to allow the same primitive data object to either be fetched independently, or
    > with Manifest, and the signature bound to the manifest rather than the data object,
    > making re-signing independent of the original data producer code.
    > 
    > I have my own views on which of these directions we should pursue, but for the purposes
    > of this email the question I’d like to ask is whether the taxonomy above is a good way
    > to think about this and are there other options for how to architecturally represent
    > things I haven’t thought of?
    > 
    > DaveO
    > 
    > 
    >

[icnrg] Some thoughts on architectural choices fo… David R. Oran
Re: [icnrg] Some thoughts on architectural choice… christian.tschudin
Re: [icnrg] Some thoughts on architectural choice… David R. Oran
Re: [icnrg] Some thoughts on architectural choice… Marc Mosko
Re: [icnrg] Some thoughts on architectural choice… David R. Oran
Re: [icnrg] Some thoughts on architectural choice… Marc Mosko