Re: [dtn] "Block data length" field in BPbis

Hi Brian,

Yup, we’re actually doing that with 5050 already – my hope is to use the same intermediate format we’ve built to support the new revision of BP as well without adding too much complexity to the IF ☺

Still, it’d be nice if there were reasonable ways to make things easier for the software / hardware bits that handle that conversion.  That piece, specifically, is one that I’m pretty worried about: one needs to be able to design chips / applications that can handle specific rates (both in terms of Gbps and in terms of bundles / second) … and high variance in those processing times necessitates more conservative chip design (which means higher size, weight, power, etc to guarantee we can hit a specific rate target).

My main point is that optimization in that respect would make things easier for high-rate systems.  It’s not necessarily a requirement, but … it sure would be nice!

-Gilbert

From: dtn <dtn-bounces@ietf.org> on behalf of Brian Sipos <BSipos@rkf-eng.com>
Date: Tuesday, July 10, 2018 at 4:47 PM
To: "Clark, Gilbert J. (GRC-LCN0)" <gilbert.j.clark@nasa.gov>, Matt Wronkiewicz <wronkiew@gmail.com>
Cc: Marc Blanchet <marc.blanchet@viagenie.ca>, "Burleigh, Scott C (JPL-312B)[Jet Propulsion Laboratory]" <scott.c.burleigh@jpl.nasa.gov>, "dtn@ietf.org" <dtn@ietf.org>, Felix Walter <felix.walter@tu-dresden.de>
Subject: Re: [dtn] "Block data length" field in BPbis

Gilbert,

One aspect of BPbis that had been discussed much earlier was alternate bundle encodings (with JSON used then as a specific example to interoperate with higher-level application protocols). The idea being that any alternate encoding could be used with a proper translating gateway, where one side of the gateway used the canonical CBOR encoding and the other side the customized encoding.

It seems like for a specific high-speed application like you are describing, under some circumstances there could be a benefit to translating the canonical CBOR encoding to some internal index-capable encoding, using that alternate (or modified/extended CBOR encoding) while the bundles transit your network, and then translating back to canonical encoding at the edges of your network. In this way, you could use a modified encoding to pre-cache any number of derived data about the bundle and store it within or alongside the actual bundle.

________________________________
From: dtn <dtn-bounces@ietf.org> on behalf of Clark, Gilbert J. (GRC-LCA0) <gilbert.j.clark@nasa.gov>
Sent: Thursday, May 31, 2018 8:56:57 AM
To: Matt Wronkiewicz
Cc: Marc Blanchet; Burleigh, Scott C (JPL-312B)[Jet Propulsion Laboratory]; dtn@ietf.org; Felix Walter
Subject: Re: [dtn] "Block data length" field in BPbis

As a little context, I'm trying to parse and apply policy to many parallel bundle ... flows, I guess? ... at rates of (eventually) 200+ Gbps.  While 200 Gbps sounds impressive, that number is far less important than the number of bundles it translates to per second: processing times, after all, are generally much more tied to the number of bundles than to the actual data rates at which a specific link / path / whatever is operating.

[rabbit_hole] This is related to one objection I have to proactive fragmentation: since processing time at intermediate nodes tends to scale with the number of bundles seen as opposed to the specific data rates one is operating at (within reason, of course), one generally wants to take steps to *minimize* the number of bundles flowing through the system at any given time.  By unnecessarily carving large pieces of data up into a number of smaller bundles without cause, one is forcing intermediate nodes to spend more cycles on processing said bundles flowing through them. [/rabbit_hole]

Having to walk entire bundles in search of specific canonical blocks is bad for me: it injects variance into the per-bundle processing time.  Large degrees of variance are suboptimal because they force one to over-design a system to hit relatively conservative targets in terms of the bundles that can be processed per second.

I'll also note that, worst-case, walking a bunch of variable-length fields seems like it could act as an effective denial of service attack: one could hypothetically craft bundles which would take forever (relatively speaking) to completely walk as part of a vain search for a target canonical block ... which may or may not exist.

Anyway, I'm not terribly attached to the index idea, but ... *anything* to keep seek / search times for individual canonical blocks more constant will make design / implementation of systems that need to process bundles at high rates quite a bit more efficient in the way they can be designed and the way they can operate, you know?

FWIW,
Gilbert

P.S. - I'll note that building fast-path processing for variable-length fields does suck a little bit.  I know the goal is extensibility here, but ... fixed-size sets of headers would have been so much nicer to work with.

The views expressed in this mail reflect the opinions of the author.  They are, therefore, not intended to reflect official positions of NASA or the U.S. Government.

On 5/31/18, 12:49 AM, "Matt Wronkiewicz" <wronkiew@gmail.com> wrote:

    Serializing canonical blocks, or just the block-specific data, has
    some additional benefits.

    Some intermediate nodes may need to limit the CBOR tree depth for
    static analysis of memory usage. They might then encounter a bundle
    with a application-specific block that has a tree depth beyond its
    limit, which it otherwise would have been able to process correctly.

    In the current version, canonical blocks have to be decoded just to
    find the next block. Encapsulating the block-specific data reduces the
    amount of parsing that needs to be done to find all the relevant
    blocks. Encapsulating whole blocks would also reduce parsing and make
    finding the total length faster.

    I would prefer to see the CBOR array format used rather than including
    an index of offsets. Adding it to the beginning of a bundle would be
    messy, and adding it to the end is redundant. Also sticking with CBOR
    serialization makes both the spec and the bundles more concise.

    CBOR has an optional tag for serialized CBOR structures. The encoding
    is specified by the protocol, so including tags is just wasted bytes.

    Matt

    On Wed, May 30, 2018 at 3:18 PM, Clark, Gilbert J. (GRC-LCA0)
    <gilbert.j.clark@nasa.gov> wrote:
    > What about e.g. an index field in the primary block that includes an array of both the offset and type of each canonical block included within that particular bundle?
    >
    > The ability to index directly to specific canonical blocks without needing to walk the bundle to find them at all would be nice.  It would also be useful to reduce the variance in processing time where e.g. a policy does need to be applied to a canonical block that comes later in a bundle.
    >
    > -Gilbert
    >
    > The views expressed in this mail reflect the opinions of the author.  They are, therefore, not intended to reflect official positions of NASA or the U.S. Government.
    >
    > On 5/30/18, 4:36 PM, "dtn on behalf of Felix Walter" <dtn-bounces@ietf.org on behalf of felix.walter@tu-dresden.de> wrote:
    >
    >     Marc,
    >
    >     yes, for intermediate nodes this requires parsing the complete CBOR
    >     representation for all blocks. It also implies that a full CBOR parser
    >     always has to be used because it is not known in advance which types
    >     will be contained in unknown blocks.
    >
    >     An alternative would be to specify the "block payload" as being a single
    >     CBOR byte string, containing the (serialized) block-specific data.
    >     Though not as nice as having the bundle as a single large CBOR object,
    >     from an implementation perspective, this is probably the simplest
    >     solution. For example, the status blocks would then contain a serialized
    >     CBOR array in the payload field.
    >
    >     Felix
    >
    >     Am 30.05.2018 um 21:37 schrieb Marc Blanchet:
    >     > On 30 May 2018, at 15:10, Felix Walter wrote:
    >     >
    >     >> Scott,
    >     >>
    >     >> great, I think removing the field is the best solution.
    >     >>
    >     >
    >     > much safer to have one single place for authority. However, does that
    >     > require more parsing from the intermediate nodes if they need to
    >     > somewhat parse some blocks for policy decisions for example?
    >     >
    >     > Marc.
    >     >
    >     >> Felix
    >     >>
    >     >> Am 30.05.2018 um 20:54 schrieb Burleigh, Scott C (312B):
    >     >>> Felix, sorry, I am finally replying to this email: you are right
    >     >>> that CBOR representation would provide all of the individual lengths
    >     >>> of the block-type-specific data fields that are summed in the block
    >     >>> data length field, and as such the block data length field is
    >     >>> redundant.  My first impulse on re-reading your email was simply to
    >     >>> revise the definition of "Block data length" as you suggest.  But on
    >     >>> reflection I think it actually makes more sense to remove block data
    >     >>> length from the specification and instead specifically require that
    >     >>> all block-type-specific data fields appear in CBOR representation.
    >     >>>
    >     >>> I want to post version 11 of this specification later today, before
    >     >>> version 10 expires, and at this point I plan to go ahead with
    >     >>> removal of the block data length field.  If anyone has a technical
    >     >>> argument to make in defense of retaining block data length in bpbis,
    >     >>> please speak up this afternoon?
    >     >>>
    >     >>> Scott
    >     >>>
    >     >>> -----Original Message-----
    >     >>> From: dtn <dtn-bounces@ietf.org> On Behalf Of Felix Walter
    >     >>> Sent: Friday, March 23, 2018 9:22 AM
    >     >>> To: dtn@ietf.org
    >     >>> Subject: [dtn] "Block data length" field in BPbis
    >     >>>
    >     >>> Hi,
    >     >>>
    >     >>> We just had a short talk with Scott, Ed, and Rick about the "Block
    >     >>> data length" [1] field of the canonical block in BPbis that I would
    >     >>> like to forward to the list. As far as I understand it, this value
    >     >>> is the count of (serialized) bytes still belonging to the block, but
    >     >>> following the length field.
    >     >>>
    >     >>> Because CBOR is used for the block-type-specific data, the length
    >     >>> field by itself is redundant. For example, in the payload block, it
    >     >>> will always be followed by a "CBOR byte string" representing the
    >     >>> payload data. (This contains a length as well.) It needs to be
    >     >>> considered in implementations that all these length fields are
    >     >>> variable-length themselves - however, this turned out to be no big
    >     >>> issue. I also see the point of having the "Block data length"
    >     >>> available to the parser, to be able to skip over the whole block
    >     >>> data of complex extension blocks without even parsing them.
    >     >>>
    >     >>> Are there any further reasons for having the "Block data length"
    >     >>> available which I have missed? Or does anyone have a strong opinion
    >     >>> on whether this should be removed or kept?
    >     >>>
    >     >>> By the way, the language concerning the "Block data length" should
    >     >>> probably be modified slightly as it refers to "[...] the aggregate
    >     >>> length of all remaining fields of the block, i.e., the
    >     >>> block-type-specific data fields.", though, this may (now) be
    >     >>> followed by the CRC checksum.
    >     >>>
    >     >>> Felix
    >     >>>
    >     >>> [1] https://tools.ietf.org/html/draft-ietf-dtn-bpbis-10#section-4.2.3
    >     >>>
    >     >>> _______________________________________________
    >     >>> dtn mailing list
    >     >>> dtn@ietf.org
    >     >>> https://www.ietf.org/mailman/listinfo/dtn
    >     >>>
    >     >>> _______________________________________________
    >     >>> dtn mailing list
    >     >>> dtn@ietf.org
    >     >>> https://www.ietf.org/mailman/listinfo/dtn
    >     >>
    >     >> _______________________________________________
    >     >> dtn mailing list
    >     >> dtn@ietf.org
    >     >> https://www.ietf.org/mailman/listinfo/dtn
    >     >
    >     > _______________________________________________
    >     > dtn mailing list
    >     > dtn@ietf.org
    >     > https://www.ietf.org/mailman/listinfo/dtn
    >
    >
    >     _______________________________________________
    >     dtn mailing list
    >     dtn@ietf.org
    >     https://www.ietf.org/mailman/listinfo/dtn
    >
    >
    > _______________________________________________
    > dtn mailing list
    > dtn@ietf.org
    > https://www.ietf.org/mailman/listinfo/dtn

_______________________________________________
dtn mailing list
dtn@ietf.org
https://www.ietf.org/mailman/listinfo/dtn