[dtn-interest] Comments on DTNRG documents from BBN SPINDLE project team

Rajesh Krishnan <krash@bbn.com> Wed, 16 March 2005 23:06 UTC

Received: from a.bbn.com (a.bbn.com [128.89.80.80]) by webbie.berkeley.intel-research.net (8.11.6/8.11.6) with ESMTP id j2GN6BV09258 for <dtn-interest@mailman.dtnrg.org>; Wed, 16 Mar 2005 15:06:11 -0800
Received: (from krash@localhost) by a.bbn.com (8.11.0/8.11.0) id j2GN63j11921 for dtn-interest@mailman.dtnrg.org; Wed, 16 Mar 2005 18:06:03 -0500
From: Rajesh Krishnan <krash@bbn.com>
Message-Id: <200503162306.j2GN63j11921@a.bbn.com>
To: dtn-interest@mailman.dtnrg.org
Date: Wed, 16 Mar 2005 18:06:03 -0500
Reply-to: krash@bbn.com
X-Mailer: ELM [version 2.5 PL3]
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Subject: [dtn-interest] Comments on DTNRG documents from BBN SPINDLE project team
Sender: dtn-interest-admin@mailman.dtnrg.org
Errors-To: dtn-interest-admin@mailman.dtnrg.org
X-BeenThere: dtn-interest@mailman.dtnrg.org
X-Mailman-Version: 2.0.13
Precedence: bulk
List-Unsubscribe: <http://mailman.dtnrg.org/mailman/listinfo/dtn-interest>, <mailto:dtn-interest-request@mailman.dtnrg.org?subject=unsubscribe>
List-Id: Delay Tolerant Networking Interest List <dtn-interest.mailman.dtnrg.org>
List-Post: <mailto:dtn-interest@mailman.dtnrg.org>
List-Help: <mailto:dtn-interest-request@mailman.dtnrg.org?subject=help>
List-Subscribe: <http://mailman.dtnrg.org/mailman/listinfo/dtn-interest>, <mailto:dtn-interest-request@mailman.dtnrg.org?subject=subscribe>
List-Archive: <http://mailman.dtnrg.org/pipermail/dtn-interest/>

All,

We thank the DTNRG members for hosting a meeting with us (BBN SPINDLE
project) at Minneapolis last week.  We had a delightful discussion.  
Included below are some comments (from me and my colleagues on the BBN 
SPINDLE project) on:

  1. the DTN Architecture (expired) 
  2. the Bundle Protocol Specification (soon to expire)
  3. preliminary thoughts on ways in which our work may feed into the DTNRG

We welcome comments on our comments.  

Best Regards,
Rajesh

...............................................................................
Comments on DTN Architecture 
============================

* Scope: The comments in this section refer specifically to the text 
  contained in draft-irtf-dtnrg-arch-02.txt

* Overall: The following aspects have not received adequate treatment
  in the architecture:

  - Mobility
    o especially, node mobility between regions and region mobility

  - Management
    o ownership and administrative boundaries for nodes and regions
    o instrumentation of DTN nodes for observing and controlling remote
      network state
    o error conditions at a node such as a persistent store failure
    o declarative (external) specification and enforcement of policies

  - Disruptions occurring at short timescales (the focus is on tolerance 
    to delays at large timescales)

* The abstract defines the delay tolerant network as an "overlay that 
  exists above the transport layer."  It appears that there is nothing
  in the architecture that fundamentally requires a DTN to be overlaid
  over the transport layer of the underlying network.  

  On the other hand, there appear to be some _implicit_ service requirements 
  assumed for the underlying network.  These requirements may be typically 
  satisfied by the transport layer  in an IP network, but it is not clear 
  the same holds for other networks.  These assumptions must be made explicit 
  irrespective of which layer of the underlying network DTN is overlaid upon.  

  Since DTN is an overlay, the underlying network appears to the DTN to be 
  a (more capable) link layer rather than a transport layer, and DTN needs 
  to implement functionality (e.g. routing) typically associated with the 
  network layer.

  We suggest generalizing this statement to "overlay that exists above any 
  layer of an underlying network that provides the <enumerated> capabilities 
  (possibly through a convergence sublayer)".  

* In Section 3.1 it is not clear whether aggregation of bundles is allowed.

* Section 3.2, assumptions 1-3. Assumptions (1) and (2) are not generally
  valid for tactical military networks -- a large application area seeking
  this technology. Storage is neither plentiful nor well-distributed. 
  However, it is not clear that the *architecture* per se *needs* 
  assumption 1.  Within this architecture, one could build routing 
  mechanisms that perform coordinated, distributed storage. 

* Sec 3.2. Bulk/Normal/Expedited classes. Is the treatment on a per-next-hop 
  basis? Can one give a Bulk bundle destined for the south pole to a contact 
  headed south while the queue contains an Expedited bundle destined for the 
  north pole? we guess the answer is yes, but it is not entirely clear from 
  the text.

* In Section 3.3 it is not clear whether the transferrer or transferee 
  generates the custody transfer indication. Also if these indications
  are all sent to the report-to node, then how many reports per bundle
  does it receive?  Will this scale?  Can reports be aggregated or
  piggybacked opportunistically or not?

* In Section 3.3, for the expiration time and the curent time of day, 
  what is the reference?  Will GMT / Earth time be used in the IPN?

* Sec 3.4/3.5. Does the notion of a region include hierarchical nesting in a
  topologically significant way? That is, can the "A" in (R,A) turn contain an 
  (R1,A1) and so on? 

* In Section 3.5, the requirement that "any node not in R1 must treat A 
  as "opaque" (i.e. it cannot interpret A)" seems needlessly restrictive.  
  It seems sufficient to require that A not be modified outside R1.  To 
  support mobility efficiently (and avoid needless telescoping of routes), 
  it may become necessary to interpret A (e.g. location hints) outside R1.

  Furthermore, it can be argued that the "name invariance family" in the
  Bundle Protocol Specification is indeed an interpretation of A outside 
  R1.

* In Section 3.5 "Every node in a region R1 shall have the ability to 
  eventually deliver messages to every other node in R1 ..." needs to
  be qualified with a "barring a permanent network partition or permananent 
  destruction of nodes, and provided the expiration time of the bundle 
  exceeds the time the network is partitioned, and there is sufficient
  storage in the network."

* Sec. 3.5, assumption 3 ("every node in a region should be able to eventually
  deliver to every other node"). Not clear whether this assumption requires 
  that the "eventual delivery" be made with or without the use of data 
  hauling nodes (DTN  routers). If we cannot use DTN routers within the 
  region, then this seriously limits the applicability of this
  architecture to the tactical military environment. For instance, the region
  could be a brigade with HMMMVs, soldiers, UAVs etc. A brigade could be
  partitioned with UAV or helicopters doing the data hauling. In such a case
  we need a few DTN routers within the region. On the other hand, if 
  DTN routers can be used to heal such partitions within a region, it is 
  not clear how the addressing scheme will work.

* In Section 3.8 "graph" is used to mean "multi-graph".

* Section 3.8.1. Types of contacts. How is contact volatility managed?
  When you lose contact, you are not sure whether it is "still out there" or
  "gone". In other words, do you envisage the use of a link up/down protocol
  like in OSPF or MANETs to "maintain" contacts.  Is this function at the 
  bundle layer or in the convergence layer?

* Sec. 3.9, Reactive fragmenting. This implicitly assumes that the sender and
  receiver can communicate throughout the message exchange. What if the 
  link/subnet-path breaks after receiving part of the message? In this case 
  the sender cannot learn that only a portion of the message was received. 
  Actually, this is the likely case -- if X receives only part of a message
  from Y, it is probably because there was some disruption in the communication
  -- and this same disruption will likely be a barrier for the sender to
  "learn" anything at all about the outcome of its transmission.

* Section 3.9 what does it mean for DTN nodes to "share an edge"

* Section 3.10, can ACKs be aggregated or piggybacked opportunistically?

* Section 3.11, seems the use of seconds for expiration presupposes the
  use of DTN in high delay environments, and precludes applications to 
  high-speed optical networks (or computer interconnects), and other
  distributed applications, where custody transfers and data expiration 
  may indeed happen at faster timescales.

  Why should not DTN scale _down_ in delays and disconnections the same
  way it scales up?

* Section 3.12. Congestion control. We are not sure if congestion control 
  should be considered within this architecture. Presumably, DTN is deployed 
  in a network where only a (small?) subset of the nodes are DTN capable (DTN
  routers/nodes). Thus there may be plenty of other traffic that DTN has no
  control over. Congestion is a network-layer/MAC layer phenomenon and best
  controlled there. No matter what we do at the DTN layer, it will be of no
  avail if other applications/protocols using the underlying network blast 
  away to glory.

* Section 3.12. Flow control for solving the congestion control. This only
  works even partially if the receiver capacity is lower or equal to the 
  capacity of the intermediate nodes -- may not be a realistic assumption, 
  especially in a tactial military environment where the bundle may have 
  to go over multiple wireless hops.

* Section 3.13, The security architecture (based on checking at edges) may 
  be challenged by mobility, especially of the edges. What if the edges of 
  the network are continuously redefined due to mobility?

* Section 4.3 makes a critical point regarding state versus protocol
  behavior and the possibility of state being shared across protocols.
  A corollary of this sattement is the need to identify information
  models for sharing (and storing) routing, forwarding, and name
  resolution state. 

* Section 5, For disruption tolerance, the application structuring issues 
  must include cases where the DTN node where the application is registered
  is mobile or is otherwise soon to be disconnected (and needs to transfer 
  custody of the registration).

* Section 5, application needs to be better defined -- are they strictly 
  point-to-point or can aplication endpoints reside in a distributed 
  manner (anycast or multicast basis) in the delay tolerant network

* Section 6 should list mandatory requirements for the convergence layers
  in terms of what services they must provide to the bundle layer.

* DTN has some goals similar to X.400; the X.400 delivery service classes 
  are worth evaluating in this context (a citation of X.400 is appropriate 
  as relevant prior work)

* Consider the terminology "hold-and-forward" to distinguish from the 
  usage elsewhere of "store-and-forward" to mean "queue-and-forward"

* Consider the terminology "data hauling" to the case where mobility
  is exploited for delivery e.g. data mules, message ferries, ...

Comments on the Bundle Protocol Specification 
=============================================

* Scope: The comments in this section refer specifically to the text
  contained in draft-irtf-dtnrg-bundle-spec-02.txt

* This document is well written.  In particular, the format used to 
  describe the primitives and their semantics aids the understanding
  of the document.

* The term interface is never defined rigorously and is used in multiple
  contexts such as a software/service interface or a network interface.
  This should be done in the introduction.

* The endpoint ID terminology differs from what is used in the architecture.
  For example, the architecture indicates that a node must have at least one 
  MEI, but allows multiple MEIs per node; we think the MEI maps to the Agent 
  administration endpoint ID in the bundle protocol specification, which
  seems to suggest that there is exactly one agent administration endpoint ID.

* Does the endpoint ID apply to the node or the interface?   Appears
  to be the former but worth clarifying.

* Name Mobility: Is the communications endpoint ID allowed to move from 
  node to node?  If so, what state related to the registration (e.g. 
  bundles waiting to be delivered, the "delivery failure" action) MUST/
  SHOULD/MAY be migrated? 

  In other words, can nodes transfer custody of a "registration"?  (As an 
  analogy consider arguments for and against TCP session migration.)

  Treating the registration as an "active mailbox" object may be a useful
  metaphor here.

  This may be a larger issue here related to the relinquishment of name
  bindings by individual nodes.

* Can the CancelBundle.request (sometimes aliased to Cancel.request) only
  apply if the bundle is locally stored?  For example, upon a cancel request
  can a node launch a "k-hop trace and delete" on a pattern of bundles that 
  were sent earlier?  This will be useful to purge bundles/replicas with 
  large expiration times (that have either been delivered or canceled)?

* The hash carried by bundles in the security header can have other uses.
  For example, this hash can be used to populate say a Bloom filter on en 
  route nodes for traceback.

* What happens to any bundles that arrive between registration and start 
  delivery?   Does the default delivery action go into effect automatically
  upon registration?

  It may be useful to point out that there is a distinct poll mode (active 
  reception?) when passive reception is first introduced.

* What happens to bundles that arrive after a deregister.request?  We assume
  they MUST be discarded.

* MUST the SENDERROR.INDICATION always include the application data unit?

* For the administrative part of the endpoint ID, some rationale is required 
  regarding why 8-bit, 16-bit, and URIs are supported in particular?  Why is
  URI not sufficient?  Why not 32-bit or 48-bit?

* A generic user-defined experimental header mechanism (syntax and semantics
  for nodes not implementing that header) is needed in order to enable 
  experimentation.

* A source routing header extension will be useful.  

* Since the dictionary header only allows 16 strings, and since there can
  be only one header of each type, it appears that extension headers will 
  need their own dictionaries.  For example, consider a 17-hop source route.

* It appears that generating a unique ID based purely on the creation
  timestamp and endpointID may cause problems.  A node with concurrent 
  embedded processors may be able to generate more than one bundle per 
  clock tick.  Therefore, an additional sequence counter may be needed.

* It is desirable to use the same style in Section 4 as in Section 2 
  (i.e. explicitly call out primitives and desribe their semantics).

* When a bundle is discarded (in Section 4.2) due to authentication 
  failure, can the report to the adminsitrative endpoint contain a
  copy of the inauthentic bundle or not?  Since the bundle is otherwise 
  discarded and not processed further, this copy is potentially useful 
  for instrumenting various security mechanisms.

* There is a potential for denial-of-service attacks that make use
  of the report-to field. This is a problem in a DTN since the sender 
  can legitimately be disconnected and vanish.  Has the report-to 
  third party have any say in the matter?  For example, does the
  bundle carry any credentials from the report-to?

* In the bundle fragmentation section, it is assumed that the sending
  agent "knows" when a transmission is terminated before the entire payload 
  has been transmitted.  Is this (signaling of partial transmission) an 
  explicit requirement for a DTN convergence layer?

* Currently, the fragmentation information carried in the headers do not 
  provide for redundant encodings of the fragments -- i.e. to indicate 
  how the bundle was fragmented and erasure-coded.  Yet one may wish to
  apply these codes precisely when there is fragmentation.

  In the current architecture, erasure or digital fountain type encoding 
  must occur either below the convergence layer (which means the protection
  cannot cross multiple networks), or end to end at the application which
  means the application cannot selectively erasure-code only when there
  is fragmentation.

* In Table 5 there is no way to indicate that a bundle was received
  in error, but was or was not reconstructable (using the erasure 
  coding). Such a feedback will aid in adjusting the strength of the
  code.

Anticipated Requirements / Areas for Possible Draft Proposals  
=============================================================

* Experimental headers: Our SPINDLE project will use the bundle as the 
  means of message encapsulation, however, we will require the ability 
  to add and experiment with additional headers.  We request that a user
  definable experimental header mechanism be added to the bundle
  protocol specification.  In particular, we anticipate the need for 
  extension headers to:

  - support braid-spray routing
  - carry name resolution hints  
  - carry resource accounting tokens 
  - support management tasks such as bundle traceback, proactive 
    distributed purge of canceled or delivered bundles (as opposed 
    to waiting in storage until bundle expiration).
  - possibly others

* Interpretation of names: We expect we will require a modest change to 
  the DTNRG architecture to allow any region to interpret (but not modify) 
  the "opaque" A part of the {R,A} tuple that constitutes a DTNRG name.  
  Related modifiable information (e.g. resolution hints) may be carried in 
  a late binding header that we may specify.  We anticipate that the unique 
  part of our supername will map to the DTNRG name syntax.

* Region interpretation: We anticipate that the routing and late binding
  frameworks developed within the DARPA DTN program to support tactical 
  mobility may each interpret and use regions differently (e.g. what is 
  a region, who owns/administers a region, mobility of nodes across 
  regions, mobility of regions themselves, and sub-region partitioning).

* We anticipate the DTN program will propose new frameworks for bundle 
  routing (family of routing protocols) and delayed resolution of names.
  We anticipate we will generate specifications of information models for 
  storing and sharing topology, adjacency, reachability, and name resolution 
  across nodes and across algorithms.  

* To facilitate regular DTN-DTNRG interaction, a monthly teleconference
  between the DTN and DTNRG to address DTN-specific issues will be 
  beneficial.

...............................................................................