Re: [dtn] [EXTERNAL] Benjamin Kaduk's Discuss on draft-ietf-dtn-bpbis-22: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Thu, 27 February 2020 01:19 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: dtn@ietfa.amsl.com
Delivered-To: dtn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C9C43A0D7D; Wed, 26 Feb 2020 17:19:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.112
X-Spam-Level: ***
X-Spam-Status: No, score=3.112 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SspNZE4sFqDG; Wed, 26 Feb 2020 17:19:11 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A39623A0D7B; Wed, 26 Feb 2020 17:19:10 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 01R1J1Ki014695 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 26 Feb 2020 20:19:03 -0500
Date: Wed, 26 Feb 2020 17:19:00 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: "Burleigh, Scott C (US 312B)" <scott.c.burleigh@jpl.nasa.gov>
Cc: The IESG <iesg@ietf.org>, "draft-ietf-dtn-bpbis@ietf.org" <draft-ietf-dtn-bpbis@ietf.org>, Fred Templin <fred.l.templin@boeing.com>, "dtn-chairs@ietf.org" <dtn-chairs@ietf.org>, "dtn@ietf.org" <dtn@ietf.org>
Message-ID: <20200227011900.GA56312@kduck.mit.edu>
References: <158095903452.30594.18160625444164563541.idtracker@ietfa.amsl.com> <803a0379e44449a98c4a3900c2b2a78d@jpl.nasa.gov>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <803a0379e44449a98c4a3900c2b2a78d@jpl.nasa.gov>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dtn/iSsKHi0XTRjUy95Nj3KdhY3CXlg>
Subject: Re: [dtn] [EXTERNAL] Benjamin Kaduk's Discuss on draft-ietf-dtn-bpbis-22: (with DISCUSS and COMMENT)
X-BeenThere: dtn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Delay Tolerant Networking \(DTN\) discussion list at the IETF." <dtn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dtn>, <mailto:dtn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dtn/>
List-Post: <mailto:dtn@ietf.org>
List-Help: <mailto:dtn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dtn>, <mailto:dtn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Feb 2020 01:19:20 -0000

Hi Scott,

My MUA doesn't seem to want to give me syntax highlighting for the quoting
approach your MUA used, so my apologies if I miss something.  That said, I
will skip commenting on points that seem well-resolved already (and to save
another mail, I did see your follow-up about which fragment to reassemble
to but don't expect to reply to it).

On Fri, Feb 07, 2020 at 04:46:17AM +0000, Burleigh, Scott C (US 312B) wrote:
> Thanks for the very close reading, Ben.  Responses in-line below.
> 
> -----Original Message-----
> From: Benjamin Kaduk via Datatracker <noreply@ietf.org> 
> Sent: Wednesday, February 5, 2020 7:17 PM
> To: The IESG <iesg@ietf.org>
> Cc: draft-ietf-dtn-bpbis@ietf.org; Fred Templin <fred.l.templin@boeing.com>om>; dtn-chairs@ietf.org; fred.l.templin@boeing.com; dtn@ietf.org
> Subject: [EXTERNAL] Benjamin Kaduk's Discuss on draft-ietf-dtn-bpbis-22: (with DISCUSS and COMMENT)
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-dtn-bpbis-22: Discuss
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I support Roman's Discuss.
> 
> (1) It's not clear to me that we should be defining new (near-)application-layer protocols on the standards track without mandatory security mechanisms.  Even draft-ietf-dtn-bpsec defines a "BPSec threat model" that is largly the same as the RFC 3552 threat model, in which the network is completely untrusted and to provide end-to-end communications we must supply additional security mechanisms, yet BPSec is not required to implement or use.  I could perhaps see room for allowing waypoint nodes that do not act as endpoints to remain security-unaware, but the justification for security-unaware endpoints seems quite lacking.
> 
> 	The consensus position of the WG, I believe, is that BP may in some cases be deployed in closed, highly resource-constrained systems where the overhead of implementing, much less using, the security mechanisms would be considered both prohibitive and needless.  For such environments, making those mechanisms mandatory might result in either non-adoption of the IETF standard or adoption of a non-compliant implementation, both undesirable.
> 	The specification can easily be revised to require the implementation of BPsec, but I think that shouldn't be done until the WG has reached a revised consensus.

I'm happy to see what consensus arises in the WG, and don't want to
preclude "carve-out"s for such specific environments.  I think we can meet
everyone's needs but still have the default behavior/expectations be to use
the designated protocol security mechanisms.

> (2) The state machine for transitions between singleton EID and non-singleton EID seems highly unclear to be usable in a globally synchronized manner (I refer specifically to the text in Section 4.1.5.2: "A node's membership in a given singleton endpoint MUST be sustained at least until the nominal operation of the Bundle Protocol no longer depends on the identification of that node using that endpoint's ID").  Distinction between singleton-EID and non-singleton EID may need to be made an explicit protocol element.
> 
> 	An endpoint never makes a transition from being singleton to being non-singleton; a singleton endpoint is an endpoint that always contains exactly one member (3.1).  The intent of the design is that the singleton-ness of an endpoint can be discerned from the endpoint's ID in a manner that is scheme-dependent; for example, all endpoints identified by ipn-scheme IDs are singleton endpoints (section 10.8 will now include this statement).

Okay, so the transition in question is from singleton EID to
EID-with-no-membership, hmm.  I think we probably want to mention this
intent explicitly (that any given EID will be determined to be singleton or
not-singleton in a scheme-dependent manner).

> (3) The forwarding procedure in Section 5.4 refers to a "data label extension block (to be defined in a future document)" with no reference; it doesn't really seem like this sort of speculative forward-looking statement is appropriate in a Proposed Standard.
> 
> 	Fine; note removed.
> 
> (4) We discuss using a Previous Node block to "return a bundle to sender"
> when forwarding failed, but do not discuss whether Previous Node should be added (or updated or removed) on transmission, receipt, or both.
> 
> 	Good point.  Explicit language is being added to 4.3.1 and 5.4.
> 
> (5) The extensibility story seems incompletely described: what should an implementation do upon receiving a bundle with an unrecognized control flag bit set, or a block with an unrecognized control flag set?
> 
> 	Step 3 of 5.6 defines the action to be taken when any block of the bundle (including the primary block, which contains the bundle processing flags) is malformed according to this specification.

Okay, so the intent is that flags are not a native extensibility mechanism
(and thus that flags not defined in this document cannot be used absent
some external mechanism to indicate that the flag in question is supported,
presumably as part of some network-wide configuration).  I think many
people default to expecting flags to be an extension point, so I'd suggest
making some statement about the set of valid flags being fixed by this
specification, needing a new protocol version to define new flags, needing
to ensure support by external means before using new flags, or similar.

> (6) The use of absolute times for creation timestamps suggests a strong dependence on accurate time (for nodes that do not acknowledge their lack of an accurate clock); the consequences of the failure of accurate time should be discussed in the security considerations section.
> 
> 	The absolute times (and sequence numbers) in bundle creation timestamps are used to form bundle IDs, so the accuracy of those times is moot for nearly all purposes; the exception is determination of the time at which a bundle's lifetime expires, normally the sum of the bundle's creation time and its time-to-live.  When the time reference at a bundle protocol agent is not accurate, the Bundle Age block compensates for the lack of an accurate bundle creation time as described in 4.3.2.

I'm not sure that the accuracy of a given node's time can be so blithely
asserted to be a clear local matter orthogonal to other concerns (but
perhaps it can): specifically, in the Internet scenarios I'm used to
working in, nodes tend to have fairly cheap local oscillators that do not
necessarily do well over time in isolation (especially with varying local
physical conditions such as temperature), and need to rely on network
protocols like NTP or PTP to remain synchronized with actual time.
Well-known vulnerabilities in those protocols, in turn, can be leveraged
into vulnerabilities in the cryptosystems built upon an expectation of
accurate time.  Now, I freely admit that DTN scenarios are unlike the
Internet scenarios I know, and if all nodes are going to have atomic clocks
keeping stable absolute time, then there's no concerns here.  But I haven't
yet seen anything to make me confident that that's the case, so I still
have to ask.

> (7) Section 4.1.6 should make a statement regarding whether leap seconds are included or excluded from the count of seconds since the DTN epoch.
> 
> 	The count of seconds since the DTN epoch is a count of seconds, elapsed intervals of time.  Leap seconds are not seconds; a leap second never elapses, it is merely asserted.  So by definition the count of seconds since the DTN epoch does not include leap seconds.

I'm not sure I understand the assertion that a "leap second never elapses"
in the sense that humans still observe a second of time passing while
clocks read 23:59:60.  Perhaps it would be more clear to talk about UTC vs.
UT1?  I'm still not sure which of the two you expect counts of seconds
since the DTN epoch to be measuring.

> (8) The definition of Fragment offset needs to specify whether the lowest allowed byte index is zero or 1 (I believe zero, from other discussion).
> 
> 	Fragment offset is not a byte index, it is an offset.  The offset of X from Y is Y - X.  Therefore the minimum value of fragment offset can only be zero.
> 
> (9) Bundle status reports are only defined to include the creation timestamp of the bundle whose status is being reported on, but not the sequence number thereof.  Since we allow nodes without accurate clocks to use a creation timestamp of zero and rely solely on the sequence number to identify bundles, it seems that the status reports for such bundles are effectively useless without the sequence number information.
> 
> 	Creation timestamp is defined in 4.2.2 as including both the bundle's creation time and also the bundle's creation timestamp sequence number.

Oops, sorry to have missed that.

> (10) Please resolve the internal inconsistency in Section 10.6 that simultaneously claims that potential bundle protocol URI scheme types are integers of undefined length and only have 255 available codepoints (i.e., definite length).
> 
> 	There is no inconsistency.  The URI scheme code number field in each BP endpoint ID is a CBOR unsigned integer (see 4.1.51); the length of that CBOR unsigned integer is not defined in this specification.  However, the Bundle Protocol URI scheme types registry contains only 256 codepoints.  This means that, in practice, no URI scheme code number appearing in a transmitted bundle need be longer than 2 bytes (usually only 1), although a mischievous CBOR implementation might represent that number in 3 or 5 or 9 bytes with a lot of leading zeroes.

Why is there a restriction to only 256 codepoints, though?  If there is no
fixed-length encoding then this is a purely artificial limitation with no
explanation.

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> It's pretty unfortunate that we have to have separate (built-in) CRC and (dedicated block type) BIB support; a more unified scheme that always provides cryptographic integrity protection would have simpler encoding rules.  I rather wonder why it's not possible to roll the CRC as a mechanism for detecting media-induced errors into the CLA functionality as opposed to needing to be in the top-level BPA.  Then media not susceptible to errors could use no or a small CRC and more malleable media could use stronger CRCs, leaving the decision closer to the knowledge of the transport properties.
> 
> 	This has been a point of contention for many, many years.  The BPv6 specification (RFC 5050) was exactly aligned with your suggestion, but at long last we were convinced to add CRCs to the design.  Many of us would prefer not to reopen that can of snakes.

Okay.  That's the WG's prerogative and I have no need to reopen the can of
snakes.

> Is a registration something that conceptually lives "in the network", or a purely local matter to the node in question?  What about active vs. passive state thereof?
> 
> 	Registrations are purely local.  There is no mechanism for advertising them to other nodes, and (at least currently) no need.

Adding "local" in its definition might help future readers.

> The creation timestamp, sequence number, and lifetime can serve to some extent to detect replay of old data, though given the environments we expect this to operate in, it may be hard to differentiate a replay from normal delivery.  I don't see any mechanisms that would allow for detection of dropped bundles (whether due to attack or other error), though it's not entirely clear that such a mechanism is possible in these environments.
> 
> 	A separate application service protocol running above BP, named "Delay-Tolerant Payload Conditioning", has been defined for these purposes, and has been implemented.  Whether or not running DTPC in real operations actually makes sense is a question whose answer will clearly be environment-specific.

Oh, cool!  And agreed about "clearly environment-specific".  I'll leave it
to you to decide whether an informative reference is merited.

> It may also be worth considering whether there is use for the ability detect a truncated application-layer "stream" that's a group of bundles related to each other in some way.  I do see that the protocol is largely focused on bundles as discrete units, and so expect the answer to be "no", but this is one of the more standard security mechanisms, so I wanted to explicitly check.
> 
> 	Yes, DTPC can be used for this purpose.  In less forgiving environments the "data label" mechanism you remark on above would be used to identify bundles that are related to one another.  Mechanisms for streaming data over a delay-tolerant network have also been defined and implemented, though not yet standardized.
> 
> Similarly, one might imagine a block type that indicates the initial block types present in a bundle, which could be (required to be) subject to integrity protection so that removal of blocks could be detected at the recipient.
> 
> 	Yes, the idea of a "manifest block" has also been discussed for many years, though not yet defined or implemented.
> 
> Some further discussion of these points, in terms of the goals for these environments and what behaviors are actually attained, might be appropriate in the security considerations section.

I'm happy to hear that I'm not proposing new ideas.
It sounds like you don't think it's worth mentioning (any of) them in the
security considerations, which is a defensible position, even if it's not
the one I would take.

> Section 3.1
> 
>    Application Data Unit (ADU) - An application data unit is the unit
>    of data whose conveyance to the bundle's destination is the purpose
>    for the transmission of some bundle that is not a fragment (as
>    defined below).
> 
> I understand the desire for precision, but this definition feels very convoluted and could perhaps be made more accessible to an application author.
> 
> 	I think the precision is important.
> 
>    Bundle node - A bundle node (or, in the context of this document,
>    simply a "node") is any entity that can send and/or receive bundles.
>    Each bundle node has three conceptual components, defined below, as
>    shown in Figure 2: a "bundle protocol agent", a set of zero or more
>    "convergence layer adapters", and an "application agent".
> 
> If a node always has an application agent, won't anything done by that application agent make the node also part of an endpoint?  Yet we discuss waypoint nodes that are not endpoints...
> 
> 	No, all nodes are identified by node IDs, which are singleton endpoint IDs, which means that every node -- whether waypoint or end system -- is a member of at least one endpoint.  "Endpoint" in BP is not the same as "end system."

Perhaps expand the definition to include "Note that any bundle node may be
a mumber of multiple endpoints, and must be a member of at least one [since
nodes are identified by endpoint ID]" to clarify for naive readers like me?

>    Delivery - A bundle is considered to have been delivered at a node
>    subject to a registration as soon as the application data unit that
>    is the payload of the bundle, together with any relevant metadata
>    (an implementation matter), has been presented to the node's
>    application agent in a manner consistent with the state of that
>    registration.
> 
> Is this metadata considered attached to the ADU or the bundle or something else?
> 
> 	That metadata is purely an implementation artifact, a feature of the API.  Some of it might be information from the bundle's primary block -- or other blocks -- but some of it might be local information.

Okay.  I can't come up with a rewording that would help clarify.

>    Delivery failure action - The delivery failure action of a
>    registration is the action that is to be taken when a bundle that is
>    "deliverable" subject to that registration is received at a time
>    when the registration is in the Passive state.
> 
> Just to check my understanding: with registrations being per-(node,endpoint), in a case where many nodes are in an endpoint, we'll undertake a delivery failure action for each node with a passive registration even if the bundle is successfully delivered to a different node with an active registration?
> 
> 	Yes; this would be multicast, for which there is at least one prototype specification and implementation but no standard yet.  Notions of DTN "anycast" have been floating around for years but nobody has defined exactly what that would look like, AFAIK.
> 
> How does this interact with the "report-to" EID?
> 
> 	If the report-to EID happened to be the ID of an endpoint with multiple members, then the status report would be delivered at multiple nodes.

I think I was asking about whether all such delivery failure actions would
target the same report-to EID, but now that I look back at it I can't see
what else they would be doing ... thanks for clarifying.

> Section 3.2
> 
>    Every CLA implements its own thin layer of protocol, interposed
>    between BP and the (usually "top") protocol(s) of the underlying
>    native protocol stack; this "CL protocol" may only serve to
>    multiplex and de-multiplex bundles to and from the underlying native
>    protocol, or it may offer additional CL-specific functionality. The
>    manner in which a CLA sends and receives bundles, as well as the
>    definitions of CLAs and CL protocols, are beyond the scope of this
>    specification.
> 
> We still need to specify what interfaces a CLA needs to present to the BPA, though!  Section 7 is a start at this (and probably worth cross-referencing from here), though it feels pretty sparse.
> 
> 	I don't think we need to specify those interfaces in a protocol specification.  They will be different for different BPA implementations.

I think there's still going to be some minimal set of things the CLA needs
to be able to do in order for the system as a whole to function, and
Section 7 is smaller than that minimal set.

>    In the case of a node that serves simply as a BP "router", the AA
>    may have no application-specific element at all. The application-
> 
> (But earlier we said that a node has all three elements, including an application agent.  What would the Administrative Element do on such a "router" node?)
> 
> 	It would respond to (and/or issue) administrative records: status reports and potentially other administrative records yet to be defined.
> 
>    The destination of every bundle is an endpoint, which may or may not
>    be singleton.  The source of every bundle is a node, identified by
>    the endpoint ID for some singleton endpoint that contains that node.
> 
> It might be worth a forward-reference to Section 4.1.5.2 where we record the requirement for a valid endpoint ID of a given node.
> 
>    The bundle protocol is designed for extensibility.  Bundle protocol
>    extensions, documented elsewhere, may extend this specification by:
>       [...]
>       . defining additional mandates and constraints on processing
>          that conformant bundle protocol agents must perform at
>          specified points in the inbound and outbound bundle processing
>          cycles.
> 
> It's not clear to me how "additional mandates and constraints" can successfully be imposed by protocol extensions given that the source node cannot in general know the path(s) the bundle will take.
> 
> 	The source node can always state what it wants to happen; whether it will actually happen or not is unknowable.  Like so much in life.

Given the previous discussion about how to use new flags, it may have a
pretty good sense for what will happen, even though as you say it is
unknownable.

> Section 4.1.2
> 
>    When not omitted, the CRC SHALL be represented as a CBOR byte string
>    of two bytes (that is, CBOR additional information 2, if CRC type is
>    1) or of four bytes (that is, CBOR additional information 4, if CRC
>    type is 2); in each case the sequence of bytes SHALL constitute an
>    unsigned integer value (of 16 or 32 bits, respectively) in network
>    byte order.
> 
> This seems to preclude the possibility of any future CRC types being defined (without an update to this specification).
> 
> 	That's right.  We didn't want to add another open-ended configuration dimension unnecessarily.

Okay.

> Section 4.1.3
> 
>    If the bundle's source node is omitted (i.e., the source node ID is
>    the ID of the null endpoint, which has no members as discussed
>    below; this option enables anonymous bundle transmission), then the
>    bundle is not uniquely identifiable and all bundle protocol features
>    that rely on bundle identity must therefore be disabled: the "Bundle
>    must not be fragmented" flag value must be 1 and all status report
>    request flag values must be zero.
> 
> It's also impossible for the user application to acknowledge such an anonymous bundle, right, so that flag would also be zero?  Or do such reports go to the "Report-to EID"?
> 
> 	But that acknowledgment is by the user application of the destination, to the user application of the source; the user application isn't being asked to send an acknowledgment to the BPA of the source.  That acknowledgment will be possible if and only if the ID of the application-layer endpoint is included in the application data unit somewhere.
> 	In practice this feature is not used much, but DTPC does exercise it.
> 
>      . Bits 21-63 are unassigned.
> 
> Does this imply a mandatory encoding (width) of the CBOR unsigned integer representing the flags (and a hard limit on the number of possible flags)?
> 
> 	There is a hard limit on the number of possible flags (see 10.3), but the representation of this value on the wire is simply a CBOR unsigned integer; no mandatory length.

Okay.

> Section 4.1.4
> 
>      . This block must be replicated in every fragment.  (Boolean)
> 
>      . Transmission of a status report is requested if this block
>         can't be processed.  (Boolean)
> 
>      . Block must be removed from the bundle if it can't be processed.
>         (Boolean)
> 
>      . Bundle must be deleted if this block can't be processed.
>         (Boolean)
> 
> nit: perhaps we could reorder this list to match the order in which the bit flags are allocated?
> 
> 	Sure, good idea.
> 
> Section 4.1.5.1
> 
>    Each BP endpoint ID (EID) SHALL be represented as a CBOR array
>    comprising a 2-tuple.
> 
> nit: Is there a reason to not just say "a CBOR array of two elements"?
> 
> 	How about "comprising two items"?

Sure.

>    The first item of the array SHALL be the code number identifying the
>    endpoint's URI scheme [URI], as defined in the registry of URI
>    scheme code numbers for Bundle Protocol maintained by IANA as
>    described in Section 10. [URIREG].  Each URI scheme code number
> 
> I'm not sure why [URIREG] is the reference here; it has nothing to do with the "code numbers for Bundle Protocol".
> 
> 	Right, this will be fixed in version 23.
> 
>      . If the EID's URI scheme is "ipn" then the SSP SHALL be
>         represented as a CBOR array comprising a 2-tuple.  The first
> 
> [same comment about "two elements"]
> 
> 	Same fix.
> 
>         item of this array SHALL be the EID's node number represented
>         as a CBOR unsigned integer.  The second item of this array
>         SHALL be the EID's service number represented as a CBOR
>         unsigned integer.
> 
> Where are node and service numbers defined?
> 
> 	Adding a little text to 4.1.5.1.
> 
>      . Definitions of the CBOR representations of the SSPs of EIDs
>         encoded in other URI schemes are included in the specifications
>         defining those schemes.
> 
> This feels kind of weird; the vast majority of URI schemes are not going to define CBOR encoding rules for their SSPs, but above we claim that bundle protocol is usable with any registered URI scheme.  Absent some mechanism for a separate document from the scheme definition to specify the CBOR encoding rules for the SSP, these claims are incompatible.
> 
> 	Good catch!  I think those rules should be included in the entries of the BP URI Scheme Types registry.  Modifying that registry definition.

Thanks!

> Section 4.1.5.2
> 
>       . The EID of any singleton endpoint of which a node is a member
>         MAY be used to identify that node. A "node ID" is an EID that
>         is used in this way.
>       . A node's membership in a given singleton endpoint MUST be
>         sustained at least until the nominal operation of the Bundle
>         Protocol no longer depends on the identification of that node
>         using that endpoint's ID.
> 
> Per the Discuss point, this feels like it's unworkable in practice; if we had a way to reliably distribute knowledge of which EIDs are usable as node IDs and/or are in use as node IDs, we could also use that mechanism to distribute lots of other useful things, like key material, revocation information, etc.  Since we claim to not be able to do those things, it's a little boggling to see a claim that we can do this for node IDs.  It kind of seems like we may have to make a fundamental distinction between "singleton EIDs" and "EIDs that may have multiple member nodes" in order to get these properties to be workable globally.
> 
> 	Right, see the response to the Discuss point.
> 
> Section 4.1.7
> 
>    The second item of the array SHALL be the creation timestamp's
>    sequence number, represented as a CBOR unsigned integer.
> 
> We haven't introduced the term "sequence number" yet.
> 
> 	We just did.

I mean, we haven't given it semantics yet (and don't until § 4.2.2) so this
is effectively telling the reader "there's a sequence-number-shaped hole
here that will be filled somehow".

> Section 4.1.8
> 
>    Block-type-specific data in each block (other than the primary
>    block) SHALL be the applicable CBOR representation of the content of
> 
> nit: as a stylistic matter, this qualification seems too important to relegate to a parenthetical.
> 
> 	I disagree.
> 
> Section 4.2.1
> 
> We probably want to check for consistency between what's here and what's covered in the section-4 intro -- both use normative keywords, and Section 4 talks about the penultimate array item being the payload block, which we don't mention here.
> 
> 	We're consistent.  The payload block is a canonical block.
> 
> Section 4.2.2
> 
>    The primary block of each bundle SHALL be immutable.  The values of
>    all fields in the primary block must remain unchanged from the time
>    the block is created to the time it is delivered.
> 
> Is there a technical mechanism that can enforce this?
> 
> 	The CRC and/or BIB on the primary block.
> 
>    Lifetime: The lifetime field is an unsigned integer that indicates
>    the time at which the bundle's payload will no longer be useful,
>    encoded as a number of microseconds past the creation time. (For
>    high-rate deployments with very brief disruptions, fine-grained
>    expression of bundle lifetime may be useful.)  When a bundle's age
> 
> Does this mean that we assume that the creation time has microsecond accuracy as a reference even though it is only encoded with seconds-level precision?
> 
> 	No, only the lifetime has this sort of accuracy.  The operational utility of microsecond accuracy in the bundle lifetime has yet to be fully explored.

I think I'm still confused, then -- what good is a microsecond-precision
lifetime when my reference point for the start of the lifetime is so fuzzy?

>    the CRC type.  The CRC SHALL be computed over the concatenation of
>    all bytes (including CBOR "break" characters) of the primary block
>    including the CRC field itself, which for this purpose SHALL be
>    temporarily populated with the value zero.
> 
> nit: the CRC is a CBOR byte string; "the value zero" assumes an implied encoding of that byte string.  Perhaps "all bytes zero" is better.
> 
> 	Yes, that's better.
> 
> Section 4.2.3
> 
>    Every block other than the primary block (all such blocks are termed
>    "canonical" blocks) SHALL be represented as a CBOR array; the number
>    of elements in the array SHALL be 5 (if CRC type is zero) or 6
>    (otherwise).
> 
> Is this an invariant that future-defined block types must also adhere to?
> 
> 	Yes.  Future block types have to comply here; their content (the block-type-specific data) may be anything at all, but the overall structure of the canonical block is fixed.

Okay.  I'd suggest calling this out that "This block structure is a
protocol invariant; new block types defined in the future will only change
the contents of the block-type-specific data" or similar.

>         computed over the concatenation of all bytes of the block
>         (including CBOR "break" characters) including the CRC field
>         itself, which for this purpose SHALL be temporarily populated
>         with the value zero.
> 
> [same comment as above re. "all bytes zero"]
> 
> 	Yes.
> 
> Section 4.3.2
> 
>    The Bundle Age block, block type 7, contains the number of
>    microseconds that have elapsed between the time the bundle was
>    created and time at which it was most recently forwarded.  It is
> 
> (Are we again assuming the creation time to have microsecond accuracy even though the precision of representation is in seconds?)
> 
> 	No, the creation time plays no role in procedures involving the Bundle Age block.

My understanding is that this bundle age is a difference of absolute times
(time forwarded - time created) and that time created is only known
precisely by the node doing the creation.  Any other node that needs to
update the bundle age will have a precise value for "time forwarded" but
only seconds accuracy for "time created", so normal propagation of
uncertainty techniques would require that this field can only convey
information useful to seconds precision.  (If the bundle age was instead an
accumulator for quantities known locally at each step, this could be
different, but that's not my understanding of what it's supposed to be.)

>    The block-type-specific data of this block is an unsigned integer
>    containing the age of the bundle in microseconds, which SHALL be
>    represented as a CBOR unsigned integer item. (The age of the bundle
>    is the sum of all known intervals of the bundle's residence at
>    forwarding nodes, up to the time at which the bundle was most
>    recently forwarded, plus the summation of signal propagation time
>    over all episodes of transmission between forwarding nodes.
>    Determination of these values is an implementation matter.) If the
> 
> I get that determination of these times will depend on the CLA(s) in use, but it sounds like we are making a hard requirement that an accurate value with microseconds precision is available?  That seems like a pretty stringent requirement to place on the underlying transport technologies.
> 
> 	We're not requiring any particular degree of accuracy; all we're requiring is representation in microseconds.

Perhaps this is overly pedantic, but "the age of the bundle in
microseconds" is a precisely defined value that has a single well-defined
value barring relativistic effects.  This value is in general different
from "the age of the bundle, rounded to the nearest millisecond,
represented to microsecond precision" which is what it sounds like you're
saying this field can contain.  Rewording to something like "the age of the
bundle, encoded in units of microseconds" does not have such a tight
binding in the definition.

>    bundle's creation time is zero, then the bundle MUST contain exactly
>    one (1) occurrence of this type of block; otherwise, the bundle MAY
>    contain at most one (1) occurrence of this type of block.  A bundle
>    MUST NOT contain multiple occurrences of the bundle age block, as
>    this could result in processing anomalies.
> 
> I'm a bit confused at the formal state of this extension block type.  Up in Section 4.3 we said that not all nodes will implement processing or production of extension blocks, but this text is in the core spec and says that this extension block MUST be present under some conditions.  Is this implicitly predicated on the implementation in question supporting Bundle Age, or is it a hard requirement?
> 
> 	It's a hard requirement.  If the creation time is zero, the Bundle Age block must be present; in this case, the implementation no longer has the option to omit procedures for processing the Bundle Age block.  Other implementations, that never set creation time to zero, do have that option.

I think I'm confused now: does an implementation need to be prepared to
process such blocks even if it will never create them, since some other
compliant node might be creating such bundles that would be routed through
it?

> Section 4.3.3
> 
> Where do we discuss the consequences of nodes failing to implement the Hop Count extension block (as is apparently allowed by Section 4.3)?
> 
> 	We don't discuss those consequences.  We think the implementer will probably be able to figure them out, i.e., the node lacks the safety mechanism discussed in the second paragraph of 4.3.3.

I think it's pretty typical to include such a discussion in the security
considerations, but won't insist upon it here.

>    unsigned integer. A bundle MAY contain at most one (1) occurrence of
>    this type of block.
> 
> nit: I think this is "MAY contain this type of block" but "MUST contain at most 1 occurrence".
> 
> 	Okay, let's say MAY contain one occurrence of this block but MUST NOT contain more than one.
> 
> Section 5.1
> 
>    Note that requesting status reports for any single bundle might
>    easily result in the generation of (1 + (2 *(N-1))) status report
> 
> [(1 + (2 *(N-1))) might be more concisely expressed as ((2*N) -1).]
> 
> 
>      . A "bundle reception status report" is a bundle status report
>         with the "reporting node received bundle" flag set to 1.
>      . A "bundle forwarding status report" is a bundle status report
>         with the "reporting node forwarded the bundle" flag set to 1.
>      . A "bundle delivery status report" is a bundle status report
>         with the "reporting node delivered the bundle" flag set to 1.
>      . A "bundle deletion status report" is a bundle status report
>         with the "reporting node deleted the bundle" flag set to 1.
> 
> These strings (the flag names) appear twice in the document: here and in Section 6.1.1; neither location explicitly says that it *defines* the flag values (though the latter does seem to actually do so, in the prose).
> 
> Section 5.3
> 
>    Step 1: If the bundle's destination endpoint is an endpoint of which
>    the node is a member, the bundle delivery procedure defined in
>    Section 5.7 MUST be followed and for the purposes of all subsequent
>    processing of this bundle at this node the node's membership in the
>    bundle's destination endpoint SHALL be disavowed; specifically, even
>    though the node is a member of the bundle's destination endpoint,
>    the node SHALL NOT undertake to forward the bundle to itself in the
>    course of performing the procedure described in Section 5.4.
> 
> The discussion so far in the document has not prepared me for the notion that a bundle would continue to be forwarded even after it has been delivered.  Please put a description somewhere of why this might occur.
> 
> 	This should not be a surprise.  An endpoint can contain multiple nodes.  A node that is forwarding a bundle might be just one of the members of the bundle's destination endpoint.

An endpoint can contain multiple nodes, sure, but if it's "one endpoint"
then I'd hope you'd forgive a reader for thinking that it's also "one
logical entity" whose nodes are capable of communicating with each other
internally, so that delivery to "the endpoint" terminates the protocol
processing and hands off to the internal synchronization protocol, as would
happen for (e.g.) TLS servers implemented with multiple functionally
equivalent endpoints for scaling purposes.  That's kind of implied by the
"point" part -- we don't call the identifier "a name for a potentially
loosely related group of entities interested in receiving the same
content".

> (Also, nit(?): everything in this paragraph is contingent on the node being a member of the destination endpoint, right?)
> 
> 	Yes.  I think the structure of the sentence makes that clear.
> 
> Section 5.4
> 
>    Step 2: The bundle protocol agent MUST determine whether or not
>    forwarding is contraindicated (that is, rendered inadvisable) for
>    any of the reasons listed in Figure 4. In particular:
> 
> I'm a bit confused by this wording, since Figure 4 is in essence a table of status report reason codes, i.e., values that appear on the wire in status report messages.  The contents of that table are not, logically, actual *reasons*, but rather the protocol contants.  It seems like it would be better to refer to some (possibly prose) description of the actual situations that would induce a "failure to forward" condition (and cause the generation of a report using one of those codes).
> 
> 	I don't think that would make the specification more accurate or easier to read.
> 
> Furthermore, Figure 4 only contains those codes that are currently defined; future extensions can define additional codes as well.
> (This is not the only place where Figure 4 is referenced as an alleged list of reasons to not forward.)
> 
> 	Good catch.  Changing those references to point to the registry defined in 10.5. 
> 
>    Step 4: For each node selected for forwarding, the bundle protocol
>    agent MUST invoke the services of the selected convergence layer
>    adapter(s) in order to effect the sending of the bundle to that
>    node. [...]
> 
> I appreciate the proper use of effect(v) -- thanks!
> 
>          Determining the time at which the bundle protocol agent
>    invokes convergence layer adapter services is a BPA implementation
>    matter.  Determining the time at which each convergence layer
>    adapter subsequently responds to this service invocation by sending
>    the bundle is a convergence-layer adapter implementation matter.
> 
> I appreciate that the actual procedures involved will depend on the CLAs and, to large extent, BPA implementation, but without giving some requirements as to what properties are needed, this protocol is incomplete.
> 
> 	I disagree.  These determinations are orthogonal to the rules defining the structure of Bundle Protocol data units and the behavior of interoperating nodes; no BP procedure that is performed at node A depends in any way on the manner in which these timing decisions are made at node B.

Whether or not you end up with a usable system depends on these timing
decisions.  It seems to me that this document is the right place to discuss
what needs to happen in order to have a usable BP system, even if how that
will happen is specified in a different specification.  But this is a
non-blocking comment, so agreeing to disagree is probably the right thing
to do.

> I could make a BPA implementation that chooses to never invoke CLA services and that would comply with this text (yet would not be a usable implementation at all).
> 
> 	Sure, there are lots of ways you could write BP software that is not usable.  It's not the job of this protocol specification to advise you against all of those possible errors.
> 
>      . If the bundle contains a data label extension block (to be
>         defined in a future document) then that data label value MAY
>         identify procedures for determining the order in which
> 
> This doesn't feel like it needs to be a normative "MAY" vs. descriptive "may".
> 
> 	Another reviewer has objected to this paragraph, so it is deleted.
> 
>      . If the bundle has a bundle age block, as defined in 4.3.2.
>         above, then at the last possible moment before the CLA
>         initiates conveyance of the bundle via the CL protocol the
>         bundle age value MUST be increased by the difference between
>         the current time and the time at which the bundle was received
>         (or, if the local node is the source of the bundle, created).
> 
> This does not seem to account for the transmission time from the previous node to this one, which was a required component in the definition of the bundle age.
> 
> 	Right, this is not a complete description of how to process the Bundle Age block; that's in 4.3.2.
> 
>    In that event, processing proceeds from Step 4 of Section 5.4.
> 
> [This is Section 5.4, so this is self-referential.]
> 
> 	It's pointing from Step 5 of 5.4 to Step 4 of 5.4.
> 
>    If completion of the data sending procedures by all selected
>    convergence layer adapters HAS resulted in successful forwarding of
> 
> ["HAS" is not a BCP 14 keyword.]
> 
> 	Does that mean we're not allowed to write it in all caps for emphasis?

I don't know of such a hard rule, but it may confuse people.  (Perhaps the
RFC Editor does have such a rule and I don't know about it...)

>    the bundle, or if it has not but the bundle protocol agent does not
>    choose to initiate another attempt to forward the bundle, then:
>    [...]
>         endpoint ID. The reason code on this bundle forwarding status
>         report MUST be "no additional information".
> 
> It's kind of weird to use "no additional information" for both the "success"
> case and the "I just decided not to" case.
> 
> 	And yet it's true: there is no additional information.
> 
> Section 5.4.1
> 
>    Otherwise, when -- at some future time - the forwarding of this
> 
> nit: two hyphens for the second em dash.
> 
> 	Thanks!
> 
> Section 5.6
> 
> Why is "Block unintelligible" used for both CRC failures and "extension not implemented"?
> 
> 	Because in both cases the block is unintelligible.
> 
> Section 5.8
> 
> I suggest to replace "fragmented bundle" with "bundle being fragmented", for clarity.
> 
> 	"Fragmented bundle" is defined clearly.  I think its subsequent use is then clear.

I think that defining terms to a different definition that common English
usage would lead one to use is generally needlessly confusing, though I do
not dispute that there is a clear local definition here.

>      . Beyond these rules, replication of extension blocks in the
>         fragments is an implementation matter.
> 
> It seems prudent to give some indication of how the BPsec blocks are managed across fragmentation.
> 
> 	Yes, in the BPsec specification.

I think that my point here is (partly) to note that it's not *strictly* an
implementation matter since the specification for the extension block can
also include additional requirements, as is the case for the BPsec block
types.

> Section 5.9
> 
>    If the concatenation -- as informed by fragment offsets and payload
>    lengths -- of the payloads of all previously received fragments with
> 
> nit: talking about this as "concatenation of all fragment payloads" is a bit risky, since we admit the possibility of overlapping fragments; would it be better to talk about "recovering the full application-data-unit-length byte array by inserting fragment contents at the indicated offsets"?
> 
> 	I see what you're saying.  I think any reasonable implementer will understand what is meant, but I agree that the language is not strictly accurate.  Let's say "the non-overlapping portions of the payloads of all previously received fragments <etc>".

Okay.

>      . This byte array -- the reassembled application data unit --
>         MUST replace the payload of this fragment.
>      . The "Reassembly pending" retention constraint MUST be removed
>         from every other fragment whose payload is a subset of the
>         reassembled application data unit.
> 
> Why is the last-received fragment special that it's payload is replaced by the entire payload?  Wouldn't it make more sense to promote the fragment with offset zero, since that is guaranteed to have the right extension blocks?
> 
> 	Because those other fragments might have been discarded; all we need to retain is their payloads.  The extension blocks that need to be "right" should have been replicated in all fragments.
> 
> Section 6.1.1
> 
> There's a lot of nested arrays here; some examples would really help clarify the structure.
> 
>    Each item of the bundle status information array SHALL be a bundle
>    status item represented as a CBOR array; the number of elements in
>    each such array SHALL be either 2 (if the value of the first item of
>    this bundle status item is 1 AND the "Report status time" flag was
> 
> ["AND" is not a BCP 14 keyword]
> 
> It's somewhat surprising to have several of the reason codes in the figure that appear with no accompanying prose discussion of when they might be used.
> 
> With the presence of a "traffic pared" report code, one wonders if it might be worth defining a mechanism to consolidate multiple status reports, so that one might report (e.g.) paring several bundles in a single status report.
> 
> 	Interesting idea, but I don't think the benefit would justify the additional complexity.

Sure; I don't get the impression that the status reports are in heavy usage
at present.

> Section 6.2
> 
>    Step 1: The administrative record must be constructed. If the
>    administrative record references a bundle and the referenced bundle
>    is a fragment, the administrative record MUST contain the fragment
>    offset and fragment length.
> 
> To be clear: this is normative guidance that applies to all administrative record types that may be defined in the future?
> 
> 	Yes, as that's the only way the administrative record can identify the subject bundle.  But "with reference to some bundle" should be removed from the sentence above, as not all administrative records are generated with reference to a particular bundle.
> 
> Section 7.2
> 
>      . sending a bundle to a bundle node that is reachable via the
>         convergence layer protocol;
>      . notifying the bundle protocol agent when it has concluded its
>         data sending procedures with regard to a bundle;
>      . delivering to the bundle protocol agent a bundle that was sent
>         by a bundle node via the convergence layer protocol.
> 
>    The convergence layer service interface specified here is neither
>    exhaustive nor exclusive. That is, supplementary DTN protocol
>    specifications (including, but not restricted to, the Bundle
>    Security Protocol [BPSEC]) may expect convergence layer adapters
>    that serve BP implementations conforming to those protocols to
>    provide additional services such as reporting on the transmission
>    and/or reception progress of individual bundles (at completion
>    and/or incrementally), retransmitting data that were lost in
> 
> How is "reporting on the transmission and/or reception progress of individual bundles" different from the bullet points above?
> 
> 	Those bullet points don't say anything about transmission and/or reception progress, only completion.

Okay.

> Section 8
> 
>        . the Bundle Protocol (BP, RFC 5050),
>        . the Bundle Protocol version 7 specification draft (version 6),
> 
> Is this still accurate?  We're on version 21 of the draft, now, not version 6, and it rather defies belief that there have been no protocol-relevant changes since version -06.  (I understand this will get removed before publication as an RFC, but it would be somewhat telling if the implementation efforts had stalled.)
> 
> 	The microPCN implementation of BPv7 interoperates with the ION implementation, which is based on the current draft, so I am pretty sure their implementation efforts have not stalled.  I don't know exactly which version of the draft they're up to now.

Thanks for confirming!

> Section 9
> 
> Should we discuss the risk that the presence of "reassembly pending"
> retention constraints pose of a DoS on node storage, or do we think that's adequately covered already?
> 
> 	Actually I think this points to a potentially more general problem: DOS attack or not, extremely long bundle lifetimes can threaten the operation of a BP node.  I think this is a deficiency in the specification.
> 	I am going to go out on a limb and insert some language into 4.2.2 authorizing a BP agent to impose a (temporary, local) lifetime override when it detects a problem along these lines, together with some language in section 9 that references this option.  If there are objections to this approach, fine, but I think some sort of mitigation is needed.

I think implementations would end up doing something like this regardless
of whether the spec officially "blessed" it.  So agreed in principle, at
least.

>    Additionally, convergence-layer protocols that ensure authenticity
>    of communication between adjacent nodes in BP network topology
>    SHOULD be used where available, to minimize the ability of
> 
> "Authenticity of communication" requires some sense of identity and credentials associated thereto; with the current formulation of node identities as EID URIs, I'm not sure what sort of credentials would be used for this purpose.  Can you give some examples?
> 
> 	This refers to security at the convergence (transport) layer under BP, so I guess TLS would be an example.

Okay, so I guess this is just inherently specific to the URI scheme of the
EID, then.  (I'm curious what is done for "dtn" and "ipn" but please don't
feel obligated to go to any effort to do so.)

>    different blocks.  One possible variation is to sign and/or encrypt
>    blocks using symmetric keys securely formed by Diffie-Hellman
>    procedures (such as EKDH) using the public and private keys of the
>    sending and receiving nodes.  For this purpose, the key distribution
>    problem reduces to the problem of trustworthy delay-tolerant
>    distribution of public keys, a current research topic.
> 
> It's important to inject some fresh entropy when using static-static DH for key generation, as otherwise the problems from cryptographic key reuse become basically unbearable.
> 
> 	This is another paragraph that is being deleted per objections from another reviewer.
> 
> Section 10.1
> 
> Technically we're codepoint squatting for the new allocations here (unless we say that these are "suggested values"), in a specification-required registry.  I assume the experts (as authors/shepherd) are aware of this work and would not allocate the requested codepoints to another document, though, so I am not making this a Discuss point.
> 
>    |     6    |     4 | Payload Confidentiality Blk | [RFC6257]     |
> 
> We already have rows that spill over to another line; spelling out "Block"
> as is done in the current registry contents seems best.
> 
> 	Okay.
> 
> Section 10.3
> 
> Why is RFC-to-be listed as a reference for bits 7-13 when the applicable version is only 6 (not 7)?
> 
> 	Because we say that those bits are Reserved in 4.1.3.

Ah, got it.

> Section 10.5
> 
> [same note about codepoint-squatting]
> 
>    |     6,7 |        5 | Destination endpoint ID          |[RFC5050],|
> 
>    |         |          |    unintelligible                |RFC-to-be |
> 
> The current wording is "Destination endpoint ID unavailable", so this is requesting a content change.
> 
> 	Good catch, will fix.
> 
>    |     6,7 |        8 | Block unintelligible             |[RFC6255],|
> 
>    |         |          |                                  |RFC-to-be |
> 
> The registry currently shows RFC 5050 as a reference, not RFC 6255.
> 
> 	Okay.
> 
>    |     6   |      255 | Reserved                         |[RFC6255] |
> 
> Not also RFc-to-be?
> 
> 	Okay.
> 
> Section 10.6
> 
>    The Bundle Protocol has a URI scheme type field - an unsigned
>    integer of undefined length - for which IANA is requested to create
> 
> (All CBOR unsigned integers are of "indefinite" (variable) length.  Also note that RFC 7049 prefers to use "indefinite length" over "undefined length", but also that 7049 does not use "indefinite length" for integers.)
> 
>     |            1 | dtn                         | RFC-to-be         |
> 
>     |            2 | ipn                         | [RFC6260]         |
> 
> Why is "ipn" left with RFC6260 as the reference even though we are updating its registration in this document just as we are for "dtn"?
> 
> 	Sure, adding RFC-to-be.
> 
> Section 10.7 (and with minimal changes, 10.8)
> 
>    dtn-hier-part = "//" node-name name-delim demux ; a path-rootless
> 
> Is there supposed to be more (or less) to that comment?
> 
> 	No.
> 
>         receives the bundle.  In both cases (and indeed in all bundle
>         processing), the node that receives a bundle should verify its
>         authenticity and validity before operating on it in any way.
> 
> How is this possible in the absence of BPSEC?  Does this imply a recommendation or requirement to implement BPSEC?
> 
> 	Not necessarily.  Convergence-layer security might be sufficient.
> 
> Section 11.1
> 
>    [BPSEC] Birrane, E., "Bundle Security Protocol Specification", Work
>    In Progress, October 2015.
> 
> I agree with the directorate reviewer that we need to give a more concrete reference here (e.g., draft-ietf-dtn-bpsec)
> 
> 	Done.
> 
> Section 11.2
> 
> The way in which we reference [UTC] is arguably normative.
> 
> 	Fine with me, as long as it passes the nits check.
> 
> Appendix B
> 
>    start = bundle / #6.55799(bundle)
> 
> This is the tag for "self-describe [sic] CBOR" per RFC 7049; did Carsten make any indication that a more specific tag could/should be allocated for BP?
> 
> 	None that I recall.

Okay.  I'm not a CBOR expert and don't see any problems with this approach;
I was just curious.

Thanks,

Ben

>    ; The root bundle array
> 
>    bundle = [primary-block, *extension-block, payload-block]
> 
> I see that CDDL does not have provisions for noting indefinite- vs.
> definite-length encoding, so a comment here might be in order.
> 
>    bundleflagbits = &(
> 
>      reserved: 21,
>      [...]
> 
> RFC 8610 seems to suggest that omitting reserved bits may be appropriate (to disallow them from being set within this CDDL model).  (Similarly for the
> blockflagbits.)
> 
>