[dtn] Robert Wilton's Discuss on draft-ietf-dtn-dtnma-10: (with DISCUSS and COMMENT)

Robert Wilton via Datatracker <noreply@ietf.org> Thu, 15 February 2024 14:35 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Robert Wilton via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-dtn-dtnma@ietf.org, dtn-chairs@ietf.org, dtn@ietf.org, rick.taylor@ori.co, rick.taylor@ori.co
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Robert Wilton <rwilton@cisco.com>
Message-ID: <170800773840.41884.2737908790495853817@ietfa.amsl.com>
Date: Thu, 15 Feb 2024 06:35:38 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/dtn/-nJ9XvmUMD6w7OP-KhnD-wX7G9E>
Subject: [dtn] Robert Wilton's Discuss on draft-ietf-dtn-dtnma-10: (with DISCUSS and COMMENT)

Robert Wilton has entered the following ballot position for
draft-ietf-dtn-dtnma-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dtn-dtnma/

----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Hi,

I have concerns with how this document is framed that I think rises to the
level of a DISCUSS.

(1) p 0, sec

DTN Management Architecture
draft-ietf-dtn-dtnma-10

I have raised a discuss because of how this document is framed:

- It explains some of the requirements that are specific to managing devices in
DTNs. For me, the key one really being the unreliable availability of the
network meaning synchronous RPCs are not a great idea, and there is a stronger
emphasis on remote agents.

- It then critiques the existing IETF network management architecture, but this
description seems to be incorrect and inaccurate in various places.

- It then uses that critique as a justification as to why the existing IETF
network management solutions cannot be used out of the box to meet the
requirements of the DTN architecture. Whilst I agree that this is true - I
also think that with a relatively small amount of work, or enhancements (many
of which are in the process of being pursued for other reasons), it would be
possible to extend the existing IETF network management architecture to work
for DTN. I.e., I don't regard the existing text in sections 5 and 6 to really
justify a new management architecture rather that reusing and extending what is
already there. This doesn't mean that a new architecture is not justified,
only that I think that this document currently doesn't really do a good job of
making the case. Hence, I wonder whether the real justification is because the
proposed management architecture is much closer to how these devices are
managed today and hence it is less of shift in mindset?

- Finally, I haven't reviewed the proposed architecture in great detail, but I
think that the command based aspect of it, is potentially inferior to the
intent based approach in regular network management architecture, that I
believe is a more robust approach.

(2) p 0, sec

This document describes a DTN management architecture (DTNMA)
suitable for managing devices in any challenged environment but, in
particular, those communicating using the DTN Bundle Protocol (BP).
Operating over BP requires an architecture that neither presumes
synchronized transport behavior nor relies on query-response
mechanisms. Implementations compliant with this DTNMA should expect
to successfully operate in extremely challenging conditions, such as
over uni-directional links and other places where BP is the preferred
transport.

As per my other comments, because I believe that the existing IETF network
management architecture already does, or is mostly on the path to supporting
many of the requirements, then if a new management architecture is required
here, then I think that its scope should be narrowed to only work over the BP
protocol.

Specifically, I think that it is much better for the wider industry to converge
on using the existing NETCONF/RESTCONF/CORECONF + YANG(XML, JSON, CBOR+/-SIDs)
architecture where possible rather than fragmenting to different competing
solutions.

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

(3) p 4, sec 1.2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].

Please update this to use the newer requirements text in RFC 8174.

(4) p 18, sec 5.2. XML-Based Protocols

Several network management protocols, including NETCONF [RFC6241],
RESTCONF [RFC8040], and CORECONF [I-D.ietf-core-comi], share the same
XML information set [xml-infoset] to describe the abstract data model
necessary to manage the configuration of network devices. Each
protocol, however, provides a different encoding of that XML
information set.

I don't think that it is correct to describe these protocols to be XML based,
and to list them under an "XML-Based Protocol" section.

Whilst it was correct that NETCONF and YANG were originally standardised for
using XML, they have, or are, all somewhat evolving beyond that:

- YANG already has JSON, and CBOR+/-SIDs encodings in addition to XML.

- RESTCONF is HTML based rather than XML, and already supports XML and JSON
encodings of the YANG data, with CBOR surely to follow.

- NETCONF is still XML based, but really the RPCs are defined in YANG, and we
are already considering enhancing a future version of NETCONF to support a CBOR
based encoding, primarily for streamed operational data.

- CORECONF isn't XML based, but instead uses a CoAP based tight encoding of
HTML verbs with CBOR encoding of the YANG data.

(5) p 18, sec 5.2.1. The YANG Data Model

* YANG notifications [RFC8639] and YANG-Push notifications [RFC8641]
allow a client to subscribe to the delivery of specific containers
or data nodes defined in the model, either on a periodic or "on
change" basis. These notification events can be filtered
according to XPath [xpath] or subtree [RFC6241] filtering as
described in [RFC8639] Section 2.2.

Today, some of YANG Push is quite heavy weight, i.e., the resync requirement,
and some of the on-change requirements, but I suspect that a future lighter
version of YANG Push may happen (that is closer to how gNMI telemetry works).

(6) p 19, sec 5.2.1. The YANG Data Model

1. Size. Data nodes within a YANG model are referenced by a
verbose, string-based path of the module, sub-module, container,
and any data nodes such as lists, leaf-lists, or leaves, without
any explicit hierarchical organization based on data or object
type. Existing efforts to make compressed identifies for YANG
objects (such as SIDs) are still relatively verbose (~8 bytes per
item) and do not natively support ways to glob multiple SIDs.

On the wire, I would expect CBOR SIDs to probably average of being 2 bytes
each, depending on how the models are structured. The CBOR SID encoding only
encodes the difference between child node SID and its parent node SID, so if
the models are structured sensibly, and CBOR SIDs are allocated sensibly, then
they will generally only be 1 or 2 bytes long.

(7) p 19, sec 5.2.1. The YANG Data Model

2. Protocol Coupling. A significant amount of existing YANG tooling
presumes the use of YANG with a specific management protocol.

I don't think that this is correct. E.g., OpenConfig uses YANG but doesn't use
NETCONF - instead they have defined their own transport protocol using gRPC
(called gNMI), which also serves as an alternative to YANG Push.

(8) p 19, sec 5.2.1. The YANG Data Model

RPC execution is strictly limited to those issued by the client.

This is basically true but not entirely relevant, in that there is no
restriction that a management client cannot be run as an agent on the device.
Some vendor implementations support this, to serve similar purposes desired by
this architecture. I.e., for an agent on the device to act on predicable
events that occur on the device without interaction with a main management
controller, either for simplicity or latency issues (e.g., take some action
within a few msecs of when an interface goes down).

(9) p 19, sec 5.2.1. The YANG Data Model

Commands are
executed immediately and sequentially as they are received by the
server, and there is no method to autonomously execute RPCs
triggered by specific events or conditions.

Please see https://www.ietf.org/archive/id/draft-ietf-netmod-eca-policy-01.txt.
This document is adopted, but currently expired. It would be worth checking
with the authors on its current status, but there is already interest in
standardising a data model for agents on a device to act as you describe (or at
least similarly to what you describe).

(10) p 19, sec 5.2.2. XML-Based Management Protocols

NETCONF [RFC6241], RESTCONF [RFC8040], and CORECONF
[I-D.ietf-core-comi] each provide the mechanisms to install,
manipulate, and delete the configuration of network devices. These
network management protocols use the same XML information set, but
provide different encodings of the abstract data model it describes.

I don't really understand what you mean by the same XML information set. The
data is modelled in YANG and encoded in XML, JSON, CBOR (with or without SIDs).
Even in the conventional network management space, it is plausible that over
time there will be a migration away from XML towards JSON + CBOR.

(11) p 19, sec 5.2.2.1. NETCONF

NETCONF is a stateful, XML-based protocol that provides a RPC syntax
to retrieve, edit, copy, or delete any data nodes or exposed
functionality on a server. It requires that underlying transport
protocols support long-lived, reliable, low-latency, sequenced data
delivery sessions.

NETCONF is defined and used this way, but I don't think that this has to be its
fundamental nature. E.g., some controllers open a new NETCONF connection to
make a configuration change and then close it again, and such are not really
relying on any sort of long-lived connection. Further, the NMDA architecture,
RFC 8342, is designed around the concept of decoupling changing the
configuration to enacting that configuration change, and monitoring it through
subscriptions rather than in the RPC reply. I.e., the architecture is already
migrating towards a path when the synchronous reply to a NETCONF edit operation
may not be all that important.

(12) p 20, sec 5.2.2.2. RESTCONF

RESTCONF is a stateless RESTful protocol based on HTTP. RESTCONF
configures or retrieves individual data elements or containers within
YANG data models by passing JSON over REST. This JSON encoding is
used to GET, POST, PUT, PATCH, or DELETE data nodes within YANG
modules.

As per above, RESTCONF supports XML or JSON, and very likely CBOR in future,
which would likely be a small update.

(13) p 20, sec 5.2.2.2. RESTCONF

RESTCONF is a stateless protocol because it presumes that it is
running over a stateful secure transport (HTTP over TLS). Also,
RESTCONF presumes that a single pull of information can be made in a
single round-trip. In this way, RESTCONF is only stateless between
queries - not internal to a single query.

Yes, but mechanisms like YANG Push can somewhat be used to mitigate this.
E.g., already in the gNMI world, there is a move a way from synchronous get
requests to asynchronous requests (once or periodic) where the request is made
and the results are then streamed back (either to the client, or somewhere
else) at some point in the future. These operations are asynchronous with
respect to the caller.

(14) p 21, sec 6. Motivation for New Features

Management mechanisms that provide DTNMA desirable properties do not
currently exist.

This is true - but I still believe that small enhancements, some of which are
already planned may well get you to where you need to go without starting from
scratch with an entirely new management architecture.

(15) p 21, sec 6. Motivation for New Features

1. Open Loop Control. Freedom from a request-response architecture,
API, or other presumption of timely round-trip communications.
This is particularly important when managing networks that are
not built over an HTTP or TCP/TLS infrastructure.

As per above, the existing network management architecture supports this by
allowing management clients to run on the device (and potentially expose their
own North-bound management interfaces).

(16) p 21, sec 6. Motivation for New Features

2. Standard Autonomy Model. An autonomy model that allows for
standard expressions of policy to guarantee deterministic
behavior across devices and vendor implementations.

As per above, there is already some work happening in this area.

(17) p 21, sec 6. Motivation for New Features

3. Compressible Model Structure. A data model that allows for very
compact encodings by defining and exploiting common elements of
data schemas.

As per above, YANG CBOR SIDs already allow for a pretty compact encoding. This
may still be too high for a use case, but there is a clear tradeoff here
between flexibility vs encoding size. E.g., if know the exact schema that is
being encoded, and if the server sends all values, then it potentially doesn't
need to encode the keys. But this requires client and server being exactly in
sync w.r.t. the data model, so it is probably less resilient to failures.

(18) p 22, sec 7. Reference Model

There are a multitude of ways in which both existing and emerging
network management protocols, APIs, and applications can be
integrated for use in challenged environments. However, expressing
the needed behaviors of the DTNMA in the context of any of these pre-
existing elements risks conflating systems requirements, operational
assumptions, and implementation design constraints.

This is true. But if you want to properly justify why a new architecture is
needed, then I still think that it worth looking at what an architecture would
look like extending the existing management infrastructure that IETF supports.
Of course, there are benefits in having a targeted architecture specifically
for your use-case, but there are also big benefits in reusing much of what is
already there, or focussing the changes on the specific pieces that you need
(and that then other use cases may also benefit from).

(19) p 23, sec 7.1. Important Concepts

The DTNMA differs from some other management architectures in three
significant ways, all related to the need for a device to self-manage
when disconnected from a managing device.
1. Pre-shared Definitions. Managing and managed devices should
operate using pre-shared data definitions and models. This
implies that static definitions should be standardized whenever
possible and that managing and managed devices may need to
negotiate definitions during periods of connectivity.

This appears to be broadly the same with the regular NETCONF YANG management
architecture, and is also one of the goals with the IETF YANG packages and
versioning work
(https://datatracker.ietf.org/doc/html/draft-ietf-netmod-yang-packages). I.e.,
so that the device can efficiently share a reference to its schema with a
management client without needing to download the full schema for YANG library.

(20) p 23, sec 7.1. Important Concepts

2. Agent Self-Management. A managed device may find itself
disconnected from its managing device. In many challenged
networking scenarios, a managed device may spend the majority of
its time without a regular connection to a managing device. In
these cases, DAs manage themselves by applying pre-shared
policies received from managing devices.

As per my previous comments, I think that existing management clients are
already starting to do this. Perhaps a key difference here is that in regular
network management, the agents would likely only provide an assistance role
than perhaps being the primary configuration management mechanism.

(21) p 23, sec 7.1. Important Concepts

3. Command-Based Interface. Managing devices communicate with
managed devices through a command-based interface. Instead of
exchanging variables, objects, or documents, a managing device
issues commands to be run by a managed device. These commands
may create or update variables, change data stores, or impact the
managed device in ways similar to other network management
approaches. The use of commands is, in part, driven by the need
for DAs to receive updates from both remote management devices
and local autonomy.

Okay, this one we don't do, but then I'm not convinced that it is a great way
of managing remote devices. Mostly, the existing network management is moving
towards intent based configuration. I.e., where the data model expresses the
desired configuration state for the device, and the device is responsible, on
its own, to take whatever necessary steps are required to transition from the
current state to reach that desired state. It feels to me that this is more
robust that sending a sequence of commands, because if the device isn't in the
anticipated state, or if some of those commands fail, then it risks leaving the
device is an unknown and unexpected state. I think that sending commands
should probably be restricted to fixing stuff when it is broken rather than as
a mainline configuration mechanism.

[dtn] Robert Wilton's Discuss on draft-ietf-dtn-d… Robert Wilton via Datatracker
Re: [dtn] [EXT] Robert Wilton's Discuss on draft-… Birrane, Edward J.