[Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-asa-guidelines-05: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 19 January 2022 23:14 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-anima-asa-guidelines@ietf.org, anima-chairs@ietf.org, anima@ietf.org, tte@cs.fau.de, tte@cs.fau.de
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <164263408026.23202.10107292021027396373@ietfa.amsl.com>
Date: Wed, 19 Jan 2022 15:14:40 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/EHjd3NU-rBppQKHCEJa6Tbeujuk>
Subject: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-asa-guidelines-05: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-anima-asa-guidelines-05: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-anima-asa-guidelines/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

It looks like the indentation in the example MAIN PROGRAM in Appendix C
is incorrect, or at least confusing, in the "do forever" loop therein.
Specifically, assuming semantic whitespace as in Python, we never
actually perform grasp negotiation for the "good_peer in peers" case.
Additionally, I think we may have a risk of getting stuck in a loop
making no progress so long as good_peer remains in the set of discovered
peers but does not have enough resources available for our
request/negotiation to succeed.  I think we want to clear out good_peer
if a negotiation fails, to avoid that scenario.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 1

   management services to network operators and administrators.  For
   example, the services envisaged for network function virtualisation
   [RFC8568] or for service function chaining [RFC7665] might be managed
   by an ASA rather than by traditional configuration tools.

RFC 8568 is an IRTF document on "Network Virtualization Research
Challenges"; is that really the best reference for this concept?
I note that RFC 8014 exists (IETF, Informational) on "An Architecture
for Data-Center Network Virtualization over Layer 3 (NVO3)", which
admittedly is not exactly the same thing, but might still be suitable.

Section 2

   A typical ASA will have a main thread that performs various initial
   housekeeping actions such as:
   [...]
   *  Define data structures for relevant GRASP objectives.

I may just be confused, but wouldn't defining a GRASP objective/data
structure mapping be a matter of protocol design, not something that
requires ongoing activity in a housekeeping or startup thread of a
running ASA?  If the intent was just that the values in a predetermined
data structure template should be populated at this step, I would be
much less confused.

   *  Obtain authorization credentials, if needed.

In a non-autonomic context, I would say that getting credentials needs
to be a periodic task, not something done once on startup and retained
indefinitely.  It seems that in the ASA context, the situation may be
difference, since the authorization credentials in question are likely
to be (or be derived from) the device's LDevID, which might reasonbly
have a very long lifetime.  On the other hand, there might also be
scenarios where LDevID lifetime is short-lived and certificate renewal
occurs often, so that the ASA would benefit from having credential
management be a periodic task.  Do you think it's better to list this
item in the initial housekeeping or in the main loop tasks?

Section 3.3

   Section 2.  However, if ASAs require additional communication between
   themselves, they can do so using any desired protocol, such as a TLS
   session over the ACP if that meets their needs.  One option is to use
   GRASP discovery and synchronization as a rendez-vous mechanism
   between two ASAs, passing communication parameters such as a TCP port
   number via GRASP.  As noted above, the ACP should be used to secure
   such communications.

I agree that there is earlier text suggesting that the ACP should be
used for normal inter-ASA communications.  I'm commenting here
specifically on the phrase "used to secure such communications".  If the
ACP is providing secure communications, then why would there be a need
to use an additional TLS session on top of it (which is specifically
called out as an option in the quoted text)?  It seems that there is
some additional subtlety here, in that the ACP provides some baseline
security property of "the communications peer belongs to the ACP" but
that in some cases there is a need for additional protection such that
the data being exchanged is visible only to the two communicating
endpoints in the ACP and not to any intermediate ACP nodes that it
traverses.
That's a long bit of discussion to just suggest s/used to secure/used
for/, but I think it's important to treat use of the word "secure"
carefully, as it can be interpreted in many different ways by different
readers when used without further qualification.

Section 4

   For example, if such management is performed by NETCONF [RFC6241],
   the ASA must interact with the NETCONF server as an independent
   NETCONF client in the same node to avoid any inconsistency between
   configuration changes delivered via NETCONF and configuration changes
   made by the ASA.

I will, of course, defer to my ops&mgmt colleagues on this, but my
understanding was that the NMDA allowed for changes made "out of band"
from the normal management protocol and that NETCONF clients were
expected to be able to cope with that.  If my understanding is correct,
that would suggest a slight rephrasing to something like "would
typically interact", rather than "must interact".

Section 3.1, 3.2

   Note that the [...] process itself may include special-
   purpose ASAs that run in a constrained insecure mode.

In light of the other discussion on this phrase, I submit "run in a
restricted mode in an insecure network setting" for consideration as a
possible alternative.

Section 5

   A mapping from YANG to CBOR is defined by [I-D.ietf-core-yang-cbor].
   Subject to the size limit defined for GRASP messages, nothing
   prevents objectives using YANG in this way.

I would suggest mentioning the YANG data structure extension of RFC 8791
in this context of using YANG to define data structures that are
divorced frrom configuration or operational state.

Section 6.1

                                               The provisioning of the
   infrastructure is realized in the installation phase and consists in
   installing (or checking the availability of) the pieces of software
   of the different ASAs in a set of Installation Hosts.  Installation
   Hosts may be nodes of an autonomic network, or servers dedicated to
   storing the software images of the different ASAs.

I'm a bit confused about whether the Installation Host is the target of
the installation operation (i.e., the software is installed on this host
and will/might eventually run there) or functions more as a software
repository, from which the software will be fetched prior to running on
some other node.  The phrases "servers dedicated to storing the software
images" and "checking the availability of" suggest the latter, but down
in §6.1.1 we tak of a list of ASAs installed on "[list of Installation
Hosts]" and even here we talk of "installing ... software ... in a set
of Installation Hosts".  It would be good to clarify which of these
roles are intended (or that both distinct roles are covered).

Section 7.1

   necessary.  This issue is considered in detail in
   [I-D.ciavaglia-anima-coordination].

The referenced draft expired in 2016 and contains a disclaimer that [the
latest version] "has been issued to reactivate the document in order to
allow discussion within the ANIMA WG about the coordination of autonomic
functions", which does not exactly send a strong signal that that
document is a definitive source of information on this topic.

Section 7.2

                           Each ASA designer will need to consider this
   issue and how to avoid clashes and inconsistencies.  [...]

Is an ASA designer even expected to be in a position to know what other
configuration/management tools/mechanisms the ASA might be competing
with?

Section 8

   8.   On the other hand, the definitions of GRASP objectives are very
        likely to be extended, using the flexibility of CBOR or JSON.
        Therefore, ASAs should be able to deal gracefully with unknown
        components within the values of objectives.  The specification
        of an objective should describe how unknown components are to be
        handled (ignored, logged and ignored, or rejected as an error).

Do we want to encourage specifications of objectives to also describe
ways in which extension are envisioned (which should not necessarily be
taken as excluding other potential extension points)?

Section 9

Do we want to mention again the lack of transactional integrity in GRASP
(mentioned previously in §5)?

   ASAs are intended to run in an environment that is protected by the
   Autonomic Control Plane [RFC8994], admission to which depends on an
   initial secure bootstrap process such as BRSKI [RFC8995].  [...]

Perhaps banal, but I'd suggest including some explicit statement of
"those documents describe security considerations relating to the use of
and properties provided by the ACP and BRSKI, respectively" as a trigger
to actually read them, vs just noting their existence.

It may also be prudent to reference the GRASP security considerations.

                                  Thus, ASAs must be designed to avoid
   loopholes such as passing on executable code, and should if possible
   operate in an unprivileged mode.  [...]

I would add "or proxying unverified commands" to "such as passing on
executable code", since that seems more likely to me than literally
sending around executable binaries and running them".

   A similar situation will arise if an ASA acts as a gateway between
   two separate autonomic networks, i.e. it has access to two separate
   ACPs.  Such an ASA must also be designed to avoid loopholes and to
   validate incoming information from both sides.

This makes me think of the phrasing "the ASA must act as a trust
boundary between the distinct ACPs", a phrasing which might also be put
to use in the previous discussion of loophole avoidance as well.

   The initial version of the autonomic infrastructure assumes that all
   autonomic nodes are trusted by virtue of their admission to the ACP.
   ASAs are therefore trusted to manipulate any GRASP objective, simply
   because they are installed on a node that has successfully joined the
   ACP.  [...]

Thanks for stating this clearly; I think it's valuable to reiterate it
in this context.  Would this be an appropriate place to discuss how this
"trusted by virtue of ACP membership" relates (or doesn't relate) to the
notion of "unprivileged mode" or "without special privilege" that's
mentioned in a couple places earlier in the document?

Section 12.2

Google appeared to find me http://ceur-ws.org/Vol-204/P07.pdf when I
started searching from what's listed for [DeMola06].

Appendix C

In the NEGOTIATOR thread, do we need to return A to the resource_pool if
the negotiation failed?

NITS

Abstract

We definitely want to keep the "so-called" in "so-called autonomic
networking"?  It looks really odd to me, but I have no horse in this
race.

Section 1

   this way.  This document mainly addresses issues affecting quite
   complex ASAs, but the most useful ones may in fact be rather simple
   developments from existing scripts.

I think the intent here is that the most useful ASAs might just be simple
evolutions/conversions of existing scripts into simple ASAs, but the
current text doesn't convey that sense very well.  "the most useful
ones" might formally bind more tightly to 'issues' than 'ASAs', and I
think a different word than "developments" would be better.

Section 2

   According to the degree of parallelism needed by the application,
   some of these threads might be launched in multiple instances.  In
   particular, if negotiation sessions with other ASAs are expected to
   be long or to involve wait states, the ASA designer might allow for
   multiple simultaneous negotiating threads, with appropriate use of
   queues and locks to maintain consistency.

I think there are synchronization primitives usable in this scenario
that don't technically involve locks, so s/locks/synchronization
primitives/ might be more pedantically correct.  I am not taking the
stance that pedantic correctness should trump readability, though.

Section 5

   acceptable by the GRASP API will limit the options in practice.  A
   generic solution is for the API to accept and deliver the value field
   in raw CBOR, with the ASA itself encoding and decoding it via a CBOR
   library.

I would suggest "in encoded binary form", to avoid an implication that
the API is going to check that there is a valid CBOR structure being
provided.

   GRASP maximum message size.  If the default maximum size specified by
   [RFC8990] is not enough, the specification of the objective must

It might help the reader if we write out GRASP_DEF_MAX_SIZE (assuming
that is actually the intended limit).

Section 6.1

   *  The decoupling property allows controlling resources of an
      autonomic node from a remote ASA, i.e. an ASA installed on a host
      machine different from the autonomic node resources.

There seems to be a missing possessive or word order issue here; maybe
"different from the autonomic node whose resources are being
controlled"?

Section 6.1.1

   *  [ASA placement function] specifies how the installation phase will
      meet the operator's needs and objectives for the provision of the
      infrastructure.  This function is only required in the decoupled
      mode.  [...]

I wonder if it's more clear to say that this function "is the identity
function" or "just returns the list of candidate Installation Hosts"
when decoupled mode is not in use.

   The condition to validate in order to pass to next phase is to ensure
   that [list of ASAs] are well installed on [list of Installation

I think there was an exchange with a previous reviewer that proposed
just removing "well" here; I would counter with a proposal to use
"properly", "completely", or "successfully" instead -- the previous
paragraph already determined that the ASAs in question are "installed
on" the list of installation hosts.

Section 6.2.2

   *  [Set of ASAs - Resources relations] describing which resources are
      managed by which ASA instances, this is not a formal message, but
      a resulting configuration of a set of ASAs.

Comma splice (first comma).

[Anima] Benjamin Kaduk's Discuss on draft-ietf-an… Benjamin Kaduk via Datatracker
Re: [Anima] Benjamin Kaduk's Discuss on draft-iet… Brian E Carpenter
Re: [Anima] Benjamin Kaduk's Discuss on draft-iet… Benjamin Kaduk