[Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-autonomic-control-plane-16: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Thu, 02 August 2018 00:30 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: anima@ietf.org
Delivered-To: anima@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 5119A130EF2; Wed, 1 Aug 2018 17:30:10 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-anima-autonomic-control-plane@ietf.org, Sheng Jiang <jiangsheng@huawei.com>, anima-chairs@ietf.org, jiangsheng@huawei.com, anima@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.83.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <153316981032.22048.6996271018423269893.idtracker@ietfa.amsl.com>
Date: Wed, 01 Aug 2018 17:30:10 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/g5RzDw2mDy5BPF3ZlTkDm1f_KhU>
Subject: [Anima] Benjamin Kaduk's Discuss on draft-ietf-anima-autonomic-control-plane-16: (with DISCUSS and COMMENT)
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.27
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Aug 2018 00:30:10 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-anima-autonomic-control-plane-16: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-anima-autonomic-control-plane/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

This is a really exciting protocol to read about -- the prospect of
dropping a bunch of just-manufactured devices in place, spinning up a
registrar (and maybe a MASA), and getting a control plane like magic is
pretty impressive.  That said, I don't believe that this document is ready
to publish as-is.  I have a list of specific points below for discussion,
but it may be more effective to strip down the document a lot (providing a
well-defined core protocol and leaving out speculative future work, along
the lines of Alissa's comments) and only then start to work on specific
rough spots.

In particular, in its current form, it's not clear to me why this document
is targeting the standards-track -- there are lots of places where
determinations of what works best or how to do some things is left for
future work.  Are there lots of implementations or consumers clamoring for
this stuff that it makes sense to go for PS as opposed to Experimental (so
as to figure out what works and nail down a slimmer protocol for the
standards track)?  I see in A.4 that the choice of RPL was motivated by
experience with a pre-standard version of ACP; it would have been great to
hear more about those deployments in an Implementation Status section (per
RFC 7942) or the Shepherd writeup.

I also think the document needs to be more clear about what security
properties it does or does not intend to provide: we hear in the abstract
and introduction that ACP will be "secure" (and similar platitudes are
repeated throughout), but we don't really get a sense of the specifics
until Section 4, with ACP5.  This has a MUST for authenticated and "SHOULD
(very strong SHOULD)" be encrypted.  But text elsewhere in the document
seems to be using "secure" to also mean encrypted, and there is even one
place that flatly asserts that "ACP mandates the use of encryption".  This
internal inconsistency needs to be resolved, at a minimum, and ideally the
intended posture more clearly conveyed.  (It's also not really stated under
what cases encryption would not be used, so that the "very strong SHOULD"
could not be a MUST.)

Section 3.2 claims that the ACP provides "additional security" for
bootstrap mechanisms due to the hop-by-hop encryption.  But in what sense
is actual additional security gained?  Against an attacker with what
capabilities?  If there is security gain from such hop-by-hop encryption,
doesn't that point to a weakness in the bootstrap scheme?

I think there needs to be some justification of why rfc822Name is chosen
over a more conventional structure in the otherName portion of the
subjectAltName, which is explicitly designed to be extensible.

The requirement in Section 6.1.2 for CRL and OCSP checks seem impossible to
satisfy for a greenfield node without non-ACP connectivity, as it must join
the ACP domain (and supposedly validate the CRL and OCSP validity before
doing so) before establishing an ACP link with its peer, but cannot
validate anything with no connectivity.

Throughout, the document seems to implicitly conflate authentication with
authorization.  I understand that the main authorization check is just the
domain membership test in Section 6.1.2; nonetheless, as a pedagogical
matter I cannot support propagating their conflation.

In a few places, the MTI cryptographic mechanisms are under-specified,
whether the cipher mode for IKE or the TLS ciphersuites.  I have attempted
to note these locations in my section-by-section comments.

Section 6.11.1.14 places a normative ("SHOULD") requirement on the RPL
root, but if I understand correctly the RPL root is automatically
determined within the ACP, and thus the operator does not a priori know
which node will become the RPL root.  Am I misunderstanding, or is this
effectively placing this requirement on all ACP nodes?

The IANA considerations specifically do register SRV.est in the GRASP
Objective Names Table, and then follows up with a paragraph that this is
only a "proposed update".  I don't know if there's actually anything
problematic here, but the document does need clarity on what is proposed
for future work and what is to be done now.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Some high-level comments that do not quite meet DISCUSS criteria appear
below, followed by section-by-section inline comments.  My apologies for
not splitting them between substantive and editorial, but I don't think I
have enough time left before the telechat to do that and finish the other
reviews I have remaining.

The whole premise of the ACP is for it to be almost entirely autonomic and not
require external configuration.  But some pieces/functionality do require
explicit configuration (e.g., manual ACP addresses, configured remote ACP
neighbors, etc.), so I would have liked to see a section discussing what
sort of interface might be used to inject manual configuration into the
otherwise autonomic system.  Would this be done using ACP control messages
from an NMS using an ACP connect node, or an out-of-band (serial)
management port, or something else?

I think the document needs to be more clear about its stance towards
constrained nodes: DTLS is supported (along with IKEv2+IPSEC), supposedly
for the benefit of constrained nodes, but then the more heavyweight TLS is
required for several operations within the ACP itself.

Section 1

   of the ACP (after all the details have been defined), Section 10
   provides operational recommendations, Appendix A provides additional
   explanations and describes additional details or future work
   possibilities that where considered not to be appropriate for

nit: s/where/were/

   [...] The ACP can
   be implemented and operated without any other components of autonomic
   networks, except for the GRASP protocol which it leverages.

This could probably benefit from being disambiguated between the single
ACP-wide GRASP instance and the DULL GRASP used for link-local
flooding/discovery.

Section 1.1

   [...] The ACP can operate equally on
   layer 3 equipment and on layer 2 equipment such a bridges (see
   Section 7).  The encryption mechanism used by the ACP is defined to

nit: such as

   be negotiable, therefore it can be extended to environments with
   different encryption protocol preferences.  The minimum
   implementation requirements in this document attempt to achieve
   maximum interoperability by requiring support for few options: IPsec,
   DTLS - depending on type of device.

nit: This last sentence could be reworded for clarity, "[...] requiring
support for multiple options (IPsec and DTLS), depending on the type of
device."


Section 2

   ACP address:  An IPv6 address assigned to the ACP node.  It is stored
      in the domain information field of the ->"ACP domain certificate"
      ().

nit: The "->" and "()" seem like artifacts from an editor also usable for
source code?  ("->" also appears in "ACP domain"'s definition, and both
appear in the "in band (managemnet) definition and the "(virtual)
out-of-band network" definition.)

   EST:  "Enrollment over Secure Transport" ([RFC7030]).  IETF standard
      protocol for enrollment of a node with an LDevID.  BRSKI is based
      on EST.

RFC 7030 is only Proposed Standard, so "standards-track protocol"
seems more appropriate.

   (virtual) out-of-band network:  An out-of-band network is a secondary
      network used to manage a primary network.  The equipment of the
      primary network is connected to the out-of-band network via
      dedicated management ports on the primary network equipment.
      Serial (console) management ports where historically most common,

nit: s/where/were/

Please use the actual RFC 8174 instead of attempting to reproduce (but not
exactly) its updated boilerplate.

Section 4

   ACP4:  The ACP MUST be generic.  Usable by all the functions and
          protocols of the ANI.  Clients of the ACP MUST NOT be tied to
          a particular application or transport protocol.

nit: The second sentence is only a sentence fragment.

   ACP5:  The ACP MUST provide security: Messages coming through the ACP
          MUST be authenticated to be from a trusted node, and SHOULD
          (very strong SHOULD) be encrypted.

authenticated as coming from a trusted node, or authenticated to be
from a specific node, which is known to be trusted?
Also, maybe "MUST, except for [...]" is better than "very strong
SHOULD".  (What are the execptional cases where plaintext is allowed?)
Integrity protection of authenticated traffic may be worth mentioning
explicitly.

   Th eACP operates hop-by-hop, because this interaction can be built on

nit: s/Th eACP/The ACP/

Section 5

   3.  For each node in the candidate peer list, it authenticates that
       node and negotiates a mutually acceptable channel type.

There seems to be an implicit step in here, "confirms that the node is
authorized to be a part of the same ACP domain".  Presumably this is usally
the "ACP domain membership check" of Section 6.1.2; a forward reference
seems in order.


Section 6.1.1

It's slightly jarring to use ABNF to specify the contents of an
ASN.1 field.

hex-dig is case-insensitive; is that intended?

"acp-address" seems under-specified and maybe over-constrained, in
that it does not say how to get what digits to put there, and in
that limiting to 32 hex digits may prevent the use of alternative
ACP address schemes in the future, as is suggested as a possibility
in the body text.

   [...] If the operator does not own any FQDN, it should
   choose a string (in FQDN format) that intends to be equally unique

I don't know if it's worth cautioning against making up fake
TLD-equivalents, given how this has bitten people as new TLDs come
online.

I'm not sure that "people implementing our stuff have an easier
time" is a great reason to stuff randomly-shaped stuff into an
existing hole, especially when there's this nice otherName-shaped
hole available right next to it.

   o  The format of the rfc822Name is chosen so that an operator can set
      up a mailbox called   rfcSELF@<domain> that would receive emails
      sent towards the rfc822Name of any node inside a domain.  This is
      possible because in many modern mail systems, components behind a
      "+" character are considered part of a single mailbox.  In other
      words, it is not necessary to set up a separate mailbox for every
      ACP node, but only one for the whole domain.

This is effectively codifying that foo+bar@baz === foo@baz in email
addresses, which perhaps merits some broader discussion (especially in the
context of security issues when different providers disagree about whether
local components of email addresses differing after a '+' are the same or
not!).

Section 6.1.2

  o  The peer's certificate is signed by one of the trust anchors
      associated with the ACP domain certificate.

This seems to preclude having a PKI structure that is common in the web
world, of a highly secure, offline trust anchor that only certifies
intermediate CA certificates, with the intermediates certifying end-entity
certificates.  Perhaps the intention is that the peer's certificate chains
to such a trust anchor?

   o  The peers certificate has a syntactically valid domain information
      field (subjectAltName / rfc822Name)  and the domain name in that
      peers domain information field is the same as in this ACP node
      certificate.

Is this supposed to be an exact byte-for-byte match, or is some form of
insensivity allowed that would require normalization/canonicalization prior
to comparison?


Section 6.1.3

I had noted (in my local notes) on the -13 that using the ACP address and
only storing one EST server makes for a single point of failure; the
situation seems somewhat improved in the -16 in that the remebmered value
is used as the first attempt for renewal, but presumably with GRASP
announcement as a fallback there is less of a single point of failure.

Section 6.1.3.1

Does the example need a comma after 255 to indicate the absent
objective-value?  (Also, putting the example after the CDDL might help the
reader know what they're looking at.)

The "formal CDDL definition" of flood-message seems somewhat
informal at times, e.g,. for loop-count.

Using both "[t]he objective value" and an "objective-value" field for
different things is needlessly confusing; can the body text be clarified
somewhat about the value "SRV.est"?

Can "negligbile traffic" be quantified?

Section 6.1.3.3

   [...] If the CDP URL uses an IPv6 address (ULA address when using
   the addressing rules specified in this document), the ACP node will
   connect to the CDP via the ACP.

Seems to be duplicated?

   HTTPs connections.  The connecting ACP node SHOULD verify that the
   CDP certificate used during the HTTPs connection has the same ACP
   address as indicated in the CDP URL of the nodes ACP domain
   certificate

Presumably only if the CDP URL listed an IPv6 address.
(Also, nit: full stop at end of sentence.)

Section 6.1.3.4

A reference to draft-ietf-acme-star and/or draft-nir-saag-star might be
useful to inform the reader of related work.  (Note that the latter was not
adopted by the LAMPS WG yet, with the indication that some changes were
needed before it would be appropriate for adoption.)


Section 6.2

   [...] An ACP node MUST
   maintain this adjacency table up to date.

Up to date on what timescale?

   The adjacency table MAY contain information about the validity and
   trust of the adjacent ACP node's certificate.  However, subsequent
   steps MUST always start with authenticating the peer.

Also verifying that it is authorized for the operation in question?

Section 6.3

   The result of the discovery is the IPv6 link-local address of the
   neighbor as well as its supported secure channel protocols (and non-
   standard port they are running on).  It is stored in the ACP
   Adjacency Table, see Section 6.2 which then drives the further
   building of the ACP to that neighbor.

nit: "see section 6.2" is probably better in parentheses, but if commas
are used, they need to be both before and after it.

Section 6.4

   o  Build the ACP across all domains that have a common parent domain.
      For example ACP nodes with domain "example.com", nodes of
      "example.com", "access.example.com", "core.example.com" and
      "city.core.example.com" could all establish one single ACP.

If this wasn't an example it sounds like it'd need to reference the
public suffix list?

Section 6.5

   o  An ACP node may choose to attempt initiate the different feasible

nit: to attempt to initiate

Section 6.6

"Exponential backoff" requires the base of the exponent to be specified in
order to be well-defined.  (An base of, e.g., 1.0000001 is hardly any
backoff at all, over our normal timescales.)

Section 6.7.1.1

   [...] It MUST then
   support ESP with AES256 for encryption and SHA256 hash and MUST NOT
   permit weaker crypto options.

That does not fully specify cryptographic parameters for
communication security, e.g., CTR vs. CBC vs. GCM mode of AES.
(Similarly in Section 6.7.1.2.)

   In terms of IKEv2, this means the initiator will offer to support
   IPsec tunnel mode with next protocol equal 41 (IPv6).

nit: "equal to"

   ESP is used because ACP mandates the use of encryption for ACP secure
   channels.

I thought this was only a "very strong SHOULD", not mandatory.
(Similarly in Section 6.7.1.2.)

Section 6.7.1.2

(Lots of this section duplicates 6.7.1.1 and could be consolidated into
the toplevel 6.7.1.)

   If IKEv2 initiator and responder support GRE, it will be selected.
   The version of GRE to be used must the according to [RFC7676].

nit: the grammar in the last sentence is weird; maybe "must be determined
according to"?

Section 6.7.2

   To run ACP via UDP and DTLS v1.2 [RFC6347] a locally assigned UDP
   port is used that is announced as a parameter in the GRASP AN_ACP
   objective to candidate neighbors.  All ACP nodes supporting DTLS as a
   secure channel protocol MUST support AES256 encryption and MUST NOT
   permit weaker crypto options.

You should specify actual ciphersuite, signature, and hash
algorithms.

Section 6.7.3

I would recommend calling out the "terminate channel when certificate
expires" behavior again in the security considerations, as it would be
surpirsing to readers expecting the "standard" behavior.

   nodes with an area of baseline ACP nodes MUST therefore support IPsec
   and DTLS and supports threefore the baseline and constrained profile.

nit: s/threefore/therefore/


Section 6.8.2

The figure does not really aid my understanding absent some
additional explanation.

   GRASP unicast messages inside the ACP always use the ACP address.
   Link-local ACP addresses must not be used inside objectives.  GRASP

Link-local *ACP* addresses, or IPv6 ones?

   [...] GRASP
   unicast messages inside the ACP are transported via TLS 1.2
   ([RFC5246]) connections with AES256 encryption and SHA256.

Same comment as before about ciphersuite/etc.  Also, TLS or DTLS (noting
that constrained devices are assumed to only implement DTLS)?
Also, TLS 1.3 is in the RFC Editor's queue; is there work underway
to adapt to it?

   [...] TLS and TLS connections for GRASP in the ACP use the IANA assigned
   TCP port for GRASP (7107).

Is one of those supposed to be DTLS?  Is the IANA-assigned port
assigned for both TCP and UDP?

Section 6.8.2.1

As a side note, I don't mind seeing discussion about potential future work
to avoid the double authentication/encryption, but my intuition is that
it's not really worth pursuing.

Section 6.10.1

   o  Addresses in the ACP are not considered sensitive on privacy
      grounds because ACP nodes are not expected to be end-user devices.

I feel like this claim requires additional justification.

Section 6.10.3.1

   A node knowing it is in a zone MUST also use that Zone-ID != 0
   address in GRASP locator fields. [...]

What does "also" mean here?  Is this another requiment being placed on a
node that knows it is in a zone, or must this node use both the zone-id==0
and the zone-id!=0 addresses in GRASP locator fields (i.e., duplicating all
such)?

Section 6.10.5

   o  V: Virtualization bit: Values 0 and 1 are assigned in the same way
      as in the Zone-ID sub-scheme.

There is not a single 'V bit' -- the V field is either 8 or 16 bits long --
so saying "in the same way" is confusing.  I believe that the intent is to
distinguish between "zero" and "not-zero", with the zero value meaning the
same as the zero bit in the Zone-ID sub-scheme.  That is, the final bit
need not be 1 to indicate a "virtual" usage.  Or do I misunderstand?

Section 6.10.7.3

   In a simple allocation scheme, an ACP registrar remembers
   persistently across reboots for its currently used Registrar-ID and
   for each addressing scheme (zone with Subnet-ID 0, Vlong with /112,
   Vlong with /120), the next Node-Number available for allocation and
   increases it after successful enrollment to an ACP node.  In this
   simple allocation scheme, the ACP registrar would not recycle ACP
   address prefixes from no longer used ACP nodes.

It's probably better to say "increments it during successful enrollment"
since if the registrar crashed right after issuing a certificate but before
incrementing the next available node-number, it would issue a duplicate
when it came back up.

Section 6.10.7.4

   [...] Even when the renewing/rekeying ACP registrar is not
   the same as the one that enrolled the prior ACP certificate.  See
   Section 10.2.4 for an example.  ACP address information SHOULD also
   be maintained even after an ACP certificate did expire or failed.
   See Section 6.1.3.5 and Section 6.1.3.6.

Both the first and the last sentence quoted have grammar nits; the former
is a sentence fragment (perhaps "This holds even when [...]"), and the
second has inconsistent verb tense (perhapse "expired or failed").


Section 6.11

   All routing updates are automatically secured in transit as the
   channels of the autonomic control plane are by default secured, and
   this routing runs only inside the ACP.

Again, I thought encryption was only "very strong SHOULD".
If this "secured" only was intended to refer to authentication (and
presumably implicitly integrity protection), then "by default" is
not needed, since the latter protection is mandatory.

Section 6.11.1.1

   In summary, the profile chosen for RPL is one that expects a fairly
   reliable network reasonably fast links so that RPL convergence will
   be triggered immediately upon recognition of link failure/recovery.

Is there a missing "with" in here, or something else in order to get
it to parse?

   [...] This the same
   behavior as that of other IGPs that do not have the Data-Plane
   options as RPL.

Is this suppposed to be ", as is the case for RPL"?  (Also, "This is"?)

In general, this section has an unclear overall structure/organization and
several instances of strange grammar/wording.  The RFC Editor will be of
some help with the latter, but generally is unwilling to take the
initiative to make the sorts of changes needed to address the former.

Section 6.11.1.7

   Local Repair: As soon as link breakage is detected, send No-Path DAO
   for all the targets that where reachable only via this link.  As soon

nit: s/where/were

Section 6.11.1.9

Please don't treat "security" as some single black-box concept; there are
gradiations within it and different attributes that can be relevant.  For
example, here we would probably say something like "Because the ACP links
already include provisions for confidentiality and integrity protection,
their usage at the RPL layer would be redundant, and so RPL security is not
used".  I guess the RPL security bits needed for per-participant
authentication (as opposed to a group key) are not entirely in place yet,
so it's hard to claim that RPL security would do much better than even
hop-by-hop ACP security measures.

Section 6.11.1.12-14

These sections do not match up with the template entries I see in
draft-ietf-roll-applicability-template-09; can you explain the discrepancy?

Section 6.12.5

I'm confused about the "ACP multi-access virtual interface" -- is
this only for the initial "link-local" flooding/discovery?
Otherwise, aren't the ACP secure channels inherently two-party?  I
don't think I understand what the multi-access benefit is, since my
understanding was that RPL was running on top of these link-local
secure channels.

Section 6.12.5

   The ACP virtual interface IPv6 link local address can be derived from
   any appropriate local mechanism such as node local EUI-48 or EUI-64
   ("EUI" stands for "Extended Unique Identifier").  It MUST NOT depend
   on something that is attackable from the Data-Plane such as the IPv6
   link-local address of the underlying physical interface, which can be
   attacked by SLAAC, or parameters of the secure channel encapsulation
   header that may not be protected by the secure channel mechanism.

Is this the same EUI that might be used on the Data-Plane like the MAC
address of the physical interface?

nit: s/Charly/Carol/

Section 7.2

   The description in the previous paragraph was specifically meant to
   illustrate that on hybrid L3/L2 devices that are common in
   enterprise, IoT and broadband aggregation, there is only the GRASP
   packet extraction (by Ethernet address) and GRASP link-local
   multicast per L2-port packet injection that has to consider L2 ports
   at the hardware forwarding level.  The remaining operations are
   purely ACP control plane and setup of secure channels across the L3
   interface.  This hopefully makes support for per-L2 port ACP on those
   hybrid devices easy.

Have you talked to any hardware manufacturers that would be able to remove
the "hopefully" from this statement?

   A generic issue with ACP in L2 switched networks is the interaction
   with the Spanning Tree Protocol.  Ideally, the ACP should be built
   also across ports that are blocked in STP so that the ACP does not
   depend on STP and can continue to run unaffected across STP topology
   changes (where re-convergence can be quite slow).  The above
   described simple implementation options are not sufficient for this.
   Instead they would simply have the ACP run across the active STP
   topology and the ACP would equally be interrupted and re-converge
   with STP changes.

This "Instead" is a little unclear, perhaps "They fail because the ACP
simply runs across the active STP topology [...]"?

Section 8.1.1

   The ACP connect interface must be (auto-)configured with an IPv6
   address prefix.  Is prefix SHOULD be covered by one of the (ULA)
   prefix(es) used in the ACP.  If using non-autonomic configuration, it
   SHOULD use the ACP Manual Addressing Sub-Scheme (Section 6.10.4).

I'm confused in what case ACP connect would be used with autonomic
configuration (and thus why the qualification is needed on the SHOULD).
I thought the whole thing was premised on the presence of a NMS that does
not implement ACP.

   In the simple case where the ACP uses only one ULA prefix and all ACP
   connect subnets have prefixes covered by that ULA prefix, NMS hosts
   can rely on [RFC6724]

Please include some exposition on the property being provided instead of
just citing the RFC.

   ACP Edge Nodes MUST only forward IPv6 packets received from an ACP
   connect interface into the ACP that has an IPv6 address from the ACP
   prefix assigned to this interface (sometimes called "RPF filtering").

This sentence is hard to parse as to what the "that has" restriction
applies to.  I think it's supposed to be that you only forward (IPv6 packets
with a source address from the ACP prefix) into the ACP, right?

   To limit the security impact of ACP connect, nodes supporting it
   SHOULD implement a security mechanism to allow configuration/use of
   ACP connect interfaces only on nodes explicitly targeted to be
   deployed with it (such as those physically secure locations like a
   NOC).  For example, the certificate of such node could include an
   extension required to permit configuration of ACP connect interfaces.

I think this would be better as "[...] could include a specific extension,
and that extension would be required to be present in order to permit
configuration [...]".  But who would enforce this requirement -- the ACP
implementation on the node that is compromised?  That does not seem to
provide the desired security property.  This also falls into Alissa's
comment about "future work".

Section 8.2.1

I think you need to include a reference for ABNF.

Section 9.1

      [...] Since the revocation check is only
      done at the establishment of a new security association, existing
      ones are not automatically torn down.  If an immediate disconnect
      is required, existing sessions to a freshly revoked node can be
      re-set.

How would the revoked node's peers know to perform such a re-set?  This
would seem to require some signaling protocol at revocation time.

   After a network partition, a re-merge will just establish the
   previous status, certificates can be renewed, the CRL is available,
   and new nodes can be enrolled everywhere.  Since all nodes use the
   same trust anchor, a re-merge will be smooth.

I believe this document has described schemes where not all nodes use the
same trust anchor [for signing their LDevID], so maybe this should be
"trust anchor(s)" plural?

   Merging two networks with different trust anchors requires the trust
   anchors to mutually trust each other (for example, by cross-signing).
   As long as the domain names are different, the addressing will not
   overlap (see Section 6.10).

Subject to the risk of a 40-bit collision in SHA256!  While not necessarily
a critical flaw at this time, the limitation should probably be mentioned.

Section 9.2.1

I think you need informative references to all the listed protocols that
the ACP could serve as protection for.

% remote attacks are impossible

Remote attacks to DoS by resource consumption the nodes involved would
still work fine, so "impossible" is probably overstating it.

Section 9.2.2

I expected to see something about the importance of being able to detect a
compromised node and revoke its certificate.  Ideally this could be
automated (with the detecting node providing proof of compromise in some
fashion), though the details of that would probably be hard to get right.

Section 9.3

"independent of configuration" is in conflict with the discussion of
configuring ACP connect, configured remote ACP neighbors, etc.

Section 10.2.2

      For BRSKI or other mechanisms using vouchers: Parameters to
      determine how to retrieve vouchers for specific type of secure
      bootstrap candidate ACP nodes (such as MASA URLs), unless this
      information is automatically learned such as from the LDevID of
      candidate ACP nodes (as defined in BRSKI).

I thought the LDevID was essentially synonymous with "ACP domain
certificate" in this document, so I can't understand what this means
(unless IDevID was intended).

Section 10.2.3

   [...] And without additional centralized tracking of
   assigned certificates there is no way to do this - assuming one can
   not retrieve this information from the .

Missing the end of the sentence?

Section 10.2.4

   Or let it expire and not renew it, when the certificate of the sub-CA
   is appropriately short-lived.

This sort of sentiment has been highly controversial in other contexts
(e.g., draft-nir-saag-star).  (Also, that's a sentence fragment.)

   Therefore ACP domain membership for an ACP node enrolled from a
   compromised ACP registrar will fail.

>From a compromised *and detected* registrar.

Section 10.3.x

This whole section feels more like a sketch of an idea than a
well-specified model or protocol.  It might be better if spun off into a
separate document, with time spent to produce (e.g.) a YANG module or state
machines for devices in various states.

Section 10.3.5.1

   [...] Automatic enablement of ACP/ANI in networks
   where the operator does not only not want ACP/ANI but where he likely
   never even heard of it could be quite irritating to him.  Especially
   when "down" behavior is changed to "admin down".

The behavior mentioned in this last sentence really ought to be called out
more clearly in the previous section as changing the semantics of existing
administrative controls.

Section 10.3.7

The control names indicated within double quotes are mostly incomplete
references; it seems better to say, e.g., """the "up-if-only" option for
node-level ACP/ANI enablement""".

Section 11

   An ACP is self-protecting and there is no need to apply configuration
   to make it secure.  Its security therefore does not depend on
   configuration.

It seems like there are some higher-level/potentially "external"
configurations needed, including but not limited to: setting up the
registrar, definng the Intent, assigning a ULA to use for the domain,
policy for the CA issuing domain certificates, and any interaction with
external systems that is needed.  (That is, a fully autonomous system would
be totally self-contained, and thereby not of much use to the humans
involved!)

"correct operation" to me usually means "this system is behaving as
expected", but I think the intended usage here is more like "being operated
and managed correctly".  I don't know if I'm enough of an outlier to make
it not worth changing the text.

I would like to see more text about the scope of damage that a compromised
ACP node can do, and suggesting detection/remediation measures.

Section 12

   Note that the objective format "SRV.<service-name>" is intended to be
   used for any <service-name> that is an [RFC6335] registered service
   name.  This is a proposed update to the GRASP registry subject to
   future work and only mentioned here for informational purposed to
   explain the unique format of the objective name.

I'm confused about what this is actually trying to say.  It sounds like it
is actually registering SRV.est (in the previous paragraph), but following
that up by saying that this isn't actually a real registered service name
yet, and we're just registering the objective name proactively for future
work?  If so, that does not seem like the correct thing to do.

Section A.2

   [...] This requires only that the BRSKI
   registrar honors expired domain certificates and that the pledge
   first attempts to perform TLS authentication for BRSKI bootstrap with
   its expired domain certificate - and only reverts to its IDevID when
   this fails.

The "first" is unclear -- perhaps "that the pledge attempts to perform TLS
authentication for BRSKI bootstrap using its expired domain certificate
before falling back to attempting to use its IDevID for BRSKI"?

Section A.4

      [...] RPL also has other scalability improvements,
      such as selecting only a subset of peers instead of all possible
      ones, and trickle support for information synchronization.

(But trickle support is not used in the ACP profile of RPL.)

Section A.8

This discussion of reusing preexisting MAC addresses violates the claim
that the ACP-internal addresses are not guessable from the data plane.