[Anima] review of autonomic-control-plane-04

Michael Richardson <mcr+ietf@sandelman.ca> Wed, 30 November 2016 16:40 UTC

From: Michael Richardson <mcr+ietf@sandelman.ca>
To: anima@ietf.org
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha1"; protocol="application/pgp-signature"
Date: Wed, 30 Nov 2016 11:22:25 -0500
Message-ID: <12657.1480522945@obiwan.sandelman.ca>
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/nZpEphrTqDCBdzsKMpaIn2gsIzI>
Subject: [Anima] review of autonomic-control-plane-04
Precedence: list

This is a review of: draft-ietf-anima-autonomic-control-plane-04.

I will attempt to reply to parts of my review with more pertitent subject
lines, because some substantive comments/discussion are embedded, or feel
free to do that yourself.

Sorry that it's 400 lines long, if you'd like, I can find an XML file and
submit patches.

section 1:

s/access devices through console ports/
 /access devices through console ports (craft ports)/

{
 cf: http://tldp.org/HOWTO/Remote-Serial-Console-HOWTO/intro-why.html
 There are many pages asking why the telco's call the console port the
 "Craft" port, and it seems to have something to do with providing
 access to the system to the "craftsmen personnel" to verify if the system
 they just installed was operational.
}


change: For example, GRASP
      [I-D.ietf-anima-grasp] can run inside the ACP.

to: For example, GRASP
      [I-D.ietf-anima-grasp] can run securely inside the ACP.

section 3, can we number the requirements like in GRASP-08, i.e:
        ACP1, ACP2, etc.


I think that this is confusing:
   It may be necessary to have end-to-end connectivity in some cases,
   for example to provide an end-to-end security association for some
   protocols.  This is possible, but then has a dependency on routable
   address space.

I think that you mean to say that the ACP could run *OVER* some kind of
global end-to-end connectivity, but that then it depends upon routable
address space.

But, as I read it, it suggests that some protocols *inside* the ACP
might need end-to-end connectivity, and this would depend upon routable
address space.  (My take is that the purpose of the ACP is to provide
end-to-end connectivity for protocols that run inside the ACP, and I think
we all agree about that)

section 4:
        "Intent can override this default policy."
        Instead of getting into what an Intent is, and confusing the security
        reviewers, since we don't define it, can we just instead say:

        "Unless overridden by some other policy, the default policy is: To
         all adjacent nodes in the same domain.  "

Can we number the steps in this section?

Please turn the following three points into numbered paragraphs or points
seperate from the previous "steps", since they are really notes:
   o  Non-autonomic NMS systems or controllers have to be manually
      connected into the ACP.
   o  Connecting over non-autonomic Layer-3 clouds initially requires a
      tunnel between autonomic nodes.
   o  None of the above operations (except manual ones) is reflected in
      the configuration of the device.

Your diagram is great.

I have heard some say that they would want to enable the ACP on interfaces
which were marked Admin Down, with maybe even some kind of auto-negotiate
(or auto-guess based upon energy detection) of lambdas.  Is it worth saying
something in section 4 about this?

5.1:
   specific Unique Device Identifier (UDI) or IDevID certificate.
   (Note: the UDI used in this document is NOT the UUID specified in
   [RFC4122].)

how about telling us what the UDI is, rather than what it isn't?
Isn't this a Cisco internal term?
Is this here to steer your colleagues correctly? (I don't object to it being
there, I just want to make sure that it doesn't confuse others)

====
   The domain certificate (LDevID) of an autonomic node MUST contain
   ANIMA specific information, specifically the domain name, and its ACP
   address with the zone-ID set to zero.  This information MUST be
   encoded in the LDevID in the subjectAltName / rfc822Name field in the
   following way:

   anima.acp+<ACP address>@<domain>

   An example:

   anima.acp+FD99:B02D:8EC3:0:200:0:6400:1@example.com

This puts some pretty clear and pretty strong requirements onto the
Registrar, which I think belongs in the bootstrap document.  We don't really
have a place for this.  I will start a new thread about this part.

5.1.2:
please move this elsewhere, as the table has not yet been defined, and you
are already making exceptions to it:
   Where the next autonomic device is not directly adjacent, the
   information in the adjacency table can be supplemented by
   configuration.  For example, the node-ID and IP address could be
   configured.

This also seems like an pre-mature optimization:
   The adjacency table MAY contain information about the validity and
   trust of the adjacent autonomic node's certificate.  However,
   subsequent steps MUST always start with authenticating the peer.

In diagram Figure 2, please change "ANrtrI" to another letter, because
"I" and "1" are hard to distinguish.

It's true that a full mesh of ACP channels will be built: we ideally need to
create some metrix for RPL to use to pick parents.  It would be desireable
to be aware of the L2 fabric.

You are, I think suggesting having the ANswitchX block forwarding of the
ALL_GRASP_NEIGHBOR mcast group.  I'm not sure I like this solution.

One possible solution to the large number of channels is not to create the
IPsec CHILD SA unless needed.  This would be possible if the IKEv2 deamon
was also the RPL routing daemon, and we did RPL over the IKEv2 layer.
Also really sick layer violations :-)

5.2.2:
   Unfortunately, they [CDP? mDNS?] will also
   terminate their messages if they do not support the ACP and would
   then inhibit ACP neighbor discovery

Can you explain this?  I don't understand what you are saying from the text.
Are you saying that an L2 switch that spoke LLDP, but didn't speak ACP
would eat the LLDP rather than forward it, and in this case, we want to
forward it.  (A switch which didn't speak LLDP would just forward it)

It's good to point out that LLDP is not forwarded, and that L2 switches
already do this kind of thing when considering how to limit the ACP discovery.

It seems like the point of 5.2.2 is to discuss why we can't use mDNS.
Maybe we could split the CDP/LLDP section from the mDNS section. (Looks like
just a section header would do that)

5.2.3: we need to define it more clearly in this document, and if we want
       to point elsewhere, we need to point to new anobjectives document.

5.2.4: I will write some text in coordination with the next update to BRSKY
       to point at M_FLOOD.

XXX    I feel it is important to combine the ACP discovery with the proxy
       discovery.

       Thanks for noticing richardson-anima-6join-discovery and pointing to
       it.  I think that there will be some changes to this too.
       {so many documents, so little time}

5.3:
        This is interesting, you are suggesting that while many nodes may
        be part of the "example.com" domain, that the ACP would only be
        established among some subset of it.
        I can see how it might be important to connect CPE devices
        (with "*.access.example.com" certificates)
        a different ACP than the core routers (with "*.core.example.com")
        One way is to run two instances of GRASP, and enroll each instance
        seperately with different certificates.  Another way might to give
        the access concentrators certificates with multiple CNs.

        A third way might be to create some kind of ACP proxy/tunnel
        mechanism that permitted the CPE devices to build ACP tunnels
        *through* the access concentrators, via the "core" ACP, to the
        access network infrastructure.
        I have another use for such a thing, which is providing ACP backhauls
        in multi-tenant data centers.  I will start an entirely new thread
        on this.

        I suggest third paragraph, "Intent can change.." be written:
           This ACP document puts a requirement that Intents be able to
           change this default behaviour.  The precise way in which this
           should be expressed needs to be defined outside this document.
           Example Intent policies which need to be supported include:

5.4:
   From the use-cases it is clear that not all type of autonomic devices
   can or need to connect directly to each other or are able to support
   or prefer all possible mechanisms.  For example, code space limited
   IoT devices may only support dTLS (because that code exists already

I claim that any "IoT" device that is "big enough" to participate meaningfully
in the ACP is also big enough to support the common protocols other than
DTLS.   The ACP should be connected lighting controllers, not light bulbs.

As for MacSEC vs IPsec, it is my understanding that MacSEC does have a key
management protocol defined for it by the IEEE, so really the common
situation is that one supports IKEv2 to negotiate if one supports IPsec
or MacSEC.

As for the two stage process, I don't want to do this.  I want to just
use IKEv2, and I claim that there will be no people who will say, "I can not
live with this". (Of course, some may have other preferences, but preferences
does not equal rough consensus)

     ...Alice must be able
     to simultaneously act as a responder in parallel for all of them - so
     that she can respond to any order in which Bob wants to prefer...

it's this part that I think is too complex and error prone to code.

5.5.1:
   encryption.  Further parameter options can be negotiated via IKEv2 or
   via GRASP/TLS.

I think that the last sentence should be striked out, I think it is
meaningless.  IKEv2 negotiates everything, and there is no GRASP/TLS.

5.5.2: ACP via GRE/IPsec

Given that you add GRE here, I don't understand 5.5.1.
Do you mean to write that 5.5.1 is really IPsec(transport-mode) IPIP(94)?
And 5.5.2 is really IPsec(transport-mode) GRE(47)?

   Note that without explicit negotiation (eg: via GRASP/TLS), this
   method is incompatible to direct ACP via IPsec, so it must only be
   used as an option during GRASP/TLS negotiation.

That's not true. IKEv2 can negotiate this quite well.  We may want to
define some Notify messages to make it abundantly clear that this is
an ACP negotiation going on, but that's easy.

5.5.3.  ACP via dTLS
So, it's UDP and then... ? GRE inside UDP? (there is a draft tsvwg-gre-in-udp-encap-19)

   When Alice and Bob successfully establish the GRASP/TSL session, they
   will initially negotiate the channel mechanism to use.

Yeah, no.  Tons of code with no benefit.
Who is actually asking for these options?

5.5.5.  ACP Security Profiles

   A baseline autonomic device MUST support IPsec and SHOULD support
   GRASP/TLS and dTLS.  A constrained autonomic device MUST support
   dTLS.

if we want to do something for constrained devices, then we should say that
they always initiate, that they should join as RPL leafs (so no forwarding of
packets), and that they the LWIG version of IKEv2 should be supported,
and maybe the diet-ESP mechanisms.  We should also be clear if we are trying
to support constrained devices, constrained networks, or challenged networks.

5.7:
to:
   If possible by the platform SW architecture,
   separation options that minimize shared components are preferred.

add:
   ..such as a logical container (reference to Linux container), or
   virtual machine instance (reference to KVM and also to the Cisco
   router VM platform)

   o  Usage: Autonomic addresses are exclusively used for self-
      management functions inside a trusted domain.  They are not used
      for user traffic.  Communications with entities outside the

s/user/customer/
  - whichever term we use, we may want to put this into the terminology

s/consensus was to use standard ULA, because it was deemed to be/
 /consensus was to use ULA-random [RFC4193 with L=1], because it was deemed
 to be/

      as the first 40 bits of the MD5 hash of the domain name, in the
      example "example.com".

we will get beat up for using MD5 by someone who uses grep, even as a PRF
here.  Might as well just say SHA256, as it costs nothing here.

   o  Type: This field allows different address sub-schemes in the

In IANA Considerations, I suggest Standards Action, with 111 reserved for
private use.

I would like V to be at least three bits, maybe 8 bits.
In the bootstrap proxy IPIP mechanism, we need to allocate an ACP address
for each insecure L2-domain ("port") so that traffic from the Registrar
(which inside the IPIP header is v6LL) to get back to the correct link-layer.

I have mixed feelings about the 48-bit Registrar ID.
I know why you did it, and why you'd want to use 48-bits.
(It took two reads to realize it was the Registar's MAC, not the enrolled
node's MAC address).

So the diagram is really:

        48        3   13         48          15        1
   +-------------+-+--------+-------------+----------+---+
   | hash(domain)|T| ZoneID | Registar ID |Device Num| V |
   +-------------+-+--------+-------------+----------+---+

Since we never care about the /64 boundary in RPL, since we pass around
/128 routes in the ACP, do we care if we've placed the Registar ID
here?  Clearly it is nice because we have ZoneID as a /64.

I'm thinking that I would like to instead do something like:

        48        2   46            32       16
   +-------------+-+------------+----------+-------------+
   | hash(domain)|T| Registar ID|Device Num| V           |
   +-------------+-+------------+----------+-------------+

Where RegistarID is still MAC address derived, with the G and U
bits removed, and we now have 2^32 space for devices (I think 2^16
might be too small if one includes CPE devices in the CPE, and
one has some churn over a decade+ of CPE devices).
There is now 16 bits available to do things, and we can pass /114
routes around in RPL, btw.  I'm open as to whether V remains
as a specified bit, or if "physical" machine is just V=0x0000.

If we need ZoneID, then I suggest that we can easily get it by
having different Registrar IDs for each zone.  If you want them in different
/64s then just construct the RegistarID to be unique in the upper 14 bits.

This brings up an important aspect, which I know we have discussed before,
which is what does the certificate say, and how does it relate to IPsec
SA permissions, and therefore to ability for GRASP to trust things.
XXX I need to write something here to make it clearer that the ACP
    isn't so squisshy in the middle...

5.8.4:
   If a device learns through an autonomic method or through
   configuration that it is part of a zone, it MUST also respond to its
   ACP address with that zone number.  In this case the ACP loopback is

I don't like this, because it seriously breaks up the aggregation that
might otherwise be possible.  I don't want explicit ZoneID, I'd rather
go with the Note: in 5.8.4, or use the RegistrarID.

5.9:
        Needs to say more clearly that we are using 6550 RPL,
        and we need to decide if we are using storing or non-storing mode.
        I strongly suggest that we want storing mode.
        We will have to define a bunch of other RPL parameters.

        We also need to be clear that RPL is occuring *within* the ACP
        channels.

        (Alternatively, we could run RPL outside the ACP channels, using RPL
        layer-3 security, and then setup ACP channels when we pick a parent.
        There are definite advantages to this, and also many downsides.
        I don't suggest this, but, it might be worth saying why)

5.10:
        When we establish multiple ACP channels RPL (or any other routing
        protocol!) will need to have some metrics to pick among them.
        I'm not sure what we can provide here, at the least, we should
        prefer shorter paths to longer ones.

        If an autonomic node decides to have a limit on how many channels
        it sets up, or how many it will setup with a particular peer,
        it SHOULD indicate a clear "thanks, I'm full" message in the ACP
        channel negotiation protocol (i.e. an IKEv2 Notification).

6.1:
        Is there a distinction between marking a port on a switch as
        "ACP access" (no ACP channel) to connect to the NMS, from
        a case where the switch is told to negotiate an ACP channel
        with the NMS machines (extending the ACP via explicit configuration,
        rather than via discovery)?
        I think so, does 6.1 cover only the "ACP access" case then?
        Can we give it a clear name?

I'd like to add a 6.3:
    ACP through third-party L3 Clouds

I'm thinking that a cooperating L3 device could M_FLOOD ACP, and then,
when the IKEv2 negotiation comes in, could have been configured to forward
the traffic back to a designated ACP node at the edge of the NMS.
(maybe over IPv4 including NATs).
The resulting tunnel would be an ESP-over-UDP tunnel.
A multi-tenant datacenter might provide this as a service to it's tenants.
(where would the bandwidth come from?  The datacenter would probably
buy that from a it's transit tenants and be multihomed)

7.
   o  If an existing device gets revoked, it will automatically be
      denied access to the ACP as its domain certificate will be
      validated against a Certificate Revocation List during
      authentication.  Since the revocation check is only done at the

First mention of CRLs, btw!  This is one of the details that belong
in section 5.5.1/5.5.2.

      automatically torn down.  If an immediate disconnect is required,
      existing sessions to a freshly revoked device can be re-set.

the problem is that the knowledge to know to re-set is not distributed,
unless we do it via GRASP.  The detail missing is that we should be
restarting the IKEv2 Parent SAs periodically (we can do this without killing
the Child SAs), and doing the OCSP checks there.
I suggest that OCSP is probably the better solution rather than CRLs here as
we have an ACP over which to do it.  Max may have an opinion here, and maybe
we should do CRLs instead for reasons of network partition.

   There are few central dependencies: A certificate revocation list
   (CRL) may not be available during a network partition; a suitable
   policy to not immediately disconnect neighbors when no CRL is
   available can address this issue.

Assuming that "immediately" means that we eventually disconnect neighbours
when no CRL is available, isn't that the same as just making the CRL recheck
time longer?
  i.e. CRL check time of X + grace period Y
  same as: CRL check time = X+Y
           rekey time = X

section 10, needs a discussion about source address spoofing within the ACP.

Appendix A:
      of the network, the less state needs to be maintained.  This
      adapts nicely to the typical network design.  Also, all changes
      below a common parent node are kept below that parent node.

this implies that we are using storing mode.  It's not true for non-storing
mode.


--
Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
 -= IPv6 IoT consulting =-

Attachment: signature.asc

[Anima] review of autonomic-control-plane-04 Michael Richardson

[Anima] review of autonomic-control-plane-04

Attachment: signature.asc