Re: [Anima] Benjamin Kaduk's No Objection on draft-ietf-anima-grasp-api-08: (with COMMENT)

Brian E Carpenter <> Thu, 03 December 2020 04:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 49EAB3A0B06; Wed, 2 Dec 2020 20:50:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id W3XQX_GdVZeX; Wed, 2 Dec 2020 20:50:32 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::632]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 08D953A0B04; Wed, 2 Dec 2020 20:50:31 -0800 (PST)
Received: by with SMTP id b23so447480pls.11; Wed, 02 Dec 2020 20:50:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=KKd1v07cC8AEk29SAUoNNZO/rfgBWVYzZhGu2yDGEpY=; b=c2qJDH64pc6gP6TLDUN3q/YVlRrS+mtQU8ajsRqzFMXmwl102nOWvWIVzbuz7XzzkF shlO69EexhwoThP3nFK1VV7jqTZAoNrq6B+c/NBDqs1l2CGIRFkzgDduaS+z+a8RgJeL IR207ZOfvvvDM11RgeNbot36Ve5asLCVCvK8nRydM+HqvNNxeKOcjRMq+07F4UWKkLSh Nd+j1M4Huux8yWiK3bGUtIbkkA7KbXWtJzLfYkVOItlPg3fTVP+bRxd+cUBAksPQBPvg 0kExTuq7WpxwxsJgFt7D7e7pQPeQoWjIoGpjCcpmQkbBCujOOVTaI6SN4kn1RfTf6BMv AaYg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=KKd1v07cC8AEk29SAUoNNZO/rfgBWVYzZhGu2yDGEpY=; b=WatfB7Tc3+gSlHlmILtMhW5LM3erSUsd7+dcV4qiq22h/zPB/3JSsLrdNo/IuS8xeA wI56NT45Jiqr5gag9YHSx7Bwk4+QBGdWtcATgiOJN2SVFn3KIUrQrq4x3iSd3ZRED2VQ BiWiAhBJT0d6yyuq/xnP2NlcbEawJs+ebmTXzBcYOgJQMMiARaXWb3t2H3GfnP/Ob5mT jvVoD9cnjVhBMJFb+AXb6XuoBAQc3R3+R/ZRiNGNE3ieEa0T9qoBND6lC4hGcEqTtlM9 r3J2m/UaZn6N3twmnpWxYSDZkYD8zThgVO+sPEe4aOG5QT4Rn5OaFAqHlW8K9DxeVWT6 QonA==
X-Gm-Message-State: AOAM530oTBKZ/JI3Dw/l2zjEL5WqrNgPrlwL4JiBm0bSxMHviOKii+sr 1elQm0yq+ggE3hAP9lywKc0=
X-Google-Smtp-Source: ABdhPJxxBQyM003QZB7yCRRP6tgCh1tWSW3P+zpLx3jhKiHRELPp1CGY52lr5axCwr3lfj1m2ZfcTg==
X-Received: by 2002:a17:90a:4596:: with SMTP id v22mr1383867pjg.12.1606971031095; Wed, 02 Dec 2020 20:50:31 -0800 (PST)
Received: from [] ([]) by with ESMTPSA id r4sm588529pgs.54.2020. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Dec 2020 20:50:30 -0800 (PST)
To: Benjamin Kaduk <>, The IESG <>
Cc:,,, Sheng Jiang <>
References: <>
From: Brian E Carpenter <>
Message-ID: <>
Date: Thu, 3 Dec 2020 17:50:25 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [Anima] Benjamin Kaduk's No Objection on draft-ietf-anima-grasp-api-08: (with COMMENT)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 03 Dec 2020 04:50:35 -0000

Thanks Ben. Some comments below (no comment means I agree with you):

On 03-Dec-20 15:03, Benjamin Kaduk via Datatracker wrote:
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-anima-grasp-api-08: No Objection
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
> Please refer to
> for more information about IESG DISCUSS and COMMENT positions.
> The document, along with other ballot positions, can be found here:
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> I have two comments in particular that I would like to call your
> attention to: my comment on cache flushing in Section 2.3.4, and my
> comment on the CBOR data model used for validation in Appendix A.
> Section 1
>    An ASA runs in an ACP node and therefore inherits all its security
>    properties, i.e., message integrity, message confidentiality and the
>    fact that unauthorized nodes cannot join the ACP.  All ASAs within a
> I agree with Roman's comment that the "it" whose security properties are
> inhereited is the ACP *node*, not the ACP itself, and thus that some
> rewording is appropriate.
>    The GRASP API library would need to communicate with the GRASP core
>    via an inter-process communication (IPC) mechanism.  The details of
> Hmm, if the GRASP core is in kernel-space and the API library in
> userspace, wouldn't we normally refer to that exchange as a system call
> rather than IPC?  (Figure 1 also labels this interaction "IPC".)

That could be, and it possibly depends on the terminology used by the
language and/or operating system in use. However, it isn't clear to
me that the GRASP core *needs* to be kernel space. Its lowest level
operations are socket calls and whatever it uses to support asynchronous
operations. So I think the text should be non-commital on this.

> Section 2.1
>    *  Authorization of ASAs is not defined as part of GRASP and is not
>       supported.
> Any chance I could interest you in s/not supported/a subject for future
> work/?  It is looking somewhat likely since such a statement is already
> present in the security considerations...
>    *  User-supplied explicit locators for an objective are not
>       supported.  The GRASP core will supply the locator, using the ACP
>       address of the node concerned.
> This would seem to prevent any non-ACP use of GRASP; I suggest adding
> some language with a caveat about "for example" or similar, unless the
> intent is to limit the API usage to ACP (or DULL) scenarios.
> Section 2.2.1
> I think that the possibility for a single outbound message to get a
> sequence of incoming replies (at different times) further complicates
> the design of an asynchronous mechanism, and we would do well to discuss
> how such scenarios (e.g., broadcast discovery messages) would be handled
> by the implementation and API.  (I see that we do end up using a timeout
> in practice to resolve this topic, but would probably still mention it
> as an issue that has been resolved, here.)

There's quite a lot of text in the main GRASP spec about that, because
indeed a discovery message may get an unpredictable number of replies.
Some of that text is a direct result of prototyping experience. I think
the summary is that discovery is not a deterministic process. If there
are N nodes handling objective O, you will discover some number of
them <=N before the timeout expires. That semantic should probably
be mentioned in the API spec.

> Section 2.2.2
>    ports rather than a separate port per session.  Hence the GRASP
>    design includes a session identifier.  Thus, when necessary, a
>    'session_nonce' parameter is used in the API to distinguish
>    simultaneous GRASP sessions from each other, so that any number of
>    sessions may proceed asynchronously in parallel.
> I do see that there was previous discussion on the 'nonce' terminology
> here, and I am unsure why there is need to move away from the "session
> ID" terminology used in GRASP itself.  In particular, the
> "session_nonce" is not a number used *once*, rather, it is used only for
> one session (but potentially multiple times within that session).  That,
> to me, makes it a (short-lived) identifier, not a nonce.  Roman's
> proposal of 'handle' would resolve this apparent disparity.

Yes, I don't have strong feelings about that (except running code
using the "_nonce" terminology).

> Section 2.2.3
>    On the first call in a new GRASP session, the API returns a
>    'session_nonce' value based on the GRASP session identifier.  This
> What does "based on" mean?  Does there need to be a one-to-one
> correspondence?  Or just in one direction?  Are we going to be
> constrained by the (IMO, too limited) 32 bits of randomness limit of the
> GRASP Session ID?

Re the "based on", we have other comments that lead to touching up that

Re the 32 bits, it's only intended to make a collision unlikely. It
only needs to be unique within a node, and that is trivial to check.
There is no presumption that it has cryptographic value.

> Section
>    -  Note 3: In a language such as C the preferred implementation
>       may be to represent the Boolean flags as bits in a single byte,
> Which aspect(s) of C are relevant for the "such as"?

Dunno... I'd certainly do it that way in assembly language too.

>    An essential requirement for all language mappings and all
>    implementations is that, regardless of what other options exist
>    for a language-specific representation of the value, there is
>    always an option to use a raw CBOR data item as the value.  The
>    API will then wrap this with CBOR Tag 24 as an encoded CBOR data
>    item [RFC7049] for transmission via GRASP, and unwrap it after
>    reception.
> I'm not sure I understand why the bstr wrapping is mandatory -- I would
> have thought that the attraction of using a raw encoded CBOR data item
> would be that it could be used directly, without additional wrapping.

Generally, yes. But this covers a corner case where for whatever
reason the user provides a CBOR-encoded thing where you'd expect
the thing itself. It isn't a situation that arises if the programming
language is object-oriented, but might arise if an ANSI C programmer
wants to pass a compound object as the value.

(Coding this in my prototype really made my head hurt.)

>     int loop_count;
>     int value_size;           // size of value in bytes
> Some people might argue for using unsigned types for at least sizes
> (e.g., size_t), and often for things like loop counts that cannot be
> negative (though the argument for an unsigned type there is somewhat
> weaker).

All improvements to my weak C skills are very welcome.

>         self.value = 0      # Place holder; any valid Python object
> Wouldn't None be a more conventional placeholder in Python?

I think that's right. This class definition is the very first piece
of real Python code that I ever wrote, so I'll patch it.

> Section
>    *  The following cover all locator types currently supported by
>       GRASP:
>       -  is_ipaddress (Boolean) - True if the locator is an IP address
>       -  is_fqdn (Boolean) - True if the locator is an FQDN
>       -  is_uri (Boolean) - True if the locator is a URI
> Are these mutually exclusive?


> Section
> As for the GRASP session ID, I think that a 32-bit cap is too
> restrictive.  I think we should be in the habit of using 128-bit nonces
> and needing to justify anything smaller.  (64 bits would *probably* be
> fine here, FWIW, and might make it easier to represent in common
> language bindings.)

As above, we really aren't looking for cryptographic unguessability here.

>    Section  Another possible implementation is to hash the
>    name of the ASA with a locally defined secret key.
> I recognize that this is a throwaway line, 

I'm inclined to delete it. When we get around to authorization
mechanisms in ANIMA, we'll have to do something more serious anyway
(or recycle an existing authzn mechanism).

> but the naive keyed hash
> construction is subject to length-extension attacks (for certain hash
> constructions such as the Merkle-Damgarg family that includes SHA-2);
> HMAC is more robust for this type of usage and can be phrased in an
> similarly concise manner ("compute an HMAC of the name of the ASA under
> a locally defined secret key").
> Section 2.3.3
>    *  deregister_asa()
>       [...]
>       -  Note - the ASA name is strictly speaking redundant in this
>          call, but is present for clarity.
> So what happens if the wrong name is passed?

A fail. We have to update this anyway because of another comment,

>          transmit to other ASAs.  It is not necessary to register an
>          objective that is only received by GRASP synchronization or
>          [...]
>          Registration is not needed for "read-only" operations, i.e.,
>          the ASA only wants to receive synchronization or flooded data
>          for the objective concerned.
> These seem to have high overlap and thus be candidates for
> deduplication.

The difference is that synch is for rarely required data and flood
is for generally required data. There's also ongoing work for
adding a pub/sub model (draft-ietf-anima-distribution).

>       -  The 'ttl' parameter is the valid lifetime (time to live) in
>          milliseconds of any discovery response for this objective.  The
> (nit?) I'd suggest to add "generated", since it would not apply to any
> hypothetical received discovery response for the objective in question.
>       -  If the parameter 'overlap' is True, more than one ASA may
>          register this objective in the same GRASP instance.
> Do all ASAs registering this objective have to set it to True, or just
> the first one, in order for the subsequent registrations to succeed?

Somebody else asked about that too, and the answer seems to be that
we need consistency, otherwise it's an error. (It's useful for
make-before-break updates of a running ASA, but that's a whole
other story.)

> Section 2.3.4
>       -  If the parameter 'minimum_TTL' is greater than zero, any
>          locally cached locators for the objective whose remaining time
>          to live in milliseconds is less than or equal to 'minimum_TTL'
>          are deleted first.  Thus 'minimum_TTL' = 0 will flush all
>          entries.
> Why does one ASA's request flush entries from the cache shared with
> other ASAs?  I am forced to infer the motivation for including the
> minimum_TTL parameter in the first place, but it seems like it is useful
> if the requesting ASA needs to find something that will remain active
> for a given period of time, but different ASAs may have different needs
> for the peer's stability, and so flushing the cache in this way could
> hamper the operation of peer ASAs.
> If the intent is only to not return those cached locators *for this
> discovery operation*, then say that, not that they are flushed from the
> cache entirely.

Yes, that question has been raised by someone else too. I'm not sure I
agree with you. In any case, flushing the discovery cache does not affect
GRASP sessions that are already in progress and does not force another ASA
to forget an address that it's already discovered. Once you've discovered
an address, you can go on using it until it fails, in which case you will
repeat discovery anyway. 

> Section 2.3.5
> Thanks for the figure (I probably should have put one into RFC 7546,
> which is basically this section but for the GSS-API).

I borrowed the format from RFC4101's example for IKE.

> I suggest noting in the first paragraph that the negotiation occurs in
> lockstep, with the initiator starting the negotiation and preparing a
> message, the responder processing that message and generating a new
> negotiation message in turn, with at most one negotiation message in
> flight at any given time.  It seems particularly important to note
> whether this also applies to negotiate_wait() calls/messages, or if
> those can be made at any time by either entity.  (This probably relates
> to some of the genart reviewer's comments.)


> I note that the prospect of the loop count going up (and, thus, risk of
> infinite looping) was pointed out by the genart review.  I share such
> concerns and am happy to see that improved discussion of this topic (and
> the related 'lifetime' extension) is planned.
>          For this and any other error code, an exponential backoff is
>          recommended before any retry.
> Any guidance about whether this should be by doubling vs a different
> exponent base?  I guess the security considerations do say that it's
> dependent on the semantics of the objective in question, which may be
> enough (though a pointer or mention here would be appreciated).
> (Also, any reason to not use the 2119 RECOMMENDED?)

Well, it's informational. I agree about the pointer to the sec. cons.
>       -  This function must be followed by calls to 'negotiate_step'
>          and/or 'negotiate_wait' and/or 'end_negotiate' until the
>          negotiation ends. 'listen_negotiate' may then be called again
>          to await a new negotiation.
> We just recommended a few paragraph previously that listen_negotiate()
> should be called again *immediately* after the first listen_negotiate()
> returns; I don't see why it's useful to also say that it might be called
> again after a given negotiation ends.

Yes. In any case there might be a use case where it would be better to wait.
>       -  Executes the next negotation step with the peer.  The
>          'objective' parameter contains the next value being proffered
>          by the ASA in this step.  It must also contain the latest
>          'loop_count' value received from request_negotiate() or
>          negotiate_step().
> This is intreseting; negotiate_step() must preserve the loop count from
> the previous call, so only the initial negotiation response (the
> request_negotiate() 'proffered_objective' output) can increase the loop
> count, not any arbitrary negotiation step?  That seems to limit concerns
> about infinite looping (as raised by the genart reviewer and apparently
> acknowledged in the response to the genart review).
>          o  Threaded implementation: Called in the same thread as the
>             preceding 'request_negotiate' or 'listen_negotiate', with
>             the same value of 'session_nonce'.
> IIUC it is *expected* to be called in the same thread as the previous
> call, but is not strictly speaking *required* to do so, since the
> session_nonce tracks the library state for the negotiation in question.
> Or am I mistaken?

No, that's true. I don't think there'd be a problem in principle.

>          'result' = True for accept (successful negotiation), False for
>          decline (failed negotiation).
>          'reason' = optional string describing reason for decline.
> What happens if I pass a reason string with result of True?

It will be ignored. We should specify that.

> Section 2.3.6
>       -  If the 'peer' parameter is null, and the objective is already
>          available in the local cache, the flooded objective is returned
>          immediately in the 'result' parameter.  In this case, the
>          'timeout' is ignored.
>       -  Otherwise, synchronization with a discovered ASA is performed.
>          If successful, the retrieved objective is returned in the
>          'result' parameter.
>>From context this 'otherwise' seems to be the "'peer' parameter is null
> but the objective is not available in the local cache" case (as opposed
> to also covering the "'peer' parameter is not null" case).  It might be
> possible to clarify this with formatting and/or rewording.

Yes, this needs some fixing because of other comments too.

>    *  synchronize()
>       [...]
>       -  Since this is essentially a read operation, any ASA can do it,
>          unless an authorization model is added to GRASP in future.
>          Therefore the API checks that the ASA is registered, but the
>          objective does not need to be registered by the calling ASA.
>       [...]
>       -  Since this is essentially a read operation, any ASA can use it.
>          Therefore GRASP checks that the calling ASA is registered but
>          the objective doesn't need to be registered by the calling ASA.
> These seem redundant and candidates for de-duplication.

>       -  In the case of failure, an exponential backoff is recommended
>          before retrying.
> [same remark as previously]
> Section 2.3.7
>          'info' = optional diagnostic data.  May be raw bytes from the
>          invalid message.
> This means it does not have to be well-formed CBOR, and will be wrapped
> in a bstr by the library?  (The GRASP spec suggests that a different
> CBOR structure would be permitted, though of course the API need not be
> required to expose such flexibility.)

Yes, since it's ?any in the CDDL definition of the message, it could indeed
be a wrapped bstr. This is a bit like ICMP.
> Section 4
> If we're going to keep the 32-bit nonce/handle/etc, it's probably worth
> a mention of collision/guessing probability.
> It might be worth a reference to the RFC 3986 security considerations
> since we do allow URI locators.  This is not really any different than
> for GRASP itself, but the URI is exposed to the API consumer and so
> reminding them about it seems worthwhile.
> The session_nonce is nominally opaque to (non-ACP, at least) ASAs, but
> is likely to be implemented in a way that does preserve some state.  Is
> there a risk if an ASA attempts to "peek through the abstraction
> barrier"?  (I am not sure I see one, but you're the expert!)

Well, that's the case where a custom hash might a safer implementation.
However, knowing a GRASP session ID really isn't a big win for an
attacker that's already got control of an ASA, and the ACP is
supposed to keep 3rd parties out anyway.

>    GRASP objective concerned.  These precautions are intended to assist
>    the detection of malicious denial of service attacks.
> I suggest to drop the word "malicious"; such denial of service
> conditions need not be malicious and can occur by accident.


>    As a general precaution, all ASAs able to handle multiple negotiation
>    or synchronization requests in parallel may protect themselves
>    against a denial of service attack by limiting the number of requests
>    they can handle simultaneously and silently discarding excess
>    requests.
> I think that best practices would also include some limit on the number
> of objectives registered by a given ASA and possibly the number of ASAs
> registered, to protect the core library/kernel resources.
> (nit?) I suggest dropping 'can'.


> Appendix A
> There was some discussion with the genart reviewer about the CBORfail
> error code as being particularly useful.  I note that
> draft-ietf-cbor-7049bis is in AUTH48 and introduces a hierarchy of
> "levels of validation" (in the form of different data models).  CBOR
> that is valid in the generic data model might not be valid in the
> extended data model or a data model specific to a given application.  I
> strongly encourage this document to update to referencing 7049bis and
> giving an indication of what data model is in use for processing both
> information received from the peer and any CBOR-encoded data received
> from the ASA.

I will cc this off-list to Carsten.

====>>>> Carsten! Help!

>    'noSecurity' error will be returned to most calls if GRASP is running
>    in an insecure mode (no ACP), except for the specific DULL usage mode
> My understanding of the text in the GRASP spec itself was that non-ACP
> security services were allowed.  Is the API intended to be limited to
> only ACP usage?

Well, there's a difference between *the* ACP (draft-ietf-anima-autonomic-control-plane)
and *an* ACP (an equivalent security substrate). Using GRASP with
no security substrate is definitely NOT RECOMMENDED ;-)

Will rephrase slightly.

>    ASAfull          4 "ASA registry full"  (register_asa)
>    dupASA           5 "Duplicate ASA name" (register_asa)
>    noASA            6 "ASA not registered"
>    notYourASA       7 "ASA registered but not by you"
> Giving this much detail is making things much easier for malicious ASAs
> ... but given that the deployment model basically assumes that such
> things don't exist (even if we do give some small consideration to the
> possibility in some places), I will not complain about retaining this
> level of detail in the error messages.
>    noDiscReply     17 "No reply to discovery"
>                                  (req_negotiate)
> There is perhaps some explanation to give about the distinction between
> noReply and noDiscReply, i.e., in the body text.  Maybe it is
> self-explanatory, though, provided that the author of the code notices
> that noDiscReply exists at all.
> Likewise for noNegReply, noSynchReply, noValidSynch, and, possibly,
> noValidStep.

Yes. Will do.

Thanks for the careful review.