Re: [Anima] Review of ANIMA GRASP

Brian E Carpenter <brian.e.carpenter@gmail.com> Sat, 25 June 2016 02:13 UTC

To: Sheng Jiang <jiangsheng@huawei.com>, "anima@ietf.org" <anima@ietf.org>, "draft-ietf-anima-grasp.all@ietf.org" <draft-ietf-anima-grasp.all@ietf.org>
References: <5D36713D8A4E7348A7E10DF7437A4B927CA883CF@NKGEML515-MBX.china.huawei.com>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
Organization: University of Auckland
Message-ID: <85a59bee-fab6-9962-29c3-b3a49788da37@gmail.com>
Date: Sat, 25 Jun 2016 14:13:30 +1200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1
MIME-Version: 1.0
In-Reply-To: <5D36713D8A4E7348A7E10DF7437A4B927CA883CF@NKGEML515-MBX.china.huawei.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/GHeDlL_9vL478plZX8jZmEqJ58o>
Subject: Re: [Anima] Review of ANIMA GRASP
Precedence: list

Hi Sheng,

Thanks so much. I will give some personal first responses below. I have
not commented on the points that are editorial or simple to correct.

For timing reasons, and because we are still hoping for some other reviews
before Berlin, I expect we will post a version before Berlin that only answers
some of the points.
On 24/06/2016 22:10, Sheng Jiang wrote:
> Hi, Brian & authors of GRASP,
> 
> Thanks so much for your hard work on GRASP. With my WG chair hat on, I have done a thorough review. (Giving that my review went much slower than I thought. I did not complete it. This email only contains most of concepts and considerations, up to section 3.4 . I will send other mail for the rest part for the protocol details next week.) In general, I feel the current document is already in a good sharp. I have a few minor comments are below. Most of them are in requirement and overview sections.
> 
> Before into details, in general the current text still have room to improve regarding to clearly distinguish the target, role and responsibility of GRASP, ASAs and autonomic network as a whole.

Yes, although of course some of that material intersects with the Reference Model
document. I think we also need some deep reviews of the Reference Model to get
this aspect correct.

> Second paragraph of Section 1, the Autonomic Service Agent should be in the first character uppercase format and a reference of RFC7575 should be given. It is an important term to understand the context.
> 
> In the same paragraph, “There is no restriction on the type of parameters and resources concerned.” I am not sure the meaning of restriction here. I guess this sentence means any parameters and resources could be the negotiation objective. Am I right?

Yes - we don't want to limit this in any way.

> In the same paragraph, should the atomic unit of discovery should also be mentioned? Actually, it is not clear whether the discovery is device oriented or ASA oriented.

Good point. I personally think it is objective oriented. A node can contain multiple
ASAs and an ASA can handle multiple objectives, so discovery must operate for
one objective. It is the most flexible design.

> Third paragraph of Section 1, “Negotiation is … between the negotiating devices”.  Is it possible to negotiate between ASAs or instances in a same device?

I can answer for my prototype code: yes. I can show you this in Berlin: three ASAs
negotiating simultaneously in my laptop, with three instances of GRASP running
in fact. (Thanks to Toerless; this works because he persuaded us to change to a
separate TCP port for each instance.)

> Fourth paragraph of Section 1, I don’t understand the purpose of these text from “Although”. It seems the 3rd, 4th and 5th sentences are saying the negotiation or GRASP is not restricted to the topology in a very obscure way. If my read is right, please just say so. The last sentence looks like a widow for me, although the word “therefore” appeared. For me, bootstrap may be a special ASA, but it is no different mean to GRASP from other ASA, unless it raises some special requirements, which I did not see.

Yes, I think we can simplify this text to say that GRASP can run in any kind of
network topology. If you remember, a very early draft had quite a discussion
of hierarchy and this is really left-over text from that.

(I think the 'infrastructure ASAs' such as bootstrap and ACP-formation might have
some special technical requirements, but they should be out of scope here.)

> First paragraph of Section 2, is it multiple ASAs might manage a same technical objective? If yes, it should also be mentioned.

OK (of course this raises the coordination issue, but that is out of scope here).

> 
> Second paragraph of Section 2.1, “In some cases, when a new application session starts up within a device, the device or ASA may again lack information about relevant peers. It might be necessary to set up resources on multiple other devices, coordinated and matched to each other so that there is no wasted resource.” I felt difficult to understand both sentence, till I assume the second sentence describes an example scenario for the first sentence. If my assumption is what the authors want to express, it will be better to integrate them into one sentence with word “for example”.
> 
> D3 in Section 2.1, “When an ASA starts up, it must require no information about any peers in order to discover them.” It is worth to clarify the another side: if there are existing information, ASA may use it.

Are you sure? I think any discovery results obtained before a crash should
be flushed, for example. Maybe the point is that we require no *configured*
information in an autonomic network.

> 
> D4 in Section 2.1, “so discovery needs to be repeated to find counterparts for each objective.” The word “repeat” may imply the linear time order. Maybe replaced by “separated”, “splitted”?
> 
> D5 in Section 2.1, “the discovered peers should be associated with the objective.” may be more clear.
> 
> The first paragraph of D7 in Section 2.1, it looks like a subset of D2, no pre-known network topology knowledge.
> 
> N1 in Section 2.2, “arbitrary subsets of participating nodes”, maybe a “selected” is needed.
> 
> N4 in Section 2.2, “the protocol should be able to carry the message formats used by existing configuration protocols (such as NETCONF/YANG) in cases where that is convenient.” I like this requirement. However, I doubt whether it is feasible, particularly, there are multiple protocols with different message formats. If we are not sure the feasibility, it is better to lower this requirement to “MAY”.

I don't really see the problem - in the GRASP design, the only restriction is that
the value of an objective can be expressed in CBOR, and as far as Iknow CBOR
can express anything. Maybe it needs to be rephrased as "able to encapsulate the
data formats used by existing configuration protocols".

> 
> N5 in Section 2.2, “the protocol … must be capable of running in any device that would otherwise need human intervention.” This looks like too strong for me with the word “must” and “any”, unless we are talking about full autonomic network, which exclude any human intervention, although it may be our ultimate goal. The same with N6, the word “must” and “any” may be too strong. 

I think that's probably true; we need to make this text more logical. Actually there
is an interaction here with the reference model, which should discuss constrained
vs unconstrained nodes.

> On another side, I am not fully understand the technical impact of this requirement for the protocol design. Is there any
technical difficult to prevent the running on any devices, then we have to do some special design in the protocol?

No, there is just the constrained node question (resources).

> N7 in Section 2.2, if a dependency chain become too long, it may slow down the decision too much. If so, the performance of the total AN may also be damaged. So, I guess a mechanism to avoid the long-chain of dependencies is also needed. However, whether it is matter for ASAs or GRASP, I am not sure. 

I don't think the protocol itself can help with this. Actually, this problem is
why I recently became interested in distributed consensus algorithms, where
convergence time is a big issue.

> In paragraph 3, “think ahead” could actually have two meaning here: 1, dry run to see the impact of new parameter; 2, predict
some situation which may be input for the parameters decision. If you want to express the second somewhere else, at least, “In
other words” is not a suitable conjunction.

Agreed.

> N8 in Section 2.2, more discussion on why leads to the design choice that we made later in this document.

Yes, my opinion is that we have far too little experience to fix an information
model now, so this has to be a separate issue from the protocol design. Therefore
we chose a flexible and easily extensible message format.

> 
> T1 in Section 2.3, “it should be possible for ASAs to be implemented independently of each other as user space programs rather than as kernel code.” This looks an recommendation/consideration for GRASP implementation rather than a technical requirement for protocol design. I understand the meaning and importance of such information. But it looks for me it appear in a wrong category.

It sits in the middle. In fact this *did* change the protocol design; when we added
the protocol and port number to the Discovery Response, it was exactly for this reason.

> 
> T5 in Section 2.3, “the prerequisite is discovering a peer’s locator by any method.” Does the scenarios without any discovery possible? Such as sending out a on link multicast sync request?

Well, the GRASP design doesn't have that - the method is discover before synchronize.
However, I guess that isn't actually a requirement, it's a design choice. I think we
can simply delete the "prerequisite" clause.

> 
> T6 in Section 2.3, I believe this is a requirement for ASA, even a recommendation for ASA implementation. It is NOT a requirement for GRASP protocol design.

I'm not sure. It means that GRASP must be capable of supporting multiple parallel
operations, for which we need the Session ID (or some equivalent mechanism). So it
does affect the protocol design. However, the reference here to ASAs is out of scope.

> 
> “Negotiation” in Section 3.1, “a process by which two (or more) ASAs”. Does multiple-side negotiation support in this document? Or does it fulfill by multiple bilateral negotiation? According to section 3.2, it is the latter. It is slightly misleading here.
>
> “State Synchronization” in Section 3.1, according to the term of discovery, “as sources of synchronization data”, the sync in this document seems fully require-response model. If so, it is worth of clarifying here.
>
> “Objective” in Section 3.1, “a given objective will occur during discovery and negotiation, or during discovery and synchronization, but not in all three contexts.” The discovery is not bound to negotiation, or synchronization. A better description here may be “a given objective will not occur in negotiation and synchronization contexts simultaneously.
> 
> “Objective” in Section 3.1, third paragraph, “That node is generally expected to contain an ASA which may itself manage other nodes.” In my understanding, an ASA could only manage local nodes although it may influence other nodes by negotiating with the peer ASAs on them.
> 
> The second bullet point in Section 3.2, “It will provide services to ASAs via a suitable application programming interface”. My understand is the API are our of document scope and may defined in future document. If so, this should be explicitly mentioned.
> 
> The third bullet point in Section 3.2, I believe some statement should be added, “As a design choice, the protocol itself is not provided with build-in security functionality.”
> 
> “Organizing of synchronization or negotiation content” bullet point in Section 3.2. I believe this point should be rewritten as a recommendation for ASA. GRASP is a generic platform. Consequently, it is independent from content organizing.

Correct. From the work I've done on the reference model and the GRASP API, I think
we will need a document all about ASAs and how they are organized.

> 
> “Self-aware network device” bullet point in Section 3.2, last sentence “A device has no pre-configuration for the particular network in which it is installed.” I am not sure the purpose of this sentence although it looks like a requirement. If so, it is too strong. I think a “may” should be added.

I wonder whether this whole paragraph even belongs in this document. It seems
like an extract from the reference model.

> 
> Section 3.3, more description may be added regarding to use GRASP to signal between two Autonomic networks/domains.

Agreed, but at the moment I'm not sure what to say.

> 
> In section 3.3.3, “In other words an might perform discovery because it only wishes to receive synchronization data.” The word “In other words” should be replaced by “for example” since there are other possible scenarios for the sentence before it..
> 
> In section 3.3, “Discovery Procedures”, “the device MAY respond to the link-local multicast”. Why “MAY”? As an important behavior for successful discovery, I think it should be “SHOULD”. 

Good question. Actually this is a very old MAY - it is present in section 5.2.2
of draft-carpenter-anima-gdn-protocol-00, in slightly different wording.
I have always assumed that it was because an ASA might wish to hide for some
time (for example, if it has no free resources to offer, there is no point
in being discovered). Maybe we could say "SHOULD respond unless temporarily
unavailable", or something like that?

> At the end of second paragraph, you should define the default behavior if a receiving device does not support the object and
have no information of another ASA, like silent drop the process.
> 
> Here, the word “support” has two meanings, a) understand the objective, b) has the source regarding to the objective. This may need to be distinguished during the discovery.
> 
> One scenario missed: in multiple interfaces scenarios, a device may send out the discovery message on multiple interfaces, the discovery result MUST be associated with the interface receiving it.
> 
> In multiple interfaces scenarios, “it MUST relay the query by re-issuing a Discovery message as a link-local multicast on its other interfaces.” I do NOT think “MUST” is right here. It means an objective that does not support by any devices or only support by a few devices would certainly cause a signal storm. I suggest to soft this to “SHOULD” and make it changeable by intent.

Why would it cause a storm? There are various mechanisms to prevent that. I don't think
there is any discovery mechanism possible without relaying discovery. So I think
all multi-interface nodes MUST relay by default.

You are correct, there might be objectives which are only useful if on-link,
so being able to restrict discovery to on-link would be valuable in that case.
That does seem like Intent.

> Also, there should be a way for the initiator to indicate the discovery message should not be relayed, like in the scenarios that the initiator would only want to discovery counterpart in its neighbors.

That would be a protocol change, we can make it an open issue for now.

> Section 3.3.4.1, I am not sure whether the Rapid mode is only used on-link or not, in another word. The discovery message with a negotiation objective option should or should not be relayed. Either way, it should be clarified further.

I have no real opinion about that. To be honest I am not convinced by the argument
for Rapid mode. It seems like complexity for quite a small gain, since discovery
results will normally be cached. Let's discuss...

> Last paragraph, Section 3.3.5.1, “In practice this means that they MUST NOT be transmitted and MUST be ignored on receipt unless there is an operational ACP.” I guess here should add “or other strong security mechanisms.”
> 
> The same paragraph, “synchronization objectives that are flooded SHOULD NOT contain unencrypted sensitive information.” There is not definition of “sensitive information”. Therefore, the meaning of this sentence is questionably.
> 
> The second paragraph of section 3.3.5.2, “This rapid synchronization function SHOULD be configured off by default.” Why off by default? More explanation are needed for this design choice.

That's because all nodes in the network need to agree whether Rapid Mode is in use
or not, I think. In any case the failure modes could get quite complicated (if one
peer responds with just a discovery response, and another one several hops away
responds with a negotiation response, but the initiator has already started
negotiating with the nearest neighbor).

Thanks again,
   Brian

Re: [Anima] Review of ANIMA GRASP Brian E Carpenter
Re: [Anima] Review of ANIMA GRASP Brian E Carpenter
Re: [Anima] Review of ANIMA GRASP Michael Behringer (mbehring)
Re: [Anima] Discovery relay [was Re: Review of AN… Sheng Jiang
[Anima] Discovery relay [was Re: Review of ANIMA … Brian E Carpenter
[Anima] Review of ANIMA GRASP Sheng Jiang
Re: [Anima] ASA management discussions [was snip … Brian E Carpenter
[Anima] ASA management discussions [was snip of] … Peloso, Pierre (Nokia - FR)
Re: [Anima] Review of ANIMA GRASP Brian E Carpenter
Re: [Anima] Review of ANIMA GRASP Peloso, Pierre (Nokia - FR)