Re: [Anima] AD review of draft-ietf-anima-asa-guidelines-03

Brian E Carpenter <brian.e.carpenter@gmail.com> Wed, 24 November 2021 02:55 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB9533A0888; Tue, 23 Nov 2021 18:55:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.95
X-Spam-Level:
X-Spam-Status: No, score=-3.95 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-1.852, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uRa8BnU71h4j; Tue, 23 Nov 2021 18:55:55 -0800 (PST)
Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3F9853A0884; Tue, 23 Nov 2021 18:55:55 -0800 (PST)
Received: by mail-pl1-x62c.google.com with SMTP id y8so677375plg.1; Tue, 23 Nov 2021 18:55:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:subject:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=DlmYi1sYXu3nrZgLF3NWxuRYDHuId+o1XoBK7URj1g0=; b=eAnWGD57Ds+lF5+z72qi1XhK2yIAM8zGZDrg5pTEcGFtd6k7fEFhrCT5I1SdxlO7v2 AQtdPV46hvqJAsRHzHfKPJijYRaIJGgPTqZyU39R691DPjpB+aZUWIynw2SROympT2Qm SaMsuHlRWKpt/IgIw3+ZGQEyDNZAacMK90JSSQbv+JgHuIQLX17kpMLHtEvtLmiLiGEH NHzv4BqdhI+NWfj8ozoCMsJn4XIK95gZpUFgRmlsV3jxeaiFe/58Pctso59sKNvXK1Go Es2vXcrWiGOE3mcvjC0W/c7yJbBfWC0h5ANYWnZ/Ao9UASO4ORuy6KpDXyqgkT49ZZHF 36BA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:subject:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=DlmYi1sYXu3nrZgLF3NWxuRYDHuId+o1XoBK7URj1g0=; b=hdfFSCFXLlM395+caROzwyhEM0TvP2zbIDUmZ1pNkpXEFGMLCQMSm3X5BKsr5zf6KV vNcgUJMEiuMLuOQIpB6BiEKz9zZBS5xzxOZ1eURwzOYkYsgnrJT3AoUVmlHrnurQVgVt k4OffqP562+95IhCGCuMgq/3prY1oSa+B9TtMNeqzphtHGhaTN7owVtEdviLfbDca8UT AiePhM5jR92qmmrI72Vlo6ashjyG8thi9TQ+8VzQlmxy9REqm3rO4FYag5vauQyIE/x3 q9Hj/8RFNSJ26p/iWYjHmWswu7lCaffraNWfbauc0sHhCVqMr2c/jden/+mkA2YEsOZ9 bW9Q==
X-Gm-Message-State: AOAM530ICVKcC6EkgGrVHE2lq2kaVVYxs1c0zwyw6gHl7wtN4Xfydod+ OfJqXPxdqHD3lyqcMrsFwfu9HvjI1bNxTA==
X-Google-Smtp-Source: ABdhPJxFGglBOgKkEOvqq+mwDNwGeSQ6CSPuUfva/EHy41z/7GKdmc3tsjCmwIHB44YTXPUPlBgt3A==
X-Received: by 2002:a17:903:1c7:b0:141:e630:130c with SMTP id e7-20020a17090301c700b00141e630130cmr14005433plh.80.1637722552846; Tue, 23 Nov 2021 18:55:52 -0800 (PST)
Received: from ?IPv6:2406:e003:102d:e801:80b2:5c79:2266:e431? ([2406:e003:102d:e801:80b2:5c79:2266:e431]) by smtp.gmail.com with ESMTPSA id i185sm14176118pfg.80.2021.11.23.18.55.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 23 Nov 2021 18:55:52 -0800 (PST)
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
To: "Rob Wilton (rwilton)" <rwilton@cisco.com>, "anima@ietf.org" <anima@ietf.org>, "draft-ietf-anima-asa-guidelines.all@ietf.org" <draft-ietf-anima-asa-guidelines.all@ietf.org>, Toerless Eckert <tte@cs.fau.de>
References: <BY5PR11MB4196D271165E0D9EEE20512DB59B9@BY5PR11MB4196.namprd11.prod.outlook.com>
Message-ID: <33604a27-563e-436e-2f37-629e0183f705@gmail.com>
Date: Wed, 24 Nov 2021 15:55:47 +1300
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0
MIME-Version: 1.0
In-Reply-To: <BY5PR11MB4196D271165E0D9EEE20512DB59B9@BY5PR11MB4196.namprd11.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/sdnKIFypljE4P3M_kNFUieAHGWI>
Subject: Re: [Anima] AD review of draft-ietf-anima-asa-guidelines-03
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Nov 2021 02:56:00 -0000

Hi Rob, thanks for such a careful review. The -04 version posted a few seconds ago should respond to your points, but we have inserted comments below.

On 19-Nov-21 07:27, Rob Wilton (rwilton) wrote:
> Hi Authors, ANIMA, Toerless,
> 
> My AD review of draft-ietf-anima-asa-guidelines-03 is inline.  I have also attached a copy of my review because the IETF mailer likes to truncate my review emails, so please check that you reached my signature for the full review.
> 
> Toerless, please can you also update the Shepherd writeup to indicate that I'm the responsible AD for this document.
> 
> The draft is well written, and mostly I have fairly minor comments to improve the readability of the document in a few places.  I also ran the document through an automatic grammar checker, and the nits raised are at the end of my review.
> 
> 
> 
> 1. Would it be useful to have a terminology section for some of the acronyms, to make
> it easier for readers to refer back to?  This could also help indicate where the terms are defined?  I'm happy to leave this entirely to the authors discretion.


Yes, this seems useful (and should have been in RFC8993). Will add as an Appendix.
    
> 1.  Introduction
> 
>     Another example is that an existing script for locally monitoring or
>     configuring functions or services on a router could be upgraded as an
>     ASA that could communicate with peer scripts on neighboring or remote
>     routers.  A high-level API will allow such upgraded scripts to take
>     full advantage of the secure ACP and the discovery, negotiation and
>     synchronization features of GRASP.  Familiar tasks such as
>     configuring an Interior Gateway Protocol (IGP) on neighboring routers
>     or even exchanging IGP security keys could be performed securely in
>     this way.  This document mainly addresses issues affecting quite
>     complex ASAs, but the most useful ones may in fact be rather simple
>     developments from existing scripts.
> 
> 2. In this example, is the assumption that the scripts are running on the devices?  It wasn't entirely clear to me.  If so, perhaps reword the first part of the paragraph to make this more clear?


Yes, done.

> 2.  Logical Structure of an Autonomic Service Agent
> 
>     As mentioned above, all but the simplest ASAs will need to suport
>     asynchronous operations.  Not all programming environments explicitly
>     support multi-threading.  In that case, an 'event loop' style of
>     implementation could be adopted, in which case each thread would be
>     implemented as an event handler called in turn by the main loop.  For
>     this, the GRASP API (Section 3.3) must provide non-blocking calls and
>     possibly support callbacks.  When necessary, the GRASP session
>     identifier will be used to distinguish simultaneous operations.
> 
> 
> 3. Various languages have better concurrency paradigms than threads and locks (e.g., actors, futures, goroutines, etc).  So, I think that you are using threads here as a way to describe what operations are expected to be handled in an asynchronous fashion, and to describe that using a fairly simple frame of reference.  You cover the single threaded case, but I didn't know whether it would be helpful to indicate that other concurrency mechanisms can be used, or perhaps that would just be obvious to folks who are familiar with them anyway ...


Well, I'd say "different concurrency paradigms" to avoid a value judgment, but yes. We did cover this at more length in the API (RFC8992) so probably we should point to that discussion. To avoid confusing text, I will move most of the text about asynchronous operations into the section about the API.

(( From a quick glance at goroutines, I can only see a syntactic sugar difference between _tcp_listen(listen_sock).start() in Python and go _tcp_listen(listen_sock) in Go, for example. All these things are event loops under the skin, anyway. ))

> 
> 
> 
> 
>     The logic of the main loop will depend on the details of the
>     autonomic function concerned.  Whenever asynchronous operations are
>     required, extra threads will be launched, or events added to the
>     event loop.  Examples include:
> 
> 4. Suggest "extra threads may be launched, ...", since in some cases the suggestion is
> to reuse existing threads.


Ack

> 
> 5. What are the examples of, i.e., are they examples of logic in the main loop, or examples of asynchronous operations.  Should this be clarified?


In the threaded model, they are threads.
    
> 4.  Interaction with Non-Autonomic Components
> 
>     An ASA, to have any external effects, must also interact with non-
>     autonomic components of the node where it is installed.  For example,
>     an ASA whose purpose is to manage a resource must interact with that
>     resource.  An ASA whose purpose is to manage an entity that is
>     already managed by local software must interact with that software.
>     For example, if such management is performed by NETCONF [RFC6241],
>     the ASA must interact directly with the NETCONF server in the same
>     node.
>     
> 6. I presume that the ASA would be treated like any other NETCONF management client?  Perhaps this is worth stating, and that it may need to coordinate configuration updates with other management clients.


Oh, it's so confusing to me that the NETCONF server (in the protocol sense) is in the managed device, which in the big picture is not really a server at all. Here's my attempt:

       For example, if such management is performed by NETCONF
       <xref target="RFC6241"/>,
       the ASA must interact directly with the NETCONF server in the same node,
       to avoid any inconsistency between configuration changes delivered
       via NETCONF and configuration changes made by the ASA.

> 
> 
> 5.  Design of GRASP Objectives
> 
>     The general rules for the format of GRASP Objective options, their
>     names, and IANA registration are given in [RFC8990].  Additionally
>     that document discusses various general considerations for the design
>     of objectives, which are not repeated here.  However, note that the
>     GRASP protocol, like HTTP, does not provide transactional integrity.
>     In particular, steps in a GRASP negotiation are not idempotent.  The
>     design of a GRASP objective and the logic flow of the ASA should take
>     this into account.  For example, if an ASA is allocating part of a
>     shared resource to other ASAs, it needs to ensure that the same part
>     of the resource is not allocated twice.  The easiest way is to run
>     only one negotiation at a time.  If an ASA is capable of overlapping
>     several negotiations, it must avoid interference between these
>     negotiations.
>     
> 7. Is the alternative approach valid here, which is to design the GRASP Objectives such that they can be treated idempotently?  I.e., where the receiver can detect that it is a duplicate request and ignore it.  Generally, I find that to be a simple and robust way to do concurrent system design.


True, will add. For example, Toerless has a model that amounts to DNS-SD lookup over GRASP. That is idempotent. But a shared resource objective can't be idempotent, I don't think, because if A gives some resource to B, that needs to be an atomic transaction.


> 
> 
> 
> 3.3.  Interaction with GRASP and its API states:
> 
>     ...  The format and size
>     of the value is not restricted by the protocol, except that it must
>     be possible to serialise it for transmission in CBOR [RFC8949], which
>     is no restriction at all in practice.
>     
> 
> and 5.  Design of GRASP Objectives states:
> 
>     The maximum size of the value field of an objective is limited by the
>     GRASP maximum message size.  If the default maximum size specified by
>     [RFC8990] is not enough, the specification of the objective must
>     indicate the required maximum message size, both for unicast and
>     multicast messages.
>     
>     A mapping from YANG to CBOR is defined by [I-D.ietf-core-yang-cbor].
>     Subject to the size limit defined for GRASP messages, nothing
>     prevents objectives using YANG in this way.
>     
> 8. Does the description of the maximum message size conflict between section 3.3 and the two paragraphs in section 5?


Good catch. I will make the text consistent. If there's anything in GRASP that we need to revisit, it's this. What I realised (and implemented) is that it will work with pretty much any reasonable message size limit; it's just matter of receive buffer size (and avoidance of buffer overrun bugs, of course). Documented in https://www.ietf.org/archive/id/draft-carpenter-anima-grasp-config-00.html

>     
>     
>     5.  Design of GRASP Objectives
> 
>    
>     A mapping from YANG to CBOR is defined by [I-D.ietf-core-yang-cbor].
>     Subject to the size limit defined for GRASP messages, nothing
>     prevents objectives using YANG in this way.
>     
> 9. Given that you are taking about size limits, would it be worth mentioning the YANG
> CBOR SID encoding that allows for even more compact messages?


I see that SIDs are mentioned at least 156 times in the yang-cbor draft, so I think we are covered.

> 
> 
> 
> 6.2.2.  Instantiation phase inputs and outputs
> 
>     [Instantiation_target_parameters]  that specifies which are the GRASP
>        objectives to be set to ASAs (e.g. an optimization target)
> 	
> 10. Did you mean set to, or perhaps set on, or sent to?


"sent to" (unless Laurent disagrees)

> 
> 
> 6.3.  Operation phase
> 
>        Modify ASAs managed resources: by updating the instance mandate
>        which would specify different set of resources to manage (only
>        applicable to decouples ASAs).
> 	
> 11. decouples ASAs => decoupled ASAs?


"decoupled"

> 
> 
>     During the Operation phase, running ASAs can interact the one with
>     the other:
>     
> 12. Suggest => interact with other ASAs:


Yes

> 
> 
>        in order to exchange knowledge (e.g. an ASA providing traffic
>        predictions to load balancing ASA)
> 
> 13. Suggest => to a load balancing ASA


Yes

> 
> 
>     During the Operation phase, running ASAs are expected to apply
>     coordination schemes
> 
>        then execute their control loop under coordination supervision/
>        instructions
> 
> 14. I wasn't really sure that I understand what was meant by the above.


We can simplify the text. The immediately following section describes this better.

> 
> 
> 10.  Robustness
> 
>     It is of great importance that all components of an autonomic system
>     are highly robust.  In principle they must never fail.  This section
>     lists various aspects of robustness that ASA designers should
>     consider.
>     
> 15. I would suggest rewording this paragraph to something like follows (with
> an emphasis on recovery than trying to avoid failures):
> 
>     It is of great importance that all components of an autonomic system
>     are highly robust.  Although ASA designers should aim for the
>     component to never fail, it is more important to design the ASA to
>     assume that failures will happen and to gracefully recover from those
>     failures when they occur.  Hence, this section lists various aspects
>     of robustness that ASA designers should consider.


Yes, excellent.

> 
> 
> Appendix B.  Example Logic Flows
> 
> 16. Using a periodic timer is arguably better than a sleep loop (but
>      requires the code to be event based)


Right. sleep() is the simple-minded approach, actually. In Python,
I could use a combination of time.sleep() and time.monotonic()
to make a true periodic timer, even in threaded code, in a case
where it matters. We'll change the comment in the pseudocode.

> 17. Should the NEGOTIATOR thread exit when it has completed?


Yes. My brain has been unduly influenced by Python, where a thread
does exit if you drop out of the code. You can't write exit()
because that exits the whole shebang. Will clarify with a comment.

> 
> 
> 
> 18. Minor nits from a grammar checking tool are below.

Will fix.

Thanks again,
      Brian and co-authors.

> 
>     
> Potential Spelling Errors:
> successfull, suport, un-install
> 
> e.g. => e.g.,
> 
> Grammar Warnings:
> Section: 1, draft text:
> To achieve this objective, the autonomic networks ecosystem must include at least a library of ASAs and corresponding GRASP objective definitions.
> Warning:  Apostrophe might be missing.
> Suggested change:  "networks'"
> 
> Section: 1, draft text:
> However, for the present document, the basic definitions and goals for autonomic networking given in [RFC7575] apply .
> Warning:  Don't put a space before the full stop.
> Suggested change:  "."
> 
> Section: 1, draft text:
> Of course it also interacts with the specific targets of its function, using any suitable mechanism.
> Warning:  The comma is probably missing here: course, it.
> Suggested change:  "course, it"
> 
> Section: 3.3, draft text:
> Thus ASAs may operate without special privilege, unless they need it for other reasons.
> Warning:  Did you forget a comma after a conjunctive/linking adverb?
> Suggested change:  "Thus,"
> 
> Section: 5, draft text:
> Additionally that document discusses various general considerations for the design of objectives, which are not repeated here.
> Warning:  Did you forget a comma after a conjunctive/linking adverb?
> Suggested change:  "Additionally,"
> 
> Section: 6.2, draft text:
> having a piece of code available to run on a host and 2.
> Warning:  This sentence does not start with an uppercase letter.
> Suggested change:  "Having"
> 
> Section: 6.2, draft text:
> having an agent based on this piece of code running inside the host.
> Warning:  This sentence does not start with an uppercase letter.
> Suggested change:  "Having"
> 
> Section: 6.2.1, draft text:
>   Additionally in this phase, the operator may want to set goals to autonomic functions e.g. by configuring GRASP objectives.
> Warning:  Did you forget a comma after a conjunctive/linking adverb?
> Suggested change:  "Additionally,"
> 
> Section: 10, draft text:
> In principle they must never fail.
> Warning:  The comma is probably missing here: principle, they.
> Suggested change:  "principle, they"
> 
> Section: 11, draft text:
> Independently of this, interfaces between ASAs and the router configuration/monitoring services of the node can be subject to authentication that provides more fine grained authorization for specific services.
> Warning:  This word is normally spelled with hyphen.
> Suggested change:  "fine-grained"
> 
> Section: Appendix B, draft text:
> Thus the ASA logic has two operating modes: origin and delegator.
> Warning:  Did you forget a comma after a conjunctive/linking adverb?
> Suggested change:  "Thus,"
> 
> Thanks,
> Rob
>