Re: [Anima] Review of draft-ietf-anima-stable-connectivity-02

Toerless Eckert <tte@cs.fau.de> Tue, 27 June 2017 04:21 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: anima@ietfa.amsl.com
Delivered-To: anima@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9BD23128B8E; Mon, 26 Jun 2017 21:21:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SS5LHslaoQ_T; Mon, 26 Jun 2017 21:21:29 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4DECA127136; Mon, 26 Jun 2017 21:21:29 -0700 (PDT)
Received: from faui40p.informatik.uni-erlangen.de (faui40p.informatik.uni-erlangen.de [131.188.34.77]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id 4656558C4D5; Tue, 27 Jun 2017 06:21:24 +0200 (CEST)
Received: by faui40p.informatik.uni-erlangen.de (Postfix, from userid 10463) id 2D50BB0C3EC; Tue, 27 Jun 2017 06:21:24 +0200 (CEST)
Date: Tue, 27 Jun 2017 06:21:24 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: Sheng Jiang <jiangsheng@huawei.com>
Cc: Anima WG <anima@ietf.org>, "anima-chairs@ietf.org" <anima-chairs@ietf.org>
Message-ID: <20170627042123.GA18207@faui40p.informatik.uni-erlangen.de>
References: <5D36713D8A4E7348A7E10DF7437A4B927CDE5CCF@NKGEML515-MBX.china.huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <5D36713D8A4E7348A7E10DF7437A4B927CDE5CCF@NKGEML515-MBX.china.huawei.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/anima/J6_u7VEG_YeoU1KKMLB8dUe6hDk>
Subject: Re: [Anima] Review of draft-ietf-anima-stable-connectivity-02
X-BeenThere: anima@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Autonomic Networking Integrated Model and Approach <anima.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/anima>, <mailto:anima-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/anima/>
List-Post: <mailto:anima@ietf.org>
List-Help: <mailto:anima-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/anima>, <mailto:anima-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Jun 2017 04:21:33 -0000

Thanks a lot, Sheng! 

Integrated fixes for all comments (see inline discuss below). Pushed stable connectivity draft
into www.github.com/anima-wg/autonomic-control-plane together with ACP draft because i ended up
having to do a fix for the ACP draft as a result of your comments.

Diff between -02 and fixed up stable connectivity draft here: 

http://tools.ietf.org/tools/rfcdiff/rfcdiff.pyht?url1=https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-stable-connectivity/draft-ietf-anima-stable-connectivity-02.txt&url2=https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-stable-connectivity/draft-ietf-anima-stable-connectivity.txt

If you are fine with the fixes let me know, and i'll push it to datatracker as -03.

If not ok. feel free to use email or now also add issue to the git.

Raw txt/git files of fixed stable connectivity text:

https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-stable-connectivity/draft-ietf-anima-stable-connectivity.txt
https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-stable-connectivity/draft-ietf-anima-stable-connectivity.xml

Diff for change done in ACP (explaining ACP connect better):

http://tools.ietf.org/tools/rfcdiff/rfcdiff.pyht?url1=https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-autonomic-control-plane/draft-ietf-anima-autonomic-control-plane-06.txt&url2=https://raw.githubusercontent.com/anima-wg/autonomic-control-plane/master/draft-ietf-anima-autonomic-control-plane/draft-ietf-anima-autonomic-control-plane.txt

Cheers
    Toerless

On Tue, Jun 20, 2017 at 08:00:33AM +0000, Sheng Jiang wrote:
> Hi, authors of draft-ietf-anima-stable-connectivity,
> 
> I am doing a thorough review as the document shepherd with my ANIMA chair hat on. Please address the below comments so that we could process this document further.
> 
> First, I have issues for section 2.1.4, "IPv4 only NOC application devices". It would be an unlike scenario to manage an IPv6 network (all managed devices are IPv6 enabled) with IPv4 only NOC application devices. Furthermore, the NAT64 setup in this scenario is complex, and connectivity between IPv4 only NOC application devices and NAT is unsecure. I would suggest to reduce the whole section or add a clear statement that it is a not-recommended scenario at the end of this section.

Yes, it is a mess. But it is documenting an actual deployment experience with an enterprise
customer. TAnd my understanding is tht it would be a quite common scenario for enterprises. There
was just recently a very nice RTG area WG chair tutorial that to me reconfirmed the ongoing
challenges with IPv6 only management planes.

I have added paragraphs to justify the need for this section and that its all undesirable workarounds
Please check. To me, this messy section is the price we have to pay so that the ACP
can simply be IPv6 only. Its a price i am happy to pay.

> Secondly, the description and category in section 2.1.2, "Limitations and enhancement overview". For me, only the point 1 & 3 are really limitation of ACP itself. 

I have improved the text and categorization in this chapter to make it hopefully clearer to read and to be more precise in terminology.

> Point 2 is a precondition (it is actually conflicted with section 2.1.4.)

Yes, IPv6 is a precondition. If a deployment can not meet this precondition, that is a challenge for which there is a workaround. Those workarounds are described in section 2.1.4. I've tried to improve all that text. I do not understand why point 2 would be a conflict with section 2.1.4 instead of rather pointing to 2.1.4.

> I don't understand point 4. What does "exposing the ACP natively" mean?

That is the term that was defined in ACP CP draft, section 6.1. Reading through that section, i improved it in the ACP section, and i updated the text about it in stable connectivity.

Oh, and this change had me change the term "NOC application device" to "NMS host" throughout the document because that is the term used in the ACP draft.

> Thirdly, I have issues to use names to distinguish the path selection policies.
> This is a chicken&egg issue in the autonomic scenario.
> Who and how the DNS names are setup, by human administrator? 

IMHO, considerations for names are one of the most crucial part of
operationalizing the ACP. The little text we have about the ACP is IMHO
a good compromise between overlooking the problem and writing a lot more
text.

The concept of using different addresses for a device for different actions
is not novel to ACP. Think of using a loopback to "reliably" talk to a router
vs. ping'ing specific interface addresses of a router to discover whether
an peer (reachable via that interface) is up/down. If you look at service 
providers name setups (often in DNS), then they will have different names
for those different addresses and use those different names in the different
tools. The text in this draft simply explains the same concept for
ACP vs. other addresses. 

DNS names can be setup by various locally built automation scripting or templates.
Same think for DNS names for ACP. Operators will likely point out that automating
the DNS name creation is more difficult because we do not use topology/semantic
ACP addresses, but in principle it's the same automation task.

> If there is a mechanism to distinguish the IP addresses of ACP and data-plane without human intelligence, why does it bore to use DNS?

ALl the existing "actions" that you want to do for a device are just too stupid:
They do not include the logic of knowing which address is best for them to
use. That's why you use names for subsets of the possible addresses to allow
you to specify exatly those address(es) that will reach the device via the
network connection matching the intent of the action you're taking. 

> DNS registration & lookup by itself is a very complex and time-consumption procedure.

Depends on your automation. Its certainly an interesting aspect especially moving
from IPv4 to IPv6 where embedding logic into adresses becomes a whole different
ballgame.

> If we don't want to show the semantic of these addresses to any human, names are meaningless.

But you need new NOC tools where each action would need to know what type
of connection is best for it.

I have added one paragraph to 2.1.5 to outline this logic and justification for why
to describe DNS. Look for "Ideally, a NOC system would learn and keep track of all addresses of a device"


> In section 2.2, "The ACP can provide common direct-neighbor discovery and capability negotiation". This is a wrong statement. ANI does provide this, but it is done by GRASP, not ACP.

Ack. Changed to: The ANI (ACP, Bootstrap, GRASP) can provide via the GRASP protocol common direct-neighbor 
discovery and capability negotiation (GRASP via ACP and/or data-plane) and stable and secure
connectivity for functions running distributed in network devices (GRASP via ACP).

> At the end of section 2.1.3, "A simple short-term workaround could be a physical external loopback cable into two ports of ANrtr1 to connect the data-plane and ACP VRF as shown in the picture." Does this has to be "a PHYSICAL external loopback CABLE"? It sounds like a very strong requirement. Personally, I believe this could be done by a virtual loopback interface.

Ack. Changed text to: A workaround without additional software functionality could be a physical external loopback
cable into two ports of ANrtr1 to connect the data-plane and ACP VRF as shown in the picture. A (virtual) 
 software loopback between the ACP and data plane VRF would of course be the better solution.

The key notion why the "gross" physcial loopback may be appreciated is because it doesn't require software work. Or specification of the behavior of such a software loopback in the roting behavior in the ACP draft. I would love to have that software loopback, but i think we (ACP doc authors) felt that we wanted to keep the ACP spec lightweight enough to be implementable without all possible enhancements. But i hope its fine to mention the option here in this document as you suggested.

> Section 3, "Security considerations" is more like a deployment considerations. Both ULA-C and reverse DNS are additional deployment

Yes, but i think if you look through various IETF documents you will see that it is quite common for security consideration sections to outline additional steps to secure the documents work goal especially when those steps are otherwise orthogonal to the documents spec (eg: leverage existing components).

Whats the process for informational documents. It will get SEC AD review, right ? Maybe hold that thought for that review ? Or else i can proactively ask, because right now i am a bit at a loss whats the right limit of what to put into sec considerations vs. what to extract into a separate chapter.

> Minor comments,
> 
> Section 2.1.6, "Autonomic NOC device/applications" should be moved to early part of this document. For me, it is the default scenario/requirement. It should be section 2.1.1, I guess.

The text unfortunately builds up in the order it is written, so it would be a lot of rewrite to make sure it would still read after such a reordering. Instead i have added the following as the first paragraph:

<t>This section describes stable connectibity for centralized OAM operations via ACP/ANI
starting by what we expect to be the most likely easiest short-term deployment option. It
then describes limitation/challenges of that approach and their solutions/workarounds to finish
with the preferred target option of autonomic NOC devices in <xref target="autonomic-devices" />.</t>

<t>This order was choosen because it helps to explain how simple one can start using the ACP, 
how difficult workarounds can become (and therefore what to avoid), and finally because one very 
promising long-term solution alternative is exactly like the most easy short-term solution only virtualized
and automated.</t>

The last sentence is that punchline: The likely easiest/best atonomic NOC device solution is one where you do everything you do in the short-term approach, just virtualized and integrated into the NOC device. WHich means you'd need to explain all that stuff anyhow...

> It is worth of mentioning even the ACP provides only IPv6 connectivity, through it, IPv4 configuration or even non-IP configuration could be managed.

Good point. Added the following sentence:

<t>Note that even though the ACP only uses IPv6, it can and should be used to providestable connectivity for management of any network: IPv4 only, dual-stack or IPv6 only.</t>

> The document does not properly quota references in the text. The references defined are mostly not used.

Ack. Removed unnecessary references, introduced terms/references in the beginning of the document. All references now used at least once ;-)

> The document separate sections for Informative/Normative References

Hmm. It DOES NOT distinguish between normative and informational documents because i thought that an informational document like this could not have normative references. If it can and should have them let me know, then i'll make Bootstrap/Grasp/ACP normative and reference/definitions informational.

> The empty section 5 "Further considerations", should be removed.

Done.

> Most of references are out of date. behringer-anima-reference-model > ietf-anima-reference-model, irtf-nmrg-an-gap-analysis > RFC 7576, irtf-nmrg-autonomic-network-definitions > RFC 7575.

Fixed.

> In section 1.1, "the introduction of IPv6 or other mayor re-hauls in the infrastructure design." What is the mean for "other mayor"?

added: Examples include change of IGP protocols or areas, PD (Provider
Dependent) to PI (Provider Independent) addressing, systematic topology changes.

> In section 1.3, "the Autonomic Networks Autonomic Control plane (ACP)" may be better to presented as "the Autonomic Control plane (ACP) in Autonomic Networks"

Done

> There is no full names for "NOC" in section 1.1, "PSTN" in section 1.3, AT" in section 2.1, "DNS" in section 2.1.1, "RoI" in section 2.1.4, MP-TCP in section 2.1.5, "KARP" in section 2.2, even the first appearances.

Done, added RFC references for those TLAs that have them as well.

> Last sentence of section 2.2 should be removed.

Done.

> In section 3, first sentence, add "In this section,"

Done.

> There are typos, needed to be fixed too:
> 
> Two "the the" in the end of section 2.1.2 and end of section 4;

Done
> 
> A couple of "randomn" -> "random";

Done
> 
> "networ" in section 2.1.5;

Done
> 
> "jut" -> "just" at the end of section 2.1.6, I guess.
> 
> "The most simple" -> "the simplest" in section 2.17.

Done.