[Nemops-interest] Re: [Nemops-workshop-attendees] Re: NEMOPS Workshop Report

Kristian Larsson <k@centor.se> Thu, 20 February 2025 21:33 UTC

Return-Path: <k@centor.se>
X-Original-To: nemops-interest@ietfa.amsl.com
Delivered-To: nemops-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 212F8C1654F2; Thu, 20 Feb 2025 13:33:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.903
X-Spam-Level:
X-Spam-Status: No, score=-1.903 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2Pcidb2uBetS; Thu, 20 Feb 2025 13:33:12 -0800 (PST)
Received: from Mail1.SpriteLink.NET (Mail1.SpriteLink.NET [195.182.5.127]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5E8AC1519B3; Thu, 20 Feb 2025 13:33:05 -0800 (PST)
Received: from localhost (unknown [37.152.59.82]) by Mail1.SpriteLink.NET (Postfix) with ESMTPSA id C1B4B3F4D0; Thu, 20 Feb 2025 22:32:58 +0100 (CET)
From: Kristian Larsson <k@centor.se>
To: Michael Richardson <mcr+ietf@sandelman.ca>
Date: Thu, 20 Feb 2025 22:32:33 +0100
References: <CAP7zK5YV5s2-0jutfN0mg3CBUNbZgA4KJbkFdqFXvbEqphsj-Q@mail.gmail.com> <30029.1740080212@obiwan.sandelman.ca>
User-agent: mu4e 1.8.13; emacs 29.4
In-reply-to: <30029.1740080212@obiwan.sandelman.ca>
Message-ID: <87frk8i9og.fsf@centor.se>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Message-ID-Hash: HUEM3PDWNGVMRECOQPMJM65IOR6OIDU6
X-Message-ID-Hash: HUEM3PDWNGVMRECOQPMJM65IOR6OIDU6
X-MailFrom: k@centor.se
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: Dhruv Dhody <dd@dhruvdhody.com>, nemops-interest@iab.org, architecture-discuss@iab.org, nemops-workshop-attendees@iab.org
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Nemops-interest] Re: [Nemops-workshop-attendees] Re: NEMOPS Workshop Report
List-Id: Next Era of Network Management Operations <nemops-interest.iab.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/nemops-interest/wuSfrdIxmgjgZFhr4dEn1KK7rRw>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nemops-interest>
List-Help: <mailto:nemops-interest-request@iab.org?subject=help>
List-Owner: <mailto:nemops-interest-owner@iab.org>
List-Post: <mailto:nemops-interest@iab.org>
List-Subscribe: <mailto:nemops-interest-join@iab.org>
List-Unsubscribe: <mailto:nemops-interest-leave@iab.org>

What is the point you are trying to make when you say that you “never spoke of YANG”? It sounds like you, as the tutor, choose not to teach it. I can’t relate at all with your claim that RESTCONF does not scale down. What do you mean? What prevents you from using it? Be specific please.

Today, I did a production deployment for a network where I’ve been working over the last 8 months to build up an automation system. It went perfect. It’s entirely using NETCONF / YANG both for configuration and retrieval of operational state. It is not really about telemetry per say, just polling for now. I’d bet it’d be gNMI/YANG if telemetry had been important. Cisco NSO was used for config / oper state now.

In parallel, as I also stated during the workshop, I and others, primarily at Deutsche Telekom, are working on an open source system doing YANG-native orchestration - Orchestron (<https://github.com/orchestron-orchestrator/orchestron/>). I think it scales, both up and down. Admittedly, we’re lagging behind on documentation but I think it does scale down to the kind of use cases you mention. I hope we can soon be in a better position to make it user friendly for external people!

As far as NEMOPS goes, I think it is important to scale up and down but with an emphasis on solving industry problems. That leans more towards large scale SP deployments than 3-switch deployments. I think the NEMOPS report should primarily reflect that.

Kind regards,
   Kristian.


Michael Richardson <mcr+ietf@sandelman.ca> writes:

> Thank you for the write up.
>
> Dhruv Dhody <dd@dhruvdhody.com> wrote:
>     > Wes and I have posted the initial version of the NEMOPS Workshop Report -
>
>     > <https://datatracker.ietf.org/doc/draft-iab-nemops-workshop-report/>
>     > <https://www.ietf.org/archive/id/draft-iab-nemops-workshop-report-01.html>
>
>     > We’d appreciate your feedback, ideally on the nemops-interest@iab.org list
>     > or directly to the authors. You can also submit a PR on GitHub:
>
> }   Many operators
> }   prioritize operational conferences over standards development
> }   organizations (SDOs), such as RIPE, NANOG, APRICOT, LACNIC, AutoConn,
> }   etc.
> }
> }   To address this, the IAB workshop’s Program Committee (PC) planned
> }   outreach initiatives to foster discussions and gather interest by
> }   engaging with operators at these venues and conducting information/
> }   requirement-gathering sessions.  Participants were encouraged to
>
> Did we succeed in getting more operators to the workshop?
> I observed the PC outreach at RIPE89, but I’m not sure it resulted in the
> groundswell that was desired.
>
> Yesterday, I walked a new (very young) network admin through adding a new
> [not new to us, just been sitting in a box since 2019] switch CREDIL.org’s
> network monitoring tool… That is, we connected it to SNMP…
> The level of idiosyncracy is great, caused in large part by
> heterogeneous equipment which is definitely not new.
> We (had to) fixed quite a number of things along the way, so it took about 5 hours.
> This was the new person’s first experience with Cisco-style router CLIs, and
> she was really impressed with how useful completion was.  We used a Web
> interface for a bit to add a static IPv6 for the control access, but the Web
> interface *did the wrong thing*.  We never spoke of YANG.
> We use tftp to backup configurations, which we then manually check into git.
> (“git app -p” is very good way to review what you actually did, and also to
> learn what went wrong with the web interface attempt)
>
> We’d love to use scp, but the security posture of it is almost worse.
> We also never get ssh access to our switches to work consistently, so serial
> consoles are mandatory as backup, and guess what: RESTCONF does not work
> across that link.   RFC8994 would be nice.
> (Yes, I have running code)
>
> I don’t think section 3.1.3 really articulates this well enough:
> “Additionally, the lack of
>  simple tools for smaller networks operating under tight timelines and
>  budgets was emphasized.”
>
> this is just insufficient emphasis here.
> I could think to replace all of section 3.1.x with:
>   “Additionally, the lack of
>    simple tools other than Net-SNMP for smaller networks operating under
>    tight timelines, a mix of older and newer systems and very small budgets
>    was emphasized.”
>
> Organizations slightly larger than CREDIL typically “solve” these problems
> by buying all same-vendor equipment every ~10 years or so.  It comes with a
> thick manual which one person reads, just before being lured away by higher
> salary to be an Application Engineer for said vendor.  If they were rich at
> the time, or the discount was large, they bought the management platform with
> a pretty interface, but in practice no ability to do anything really useful.
>
> section 3.2 seems to all be about problems with managing systems at larger
> scale.  My complaint all along is that RESTCONF(YANG) does not scale.  Scale doesn’t
> mean “big”, it means, something can work from smallest to largest scale. RESTCONF
> doesn’t work in the small.  One never starts a three switch network with
> RESTCONF, and then use it as you grow.  One starts with a three hundred
> switch network.
>
> Lots of talk about open source, but not really any mention of how such a
> thing could be funded.  That’s really the problem.
>
>
> }   1.  In network deployments, operations are typically at the bottom of
> }       the ladder.  It’s the most squeezed for time and resources.
> }       Network engineers are not typically seasoned developers.
>
> It’s way way worse than that.  Most government and NGO network engineers are
> prevented by policy (from their own organization!) from even having developer
> tools on their desktop.  Getting access to equipment to do testing on (like
> developing new tools) is essentially impossible in many places.
>
> }   4.  It was suggested that other domains (e.g., K8N/automation) are
> }       years ahead of the current network engineering stack.
>
> k8n has useless networking.  It’s “years ahead”, in that it does a few things
> autonomically, but does them wrong.  Managers are told the wrong way is “better”
>
> }   5.  There was a point about navigating non-device-specific models
> }       being difficult.  If understood correctly, the Network Engineer
> }       knows the CLI command but has trouble grepping for it in YANG
> }       modules defined by SDOs.
>
> I can discover most anything in the CLI by hitting TAB/? enough.
> One switches and routers which I’ve never touched or read the manual for.
>
> –
> ]               Never tell me the odds!                 | ipv6 mesh networks [
> ]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
> ]     mcr@sandelman.ca  <http://www.sandelman.ca/>        |   ruby on rails    [