Re: [CCAMP] Balazs Review of alarm-module-04

stefan vallin <stefan@wallan.se> Mon, 22 October 2018 12:14 UTC

Return-Path: <stefan@wallan.se>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E057130DFC for <ccamp@ietfa.amsl.com>; Mon, 22 Oct 2018 05:14:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wallan-se.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KkIuq-wJfaD7 for <ccamp@ietfa.amsl.com>; Mon, 22 Oct 2018 05:14:16 -0700 (PDT)
Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DEC08128D0C for <ccamp@ietf.org>; Mon, 22 Oct 2018 05:14:15 -0700 (PDT)
Received: by mail-lj1-x22c.google.com with SMTP id p89-v6so36751700ljb.3 for <ccamp@ietf.org>; Mon, 22 Oct 2018 05:14:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wallan-se.20150623.gappssmtp.com; s=20150623; h=from:mime-version:subject:date:references:to:in-reply-to:message-id; bh=ehojlOwqSIZ1uxt6op6MhKsUeamOYOcPHCQw05F8ow0=; b=P7z4PHY2+kcrrU0qnj9wg1e0joYuD5KfrMRdUX9IwLxOv/PWaHYFCel4HRWDWol7Ei s76D0i5CaeyYusEoF0fTqVBTVCnvwvPRM3MWCTD2r2TNWRrF7id3euxrVKrHwPruv26e uBx7OuvHzXXaKiGvVHKAdty6ZtEeBXGg23TobP9Zh0P+2IVShlPZnqlSu2dap7SWjsPm fyj62d8SzjIvYcIedGBcGD+GOf2/39J9BwbqSob40BSGoCOdhO4D52AVscnaR/PUjd1+ +TQNxvkne+iHW+o+TyubBz4H/utySSvNzpQW0RY2Rr/Gih1CQHBPponJ2nQhf0cvYmmh Fdwg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:date:references:to :in-reply-to:message-id; bh=ehojlOwqSIZ1uxt6op6MhKsUeamOYOcPHCQw05F8ow0=; b=LdnUOU2gyUvyC+lFm+/QacXQ9uQdltoOu6U31ffRiuwj8ep4Y8qxuPkZYk7QjOazNb Rou/9yisJb2ruJ+3uXC6OaKoiHcordbtzPk4OB14tYs42UaKTOVMJX0F4pBl8+ddzhdc O62KGNa8HbLukvKO7z4lhB1xGj75mIAfec6PDqMI7Xg6VX9FlGAUK0nd55tELeHSziiH 6M62lcBBY77CaJ3REH2X7Mcr3wRsriovYs1rUoj6q+o8Q0jK3rKiVLI5VXUGJh2vN04+ eCEKo2FeKtqApix4I9D1gNGEVK7OQBBxsYcvg9zywNEAjU8+g8hGV1dd0O90zgdc0dsg bT/Q==
X-Gm-Message-State: ABuFfohoh86oxgjg2cL8wTpQYN8UQFD5191f5f0d+rm9ggKb+TSAcdul v6AC6DXVME17zHQ9sWOVKjsvJg==
X-Google-Smtp-Source: ACcGV60GkIeQ6Exp+q8p/ZCRWOklFfs63bOC5lmbzhSTZpN6WTRcoDdotbplhJD/O4gTjAxUjDMd+w==
X-Received: by 2002:a2e:2f1b:: with SMTP id v27-v6mr28695383ljv.31.1540210453860; Mon, 22 Oct 2018 05:14:13 -0700 (PDT)
Received: from [192.168.99.186] ([195.234.15.132]) by smtp.gmail.com with ESMTPSA id i186-v6sm6832823lfe.41.2018.10.22.05.14.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Oct 2018 05:14:13 -0700 (PDT)
From: stefan vallin <stefan@wallan.se>
Content-Type: multipart/alternative; boundary="Apple-Mail=_95ECB095-8C5E-4A12-9079-F8CD79E0C895"
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
Date: Mon, 22 Oct 2018 14:14:11 +0200
References: <41cfebeb-0e81-838a-e3e1-9aaac3fea947@ericsson.com>
To: Balázs Lengyel <balazs.lengyel@ericsson.com>, "ccamp@ietf.org" <ccamp@ietf.org>
In-Reply-To: <41cfebeb-0e81-838a-e3e1-9aaac3fea947@ericsson.com>
Message-Id: <E8A526FA-19A6-4AC9-B612-CEB2B044EB24@wallan.se>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/fR235prB1jh2Om8RdQxODlctCw8>
Subject: Re: [CCAMP] Balazs Review of alarm-module-04
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Oct 2018 12:14:20 -0000

Hi Balazs!
Thanks for your comments, see inline
br stefan and martin

> On 12 Oct 2018, at 15:52, Balázs Lengyel <balazs.lengyel@ericsson.com <mailto:balazs.lengyel@ericsson.com>> wrote:
> 
> Hello,
> 
> I have reviewed draft-ietf-ccamp-alarm-module-04
> 
> General: 
> IMHO the draft is in a fairly good shape. 
> The draft https://tools.ietf.org/html/draft-lengyel-netmod-yang-instance-data-04 <https://tools.ietf.org/html/draft-lengyel-netmod-yang-instance-data-04> describes how to document YANG instance data. One of the main use case for documenting YANG instance data in a file is to provide information about the YANG server's capabilities. The alarm-inventory is exactly such a set of server capabilities that could, that SHOULD be documented in a YANG Instance Data file already in design-time to help users integrate management systems. The draft is in the process of being adopted by the Netmod WG.
> 
> It should be mentioned somewhere that a cleared alarm can be raised again thus become is-cleared=false; potentially many times during its life.
> 
This is stated in 3.5.1.  Resource Alarm Life-Cycle

  From a resource perspective, an alarm can for example have the
  following life-cycle: raise, change severity, change severity, clear,
  being raised again etc.  

See also the is-cleared leaf:
      leaf is-cleared {
        type boolean;
        mandatory true;
        description
          "Indicates the current clearance state of the alarm.  An
           alarm might toggle from active alarm to cleared alarm and
           back to active again.”;

> IMHO the counters defined are not the best. For me the two most important statistics would be:
> 
> 	• How many active (is-cleared=false) alarms do I have in my system?   total-not-cleared This could be added both to the list alarm and the list alarm-summary. I know this can be derived from the total minus total-cleared but I feel that this is the primary information, so this should be available just by a read operation without any maths.
Good comment, we should have a counter for number of active alarms. Will add in next revision
> 	• How many alarms did my system generate in the last hour/day/week ?   total-raised, this should be a yang:counter32 that increments whenever a new alarm is added or set to is-cleared=false. (types, severity, shelving could be considered)
> We have neither counter in the module.
I think this is outside the scope of the alarm module, the management application can do alarm statistics
> 
> I would generally prefer using YANG actions instead of rpcs. I would place all RPCs as actions under the /alarms container.
> 	• These are not general system-wide operations, they are specific to alarm-handling. They belong under /alarms
> 	• It makes setting up access control simpler. As the access control for the /alarms container already controls access to the RPCs too. Otherwise you need separate rules for each RPC.
> 	• On a CLI if you ask for help, all top level RPCs are listed as options. If there are many RPCs the list gets really long. With actions you do not have this problem.

We see your point, actions vs rpc can always be discussed.
How RPCs are presented in a CLI is a totally different thing though, that is up to the CLI engine
However, we prefer to keep these as RPCs at this point.


> 4.4) I don't understand the first sentence.
" The alarm list (/alarms/alarm-list) is a function from (resource,
  alarm type, alarm type qualifier) to the current composite alarm
  state.  The composite state includes states for the resource life-
  cycle such as severity, clearance flag and operator states such as
  acknowledgment.”

This states by using the keys (resource, alarm type, alarm type qualifier) as input you get
the current resource (severity, is-cleared, text) and operator state (e.g. ack) of the alarm as output.

The essence of the alarm list is that function, not for example, to be a list of time-stamped notifications.
Will improve this paragraph in next revision

> 
> 5) 
> type resource)
> 
> 	• required-instance false; Is not needed as that is the default meaning for instance-identifiers. Specifying default statements is not recommended by RFC6087bis
No, the default for require-intance is "true".  See 7950, section 9.9.3.

> 	• IMHO Distinguished names are more common then UUIDs. I would list them as more preferred then UUIDs.
Agree, will change in next revision
> typedef resource-match) The following two lines are redundant and contradictory (choose one :-)   )
> 
> 	• The function library is the core function library and the functions defined in Section 10 of RFC 7950. 
> 	• The function library is the core function library
Thanks for catching this, typo...
> 	•
> The context node is the root node in the data tree. 
> AFAIK YANG has many root nodes. Please clarify.
No, a tree has always exactly one root, and the root has all top-leve
nodes from all modules as children.  See 7950 section 6.4.1.
> 
> list-alarm)
> 
> I am missing a leaf: time-alarm-raised.  As an alarm can be raised then cleared, then raised again I have no way of knowing when the current alarm condition started. If an interface is down I can't check how long has it been down. We have the leaf time-created, however if the alarm is repeatedly raised/cleared/raised but never purged this leaf will tell me when this alarm was first raised in the history of the system, which is not very interesting.
This can be done by inspecting:
       +--ro status-change* [time]
          +--ro time                    yang:date-and-time
          +--ro perceived-severity      severity-with-clear
          +--ro alarm-text              alarm-text


> container shelved-alarms)
> 
> alarm-shelf-last-changed) The name of the leaf seems to indicate that this refers to the time when one specific alarm-shelf changed. However as I understand this is the time when any of the shelves changed. Please clarify / rename 
Naming can be improved, agree

> list alarm-profile)
> 
> Why is this user-ordered ? I see no meaning of the order.
If a user configures overlapping criterias, the first match is used.

> 
> rpc compress-alarms)  leaf resource) IMHO this could be of type resource-match
Good suggestion, will consider a change.

> 6)  module-description)
> 
> Mention that shelved-alarms are also augmented
Ok!

> 
> Appendix F.1)
> 
> Clarify whether this annex is normative or informative. To me this seems, this should be normative text.
This is informative
> regards Balazs
> 
> P.S. Please send any replies to me personally too as I am not always reading the cccamp list.
> -- 
> Balazs Lengyel                       Ericsson Hungary Ltd.
> Senior Specialist
> Mobile: +36-70-330-7909              email: 
> Balazs.Lengyel@ericsson.com <mailto:Balazs.Lengyel@ericsson.com>
> 
> 
> _______________________________________________
> CCAMP mailing list
> CCAMP@ietf.org <mailto:CCAMP@ietf.org>
> https://www.ietf.org/mailman/listinfo/ccamp <https://www.ietf.org/mailman/listinfo/ccamp>