Re: [CCAMP] New Version Notification for draft-vallin-ccamp-alarm-module-01.txt

NICK HANCOCK <nick.hancock@adtran.com> Tue, 12 December 2017 17:28 UTC

Return-Path: <nick.hancock@adtran.com>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D3B7E1294D4 for <ccamp@ietfa.amsl.com>; Tue, 12 Dec 2017 09:28:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YkrkR4XcP6bJ for <ccamp@ietfa.amsl.com>; Tue, 12 Dec 2017 09:28:44 -0800 (PST)
Received: from us-smtp-delivery-128.mimecast.com (us-smtp-delivery-128.mimecast.com [63.128.21.128]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 268B41294D3 for <ccamp@ietf.org>; Tue, 12 Dec 2017 09:28:44 -0800 (PST)
Received: from ex-hc2.corp.adtran.com (ex-hc3.adtran.com [76.164.174.83]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-62-sImX1DtjPemRTWa7Tb5AUw-1; Tue, 12 Dec 2017 12:28:34 -0500
Received: from ex-mb1.corp.adtran.com ([fe80::51a3:972d:5f16:9952]) by ex-hc2.corp.adtran.com ([fe80::a019:449b:3f62:28e5%10]) with mapi id 14.03.0361.001; Tue, 12 Dec 2017 11:28:33 -0600
From: NICK HANCOCK <nick.hancock@adtran.com>
To: stefan vallin <stefan@wallan.se>, "ccamp@ietf.org" <ccamp@ietf.org>
CC: JOEY BOYD <joey.boyd@adtran.com>
Thread-Topic: [CCAMP] New Version Notification for draft-vallin-ccamp-alarm-module-01.txt
Thread-Index: AQHTVutsUUvAP99Lw0C/xUcWzomNg6NADMAg
Date: Tue, 12 Dec 2017 17:28:32 +0000
Message-ID: <BD6D193629F47C479266C0985F16AAC7F068A928@ex-mb1.corp.adtran.com>
References: <BD6D193629F47C479266C0985F16AAC7F0675134@ex-mb1.corp.adtran.com> <7215B14E-CA90-4EDB-BE4C-594EC3FA0B1B@wallan.se>
In-Reply-To: <7215B14E-CA90-4EDB-BE4C-594EC3FA0B1B@wallan.se>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.22.218.223]
MIME-Version: 1.0
X-MC-Unique: sImX1DtjPemRTWa7Tb5AUw-1
Content-Type: multipart/alternative; boundary="_000_BD6D193629F47C479266C0985F16AAC7F068A928exmb1corpadtran_"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/kCROpP52timRP_AZ2cy3QsMxHd4>
Subject: Re: [CCAMP] New Version Notification for draft-vallin-ccamp-alarm-module-01.txt
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Dec 2017 17:28:48 -0000

Hi Stefan and Martin,

Thanks for your constructive feedback on our comments below, but I would like to come back to 2 of the points.

Shelving criteria and wildcards

If a client specifies only one criterion when defining a shelf, say, the 'alarm-type-id', then all alarm instances of that 'alarm-type-id' regardless of their 'resource' or 'alarm-type-qualifier' values will be shelved. In this case, the criteria for which no configuration has been made, i.e., 'resource' and 'alarm-type-qualifier', act as wildcards when not present. Given this behavior that a omitted criterion acting as a wildcard for that criterion, then we would intuitively expect that omitting all criteria would mean that wildcards apply for all criteria. Thus, creating a shelf with no criteria would shelve all alarms.
If you consider the criteria as acting as a filter on the instances of alarms, then specifying no filter criteria, will not filter any alarms, so a shelf would select all instances of alarms. We believe that such behavior would make the application of the criteria consistent.

Severity-based alarm suppression

We have encountered requirements for applying some form of alarm suppression based on the current severity of an alarm.
Given that shelving is configuration-based, but also given the need to be able to suppress alarms or at least the notification of the changes in state of alarms below a given severity, maybe it would be possible to consider supporting some other mechanism in the core alarm module to fulfill such a requirement. In the current version of the alarm module you already have the global configuration 'notify-status-changes' which allows an operator to control whether notifications are sent on all status changes or only when an alarm is raised or cleared. Maybe a separate configuration that operates in a similar way to control whether notifications are sent or not based on the perceived severity of an alarm would be a possible solution. Such a mechanism could use the same filtering criteria as a shelf, i.e., 'resource', 'alarm-type-id' and 'alarm-type-qualifier[-match]', plus a severity threshold below which notifications of state changes of the alarm will not be sent. If an alarm covered by the filter criteria is raised for the first time and is not also covered by a shelf, it would be entered into the 'alarm-list' by the system, but since its perceived-severity is below the notification threshold, no 'alarm-notification' would be sent. If the 'perceived-severity' of the alarm increases above the notification threshold, the 'alarm-notification' would be sent. If the alarm is then cleared, a notification would be sent. However, the interworking with the existing configuration 'notify-status-changes' will need to be defined.

Best regards
Nick

From: stefan vallin [mailto:stefan@wallan.se]
Sent: Monday, November 06, 2017 4:38 AM
To: NICK HANCOCK
Cc: ccamp@ietf.org; JOEY BOYD
Subject: Re: [CCAMP] New Version Notification for draft-vallin-ccamp-alarm-module-01.txt

Hi Joey and Nick!
Thanks for your comments, really good ones.
See answers inline

br Stefan and Martin

On 03 Nov 2017, at 16:50, NICK HANCOCK <nick.hancock@adtran.com<mailto:nick.hancock@adtran.com>> wrote:

Hi!

We have some questions and comments on the concept of shelving and how it is intended to be implemented and used within the YANG Alarm Module.


1. Alarm shelving and alarm suppression

3GPP TR 32-859 based on a presentation from ANSI/ISA 18.2 defines alarm shelving as one method of alarm suppression, the others being out-of-service alarms and alarms suppressed by design.

You effectively combined out-of-service alarms with alarm shelving deciding not to have separate lists for out-of-service alarms and shelved alarms. 3GPP TR 32-859 also defines 'a time limit for shelving', but such was not included as part of alarm shelving in the YANG Alarm Module. What was the reasoning behind these decisions?
Answer:
Correct, we did let shelving cover out of service as well. You could just create a shelf named OOS if you like.
Actually in an early version of this module there was a timer as well. But we removed it.
The shelf is configuration data. The timer feature would then make the server change its configuration autonomously which is not the intent of configuration.
Therefore we consider the management system (NETCONF client) being responsible for managing the timer.




2. Shelving and un-shelving alarms

This is our understanding of how alarms are shelved and un-shelved in the YANG Alarm Module:

An alarm will be shelved by a system when the alarm meets the ANDed criteria of one or more entries in the list 'shelf'.

So to shelve an existing alarm within the list 'alarm' an operator has to create an entry in the list 'shelf' or modify the criteria of one or more existing entries in the list 'shelf', so that the existing alarm is now included in the shelving criteria as defined by all entries within the list 'shelf'.
And to un-shelve an existing alarm within the list 'shelved-alarm' an operator has to delete an entry in the list 'shelf' or modify the criteria of one or more existing entries in the list 'shelf', so that the existing alarm is no longer included in the shelving criteria as defined by all entries within the list 'shelf'.

Is our understanding correct?
Answer:
Yes


However: the action 'set-operator-state' connected to the list 'alarm' currently allows the value 'shelved' to be specified as a valid input for the leaf 'state'. How can this be reconciled with alarm shelving as described above?

Specifically, if the operator is able to explicitly shelve an alarm by means of this action, this would mean that there would be a shelved alarm in the list 'shelved-alarm', for which there is no entry in the list 'shelf' with criteria that meet the shelved alarm. In addition, there is no corresponding action connected to the list 'shelved-alarm' to enable an operator to un-shelve an alarm.
Answer:
Good catch, we were not clear on the intended usage.

The set-operator-state is not intended to be used for shelving, and the "shelved" state shall only be set the by the server not by the client.
        enum shelved {
          value 4;
          description
            "Alarm shelved.  Alarms in alarms/shelved-alarms/
             MUST be assigned this operator state by the server as
             the last entry in the operator-state-change list.";
        }
        enum un-shelved {
          value 5;
          description
            "Alarm moved back to alarm-list from shelf.
             Alarms 'moved' from /alarms/shelved-alarms/
             to /alarms/alarm-list MUST be assigned this
             state by the server as the last entry in the
             operator-state-change list.";

However we lack descriptions making that clear. We should also redesign the operator-state enum into
- the settable values
- the readable values


Something like this:
typedef writable-operator-state {
  type enumeration {
    enum none;
    enum ack;
    enum closed;
  }
}

typedef operator-state {
  type union {
    type writable-operator-state;
    type enumeration {
      enum shelved;
      enum un-shelved;
    }
  }
}

So the idea is that shelving shall appear in the history of operator actions on the alarm.
But shelving is not performed by the set-operator-state.




3. Shelf criteria

A shelf defines shelving criteria and these criteria are ANDed. But each of the leafs representing the criteria is also optional, which means that it would be valid to define a shelf with no criteria.
Since we see a valid use case where an operator wishes to shelf all alarms of a system with a single configuration, we thus believe would be very useful, if a shelf configured with no criteria would shelf all alarms.

Answer:
Your questions/comments retriggered a thought me and Martin have had around resource wild-carding and resource sub-trees.
We prefer the shelf to be explicitly configured but using a resource wild-card "*".
Also it should be possible to specify a resource sub-tree like "/interfaces".



Could the authors clarify what the intended behavior would be in the case of a shelf configured with no criteria?
Answer:
No criteria would be a noop.
We suggest a resource wild-card mechanism as described above.


To define an entry in the list 'x733-mapping' 'alarm-type-qualifier-match' was used instead of just 'alarm-type-qualifier'. Wouldn't the use of 'alarm-type-qualifier-match' instead of just 'alarm-type-qualifier' as a shelf criterion be a better and more flexible solution?
Answer:
Good suggestion!



4. Shelving alarms for multiple resources

For maintenance purposes, it may be practical to be able to shelf all alarms for all resources associated with a given card within an equipment. In the current implementation, this would mean creating separate shelf entries, one for each affected resource located on the card. Wouldn't it be advantageous if the criterion 'resource' be a leaf-list instead of just a leaf?
Answer:
Good suggestion!



5. Shelving based on perceived severity

We are aware that in some network implementations there is the requirement to be able to shelve alarms based on their severity.

The severity of an alarm is not included in the criteria of a shelf in the YANG Alarm Module. Was this considered? What are the authors view on supporting shelving based on the severity of an alarm?
Answer:
Shelving and severity does not really go well together.
Severity levels are notification-focused.
An alarm is a state on a resource and it might change state with resulting notifications, for example: major, minor, and then a clear flag for the minor alarm state.

So shelving can not really be done on severity levels.



6. Shelving an active alarm only until it is cleared

We are considering the use case, in which an operator is requiring to shelve an active alarm only until it is cleared, i.e., when the alarm is cleared, it should be automatically be un-shelved by the system.  This use case is currently not covered by the current implementation of the YANG Alarm Module, because alarms are shelved based only on the criteria defined by the entries in the list 'shelf', which apply independently of whether an alarm is active or cleared.

What are the authors views on supporting such a use case and how could this be implemented?
Answer:
We do not consider this to be a shelving feature. This is more a filtering function in the management system.
This is for several reasons, see above on severity levels.
Also this would make the system change its configuration (shelf criteria)  autonomously as in question #1 which is not preferable.



7. Notifications

When alarms are shelved, the system is to stop sending notifications for the shelved alarms.
We infer that at the point in time when an alarm is shelved, no notification would be sent as the change of the 'operator-state' of an alarm is never notified.

We are concerned that in the case of multiple clients, some clients will not be made aware that an alarm has been shelved.

Why was an explicit 'alarm shelved/un-shelved' notification, which would be sent when an alarm is shelved and when it is un-shelved, not implemented in the YANG Alarm Module?

Answer:
Makes sense. Will add notifications for this.




if any of these topics have already been discussed before, please point us to where we can find the discussion.

Joey Boyd & Nick Hancock

_______________________________________________
CCAMP mailing list
CCAMP@ietf.org<mailto:CCAMP@ietf.org>
https://www.ietf.org/mailman/listinfo/ccamp