[CCAMP] draft-ietf-ccamp-alarm-module discussion points

stefan vallin <stefan@wallan.se> Tue, 13 February 2018 15:27 UTC

Return-Path: <stefan@wallan.se>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 80FD11270FC for <ccamp@ietfa.amsl.com>; Tue, 13 Feb 2018 07:27:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wallan-se.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xGs9VY1GMip1 for <ccamp@ietfa.amsl.com>; Tue, 13 Feb 2018 07:27:51 -0800 (PST)
Received: from mail-pg0-x235.google.com (mail-pg0-x235.google.com [IPv6:2607:f8b0:400e:c05::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F32AE126CB6 for <ccamp@ietf.org>; Tue, 13 Feb 2018 07:27:50 -0800 (PST)
Received: by mail-pg0-x235.google.com with SMTP id g2so74778pgn.7 for <ccamp@ietf.org>; Tue, 13 Feb 2018 07:27:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wallan-se.20150623.gappssmtp.com; s=20150623; h=from:content-transfer-encoding:subject:message-id:date:to :mime-version; bh=fpI1m2r3WJQ5Yx5csjvUzYp6Tdt7xWXENjP6IMzVz7U=; b=uXdXfOCWOEjNVHWfcYRp/eh6f8fA8Y6liLKH3hlarAk+MlHeQeyR+wmkAxkNRjQ0QZ iY7w3PG7PTUEFTCgyfcn/gJYjxYvW9MQupvfj/rMgMT0nfbL48H+aI2m0m0AcFoBnNoG pHkpcRs+ZCBWoI2e/sfn2+NDOnVP3mj8G3UWzsoszKkCvy/h32P4bpAkE65Ju0FG0QN5 PHZyUrfvx3JSqj3t/c1jKXuq4A9kTdA9jlNVZXx2b7QgFj0PmWTpTF5qG6FJWCMbKEoG BKIP0e/33nLU/awFKay2u6JA98dOJm6XH0exHX2ZRf7L4drlyGcGnfXC8j6XQ6ZzjEJ5 6qPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:subject :message-id:date:to:mime-version; bh=fpI1m2r3WJQ5Yx5csjvUzYp6Tdt7xWXENjP6IMzVz7U=; b=awRSBXC/4drmVLeBTdtR0OdfNXITTv6+Czx7uJmAwfdrroWdeRDO6SGZe30F2w2y4P Lyb7YnnhPfk8KsxuCENqpFIpdJIuG3NbujeE1L02Iip1475QdT3d5ZV27HY2IFQSqNz7 oce10oWs3PjhlKQQ59qujN2yoTIo5Zy7ujIIF5AsC0QAPtcNIloqSYp9r4omynsH9EO0 4lgGfroYsV4UTXOb32OTmeSNn2lBmfi+x6d/+/mlmjgboqQsyOEe7mJOyS06k8vX8YQw Ol5R9UCGTxPMmInPmMeekY12TZaSxUM4XPzHxn+rdLp7vesn1ydPv/oKh+V3/FFl4y+O LOvg==
X-Gm-Message-State: APf1xPCyIVe2Qvgz4RCBwjy5usf1c38acgwGlKkEP6aJesLnyLXuLhI/ iyGB5ZSGUhKB8ndlY+0goWB+SyVYjiY=
X-Google-Smtp-Source: AH8x227jQwuKtp9Iucuzw1vfDMGdsan5kbSCW0rN/56D8usiwTrY9EL8mV07QRwn129XzlMeWtiVVw==
X-Received: by 10.98.201.129 with SMTP id l1mr1626317pfk.76.1518535669681; Tue, 13 Feb 2018 07:27:49 -0800 (PST)
Received: from [192.168.10.246] (198-0-214-85-static.hfc.comcastbusiness.net. [198.0.214.85]) by smtp.gmail.com with ESMTPSA id r10sm34504569pfh.127.2018.02.13.07.27.48 for <ccamp@ietf.org> (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 13 Feb 2018 07:27:49 -0800 (PST)
From: stefan vallin <stefan@wallan.se>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Message-Id: <07D07BBE-BAE0-4805-928F-1309EDDE89AE@wallan.se>
Date: Tue, 13 Feb 2018 07:27:47 -0800
To: "CCAMP (ccamp@ietf.org)" <ccamp@ietf.org>
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/bMpd2Y9ps_pMKl_2OdoRh82WTO4>
Subject: [CCAMP] draft-ietf-ccamp-alarm-module discussion points
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Feb 2018 15:27:53 -0000

Hi!

After publishing draft-ietf-ccamp-alarm-module-01 we have a couple of open discussion ponts which we will try to summarize including our recommendation. 

Severity filtering/shelving
===================
# Original comment from Nick/joey:
---
We have encountered requirements for applying some form of alarm suppression based on the current severity of an alarm.
Given that shelving is configuration-based, but also given the need to be able to suppress alarms or at least the notification of the changes in state of alarms below a given severity, maybe it would be possible to consider supporting some other mechanism in the core alarm module to fulfill such a requirement. In the current version of the alarm module you already have the global configuration ‘notify-status-changes’ which allows an operator to control whether notifications are sent on all status changes or only when an alarm is raised or cleared. Maybe a separate configuration that operates in a similar way to control whether notifications are sent or not based on the perceived severity of an alarm would be a possible solution. Such a mechanism could use the same filtering criteria as a shelf, i.e., ‘resource’, ‘alarm-type-id’ and ‘alarm-type-qualifier[-match]’, plus a severity threshold below which notifications of state changes of the alarm will not be sent. If an alarm covered by the filter criteria is raised for the first time and is not also covered by a shelf, it would be entered into the ‘alarm-list’ by the system, but since its perceived-severity is below the notification threshold, no ‘alarm-notification’ would be sent. If the ‘perceived-severity’ of the alarm increases above the notification threshold, the ‘alarm-notification’ would be sent. If the alarm is then cleared, a notification would be sent. However, the interworking with the existing configuration ‘notify-status-changes’ will need to be defined.
—

# Stefan/Martin:
We suggest to expand the notify-status-changes configuration to include a severity level. We recommend not to allow further further filtering criteria, since this would “interfere” with shelving. 
Since an alarm might have several severity levels during its life-cycle it is important to understand that notifications will be sent for:
- any status change “crossing” (more or less severe) the configured level
- all status changes above that
- always send clear

Configured threshold = major

example life-cycle #1
================
raise, warning : no notif
minor : no noftif
major : notif
critical : notif
clear: notif

example life-cycle #2
================
raise, warning : no notif
minor : no notif
major : notif
critical : notif
minor: notif (since the alarm was major and now is moving below the threshold)
warning: -
clear: (sending clear although the alarm has a severity level below the threshold)

Question to the working group, do we need to be able to filter those notifications also on ‘resource’, ‘alarm-type-id’ and ‘alarm-type-qualifier[-match]’?
We are afraid of making the alarm module too complex to start with. An advanced notification filter concept in combination with shelving also adds complexity, both for the server and the client. Being able to filter on severity can be considered a management application function. Also if the rules on alarm quality and alarm rate in Appendix F are followed there will be no issue of notification overload.

What is the use-case for notification filtering? Notif overload? Filtering for users?

Temporal ordering
==============
# Original comment from Nick/Joey:
--
We have encountered requirements for a relatively simple means for a NMS to be able to identify the temporal order, in which alarms have been raised by a system. Such information could, of course, be derived by a client by inspecting the system’s ‘alarm-list’ and the entries in the ‘status-change’ list of each entry in the ‘alarm-list’, but then only if supported by a system. Although there is the leaf ‘last-changed’ in the lists ‘alarm’ and ‘shelved-alarm’ in the current implementation (equivalent to the leaf ‘time’ in ‘alarm-notification’), which provides a timestamp of when the status of an alarm instance last changed, there are no explicit timestamps indicating when the alarm was last raised or last cleared. If these leafs were also supported and also included in ‘alarm-notification’, they could provide a means for a NMS to correlate alarm activations within the network.

Alternatively, a relatively simple addition to the core alarm module that could streamline this task would be, for example, to maintain a global alarm activation counter within a system and assign the value of the counter to an instance of an alarm each time it is raised, incrementing  the counter after assignment. This activation identifier, which would be stored for an alarm instance within the lists ‘alarm-list’ and ‘shelved-alarm’ and included within the ‘alarm-notification’ could then be used by a client to correlate the alarms of a system and easily construct a timeline. The counter’s value itself would have no meaning, except to reconstruct  the order of alarms being raised. If an alarm is re-raised after being raised, it would receive a different value for its activation identifier.
—

# Stefan/Martin:
1) It can be done already today if YANG feature alarm-hisotry is supported
2) Adding time stamps for last-raised and last-cleared in the alarm list as well as notifs would make it simpler for the manager
3) An alarm activation counter would make the explicit ordering even simpler

Like to hear the working group opinion on these options. We are a bit reluctant to add more features to the alarm model, afraid of making it too complex 
So maybe stay at option 1?

Tagging of alarm types in alarm inventory
===============================

# Original comments from Nick and Joey:
---
Some operators require to be able to associate their own specific information with alarm types for use within their networks. Such a requirement could be supported within the alarm module through a generic mechanism whereby one or more tag/values could be mapped to an alarm-type-id and/or alarm-type-qualifier-match in a similar way as the list ‘x733-mapping’ allows a client to override the default X.733 mapping. Any mapped value would also need to be listed within the alarm inventory entry for that alarm-type and included within the ‘alarm-notification’.

We believe that just a single configurable description leaf would be somewhat restrictive. We would rather take a more generic and extensible direction, whereby a client can map zero or more tagged values as required by his network implementation to an alarm-type-id and/or alarm-type-qualifier-match. For example, something like that shown below.

As configuration:

 +--rw custom-tags {custom-tags}?
    +--rw custom-tag* [alarm-type-id alarm-type-qualifier-match tag-name]
       +--rw alarm-type-id               alarm-type-id
       +--rw alarm-type-qualifier-match  string
       +--rw tag-name                    string
       +--rw tag-value?                  tag-value

In the alarm inventory (state):

 +--ro alarm-type* [alarm-type-id alarm-type-qualifier]
    +--ro alarm-type-id                       alarm-type-id
    +--ro alarm-type-qualifier                alarm-type-qualifier
    +--ro resource*                           string
   :
    +--ro custom-tags {custom-tags}?
       +--ro custom-tag* [tag-name]
          +--ro tag-name   string
          +--ro tag-value? tag-value
—
# Stefan/Martin:
We see the point of this, but are afraid that we are designing alarm manager application functionality into the alarm module.
Doesn’t it make more sense to let the manager/client perform this function? Imagine there are two managers with different views on this?
We suggest to leave this out of the module.


Notifications for shelf criteria:
======================
Original comments from Nick and Joey:
---
When alarms are shelved, the system is to stop sending notifications for the shelved alarms.
We infer that at the point in time when an alarm is shelved, no notification would be sent as the change of the ‘operator-state’ of an alarm is never notified. We are concerned that in the case of multiple clients, some clients will not be made aware that an alarm has been shelved.
Why was an explicit ‘alarm shelved/un-shelved’ notification, which would be sent when an alarm is shelved and when it is un-shelved, not implemented in the YANG Alarm Module?
—

# Stefan/Martin:
Notifications for changed configuration in general can be done by using:
- RFC6470 config-change notification
- YANG Datastore Subscription, draft-ietf-netconf-yang-push-14



br Martin and Stefan

Stefan Vallin
stefan@wallan.se
+46705233262