Re: [CCAMP] Alarm Module work

NICK HANCOCK <nick.hancock@adtran.com> Fri, 01 June 2018 16:13 UTC

Return-Path: <nick.hancock@adtran.com>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A51812D960 for <ccamp@ietfa.amsl.com>; Fri, 1 Jun 2018 09:13:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5T7YpOllm4Jj for <ccamp@ietfa.amsl.com>; Fri, 1 Jun 2018 09:13:44 -0700 (PDT)
Received: from us-smtp-delivery-128.mimecast.com (us-smtp-delivery-128.mimecast.com [216.205.24.128]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8742112D95D for <ccamp@ietf.org>; Fri, 1 Jun 2018 09:13:43 -0700 (PDT)
Received: from ex-hc2.corp.adtran.com (ex-hc3.adtran.com [76.164.174.83]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-25-yALP5rV6P9Ks4ZI4POqLJg-1; Fri, 01 Jun 2018 12:13:34 -0400
Received: from ex-mb1.corp.adtran.com ([fe80::51a3:972d:5f16:9952]) by ex-hc2.corp.adtran.com ([fe80::a019:449b:3f62:28e5%10]) with mapi id 14.03.0382.000; Fri, 1 Jun 2018 11:13:33 -0500
From: NICK HANCOCK <nick.hancock@adtran.com>
To: stefan vallin <stefan@wallan.se>, "Common Control and Measurement Plane Discussion List" <ccamp@ietf.org>
Thread-Topic: [CCAMP] Alarm Module work
Thread-Index: AQHT9n0SoaJ4IyX+bk+5UQx4OBH1KqRJybRAgADBDwCAAHfjoA==
Date: Fri, 1 Jun 2018 16:13:32 +0000
Message-ID: <BD6D193629F47C479266C0985F16AAC7F0705D2A@ex-mb1.corp.adtran.com>
References: <D174588E-1233-4B53-B5BB-D29DE14B3888@wallan.se> <BD6D193629F47C479266C0985F16AAC7F07058E9@ex-mb1.corp.adtran.com> <7906650D-4E83-4386-AA08-43B120CD6866@wallan.se>
In-Reply-To: <7906650D-4E83-4386-AA08-43B120CD6866@wallan.se>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-classification: GB
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0FEVFJBTiIsImlkIjoiM2JmZTU4ZjYtYTk3OS00ZDE1LTkyYjQtYTY1NDI0YjU0ODI1IiwicHJvcHMiOlt7Im4iOiJDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiR0IifV19LHsibiI6IlF1ZXN0aW9uMSIsInZhbHMiOltdfSx7Im4iOiJRdWVzdGlvbjIiLCJ2YWxzIjpbXX0seyJuIjoiUXVlc3Rpb24zIiwidmFscyI6W119XX0sIlN1YmplY3RMYWJlbHMiOltdLCJUTUNWZXJzaW9uIjoiMTcuMi4xMS4wIiwiVHJ1c3RlZExhYmVsSGFzaCI6Ik1qNnhiNW5VekhvbjBRUGVtZU9nXC82Ymd4b0JXdFZHcHBcL2FVS0Z5NGh6U3BoeEg4dm9pTEdYMVJMYnpEOG81bSJ9
x-originating-ip: [172.20.62.171]
MIME-Version: 1.0
X-MC-Unique: yALP5rV6P9Ks4ZI4POqLJg-1
Content-Type: multipart/related; boundary="_004_BD6D193629F47C479266C0985F16AAC7F0705D2Aexmb1corpadtran_"; type="multipart/alternative"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/EwtfxaKT7zEFZ8qppuwu5e4akeQ>
Subject: Re: [CCAMP] Alarm Module work
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jun 2018 16:13:49 -0000

Hi Stefan!

Thanks for the very prompt response :) and allowing us to discuss the proposals in advance.

Please see my comments below.

Thanks
Nick
This message has been classified General Business by NICK HANCOCK on Friday, 1 June 2018 at 18:13:29.

From: stefan vallin [mailto:stefan@wallan.se]
Sent: Thursday, May 31, 2018 9:03 PM
To: NICK HANCOCK
Cc: Common Control and Measurement Plane Discussion List
Subject: Re: [CCAMP] Alarm Module work

Hi Nick!
Thanks for your comments, really appreciated!

On 31 May 2018, at 18:24, NICK HANCOCK <nick.hancock@adtran.com<mailto:nick.hancock@adtran.com>> wrote:

Hi Stefan,

Thanks for sharing this for discussion.

I absolutely agree that it is most important that we can progress the YANG Alarm Module to RFC asap, so that support for it can begin to be implemented in the industry.


1) Extended notification filtering.

The introduction of 'notify-security-level' here has one drawback in that it is a global configuration applying to all alarm types and thus does not allow the behaviour to be assigned based on type of resource, such as interface or object. This does not allow you to suppress alarms below a certain severity for some interface types that are not so important, for example, but keep normal alarm behaviour for other more important interfaces.
mmm, we could add resource and alarm type to notification filtering, but then we have alarm shelving and notification filtering as overlapping mechanisms.
I could argue that the manager/client could do the notification filtering.
My experience is that the severity level filtering is of less importance. You normally want to shelf based on resource and alarm type

Also just 'Crossing the specified level' may fulfil the desired behaviour for alarm types with a 1:1 mapping to severity level, but not for alarm types with severity life cycles. if the severity of the alarm continues to increase above the configured severity level, alarm notifications would also need to be sent.
That was the intention, will clarify.


Consider the following example assuming the severity-level is set to 'major':
           [(Time, severity, clear)]:
           [(T1, major, -), (T2, critical, -), (T3, major, -), (T4, minor, -), (T5, major), (T6, major, clear)]
I would expect alarm notifications at T1, T2, T3, and T6.
Adding some severity changes to your scenario
[(T1, major, -), (T2, critical, -), (T3, major, -), (T4, minor, -), (T5, warning), (T6, major), (T7, clear)]
[cid:image002.png@01D3F988.9439F0D0]

Notifications will be sent at T1, T2, T3, T4, T6, T7
[Nick] So you are saying that with the method you are proposing, you did intend that the notification would be sent at T2 (critical) in your example above, although strictly the severity did not 'cross' the specified severity level ? I didn't understand it that way from the description. Then that would be what I would have expected.

Changing this configuration to a choice does have its advantages, such as to allow the addition of other cases later or allow other SDOs or vendors to augment other cases.
So maybe a pragmatic approach so as to not block progress of this draft could be to keep the choice format but omit the 'notify-security-level' for now and continue discussions.
Or keep it :)
[Nick] See above.


2) Alarm Severity Assignment Profile

The proposed implementation provides the means to override severities for alarm types with severity life cycles, but at the same time the implementation is relatively simple.
Also the use of a criteria to assign the severity to alarms - and I am assuming that it would work like for the shelf - allows resource-independent overriding of factory default severities through specification of an alarm-type only, but also to add additional overrides for specific resources. The only question is if there are multiple assignments that apply to a specific alarm instance which assignment should apply? For example, if I create an entry for
[(alarm-type-id, alarm-type-id-qualifier, resource, severity-level]
[(los,,,major),(los,,"interface 1", critical)]
which severity assignment applies to interface 1? The more specific?
Good comment!
Priority order:
1) the more specific
  1.1) resource
  1.2) alarm type (remember hierarchical)
2) order in list



The implementation is also very specific to alarm severity assignment. The mechanism itself, though, is relatively generic, mapping information to alarms, in this case alarm severities. Other SDOs or vendors may wish to augment the list with other data nodes to use this mechanism to associate other data with alarm types and avoid having to implement multiple lists. So I believe that there would be a great advantage and added-value, if this list would be made more generic, such as renaming it to just 'alarm-profile', for example.
We are circling back and forth here.
[Nick] I nevertheless feel we are converging on a solution and that we are almost there!
The list has expressed the need to support ITU Alarm Severity Assignment Profile, this is exactly what the suggested model does.
[Nick] Yes, but the proposed ASAP list could do much more is what I am saying. Maybe the question is: does the list require a specific implementation, specifically an ASAP, or does it require the alarm module to provide a function that allows severities to be assigned to alarm types? My understanding is the latter. Maybe others on the list could also comment.

As you may recall, we have also been discussing other features in past threads, for example, to assign operator specific information to alarm types. We can argue whether the core module should or should not support this directly. More importantly we need the alarm module to progress to RFC. If certain features are not supported in the core model, then other SDOs and/or vendors would need to provide a solution and the obvious one would be to extend/augment some existing mechanism. This list would be a suitable candidate to be extended/augmented to provide a solution to enable this custom information to be assigned to alarm types, for example. But currently given its name, the list would be limited to alarm severity assignment.

My recommendation would be:

-          rename the list, say to 'alarm-profile'

-          add a reference statement to severity-level (I would use the singular and not plural as it reads better in XML) referencing M3100/M3160.

-          move it to ietf-alarms (or to its own module, but that maybe an overkill)

-          leave ietf-alarms-x733 as is.

And although it does fulfil the requirements of M.3100/M.3160,
as requested :)

including the list within the module ietf-alarms-itu would basically restrict the use and possible extension of the ASAP to ITU requirements only.
Not sure what you mean with "restrict"
[Nick] What I am saying here is that if the list is implemented as part of a module that specifically supports  ITU requirements, augmenting the list with non-ITU features goes against the grain.


Since possible augmentations could originate from requirements coming from other SDOs and vendors, it would IMHO not be prudent to include it in this module.
Nothing stops augmentations and additions of other features. Just felt there was a high pressure for ASAP which the suggestion captures.
[Nick] I would put it another way. Yes, there is a high pressure to support severity assignments to alarm type as I discussed above, but the core module should balance that with  being as generic as possible - which it does generally very well - and allow extension/augmentation by other SDOs and vendors in a generic way.

best regards Stefan


Best regards
Nick



This message has been classified General Business by NICK HANCOCK on Thursday, 31 May 2018 at 18:24:19.

From: CCAMP [mailto:ccamp-bounces@ietf.org] On Behalf Of stefan vallin
Sent: Monday, May 28, 2018 2:12 PM
To: Common Control and Measurement Plane Discussion List
Subject: [CCAMP] Alarm Module work

Hi!
Me and Martin are working on an updated version of the alarm module. Several smaller things pointed out by reviewers. Thank you all.

We would like to share 2 things for discussion:
1) Extended notification filtering
2) Alarm Severity Assignment Profile

We are now stretching the limit for being a first core module with only relevant features.
At this point I think it is more important to start having implementation support rather than adding even more features which might scare people of from implementing it.


1) Extended notification filtering
========================
See suggestion below, added the capability to filter on severity.
We did not include resource filtering since that would be too much overlap with shelving.

      choice notify-status-changes {
        description
          "This leaf controls the notifications sent for alarm status
           updates. There are three options:
           1. notifications are sent for all updates, severity level
              changes and alarm text changes
           2. notifications are only sent for alarm raise and clear
           3. notifications are sent for status changes equal to or
              above the specified severity level. Clear notifications
              shall always be sent
              Notifications shall also be sent for state changes that
              makes an alarm less severe than the specified level.
           In option 3, assuming the severity level is set to major,
           and that the alarm has the following state changes
           [(Time, severity, clear)]:
           [(T1, major, -), (T2, minor, -), (T3, warning, -),
            (T4, minor, -), (T5, major), (T6, major, clear)]
           In that case, notifications will be sent at
           T1, T2, T5 and T6";
        leaf notify-all-state-changes {
          type empty;
          description
            "Send notifications for all status changes.";
        }
        leaf notify-raise-and-clear {
          type empty;
          description
            "Send notifications only for raise, clear, and re-raise.
             Notifications for severity level changes or alarm text
             changes are not sent.";
        }
        leaf notify-severity-level {
          type severity;
          description
            "Only send notifications for alarm state changes
             crossing the specified level. Always send clear
             notifications.";
        }
      }



2) Alarm Severity Assignment Profile
============================

We have renamed ietf-alarms-x733 to ietf-alarms-itu since it now includes X.733 as well as M.3100/M.3160 features

  list alarm-severity-assignment-profile {
      if-feature alarm-severity-assignment-profile;
      key "alarm-type-id alarm-type-qualifier resource";
      ordered-by user;
      description
        "If an alarm matches the criteria in one of the entries
         in this list the configured severity levels shall be
         used instead of the system default. Note well that the
         mapping allows for several severity levels since this
         alarm module uses a stateful alarm model where
         the same alarm can have the following states:
         [(warning, not cleared),(minor, not cleared),
          (minor, cleared)]

         The configuration of this list shall update the
         /al:alarms/al:alarm-inventory/al:alarm-type list so that a
         client can always get a full picture of the possible alarms
         by reading the alarm inventory. If an alarm matches several
         entries in this list, the first match is used.";
      reference
        "M.3160/M.3100 Alarm Severity Assignment Profile, ASAP";
      leaf alarm-type-id {
        type al:alarm-type-id;
        description
          "The alarm type identifier to match for severity
           assignment.";
      }
      leaf alarm-type-qualifier {
        type string;
        description
          "A W3C regular expression that is used to match
           an alarm type qualifier.";
      }
      leaf resource {
        type al:resource-match;
        description
          "Specifies which resources to match for severity
           assignment.";
      }
      leaf-list severity-levels {
        type al:severity;
        ordered-by user;
        description
          "Specifies the configured severity level(s) for the
           matching alarm. If the alarm has several severity
           levels the leaf-list shall be given in rising severity
           order. The original M3100/M3160 ASAP function only
           allows for a one-to-one mapping between alarm type and
           severity but since the IETF alarm module supports stateful
           alarms the mapping must allow for several severity levels.

           Assume a high-utilisation alarm type with two
           thresholds with the system default severity levels of
           threshold1 = warning and threshold2 = minor. Setting this
           leaf-list to (minor, major) will assign the severity
           levels threshold1 = minor and threshold2 = major";
      }
      leaf description {
        type string;
        mandatory true;
        description
          "A description of the alarm severity profile.";
      }
    }


Stefan Vallin
stefan@wallan.se<mailto:stefan@wallan.se>
+46705233262