Re: [CCAMP] Alarm Module work

stefan vallin <stefan@wallan.se> Wed, 06 June 2018 17:31 UTC

Return-Path: <stefan@wallan.se>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C53BE130F83 for <ccamp@ietfa.amsl.com>; Wed, 6 Jun 2018 10:31:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.609
X-Spam-Level:
X-Spam-Status: No, score=-2.609 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, T_DKIMWL_WL_MED=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wallan-se.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OWrORBjjN_Ai for <ccamp@ietfa.amsl.com>; Wed, 6 Jun 2018 10:31:25 -0700 (PDT)
Received: from mail-lf0-x233.google.com (mail-lf0-x233.google.com [IPv6:2a00:1450:4010:c07::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AFB95126CB6 for <ccamp@ietf.org>; Wed, 6 Jun 2018 10:31:24 -0700 (PDT)
Received: by mail-lf0-x233.google.com with SMTP id i15-v6so41956lfc.2 for <ccamp@ietf.org>; Wed, 06 Jun 2018 10:31:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wallan-se.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=F4JzfAu1+Cj3wVKVO5pwIIFd4sNeSNaFkdqy8GTrl0Q=; b=VOe03J6nBX7b+6sJtNLA3sQfAQPaXwkiNUhmfMbMIhEh1scoSv3jd1oHpNFsG/kwZ1 2zSAFrcuPtWm8p3lrOs1d+79gTWQ4f0DPsy4Du6GzCFfEkmdGbXJ0jH5aNnO2Hc0PmgN AOhe9bptZQ9lQhCU0n91AG+NojQkNsbtNic/CdO3Qbd97NjtICyCPvGI/TpIU0atynYc v3ZHVvV00P4fGUu+qJ06JDeQgSZlF7Nnb1uLdHvXf3HWv9rzu7wSn5k9/apFZL5LmTUg SVrFlZ3+w8EtjKh0oAHCGE++MTDs1kYpkF0uISLEA98iwib9I7UW7g7HNFm0Sw64wSvQ fH+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=F4JzfAu1+Cj3wVKVO5pwIIFd4sNeSNaFkdqy8GTrl0Q=; b=px1HJdMnDbboru1A+qKsDkXN03OojQCL7OBQDPPdYTlJ5RKlq6ac61Wp064Sp9srtG Pu4eqY8N0XVEx6sio0TW1VK6NaMiBlVm/xAxtCg3XUqYjWuf0tC+WkMl5HiCZjQdn0jr PHHeMm1mpvqTD0NHcIABV6sMiEKYeuTq2bHImEiRFsKnSNBw1M9AxJghf7GXlGOalbSs CGMtQcIfGzRZqdENMbf82CYyirMv8jMaVTEOeEeJlPVlXVRmVuehqSNfMCmYXPDE8o80 qy2D1WogbwClr58hfeznDGg11chPB45rULf/cDGgfyxoZJiEhBl3mxBWoaXvxJYEmwzK PuPQ==
X-Gm-Message-State: APt69E3PiUBKRoCdZnf7nIvc5CgRwyI4B9emdYn1m7k6VmxLlAMMzG0n 7g5xXYcDtMfy8qPOPqrsQyKroQ==
X-Google-Smtp-Source: ADUXVKKPhK1OoV620DZxQIzQcDbopw66kbrXPAP6hyQVoyybUzOP8f2ta9QIPr+JO46pPU4TvLoN8A==
X-Received: by 2002:a2e:20e8:: with SMTP id g101-v6mr2746831lji.100.1528306282922; Wed, 06 Jun 2018 10:31:22 -0700 (PDT)
Received: from [192.168.72.11] (h95-155-237-105.cust.se.alltele.net. [95.155.237.105]) by smtp.gmail.com with ESMTPSA id q2-v6sm4529392ljg.90.2018.06.06.10.31.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Jun 2018 10:31:21 -0700 (PDT)
From: stefan vallin <stefan@wallan.se>
Message-Id: <26A1246E-C89B-45D0-A378-882E91EE3AEE@wallan.se>
Content-Type: multipart/alternative; boundary="Apple-Mail=_D5DD7B0B-0B1E-445F-B78C-BB1A1878B8C0"
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
Date: Wed, 06 Jun 2018 19:31:20 +0200
In-Reply-To: <AM5PR0701MB23385334660EC433FD678B0AF1620@AM5PR0701MB2338.eurprd07.prod.outlook.com>
Cc: NICK HANCOCK <nick.hancock@adtran.com>, Common Control and Measurement Plane Discussion List <ccamp@ietf.org>
To: "De La Marche, Dirk (Nokia - BE/Antwerp)" <dirk.de_la_marche@nokia.com>
References: <D174588E-1233-4B53-B5BB-D29DE14B3888@wallan.se> <BD6D193629F47C479266C0985F16AAC7F07058E9@ex-mb1.corp.adtran.com> <7906650D-4E83-4386-AA08-43B120CD6866@wallan.se> <AM5PR0701MB23385334660EC433FD678B0AF1620@AM5PR0701MB2338.eurprd07.prod.outlook.com>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/Kj6oGaD706f-7rTcQWR38Q4362E>
Subject: Re: [CCAMP] Alarm Module work
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Jun 2018 17:31:35 -0000

Hi Dirk!
Thanks for your comments!

> On 1 Jun 2018, at 16:11, De La Marche, Dirk (Nokia - BE/Antwerp) <dirk.de_la_marche@nokia.com> wrote:
> 
> Stefan, Nick,
>  
> Thank you for this interesting exchange of alarm ideas.
>  
> I have two questions related to the extended notification filtering proposal.
>  
> (1) In the two examples in this mail exchange the clear of the alarm always happens when the alarm is above the severity threshold line. What if the clear happens below the severity threshold line?  Will we in this case also send an alarm state change notification? Similarly if an alarm stays during its whole lifetime below the severity threshold line, we do not expect any notification, right? Wouldn’t this mean that the alarm application needs to remember the history of the alarm, i.e. whether a notification was once send (e.g. alarm severity was shortly above the severity threshold), in order to generate or not generate a clear notification?
> I would imagine that an external alarm manager loses interest in an alarm once it drops below the severity threshold line (as indicated by the T4 notification).Would it still be interested in the notification once it is actually cleared on the box? <>I vote on your last comment. Once the severity level drops below the filter level no notifications will be sent including clear notifications.
Above the notification threshold clear will always be sent.
Good comment! I will clarify according to above if you agree.

>  <>
>  
> (2) if indeed the severity filtering functionality overlaps with the shelving feature, does this mean that the same rules apply to shelved alarms as they apply to severity filtered alarms, i.e. that a clear of the alarm always results in a clear notification? The case I am thinking about is an active alarm that is shelved. There is no notification defined at the moment the alarm is shelved which is inconsistent from an external alarm manager point of view. When the alarm clears while shelved don’t we also need a clear notification? How else will the external alarm manager stay in sync with the box’ alarm situation?
To shelf an alarm implies to ignore the alarm. No notifications will be sent, that is the purpose.
I do not agree that a manager should be able to stay in sync with shelved alarms. Shelved alarms are ignored by purpose.

>  
> Based on this mail exchange I would agree that alarm shelving is a workable method that can replace the severity filtering if we can provide a means (e.g. using notifications) to keep the alarm application on both NC server and NC client in sync. In my opinion this is an important criterium to validate any alarm model.
Notification filtering and alarm shelving are two different things.
To shelf an alarm means that alarms associated to the configured resource/alarm-type are ignored. For example a specific interface under test.
>  
> Concerning the second proposal, a workable Alarm Severity Assignment Profile is something that is very valuable (since actually used by several, if not all the operators I have worked with). If we can overrule vendor-specific alarm severities on alarm-id and/or on object-id using wildcards we are fine.
Agree, I will probably go with the proposal from Nick, move the profile into the core module, name it alarm-profile and include the severity mapping

Thanks again!
Br Stefan

>  
> Kind regards,
> Dirk
> From: CCAMP [mailto:ccamp-bounces@ietf.org <mailto:ccamp-bounces@ietf.org>] On Behalf Of stefan vallin
> Sent: Thursday, May 31, 2018 9:03 PM
> To: NICK HANCOCK <nick.hancock@adtran.com <mailto:nick.hancock@adtran.com>>
> Cc: Common Control and Measurement Plane Discussion List <ccamp@ietf.org <mailto:ccamp@ietf.org>>
> Subject: Re: [CCAMP] Alarm Module work
>  
> Hi Nick!
> Thanks for your comments, really appreciated!
>  
> On 31 May 2018, at 18:24, NICK HANCOCK <nick.hancock@adtran.com <mailto:nick.hancock@adtran.com>> wrote:
>  
> Hi Stefan,
>  
> Thanks for sharing this for discussion.
>  
> I absolutely agree that it is most important that we can progress the YANG Alarm Module to RFC asap, so that support for it can begin to be implemented in the industry.
>  
>  
> 1) Extended notification filtering.
>  
> The introduction of ‘notify-security-level’ here has one drawback in that it is a global configuration applying to all alarm types and thus does not allow the behaviour to be assigned based on type of resource, such as interface or object. This does not allow you to suppress alarms below a certain severity for some interface types that are not so important, for example, but keep normal alarm behaviour for other more important interfaces. 
> mmm, we could add resource and alarm type to notification filtering, but then we have alarm shelving and notification filtering as overlapping mechanisms.
> I could argue that the manager/client could do the notification filtering.
> My experience is that the severity level filtering is of less importance. You normally want to shelf based on resource and alarm type
>  
> Also just 'Crossing the specified level' may fulfil the desired behaviour for alarm types with a 1:1 mapping to severity level, but not for alarm types with severity life cycles. if the severity of the alarm continues to increase above the configured severity level, alarm notifications would also need to be sent.
> That was the intention, will clarify.
> 
> 
> Consider the following example assuming the severity-level is set to ‘major’:
>            [(Time, severity, clear)]:
>            [(T1, major, -), (T2, critical, -), (T3, major, -), (T4, minor, -), (T5, major), (T6, major, clear)]
> I would expect alarm notifications at T1, T2, T3, and T6.
> Adding some severity changes to your scenario
> [(T1, major, -), (T2, critical, -), (T3, major, -), (T4, minor, -), (T5, warning), (T6, major), (T7, clear)]
> <image003.png>
>  
> Notifications will be sent at T1, T2, T3, T4, T6, T7
> 
> 
>  
> Changing this configuration to a choice does have its advantages, such as to allow the addition of other cases later or allow other SDOs or vendors to augment other cases.
> So maybe a pragmatic approach so as to not block progress of this draft could be to keep the choice format but omit the ‘notify-security-level’ for now and continue discussions.
> Or keep it :)
> 
>  
>  
> 2) Alarm Severity Assignment Profile
>  
> The proposed implementation provides the means to override severities for alarm types with severity life cycles, but at the same time the implementation is relatively simple.
> Also the use of a criteria to assign the severity to alarms – and I am assuming that it would work like for the shelf - allows resource-independent overriding of factory default severities through specification of an alarm-type only, but also to add additional overrides for specific resources. The only question is if there are multiple assignments that apply to a specific alarm instance which assignment should apply? For example, if I create an entry for 
> [(alarm-type-id, alarm-type-id-qualifier, resource, severity-level]
> [(los,,,major),(los,,”interface 1”, critical)]
> which severity assignment applies to interface 1? The more specific?
> Good comment!
> Priority order:
> 1) the more specific
>   1.1) resource
>   1.2) alarm type (remember hierarchical)
> 2) order in list
> 
> 
>  
> The implementation is also very specific to alarm severity assignment. The mechanism itself, though, is relatively generic, mapping information to alarms, in this case alarm severities. Other SDOs or vendors may wish to augment the list with other data nodes to use this mechanism to associate other data with alarm types and avoid having to implement multiple lists. So I believe that there would be a great advantage and added-value, if this list would be made more generic, such as renaming it to just ‘alarm-profile’, for example.
> We are circling back and forth here.
> The list has expressed the need to support ITU Alarm Severity Assignment Profile, this is exactly what the suggested model does.
> 
>  
> And although it does fulfil the requirements of M.3100/M.3160,
> as requested :)
> 
> including the list within the module ietf-alarms-itu would basically restrict the use and possible extension of the ASAP to ITU requirements only.
> Not sure what you mean with "restrict"
> 
> 
> Since possible augmentations could originate from requirements coming from other SDOs and vendors, it would IMHO not be prudent to include it in this module.
> Nothing stops augmentations and additions of other features. Just felt there was a high pressure for ASAP which the suggestion captures.
>  
> best regards Stefan
> 
>  
> Best regards
> Nick
>  
>  
>  
> This message has been classified General Business by NICK HANCOCK on Thursday, 31 May 2018 at 18:24:19.
>  
>  <>From: CCAMP [mailto:ccamp-bounces@ietf.org <mailto:ccamp-bounces@ietf.org>] On Behalf Of stefan vallin
> Sent: Monday, May 28, 2018 2:12 PM
> To: Common Control and Measurement Plane Discussion List
> Subject: [CCAMP] Alarm Module work
>  
> Hi!
> Me and Martin are working on an updated version of the alarm module. Several smaller things pointed out by reviewers. Thank you all.
>  
> We would like to share 2 things for discussion:
> 1) Extended notification filtering
> 2) Alarm Severity Assignment Profile
>  
> We are now stretching the limit for being a first core module with only relevant features.
> At this point I think it is more important to start having implementation support rather than adding even more features which might scare people of from implementing it.
>  
>  
> 1) Extended notification filtering
> ========================
> See suggestion below, added the capability to filter on severity.
> We did not include resource filtering since that would be too much overlap with shelving.
>  
>       choice notify-status-changes {
>         description
>           "This leaf controls the notifications sent for alarm status
>            updates. There are three options:
>            1. notifications are sent for all updates, severity level
>               changes and alarm text changes
>            2. notifications are only sent for alarm raise and clear
>            3. notifications are sent for status changes equal to or
>               above the specified severity level. Clear notifications
>               shall always be sent
>               Notifications shall also be sent for state changes that
>               makes an alarm less severe than the specified level.
>            In option 3, assuming the severity level is set to major,
>            and that the alarm has the following state changes
>            [(Time, severity, clear)]:
>            [(T1, major, -), (T2, minor, -), (T3, warning, -),
>             (T4, minor, -), (T5, major), (T6, major, clear)]
>            In that case, notifications will be sent at
>            T1, T2, T5 and T6";
>         leaf notify-all-state-changes {
>           type empty;
>           description
>             "Send notifications for all status changes.";
>         }
>         leaf notify-raise-and-clear {
>           type empty;
>           description
>             "Send notifications only for raise, clear, and re-raise.
>              Notifications for severity level changes or alarm text
>              changes are not sent.";
>         }
>         leaf notify-severity-level {
>           type severity;
>           description
>             "Only send notifications for alarm state changes
>              crossing the specified level. Always send clear
>              notifications.";
>         }
>       }
>  
>  
>  
> 2) Alarm Severity Assignment Profile
> ============================
>  
> We have renamed ietf-alarms-x733 to ietf-alarms-itu since it now includes X.733 as well as M.3100/M.3160 features
>  
>   list alarm-severity-assignment-profile {
>       if-feature alarm-severity-assignment-profile;
>       key "alarm-type-id alarm-type-qualifier resource";
>       ordered-by user;
>       description
>         "If an alarm matches the criteria in one of the entries
>          in this list the configured severity levels shall be
>          used instead of the system default. Note well that the
>          mapping allows for several severity levels since this
>          alarm module uses a stateful alarm model where
>          the same alarm can have the following states:
>          [(warning, not cleared),(minor, not cleared),
>           (minor, cleared)]
>  
>          The configuration of this list shall update the
>          /al:alarms/al:alarm-inventory/al:alarm-type list so that a
>          client can always get a full picture of the possible alarms
>          by reading the alarm inventory. If an alarm matches several
>          entries in this list, the first match is used.";
>       reference
>         "M.3160/M.3100 Alarm Severity Assignment Profile, ASAP";
>       leaf alarm-type-id {
>         type al:alarm-type-id;
>         description
>           "The alarm type identifier to match for severity
>            assignment.";
>       }
>       leaf alarm-type-qualifier {
>         type string;
>         description
>           "A W3C regular expression that is used to match
>            an alarm type qualifier.";
>       }
>       leaf resource {
>         type al:resource-match;
>         description
>           "Specifies which resources to match for severity
>            assignment.";
>       }
>       leaf-list severity-levels {
>         type al:severity;
>         ordered-by user;
>         description
>           "Specifies the configured severity level(s) for the
>            matching alarm. If the alarm has several severity
>            levels the leaf-list shall be given in rising severity
>            order. The original M3100/M3160 ASAP function only
>            allows for a one-to-one mapping between alarm type and
>            severity but since the IETF alarm module supports stateful
>            alarms the mapping must allow for several severity levels.
>  
>            Assume a high-utilisation alarm type with two
>            thresholds with the system default severity levels of
>            threshold1 = warning and threshold2 = minor. Setting this
>            leaf-list to (minor, major) will assign the severity
>            levels threshold1 = minor and threshold2 = major";
>       }
>       leaf description {
>         type string;
>         mandatory true;
>         description
>           "A description of the alarm severity profile.";
>       }
>     }
>  
>  
> Stefan Vallin
> stefan@wallan.se <mailto:stefan@wallan.se>
> +46705233262