Re: [CCAMP] draft-ietf-ccamp-alarm-module-02

stefan vallin <stefan@wallan.se> Tue, 02 October 2018 06:18 UTC

Return-Path: <stefan@wallan.se>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EBA35128A6E for <ccamp@ietfa.amsl.com>; Mon, 1 Oct 2018 23:18:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wallan-se.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PHdyDYFt4oC7 for <ccamp@ietfa.amsl.com>; Mon, 1 Oct 2018 23:18:21 -0700 (PDT)
Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2432A128CE4 for <ccamp@ietf.org>; Mon, 1 Oct 2018 23:18:21 -0700 (PDT)
Received: by mail-lj1-x22a.google.com with SMTP id y71-v6so652039lje.9 for <ccamp@ietf.org>; Mon, 01 Oct 2018 23:18:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wallan-se.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=j28b/v69qPqPq9ZBto4ujUTfxLMtKc1Kqdl3NbxUzTU=; b=lMADYnTs6VUlDZt/vDXtGbopHXRp8ErvV33J5Vw9TjAgiEeyfqeZGoR7RqsqMkX7AG HO7Xu94/ZBS1+kQeVr/VEcMnmd9eRZdrm6gIO5YmGk+SABFEnNbbmf0irqmMsrdeBnF9 L/oB83jvLFrsVb768xnCBWYBB2vtWuRn8ES8cXLhZeWZpNB3+lCTN3Of8pwDWoN41yvy /XsYhXEP2QDi6Lqc7KN0MmISUooxglqluXxQBMYgz0YOPPiJncqquCDnfYOW7zy629aA B/5V2b+QfmAuNRktgJKkQSABJs7n8s3/5yzHYV8ki53EzNO8e2NtQMJw08eecHd2qsGZ IBpw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=j28b/v69qPqPq9ZBto4ujUTfxLMtKc1Kqdl3NbxUzTU=; b=VBiVwVRGGeKYhKDABNQcuB9weNwns3mv8tnycfWoOKyYbcxwekCaBhhNhi66zy/UYu djqVZudQySqYhoqCG35b6MPUFiboZSddgsh0xmb/qwiV+XOw4hel0ybBGi1t1OIE31sZ 0s1B92EhK9FdNWh3l1oCme1BURWZQkKHd1kV4D/db5LaJMcjIFJpAuJzi3pcGJejWntP dMsEB0yKwdVpO+huyfrrkfCb6poGl6m1nakxMqoZUaC/VTq2VQxCbCoe+sgmX9kFVno/ q6YT7oiVsslxBhGz4ZP0dAS4suP6j9NZrL42g6kedVkvX1zR7VWHWFz2gfgYD7DS10oz yxGg==
X-Gm-Message-State: ABuFfoiuFOiZTjb3ivN8aKVS8yZ32v/Z11bCQ8j2u9h0L9iU513ur8T9 lqqQ/rbvBYr8iIbxLaWF7c37HA==
X-Google-Smtp-Source: ACcGV62vne6hcwPuTqDPWWhMABWV9okSpPTOoxvCZd6uvTnvt9/hRVDUtc+rjVzEecpc88GlnsWL8A==
X-Received: by 2002:a2e:2e18:: with SMTP id u24-v6mr8571046lju.3.1538461099207; Mon, 01 Oct 2018 23:18:19 -0700 (PDT)
Received: from [192.168.72.11] (h95-155-237-105.cust.se.alltele.net. [95.155.237.105]) by smtp.gmail.com with ESMTPSA id p80-v6sm3136579ljb.19.2018.10.01.23.18.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Oct 2018 23:18:18 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
From: stefan vallin <stefan@wallan.se>
In-Reply-To: <A81686D187412242AD51AAC709D0844334297C1E@Exchange2010.kamstrup.dk>
Date: Tue, 2 Oct 2018 08:18:17 +0200
Cc: Martin Bjorklund <mbj@tail-f.com>, "ccamp@ietf.org" <ccamp@ietf.org>, "netmod@ietf.org" <netmod@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <06297C39-B42A-492C-ABB5-3EB3C094F35D@wallan.se>
References: <A81686D187412242AD51AAC709D0844334296B31@Exchange2010.kamstrup.dk> <20180920.103107.750560007019896412.mbj@tail-f.com> <A81686D187412242AD51AAC709D0844334297C1E@Exchange2010.kamstrup.dk>
To: Karen Elisabeth Egede Nielsen <KEE@kamstrup.com>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/OWiLxCTqRKqOw1BfrtvjmWNALd4>
Subject: Re: [CCAMP] draft-ietf-ccamp-alarm-module-02
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Oct 2018 06:18:25 -0000

Hi Karen!
See inline
br Stefan and Martin

> 
> I hope that you can accept the follow up right below:
> 
> * Would it not be relevant in the draft to outline the relation to the alarm-state in RFC8348 ?
> 
> ** Possibly even in the substance of the document rather then in an appendix  - assuming that the two are seen as complementary mechanisms potentially based on the same underlying alarm framework (that you define in this draft)
We can add a short description on the relationship between the Alarm Module and RFC8348.
As Martin stated they serve different purposes:
"The "alarm-state" in RFC 8348 (and EntityAlarmStatus in RFC 4268) is
just a summary of the alarms that may be active on the specific
hardware component.  It doesn't say anything about how alarms are
reported, and it doesn't provide any details of the alarms; it is just
a bitmask. The alarm-module draft, specifies how alarms are reported,
generically.  It also provides a list of all active alarms."

The mapping between the data-models are outlined below.
Alarm YANG.                                 RFC 8348
alarm list
* resource                                      corresponds  to /hardware/component/
* is-cleared                                    no bit set in /hardware/component/state/alarm-state
* perceived-severity                       corresponding bit set in /hardware/component/state/alarm-state
* operator-state-change/state.      if the alarm is acked by the operator it could correspond to under-repair

> 
> ** In the draft you have "closed" state of an alarm. Wouldn't it be relevant, in your opinion. with this alarm framework in mind, also to have the closed state in the alarm-state object of RFC8348 ?
> 
> * The same question (should be included in alarm-state of RFC8348) for the shelved alarms ?
This would be an update to RFC 8348, and is out of scope for this work
> 
> Something else:
> 
> * Assuming that one has an alarm which have no clear  (see next question below) or where clear may not always come.
>   Would an operator close of this alarm make it disappear from the active alarms summary ? Can that be an implementation decision  - possibly depending on the alarm type, possibly configurable ?
The general answer is no, as stated in the document, there is no automatic purge/deletion of an alarm on clear from the resource or close from the operator.
This is by design, from an ops perspective it makes sense to be able to view the alarm even after it is cleared/closed. 
You might want to study the root cause afterwards to perform proactive actions for it not to appear again for example.

But as you say, you can make it an implementation decision, "purge on clear", "purge on close".
If it is hard-coded per alarm-type, describe it in the alarm inventory
You can also make it configurable per alarm type by augmenting the alarm profile with a purge-policy: "purge on clear", "purge on close"
> 
> * RFC3877 has the following statement: "Alarms SHOULD  be modelled so Notifications are sent on alarm Clear."  
> I did not find this statement in the substance of the draft nor in Appendix F (But it may have escaped me).  
> Is this also the mindset of this draft ?
According to this alarm module an alarm-notification will be sent with perceived-severity set to cleared.
> 
> * It is correctly understood that the Alarm Summary and the Alarm list contains the alarms which are presently in the system - i.e. which have not been purged ?
Correct
>  * Would it be relevant for the Alarm Summary list to tell when alarms was last purged due to administrative action ?
We do not want to load the alarm module with more features at this point, this could be done in the management application/client. 
> 
> * Are you considering to implement support for statistics ?
What do you mean with statistics?
a) Statistics on alarms or do you mean a b) performance monitoring module?
It a, no, that is up to the management application
If b, that is a separate module not within this one

> 
> 
> BR, Karen
> 
> -----Original Message-----
> From: Martin Bjorklund <mbj@tail-f.com>; 
> Sent: 20. september 2018 10:31
> To: Karen Elisabeth Egede Nielsen <KEE@kamstrup.com>;
> Cc: ccamp@ietf.org; stefan@wallan.se; netmod@ietf.org
> Subject: Re: draft-ietf-ccamp-alarm-module-02
> 
> Hi,
> 
> Karen Elisabeth Egede Nielsen <KEE@kamstrup.com>; wrote:
>> Hi,
>> 
>> This draft is new to me and modelling of alarm management also 
>> somewhat....
>> 
>> Could you enlighten me on the relationship, if any, in between the 
>> alarm module of this draft and the Device/resource alarm state within 
>> RFC8348 (equivalently the EntityAlarmStatus of RFC4268) ?
> 
> The "alarm-state" in RFC 8348 (and EntityAlarmStatus in RFC 4268) is just a summary of the alarms that may be active on the specific hardware component.  It doesn't say anything about how alarms are reported, and it doesn't provide any details of the alarms; it is just a bitmask.
> 
> The alarm-module draft OTOH, specifies how alarms are reported, generically.  It also provides a list of all active alarms.
> 
>> E.g.  are the two they considered complementary mechanisms (modules), 
>> just different view glasses, or are they non-compatible or redundant 
>> ..?
> 
> So if both modules are implemented (they don't have to be), the information can be viewed as redundant or just different views.
> 
> 
> /martin
> 
> 
> 
>> 
>> Many Thanks in advance !
>> 
>> 
>> BR, Karen Nielsen
>>