Re: [CCAMP] review of draft-ietf-ccamp-alarm-module-01

stefan vallin <stefan@wallan.se> Fri, 17 August 2018 08:31 UTC

Return-Path: <stefan@wallan.se>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AE20130DD3 for <ccamp@ietfa.amsl.com>; Fri, 17 Aug 2018 01:31:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wallan-se.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Dd9--igKtT2i for <ccamp@ietfa.amsl.com>; Fri, 17 Aug 2018 01:31:16 -0700 (PDT)
Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 821B7130DF3 for <ccamp@ietf.org>; Fri, 17 Aug 2018 01:31:15 -0700 (PDT)
Received: by mail-lf1-x135.google.com with SMTP id l16-v6so5311401lfc.13 for <ccamp@ietf.org>; Fri, 17 Aug 2018 01:31:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wallan-se.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=d7aaBEYzzlHLd55QsYM2lc39825hm+O+pgrr/+cMx+M=; b=0/zGEgqUgcExT7vTV/CudMRtXrjoU4tu8CS8TSIXTEmwJgP5FE2lHgXI/f4FSo6UaC NQPERGwCSomOZqG0UiEUrfLvshgOYHdBaXUlZJmbuQnVjsj6mbTO54bEL6BDMd/JFw9C 6tpgWDn/Ti5eRI/uEGZegC2LobM9Y/+Yn+eelYRgMzldl/yg+jxnwscX0EbYzCzsFPyj 6UkSzUG8//co7hxW7wtXqt7j7EliL50a+1jiWYIqeoab3JabZBWh+xW4Pf3qTJiMa7d5 NcOYyJmFYWI/qKS+dbZFcSWN9GdNcPGVK0jmjGix6T7UsRdkqEbGv/khDIw0lLtp+gFA i17g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=d7aaBEYzzlHLd55QsYM2lc39825hm+O+pgrr/+cMx+M=; b=n5agZLmmlNWxJeFjfPZZXFmT2rpQi9pS99wt+ua13pzzNwCxPhxC+Rwqqfg03csThp dCww7HeH6HT4v7EDr9DYwm9dUMyyDBtfOjOpZVFPH+XIl6z0JvQ3PMfKtyoCsB0mU+zV aCWwuhHh0Kh/9i/43UM842ATBtoUTyVXsOrXEC16iKA2IWlimYATfaltFvOBw0Fhbuxs UB6NZVQ28NEgAJTqUZ7XV6Hs5WSA/88uCQJEgdm3VPxovvExz0cLBSRVjz0bDhDeB/kl RrAVCLtpj8YP5X6ezflqRHH4WqV7aRSN5VfqEVsazJeB+2H7kt7Ze/IXWlJPcV+AP4Y/ fz9A==
X-Gm-Message-State: AOUpUlG3PP87kwr7zyuDdCiVYvL5OSemrcUT+KgwycdCHIxI952k9RmT yKhvNr9GYnqKN8D3+CMZqo8Wpg==
X-Google-Smtp-Source: AA+uWPwmvrrx0kNwEQ//yRP52m0umUFln7AzqeLQCWrHclh0OEEzljpAIVI6A+1OqmBNirj0tT0apg==
X-Received: by 2002:a19:db94:: with SMTP id t20-v6mr21128349lfi.126.1534494673331; Fri, 17 Aug 2018 01:31:13 -0700 (PDT)
Received: from [192.168.8.72] ([195.234.15.130]) by smtp.gmail.com with ESMTPSA id t66-v6sm209484lje.95.2018.08.17.01.31.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 Aug 2018 01:31:12 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
From: stefan vallin <stefan@wallan.se>
In-Reply-To: <067d01d433c4$8694eae0$4001a8c0@gateway.2wire.net>
Date: Fri, 17 Aug 2018 10:31:11 +0200
Cc: "ccamp@ietf.org" <ccamp@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <F76C3EFE-2A9D-4806-9BE3-0164DD3A90C0@wallan.se>
References: <B8F9A780D330094D99AF023C5877DABA9AF5BDE8@nkgeml513-mbx.china.huawei.com> <E597E310-27B8-4091-89BB-F510CE1AC3C0@wallan.se> <04c501d430a0$3c5cc3c0$4001a8c0@gateway.2wire.net> <8944F55D-94C0-4CD3-9445-9446F41F5D44@wallan.se> <067d01d433c4$8694eae0$4001a8c0@gateway.2wire.net>
To: tom petch <ietfc@btconnect.com>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/dziFiLdbEtTKfcxKHhmdmKvt_7Q>
Subject: Re: [CCAMP] review of draft-ietf-ccamp-alarm-module-01
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Aug 2018 08:31:18 -0000

Hi Tom!
Ok, I will have a thorough read throughout the document to make sure I am using the terms consistently.
* alarm
* alarm notification
* alarm composite state (rather than just state)
  will make sure I emphasise it is a composite state.

When referring to alarm severity, alarm clearance etc I will make sure that is made clear that I am referring to a specific state.

Going back to your concerns on my definition of alarm and other standards.
It is interesting to follow how 3GPP Alarm IRP has changed its definition over the years.

3GPP TS 32.111-1 Part 1: 3G fault management requirements

V12.0.0 (2013-06)
* alarm: abnormal network entity condition, which categorizes an event as a fault


V15.0.0 (2018-06)
* alarm: An alarm signifies an undesired condition of a resource
             (e.g. network element, link) for which an operator action is
             required. It emphasizes a key requirement that operators [...]
             should not be informed about an undesired condition unless it
             requires operator action.

* alarm notification: Notification used to inform the recipient about the
             occurrence of an alarm.


Please note how the later version from 2018-06 is almost literally the same
as in the ietf alarm draft. The earlier version from 2013 is using
yours of alarms as a special case of event.

So I disagree totally that I do not follow standards. We are progressing standards.
(The 3GPP working group read my earlier work, during their revisions).

So the definition of the term “alarm" is focusing on the resulting condition (composite state), not the individual notifications.
We use “alarm notifications” to refer to the individual notifications that informs of an update of the alarm composite state, might be a severity change, might be a clearance, ...

Br Stefan


> On 14 Aug 2018, at 13:50, tom petch <ietfc@btconnect.com>; wrote:
> 
> Stefan
> 
> I have the same difficulties with your response as I have with the
> I-D:-(
> 
> When you say,
> " (resource, alarm-type-id, alarm-type-qualifier)->(alarm state)"
> I read it as meaning that
> a given value of a resource and
> a given value of an alarm-type-id and
> a given value of an alarm-type-qualifier
> defines a value of alarm-state with other values such as
> time-created
> perceived-severity
> alarm-text
> being irrelevant.
> 
> When you then say
> "So by alarm state the composite state of an alarm comprises the alarm
> severity, if it is cleared, the text, list of resource alarm state
> changes, list of operator state changes)"
> I understand that the definition of alarm state includes
> alarm severity
> if it is cleared
> the text
> list of resource alarm state changes
> list of operator state changes
> 
> This tells me that the meaning of the term 'alarm state' varies
> throughout the document in a way I cannot predict, I cannot grasp.  I
> then struggle (fail?) to understand the I-D.
> 
> With the term 'event', prior art uses 'event' as a generic term with
> 'alarm' being that subset that indicates a fault. This says to me that
> if you want to give a different meaning to 'event', as you say below,
> then you should define 'event' (else - again - I get confused).
> 
> Tom Petch
> 
> 
> ----- Original Message -----
> From: "stefan vallin" <stefan@wallan.se>;
> To: "tom petch" <ietfc@btconnect.com>;
> Cc: <ccamp@ietf.org>;
> Sent: Saturday, August 11, 2018 6:52 PM
> 
> Hi Tom!
> 
>> On 10 Aug 2018, at 13:53, tom petch <ietfc@btconnect.com>; wrote:
>> 
>> Stefan
>> 
>> I find this I-D (too) hard to understand.
> Sad to hear, I spent some time on describing it...
> 
>> The problem I have is with
>> terminology which seems elastic.
> OK, I read you, understand I need to improve on the basic definitions,
> important
> Terminology is everything.
> 
>> 
>> Thus 'alarm state' is not defined as a term; it is in other alarm work
>> where the definition would fit with usage such as
>> 
>>  The operator state for an alarm can be: "none", "ack", "shelved",
> and
>>  "closed".
>> or
>> actual state of the alarms
>> or
>> The alarm list (/alarms/alarm-list) is a function from (resource,
>>  alarm type, alarm type qualifier) to the current alarm state.
>> 
>> But this meaning makes no sense to me when the term appears in
>> o  Alarm Instance: The alarm state for a specific resource and alarm
>> type.
>> or
>> o  Alarm Type: An alarm type identifies a possible unique alarm state
>> for a resource.
>> 
>> and since I cannot understand what you mean by these two terms, I
> think
>> I cannot understand the document.
> Oh oh, fundamental, I need to improve, let my try a quick one:
> I think I need to improve the right side of the function
> (resource, alarm-type-id, alarm-type-qualifier)->(alarm state)
> The alarm state is really a composite state.
> 
> From pyang tree output:
> 
>     |  +--ro alarm* [resource alarm-type-id alarm-type-qualifier]
>     |     +--ro resource                 resource
>     |     +--ro alarm-type-id            alarm-type-id
>     |     +--ro alarm-type-qualifier     alarm-type-qualifier
>     |     +--ro alt-resource*            resource
>     |     +--ro related-alarm* [resource alarm-type-id
> alarm-type-qualifier]
>     |     |     ...
>     |     +--ro impacted-resource*       resource
>     |     +--ro root-cause-resource*     resource
>     |     +--ro time-created             yang:date-and-time
>     |     +--ro is-cleared               boolean
>     |     +--ro last-changed             yang:date-and-time
>     |     +--ro perceived-severity       severity
>     |     +--ro alarm-text               alarm-text
>     |     +--ro status-change* [time] {alarm-history}?
>     |     |     ...
>     |     +--ro operator-state-change* [time] {operator-actions}?
>     |     |     ...
>     |     +---x set-operator-state {operator-actions}?
>     |     |     ...
>     |     +---n operator-action {operator-actions}?
>     |           ...
> 
> This means:
> (resource, alarm-type-id, alarm-type-qualifier)->(time-created,
> is-cleared, last-changed, perceived-severity, alarm-text, status-change,
> operator-state-change)
> 
> So by alarm state the composite state of an alarm comprises the alarm
> severity, if it is cleared, the text, list of resource alarm state
> changes, list of operator state changes)
> 
> This means that you can ask what is the alarm state of (FastEthernet1/0,
> linkAlarm) and get the answer: current severity, is it cleared?, current
> operator state like “ack” etc.
> 
> 
>> 
>> Another example would be the use of 'event' which appears as
>> 
>> 1.  the definition focuses on leaving out events and logging
> information
>> in general.
>> 
>> This I-D does not define event; previous IETF work, e.g. RFC3877 does,
>> and makes it clear that an alarm (class) is a subset of an event which
>> would make no sense here.
> 
> I disagree, the focus of the definition in this draft is to exclude
> general events to appear as alarms.
>> 
>> There is a lot of prior art in this field but this I-D seems to go
>> against it rather than build on it.
> Yes!
> I am well aware of prior work, spent 25 years in the alarm industry,
> standards and systems.
> Prior is not equivalent to art by definition.
> 
> This draft stands in giants shoulders, X.733, 3GPP Alarm IRP, RFC3877
> etc but with improvements.
> 
> Your statement is very general, hard to comment. Can you make a more
> specific statement? Example?
> I can mention some areas where I did make some design decisions that
> does not align with X.733, 3GPP Alarm IRP etc.
> 
> * Most alarm standards are focused on a list of notifications, this
> draft is focused on the alarm list as a function (resource,
> alarm-type-id, alarm-type-qualifier)->(composite alarm state)
> 
> * Key for alarm / alarm notification
>  X.733 uses managed object (resource), event type, probable cause,
> specific problem. The most relevant attribute being probable cause, a
> global flat enum.
>  3GPP Alarm IRP has confusing redundant overlapping keys “alarmId” an
> integer, and the X733 tuple. The standard even shows an example where
> alarmId and the X733 tuple is in conflict.
> 
> This draft simplifies this with the hierarchical alarm-type-id.
> 
> * Separation of resource life-cycle and operator life-cycle.
>  For example, 3GPP Alarm IRP has the notion of “manual-clear”, an
> operator setting the alarm clearance state. This is confusing.
> 
> * Separating alarm clearance from alarm severity.
>  This alarm module separates the clearance state of an alarm from the
> alarm severity. X.733 and 3GPP does not.
> 
> And more….
> 
> Br Stefan
> 
> 
> 
>> 
>> Tom Petch
>> 
>> ----- Original Message -----
>> From: "stefan vallin" <stefan@wallan.se>;
>> To: "Qin Wu" <bill.wu@huawei.com>;
>> Cc: <ccamp@ietf.org>;
>> Sent: Sunday, July 22, 2018 7:17 PM
>> 
> 
>