Re: [CCAMP] Second review of draft-ietf-ccamp-alarm-module-01

Qin Wu <bill.wu@huawei.com> Mon, 20 August 2018 03:26 UTC

Return-Path: <bill.wu@huawei.com>
X-Original-To: ccamp@ietfa.amsl.com
Delivered-To: ccamp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 20BBB1294D7 for <ccamp@ietfa.amsl.com>; Sun, 19 Aug 2018 20:26:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QiZsCVOKQ_Wj for <ccamp@ietfa.amsl.com>; Sun, 19 Aug 2018 20:26:08 -0700 (PDT)
Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8713B126CC7 for <ccamp@ietf.org>; Sun, 19 Aug 2018 20:26:07 -0700 (PDT)
Received: from lhreml707-cah.china.huawei.com (unknown [172.18.7.107]) by Forcepoint Email with ESMTP id 7FA5139C2F921 for <ccamp@ietf.org>; Mon, 20 Aug 2018 04:26:01 +0100 (IST)
Received: from NKGEML412-HUB.china.huawei.com (10.98.56.73) by lhreml707-cah.china.huawei.com (10.201.108.48) with Microsoft SMTP Server (TLS) id 14.3.399.0; Mon, 20 Aug 2018 04:26:01 +0100
Received: from NKGEML513-MBX.china.huawei.com ([169.254.1.79]) by nkgeml412-hub.china.huawei.com ([10.98.56.73]) with mapi id 14.03.0399.000; Mon, 20 Aug 2018 11:25:46 +0800
From: Qin Wu <bill.wu@huawei.com>
To: stefan vallin <stefan@wallan.se>
CC: "ccamp@ietf.org" <ccamp@ietf.org>
Thread-Topic: Second review of draft-ietf-ccamp-alarm-module-01
Thread-Index: AdQgn0ZSsaTuKMi2STS36VPAe6hr7gBBfduAAACrXYAAMCT58AMkrjOAACFUC6AAPQaXAACVaW4AAL8vYIAAmkNvIA==
Date: Mon, 20 Aug 2018 03:25:45 +0000
Message-ID: <B8F9A780D330094D99AF023C5877DABA9AFB8EC7@nkgeml513-mbx.china.huawei.com>
References: <B8F9A780D330094D99AF023C5877DABA9AF5BDE8@nkgeml513-mbx.china.huawei.com> <E597E310-27B8-4091-89BB-F510CE1AC3C0@wallan.se> <50582C88-3BC2-450F-B761-E61310AABFB4@wallan.se> <B8F9A780D330094D99AF023C5877DABA9AF74602@nkgeml513-mbs.china.huawei.com> <734639AA-E2B4-493A-81D6-2F80D4192883@wallan.se> <B8F9A780D330094D99AF023C5877DABA9AF9C0BE@nkgeml513-mbs.china.huawei.com> <1248184F-74FF-40BB-AD9A-FE03757CBDCF@wallan.se> <B8F9A780D330094D99AF023C5877DABA9AFA574E@nkgeml513-mbs.china.huawei.com> <C02F7165-6A02-4FB0-B51E-E2CD26B7B879@wallan.se>
In-Reply-To: <C02F7165-6A02-4FB0-B51E-E2CD26B7B879@wallan.se>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.138.33.244]
Content-Type: multipart/alternative; boundary="_000_B8F9A780D330094D99AF023C5877DABA9AFB8EC7nkgeml513mbxchi_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/ccamp/uypkbrXbGsyWDR7PDfYs88tn7ZY>
Subject: Re: [CCAMP] Second review of draft-ietf-ccamp-alarm-module-01
X-BeenThere: ccamp@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Discussion list for the CCAMP working group <ccamp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ccamp>, <mailto:ccamp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ccamp/>
List-Post: <mailto:ccamp@ietf.org>
List-Help: <mailto:ccamp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ccamp>, <mailto:ccamp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Aug 2018 03:26:11 -0000

Stefan:
Remove the comments I have reached agreement.
See my followup comments marked with [Qin-2].

发件人: stefan vallin [mailto:stefan@wallan.se]
发送时间: 2018年8月17日 17:10
收件人: Qin Wu
抄送: ccamp@ietf.org
主题: Re: Second review of draft-ietf-ccamp-alarm-module-01

Hi!


[Qin]: One way to handle the reference to the alarming resource is add Alarm-name or alarm-serial-no as one field of alarm list.
So alarm-name or alarm-serial-no can be seen as alias of 3 tuple (resource, alarm-type-id, alarm-type-qualifier).
No, I will not do this. 3GPP used to notion of alarmId as a redundant/alias (and even conflicting) key to the  3GPP triplet (mo, event-type, probable-cause).
This is really confusing and the 3GPP specs showed inconsistent scenarios where you had the same (mo, event-type, probable-cause) and different alarmId.

The spec said in text that is not allowed. Which database designer would have two conflicting overlapping keys?


[Qin-2]: I am not sure introducing alarmed is confusing in 3GPP standards, as described in http://www.qtc.jp/3GPP/Specs/32111-2-340.pdf,
Alarmid can be used to detect alarm loss. So it is still a useful information.

suppose you have two alarms generated on the same managed object, if you have alarmid, you can easily distinguish one from another.
But without alarmid, you just thought this is the same alarm reraised again, I don’t understand why alarmid will introduce overlapping key. With alarm id added into 3 tuples, you will identify the same type of alarm generated in the same managed object(you use resource to specify each managed object).

When you say alarmSerialNo, this is really confusing. The notion of a serial-number in alarm standards normally refer to individual notifications.

[Qin-2]: Yes, in some case, alarm-serial-no and alarm notification are one to one alarm relationship, in such case, you may use alarmserialno to specify each notification. In TMF814 standard, notificationId is defined to identify each notification.
In addition, I believe you haven’t touched my followup comments posted at:
https://www.ietf.org/mail-archive/web/ccamp/current/msg18904.html
which are not controller support specific comment, appreciate your response to those comments.
4 issues highlighted below:

1.  Alarm-type-id supports union of identity and string

I know defining alarm-type-id as identity make alarm-type-id is more extensible, but waste more space than using enum.

I am wondering why not define alarm-type-id as uint32 or string with embedded format such as groupid-alarmid(e.g., ”2310-36700394”), this will help manage millions of alarm types easier.

Defining alarm-type-id as identity seems wasting a lot of space and hard to deal with millions of alarm type in the design time since Enumerating each of them require human to enter all of alarm types in yang file.
A) a flat enum does not work globally across enterprises and organisations, see ITU failure with probable cause
B) Millons of alarm types ??? No that will not happen
[Qin]: That’s the reality we are facing.(:-
If you have this you need to rethink. I will not buy your equipment if I need to train my operators in the NOC to learn how to manage millions of different alarm *types*.

C) uint32, that is meaningless for operators
[Qin]: That’s why we should have both alarm-name and alarm-serial-no, alarm-name provide meaning for operators.
Alarm-type is a yang identity, it has a clear name. No need for a separate “name”.
Just give your identity a good name

D) string, that will result in surprises for operators, developers will introduce strings in their code that suddenly shows up in the NOC.
[Qin]: The essence of alarm-type-qualifier is string qualifier, so you believe introduce alarm-type-qualifier will result in surprises for operators as well??
Read the RFC, it is explained


[Qin-2]:See definition of alarm-type-qualifier:
“  typedef alarm-type-qualifier {
       type string;
       description
         "If an alarm type can not be fully specified at design time by
          alarm-type-id, this string qualifier is used in addition to
          fully define a unique alarm type.

          The definition of alarm qualifiers is considered being part
          of the instrumentation and out of scope for this module.
          An empty string is used when this is part of a key.";
     }
”
My point is alarm-type-qualifier will have the same issue as string.

E) I do not get your last comment ”require human to enter all alarm types in yang file”.
     You have to design which alarm types your system has, that should not come as.a surprise to the operator.
[Qin]: Enter 2 million alarm type in YANG file is challenging to human.
1) You are in deep problems if your system has millions of alarm types
2) So without enumerating the alarm types in a YANG file (could be generated from whatever source you have, does not have to be “manual” ) what do you argue?
Programmers entering them “manually” as strings in a printf? Difference?

There is a huge value for the operators that the alarm-types are known


There are several benefits of hierarchical identities for alarm types:
- Alarm types can be parsed from YANG modules
- You can reason about “abstract” alarm types
- Extensibility, enterprises and organisations can extend previous identities

[Qin-2]:I am not against to define alarm-types as identity, what I like to see is to add additional information that can be used to identify alarm.
In 3 tuple, resource is actually the managed object that generate alarm, so use resource as part of 3 tuples key to identify each alarm instance has limitation,
that is why we proposed to add alarm-name or alarm-id.



2.  Alarm-name or alarm-serial-no field support for alarm and alarm inventory

Suppose we have alarm-name or alarm-serial-no, I believe it is more easier to based on one field rather than 3 tuple(resource, alarm-type-id, alarm-type-qualifier) to identify each alarm instance,

The most important is this will simplify operation and management.
I think that
(GigabitEthernet0/15, link-alarm, “")

Tells more than:
42

[Qin]: The limitation of 3 tuple is when the same alarm identified by (GigabitEthernet0/15, link-alarm, “")
is raised again, (GigabitEthernet0/15, link-alarm, “")can not be used to distinguish first raised alarm and second raised alarm.
By introducing unsigned integer type alarm-serial-no and string type alarm-name, this issue can be solved.
You are confusing the individual alarm notifications with the alarm itself.
Each notification is available in the status-change list for the alarm
       +--ro status-change* [time]
          +--ro time                    yang:date-and-time

The key being the time of the alarm-notification.


[Qin-2]: See above, I am not mixing alarm-serial-no with notification itself.

[Qin]: If you correlate alarm instance with alarm name or alarm-serial-no, it will be easier to look up each alarm instance based on alarm-name or alarm-serial-no than using 3 tuple(resource, alarm-type-id, alarm-type-qualifier).
Do not agree...

[Qin-2]: See above,
1.alarmid can be used to distinguish two alarm generated at the same managed object or resource while 3 tuple not.
2.Adding alarm-name and alarmid help you manage more alarm type and provide fine granularity on alarm control.