Re: [RTG-DIR] Rtgdir last call review of draft-ietf-ccamp-alarm-module-06
"Joel M. Halpern" <jmh@joelhalpern.com> Tue, 15 January 2019 22:38 UTC
Return-Path: <jmh@joelhalpern.com>
X-Original-To: rtg-dir@ietfa.amsl.com
Delivered-To: rtg-dir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C7D7512867A; Tue, 15 Jan 2019 14:38:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.701
X-Spam-Level:
X-Spam-Status: No, score=-2.701 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=joelhalpern.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TQTw4wP57Dq5; Tue, 15 Jan 2019 14:38:22 -0800 (PST)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C8346127598; Tue, 15 Jan 2019 14:38:22 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id 43fQGk3kHWzFpr3; Tue, 15 Jan 2019 14:38:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelhalpern.com; s=2.tigertech; t=1547591902; bh=l5NqbsgXTS73ZaiAHzTgUjNMSsviiL5ujTILDkZqzcU=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=erxOs6wfSM8AC03ojc/2FQ4q96uT/JVBrp9rmqMPFhfC29TWQO6N8LSFzEJkTmVlP YR8DqLTEnlp5uCf4V5XvXU71FBZhFRKWb+vMcWM1XXMA8w4hjtPIz3XuXwiYLaZ/pb x+CP6N6isSvWsA0SdS8wxu9lIYRMyRN1HkO50mOw=
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from Joels-MacBook-Pro.local (209-255-163-147.ip.mcleodusa.net [209.255.163.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 43fQGj2hrBz13Kbh; Tue, 15 Jan 2019 14:38:21 -0800 (PST)
To: stefan vallin <stefan@wallan.se>
Cc: rtg-dir@ietf.org, draft-ietf-ccamp-alarm-module.all@ietf.org, "CCAMP (ccamp@ietf.org)" <ccamp@ietf.org>, ietf@ietf.org
References: <154714089885.30812.1684533748546533450@ietfa.amsl.com> <55998B73-A581-4A47-8D23-B88E2607EFC8@wallan.se>
From: "Joel M. Halpern" <jmh@joelhalpern.com>
Message-ID: <68a25f22-5b92-2f7b-9104-0e7e9c580a9b@joelhalpern.com>
Date: Tue, 15 Jan 2019 17:38:20 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <55998B73-A581-4A47-8D23-B88E2607EFC8@wallan.se>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-dir/TNM3AOXwkJ-l-NrflvRsMUJh-AA>
Subject: Re: [RTG-DIR] Rtgdir last call review of draft-ietf-ccamp-alarm-module-06
X-BeenThere: rtg-dir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Area Directorate <rtg-dir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-dir>, <mailto:rtg-dir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-dir/>
List-Post: <mailto:rtg-dir@ietf.org>
List-Help: <mailto:rtg-dir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-dir>, <mailto:rtg-dir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Jan 2019 22:38:25 -0000
Thank you Stefan. Your proposed clarifications sound like they will addres my concerns. Yours, Joel On 1/15/19 4:52 PM, stefan vallin wrote: > Hi Joel! > Thanks for your review, really helpful! > See inline: > >> On 10 Jan 2019, at 18:21, Joel Halpern <jmh@joelhalpern.com> wrote: >> Minor Issues: >> The first paragraph of section 3.6 (Root Cause, Impacted Resources and >> Related Alarms) has a confused "not", a missing preposition, and a typoed >> conjunction, making it very hard to be sure what is intended. I believe >> the first part of the sentence should read: "The recommendation is to have >> a single alarm for the underlying problem and list …" > That sentence is really broken in v6, my fat fat fingers. This was also pointed out by Gert Grammel. > It should read: > > The recommendation is to have a single alarm > for the underlying problem and list the affected resources in the > alarm, rather than having separate alarms for each resource > > >> >> There is a larger issue about system behavior and root cause analysis that >> I think should be discussed in this section. Root cause analysis and >> side-effect analysis are not simple tasks. It is common for them to be >> performed outside of network elements. When such is performed outside of a >> network element, it is unclear what the implications are. Is it the intent >> that network elements that can not perform root cause analysis and impacted >> resource determination should NOT support this YANG module? Or can / >> should / may they support it even though they can not perform this >> analysis? There is a paragraph that seems to be trying to talk about this, >> but I was left confused about what was expected. Part of my confusion is >> that the text treats this inability as rare, whereas in my experience for >> network elements such inability is common. > The module does not mandate any root-cause -, impact analysis or correlation capabilities. > The purpose of this section is to describe optional leafs in the alarms relating to presenting the result of such analysis, if supported. > If the system has no such capabilities, the optional leafs are not used and this section can be ignored. > I will make that clear in this section. It would be fatal if the reader did not use the module assuming it put requirements on correlation. >> >> It took me a while to realize what the text in 3.7 (and 4.1.1) about not >> generating notification is talking about. The problem is that with all >> the effort to make clear that alarms are not notifications, I missed the >> fact that an alarm being raised (or re-raised) does itself cause a >> notification. And that it is this re-raise notification (and other >> severity change, clearing, etc notifications) that are suppressed by the >> shelving. It seems to me that there needs to be better explanation of >> this in or before 3.7. > Ok, will improve description. >> >> Reading the YANG for shelving alarms, it looks to me that while it can do >> what is described earlier in the document, the conceptual structure is VERY >> different. From the YANG, to shelve a specific alarm one has to create a >> named shelf whose conditions identify the specific alarm. To selve several >> alarms that are related (for example, when the operator looks at a list and >> selects several items to shelve) the system will likely have to create >> multiple shelves, give each a unique name, and put the different alarm >> identifiers in each one. > The data-model uses a leaf-list for the resource which makes it possible to define one shelf for several resources. > However your comments made us aware of alarm-type and alarm-qualifier just being leafs. As you point out this may > lead to situations where you need to configure several shelfs for shelving different alarms relating to the same reason. > We will change this so that several alarm-types/alarm-qualifiers can also be defined for one shelf. > With this change, any arbitrary group of alarms can be configured as one shelf. > Also, we will make the list ordered-by user, and add to the > description that the first matching shelf is used. > > Thanks for pointing this out. > >> To unshelve alarms, one has to find the named >> shelf which has caused the shelving. This seems very awkward. It seems >> to have been designed to enable one to store the shelving reason separate >> from the alarm itself. It introduces the odd effect that if the shelves >> are used with conditions that can match more than one thing, then one could >> have several shelves shelving the same alarm, and an effort to unshelve >> might well not produce the desired result. Assuming that this complexity is >> desired by the working group, I would ask that it be explicitly called out >> in the descriptive portions of the document. > See above, with the proposed change, it will always be one shelf. > Finding the shelf with the shelf name should not be awkward. > Note well that it is likely that there are several alarms that are shelved due to the same shelf configuration. > Take the straight-forward shelving of all alarms from a specific interface. > Different alarms from that interface will then be shelved and it is straight-forward to delete the shelf configuration that says, “if X/Y/Z under test”. > This is an important feature of shelving. > > See above about the change to ordered-by-user, this will address your issue of several shelves addressing the same alarm. > Again, thanks for your review making us aware of this issue! >> >> Nits: >> In section 4.4 (overview of The Alarm List) tree showing the components >> of the purge-alarm operation, is there any way to make clear that the >> enumeration called alarm-status is the enumeration of filter choices >> related to whether the alarm is cleared? Maybe rename it >> alarm-cleared-filter? > Ok, will consider this > > br Stefan and Martin >> >> >
- [RTG-DIR] Rtgdir last call review of draft-ietf-c… Joel Halpern
- Re: [RTG-DIR] Rtgdir last call review of draft-ie… stefan vallin
- Re: [RTG-DIR] Rtgdir last call review of draft-ie… Joel M. Halpern