Re: [Gen-art] Gen-art telechat review of draft-ietf-i2rs-traceability-09

Elwyn Davies <elwynd@dial.pipex.com> Tue, 10 May 2016 21:51 UTC

Return-Path: <elwynd@dial.pipex.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 293DB12D15C; Tue, 10 May 2016 14:51:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.6
X-Spam-Level:
X-Spam-Status: No, score=-102.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, USER_IN_WHITELIST=-100] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0TeH4uCFnvNg; Tue, 10 May 2016 14:51:11 -0700 (PDT)
Received: from b.painless.aa.net.uk (b.painless.aa.net.uk [IPv6:2001:8b0:0:30:5054:ff:fe5e:1643]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4D43512D157; Tue, 10 May 2016 14:51:11 -0700 (PDT)
Received: from 0.b.5.5.b.9.8.9.7.f.1.a.9.3.1.3.1.0.0.0.f.b.0.0.0.b.8.0.1.0.0.2.ip6.arpa ([2001:8b0:bf:1:3139:a1f7:989b:55b0]) by b.painless.aa.net.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.77) (envelope-from <elwynd@dial.pipex.com>) id 1b0FYa-0004kJ-A4; Tue, 10 May 2016 22:51:08 +0100
To: Joe Clarke <jclarke@cisco.com>, General area reviewing team <gen-art@ietf.org>
References: <572B4D51.10003@dial.pipex.com> <e8a0bf59-f1ea-3710-9b3b-2820ea1ef64b@cisco.com>
From: Elwyn Davies <elwynd@dial.pipex.com>
Message-ID: <d8db28ea-3560-ecd5-d4a4-4a8070b07af7@dial.pipex.com>
Date: Tue, 10 May 2016 22:51:09 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.0
MIME-Version: 1.0
In-Reply-To: <e8a0bf59-f1ea-3710-9b3b-2820ea1ef64b@cisco.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/gen-art/wb13Atd2ZHQTMnxSKF6PMWXamIY>
Cc: draft-ietf-i2rs-traceability.all@ietf.org
Subject: Re: [Gen-art] Gen-art telechat review of draft-ietf-i2rs-traceability-09
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 May 2016 21:51:15 -0000

Thanks for the rapid update.  This looks to be making good progress.

Couple of nits and comments:
s1, para 2: s/describes use cases/describe use cases/

s5.2, Event ID:
> An event can be a Client authenticating with the Agent, a Client to 
> Agent operation, or a Client disconnecting from an Agent.
This is a good thing, but I am not sure that the format provides a way 
to identify the authentication and disconnection events.

s5.2, Starting Timestamp:
[I don't understand 'three points of prevision'.] Maybe...
OLD:
Given that many I2RS operations can occur in rapid succession, the use 
of fractional seconds MUST be used to provide adequate granularity.  
Fractional seconds SHOULD be expressed with at least three points of 
prevision in second.microsecond format.
NEW:
Given that many I2RS operations can occur in rapid succession, the 
fractional seconds element of the timestamp MUST be used to provide 
adequate granularity.  Fractional seconds SHOULD be expressed with at 
least three [or more?] significant digits in second.microsecond format.
END

s5.2, Ending Timestamp:
See the comments on the Starting Timestamp - though I think you could 
just refer to the words in the Starting Timestamp and avoid duplication.

s7.4/s7.4.3: Given that the I2RS pub-sub access method is 
mandatory-to-implement, i think I-D.ietf-i2rs-pub-sub-requirements has 
to be a Normative Reference.

Regards,
Elwyn


On 06/05/2016 21:27, Joe Clarke wrote:
> Thank you for your review, Elwyn.  You raise some excellent points.
>
> I'm top-posting here as I think we've addressed all of your concerns. 
> We're still reviewing the new text among ourselves, but I wanted to 
> show you the SxS diffs to get your take.  Some of the changes 
> incorporate other AD/IESG comments as well.
>
> http://www.marcuscom.com/draft-ietf-i2rs-traceability.txt-from-09-10.diff.html 
>
>
> Joe
>
> On 5/5/16 09:40, Elwyn Davies wrote:
>> Possibly Major issues:
>> Trace model:  The tracing model seems to be a curious hybrid of state
>> recording and event logging.  The introduction seems to imply that the
>> tracing model records events.  Indeed it does but state entry events do
>> not appear to get recorded until the sequence transitions out of the
>> state.  I can see that the COMPLETED entries record the total processing
>> period, but this loses the detail of when actual processing of the event
>> starts (as opposed to becoming PENDING).  I was somewhat surprised that
>> a simple chained transition event model was not used (especially since
>> the tracing entries are actually chained together already).
>>
>> In particular if some sort of disaster occurs, it seems possible in this
>> model that events in the PENDING queue might never appear in the trace
>> log at all if the request hasn't started being processed. It also
>> doesn't record any preprocessing time before the request becomes
>> PENDING.  If there is a processing bottleneck this could be significant
>> information.
>>
>> I was also wondering whether this model traces the arrival and departure
>> of clients (and whether authoentication/authorisation worked or not).
>> This may be covered by operation types in the architecture which I
>> haven't had time to read in detail.
>>
>> Minor issues:
>>
>> Nits/editorial comments:
>> s1:  The Intro should also contain a description of the intention of the
>> document - basically a slight reworking of the abstract.  It should also
>> outline the association of the framework with the interface (i2rs
>> client<->agent) to which the traceability applies.
>>
>> s3:
>>> The
>>>     ability to automate and abstract even complex policy-based controls
>>>     highlights the need for an equally scalable traceability 
>>> function to
>>>     provide event-level granularity of the routing system compliant 
>>> with
>>>     the requirements of I2RS (Section 5 of
>>>     [I-D.ietf-i2rs-problem-statement]).
>> The 'routing system' doesn't have an event-level granularity. Maybe
>> OLD:
>> provide event-level granularity of the routing system
>> NEW:
>> provide recording at event-level granularity of the evolution of the
>> routing system
>> END
>>
>> s4:  The section ends with this list of 'use cases':
>>>     As I2RS becomes increasingly pervasive in routing environments, a
>>>     traceability model offers significant advantages and facilitates 
>>> the
>>>     following use cases:
>>>
>>>     1  Automated event correlation, trend analysis, and anomaly
>>>        detection;
>>>
>>>     2  Trace log storage for offline (manual or tools) analysis;
>>>
>>>     3  Improved accounting of routing system operations;
>>>
>>>     4  Standardized structured data format for writing common tools;
>>>
>>>     5  Common reference for automated testing and incident reporting;
>>>
>>>     6  Real-time monitoring and troubleshooting;
>>>
>>>     7  Enhanced network audit, management and forensic analysis
>>>        capabilities.
>> I have added numbers to facilitate these comments:
>> IMO #2 and #4 are either not use cases or a not phrased as use cases.
>> The automated testing is not really a use case as such. Having these
>> characteristics supports the implementation of the actual use cases.
>> Related to the data retention comment above, storing some or all of the
>> trace log - and knowing which bits might be critical to control data
>> retention - is a use case but the basic storage is just a necessary
>> prerequisite of doing other things.  I also might suggest a reordering
>> indicating importance perhaps.
>>
>> Thus I would suggest replacing this with something like:
>>
>>    As I2RS becomes increasingly pervasive in routing environments, a
>>    traceability model that supports controllable trace log retention
>>    using a standardized structured data format offers significant
>> advantages,
>>    such as the ability to create common tools and support automated
>> testing,
>>    and facilitates the following use cases:
>>
>>    o  Real-time monitoring and troubleshooting of router events;
>>
>>    o  Automated event correlation, trend analysis, and anomaly
>>       detection;
>>
>>    o  Offline (manual or tools-based) analysis of router state evolution
>>        from the retained trace logs;
>>
>>    o  Enhanced network audit, management and forensic analysis
>>        capabilities;
>>
>>    o  Improved accounting of routing system operations; and
>>
>>    o Providing a standardized format for incident reporting and test
>> logging.
>>
>> s5: .. is empty: Empty sections are not desirable.  A brief overview of
>> the following sub-sections should be added (or alternatively promote
>> s5.1 which actually describes the framework).
>>
>> s5.1, para 1:
>>> Some notable elements of the architecture are in
>>>     this section.
>> I don't understand this sentence.  If it implies that elements of the
>> architecture are defined in this section then one has to ask 'Why aren't
>> they defined in the architecture document?'  Since s5.1 contains the
>> whole framework, what other elements than the 'some notable' ones are
>> there?
>>
>> s5.1, para 2: The term 'northbpund' is not defined (and isn't used in
>> the architecture').
>>
>> s5.2: The title is ' I2RS Trace Log Mandatory Fields'  - nothing that
>> isn't mandatory is discussed.  Should there be some words about optional
>> extra fields?
>>
>> s5.2, timestamps:  The RFC3339 format doesn't tie up with 32 bit
>> resolution - there are hours and minutes etc and decimal representation
>> is used.  Things like origin for timestamps needs to be defined if they
>> are to be truly useful for comparison outside an individual enterprise
>> (as might be implied by the incident reporting use case).  If RFC 3339
>> format is really used, then the timestamps need to include the date as
>> well since logs will certainly run over more than one day.  I note that
>> the example in s6 shows full RFC 3339 date/time format examples.
>>
>> s5.2, Applied Operation Data:  Does the Operation Data Present flag
>> apply to this field?  Can this be present even if there is no Requested
>> Operation Data?
>>
>>  s5.2, Result Code: Need to expand acronym RIB.
>>
>> s7.2:  One key point about timestamping (motivated by bitter experience)
>> is that timestamps need to be recorded at the point when the event
>> actually happens and not when the event is (potentially significantly
>> later) entered into the log.  Logging is (as indicated) often allocated
>> a low priority and event log writing may end up being postponed for a
>> considerable time.
>>
>> s11: I would consider I-D.ietf-i2rs-problem-statement and
>> I-D.ietf-i2rs-pub-sub-requirements to be Informative; and
>> I-D.ietf-i2rs-rib-info-model, RFC 3339 and possibly RFC 5424 to be
>> normative.
>