Re: [Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt

Rajiv Papneja <Rajiv.Papneja@huawei.com> Tue, 22 May 2012 18:30 UTC

From: Rajiv Papneja <Rajiv.Papneja@huawei.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>, "gen-art@ietf.org" <gen-art@ietf.org>
Thread-Topic: [Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt
Thread-Index: AQHM/j0YcQICZ4zzmkasgx8iN2o5AZbWR6dw
Date: Tue, 22 May 2012 18:28:00 +0000
Message-ID: <52B0D42F4BADB144B850705C4549F6EA017F1B97@dfweml511-mbs.china.huawei.com>
References: <4F5973CE.309@nostrum.com> <4F5A784D.4020401@joelhalpern.com>
In-Reply-To: <4F5A784D.4020401@joelhalpern.com>
Accept-Language: en-US, zh-CN
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_52B0D42F4BADB144B850705C4549F6EA017F1B97dfweml511mbschi_"
MIME-Version: 1.0
Cc: "draft-ietf-bmwg-protection-meth.all@tools.ietf.org" <draft-ietf-bmwg-protection-meth.all@tools.ietf.org>, "bmwg@ietf.org" <bmwg@ietf.org>
Subject: Re: [Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt
Precedence: list

Hello Joel,



Thank you so much for your review and comments. Please see our responses inline. We will posting the updated draft soon.



Regards,

-Rajiv



-----Original Message-----

From: Joel M. Halpern [mailto:jmh@joelhalpern.com]

Sent: Friday, March 09, 2012 4:38 PM

To: gen-art@ietf.org

Cc: bmwg@ietf.org; draft-ietf-bmwg-protection-meth.all@tools.ietf.org

Subject: [Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt



I am the assigned Gen-ART reviewer for this draft. For background on

Gen-ART, please see the FAQ at

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.



Please resolve these comments along with any other Last Call comments

you may receive.



Document: draft-ietf-bmwg-protection-meth-09.txt

     Methodology for benchmarking MPLS protection mechanisms

Reviewer: Joel M. Halpern

Review Date: 9-March-2012

IETF LC End Date: 20-March-2012

IESG Telechat date: N/A



Summary: This document is almost ready for publication as an

Informational RFC.



Major issues:

     I find a definitions section (Section 3)that says

"This document also uses existing terminology defined in other BMWG work."

and follows this with examples, but neither a complete list of terms nor

a complete list of documents, to be an unclear approach.  If the reader

finds terms they do not know, they have no good indication as to what

document(s) they should read to repair the gap.



response> Addressed this by removing the examples, and adding the references as following:

The last line in the section 3 now read as -



"This document also uses existing terminology defined in other BMWG

   Work [Br91], [Ma98], [Po06]"

****************



    The description of the General Reference Topology (section 4) seems

unclear to me.  The text starts out discussing, and the diagram

explicitly shows, a Traffic Generator and a Traffic Analyzer.  In the

diagram these are two disparate devices, connected to routers R1 and R5

respectively.  So far, so good.  However, the text then talks about

"the Tester" as being made up of the Traffic generator and the Traffic

Analyzer", and describes "the Tester" as being directly connected to the

Device Under Test.  It is exceedingly unclear whether this is supposed

to mean that the full collection of routers R1-R6 are the device under

test, or whether R1, R5, or some other specific router is the device

under test.



Response>

Current : A Tester is directly connected to the DUT.



New text:  A Tester is connected to the test network and depending upon the test case, the DUT may vary.

***************



     Could an effort be made to reword section 5.7?  First it says "one

or more traffic streams".  Then it says "16 flows".  Then it talks about

traffic spreading across some set of prefixes.  And the description of

the reason for not doing round-robin across the prefixes leaves me even

more confused about what one actually should set up.





Response>

Current: At least 16 flows should be used, and more if possible.

New text: For better accuracy, one may even consider provisioning 16 flows, or

      more if possible.



Current: It is suggested that there be one or more traffic streams as long as

   there is a steady and constant rate of flow for all the streams.



New text: It is suggested that there be three or more traffic streams as long as

   there is a steady and constant rate of flow for all the streams.



***********

     Section 5.8 describing the capabilities of "the Tester" seems to

contradict section 4, where "the Tester is comprised of" the Traffic

generator and the Traffic Analyzer.  The capabilities listed in section

5.8 go well beyond that.





Response>



Current: The Tester is comprised of a Traffic Generator (TG) & Test Analyzer (TA)



New : The Tester is comprised of a Traffic Generator (TG), Test Analyzer (TA) and an Emulator

***********



     The 8 scenarios shown in section 6 all have Mid-Point PLRs as far

as I can tell.  Section 7 says that the test it describes can be applied

to all the 8 cases from section 6.  But then it carefully describes

cases of Headend, Mid-Point, and Egress PLR.  But no examples of the

first or third have been shown.  Thus, I do not see, for example, as

described in section 7.1. one can select a scenario from section 6, and

then establish a headend PLR.



Response>

Added the following in section 6

  UR - Upstream router, MID PLR, HE PLR



Updated the topologies accordingly - for R1 added UR in all setups, for R2 modified the current description to MID PLR, HE PLR



************



    This reviewer would like to verify that the test procedures

described produce a meaningful value for items like Failover Packet Loss

and Failover Time.  Is there a specific reference for these, since the

actual calculations are not described here?



     Finding the definition of the Failover Time calculation methods

hidden in the reporting format (section 8) was quite surprising.  Given

that these are important definitions for the meaning fo the tests, they

should occur before the test descriptions, not in the reporting format.





Response> We agree to move the failover time measurements sections from the reporting section to section 5.9 -



Text remains the same.



********************



Minor issues:

     As noted by id-nits, section 3 references TERM-ID as a document

defining terminology, but there is no such ID in the list of references.

  And why do the section headers for 5.1, 5.2, and 5.6 also have

"[TERM-ID]"?  Note that even if those section headers are defined terms,

it is stylistically unusual to put the reference into the section header.

(It almost looks like "TERM-ID" was a marker for things which still

needed a proper reference.)





Response> s/[Term-ID]/RFC6414



*********************



     In section 5.1 a set of example failure events is listed.  It is

unclear whether this list is the ones to be tested for, or just "some"

events.  In addition, it is unclear why there is inconsistency in the

coverage of the descriptions of the failures. The three different

monitoring methods are mentioned explicitly with the Interface Shutdown

failures, but are not even mentioned for the other failures.  And then

while most of the failures list local or remote side, the last two

failures do not indicate a side.  Why?



Response> Clarified the significance of mentioning of failover events. Added remote and local side applicability on Sub-interface failure and Parent interface failure.



*************************************





     Some of the abbreviations in section 6 are unclear.  For example,

since there is no real provider it is not clear which router(s) are

meant by PE as distinct from P routers.  Also, while I familiar with

Layer3 VPN, I am not familiar with the usage"Layer2 VC".  Further given

taht VPNs have different label usages, I suspect that both "Layer3 VPN"

and "Layer2 VC" are insufficiently specific to match to a specific size

label stack.





     As an example of the above confusion, in the figure in section

6.1.2, the number of labels  in the Layer3VPN from the PE to the P

router is described as going from 2 to 3 upon failure.  The PE->P link

in the diagram is R1->R2, which is upstream of the failure.  So the

number of labels on that link won't change.  The number of labels on the

R6->R3 P->PE link (assuming I have properly guessed what PE is) does go

from 2 to 3.  But the lines refer simply to PE-P.



Response> Following changes have been made



1.    Added a relationship between the general reference topology and individual test setups. Suggested that the individual test setups are sub-set of a large topology

2.    Removed TA/TG

3.    Added notes for purposes clarifying the roles of the routers wrt to P/PE -



Following template used for all the figures-

               +-------+  +--------+    +--------+

               |  R1   |  |   R2   | PRI|   R3   |

               |  UR/HE|--|  HE/MID|----| MP/TE  |

               |       |  |  PLR   |----|        |

               +-------+  +--------+ BKP+--------+



                             Figure 2.



          Traffic            Num of Labels   Num of labels

                             before failure  after failure

          IP TRAFFIC (P-P)         0             0

          Layer3 VPN (PE-PE)       1             1

          Layer3 VPN (PE-P)        2             2

          Layer2 VC (PE-PE)        1             1

          Layer2 VC (PE-P)         2             2

    Mid-point LSPs           0             0





Note: Please note the following:

For P-P case, R2 and R3 acts as P routers.

For PE-PE case, R2 acts as PE and R3 is acts as a remote PE

   For PE-P case, R2 acts as a PE router, R3 acts as a P router and R5 acts as

remote PE router (Please refer figure 1 for complete setup)

For Mid-point case, R1, R2 and R3 acts as shown in figure - HE, Midpoint/PLR

and TE respectively.





***********************************



     Similarly, while I suspect that the numbers are accurate, it is

very hard to map the pre-failure label counts to the diagrams in a way

that explains the difference between the numbers in section 6.1.1/6.1.2

and  6.1.3 and onward.  Assuming PE-PE traffic is HE-TE traffic, then

the internal topology should not affect the label count on that.  So you

probably mean something else.  But I don't know what.



     It is unclear what section 7 means by "Select an overlay technology

(e.g. IGP, VPN, or VC)."  Please clarify.



Response> Clarified Section 7.1.1, step B



New text - Select or enable IP, Layer 3 VPN or Layer 2 VPN services with the DUT as the headend PLR



Updated this as necessary



For section 7.1.2 - test setup, "B" is not applicable - removed.



**********



     Why is section 7.1.3 (determining tailend performance) included in

the document, when no test cases include tailend failure?



Response> Removed 7.1.3



************



     Is it an issue that the timestamp based  method for determining the

failover time will, on average, overestimate the failure time by on

inter-packet interval?  (Based on assuming that on average the failure

and recovery are each uniformly distributed across the inter-packet

interval.)



RESPONSE> We suggest other two measurement methods, so it is up the user to have a preference.



***********





Nits/editorial comments:

     Section 7.1.2 item A refers to 9 scenarios from section 6. There

are only 8.



RESPONSE>  Fixed



*********

     Section 7.4 decides to leave out the nubmer of scenarios from

section 6, leading to a surprising, but otherwise probably meaningless,

difference in wording.



RESPONSE>  Fixed

**************

[Gen-art] A *new* batch of IETF LC reviews - 2012… A. Jean Mahoney
[Gen-art] Review: draft-ietf-bmwg-protection-meth… Joel M. Halpern
Re: [Gen-art] Review: draft-ietf-bmwg-protection-… Rajiv Papneja
[Gen-art] Review: draft-ietf-bmwg-protection-meth… Joel M. Halpern