[Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt

"Joel M. Halpern" <jmh@joelhalpern.com> Fri, 09 March 2012 21:38 UTC

Return-Path: <jmh@joelhalpern.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 87E3221E8088; Fri, 9 Mar 2012 13:38:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.795
X-Spam-Status: No, score=-101.795 tagged_above=-999 required=5 tests=[AWL=-0.130, BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, J_CHICKENPOX_53=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id vnmi40Z8J7I5; Fri, 9 Mar 2012 13:38:23 -0800 (PST)
Received: from morbo.mail.tigertech.net (morbo.mail.tigertech.net []) by ietfa.amsl.com (Postfix) with ESMTP id D6BA421E805F; Fri, 9 Mar 2012 13:38:23 -0800 (PST)
Received: from mailb2.tigertech.net (mailb2.tigertech.net []) by morbo.tigertech.net (Postfix) with ESMTP id C0FABCD0C4; Fri, 9 Mar 2012 13:38:23 -0800 (PST)
Received: from localhost (localhost []) by mailb2.tigertech.net (Postfix) with ESMTP id 9E1711C5975; Fri, 9 Mar 2012 13:38:23 -0800 (PST)
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from [] (pool-71-161-51-182.clppva.btas.verizon.net []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 9737A1C0914; Fri, 9 Mar 2012 13:38:22 -0800 (PST)
Message-ID: <4F5A784D.4020401@joelhalpern.com>
Date: Fri, 09 Mar 2012 16:38:21 -0500
From: "Joel M. Halpern" <jmh@joelhalpern.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: gen-art@ietf.org
References: <4F5973CE.309@nostrum.com>
In-Reply-To: <4F5973CE.309@nostrum.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: draft-ietf-bmwg-protection-meth.all@tools.ietf.org, bmwg@ietf.org
Subject: [Gen-art] Review: draft-ietf-bmwg-protection-meth-09.txt
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Mar 2012 21:38:24 -0000

I am the assigned Gen-ART reviewer for this draft. For background on 
Gen-ART, please see the FAQ at 

Please resolve these comments along with any other Last Call comments 
you may receive.

Document: draft-ietf-bmwg-protection-meth-09.txt
     Methodology for benchmarking MPLS protection mechanisms
Reviewer: Joel M. Halpern
Review Date: 9-March-2012
IETF LC End Date: 20-March-2012
IESG Telechat date: N/A

Summary: This document is almost ready for publication as an 
Informational RFC.

Major issues:
     I find a definitions section (Section 3)that says
"This document also uses existing terminology defined in other BMWG work."
and follows this with examples, but neither a complete list of terms nor 
a complete list of documents, to be an unclear approach.  If the reader 
finds terms they do not know, they have no good indication as to what 
document(s) they should read to repair the gap.

    The description of the General Reference Topology (section 4) seems 
unclear to me.  The text starts out discussing, and the diagram 
explicitly shows, a Traffic Generator and a Traffic Analyzer.  In the 
diagram these are two disparate devices, connected to routers R1 and R5 
respectively.  So far, so good.  However, the text then talks about 
"the Tester" as being made up of the Traffic generator and the Traffic 
Analyzer", and describes "the Tester" as being directly connected to the 
Device Under Test.  It is exceedingly unclear whether this is supposed 
to mean that the full collection of routers R1-R6 are the device under 
test, or whether R1, R5, or some other specific router is the device 
under test.

     Could an effort be made to reword section 5.7?  First it says "one 
or more traffic streams".  Then it says "16 flows".  Then it talks about 
traffic spreading across some set of prefixes.  And the description of 
the reason for not doing round-robin across the prefixes leaves me even 
more confused about what one actually should set up.

     Section 5.8 describing the capabilities of "the Tester" seems to 
contradict section 4, where "the Tester is comprised of" the Traffic 
generator and the Traffic Analyzer.  The capabilities listed in section 
5.8 go well beyond that.

     The 8 scenarios shown in section 6 all have Mid-Point PLRs as far 
as I can tell.  Section 7 says that the test it describes can be applied 
to all the 8 cases from section 6.  But then it carefully describes 
cases of Headend, Mid-Point, and Egress PLR.  But no examples of the 
first or third have been shown.  Thus, I do not see, for example, as 
described in section 7.1. one can select a scenario from section 6, and 
then establish a headend PLR.

    This reviewer would like to verify that the test procedures 
described produce a meaningful value for items like Failover Packet Loss 
and Failover Time.  Is there a specific reference for these, since the 
actual calculations are not described here?

     Finding the definition of the Failover Time calculation methods 
hidden in the reporting format (section 8) was quite surprising.  Given 
that these are important definitions for the meaning fo the tests, they 
should occur before the test descriptions, not in the reporting format.

Minor issues:
     As noted by id-nits, section 3 references TERM-ID as a document 
defining terminology, but there is no such ID in the list of references. 
  And why do the section headers for 5.1, 5.2, and 5.6 also have 
"[TERM-ID]"?  Note that even if those section headers are defined terms, 
it is stylistically unusual to put the reference into the section header.
(It almost looks like "TERM-ID" was a marker for things which still 
needed a proper reference.)

     In section 5.1 a set of example failure events is listed.  It is 
unclear whether this list is the ones to be tested for, or just "some" 
events.  In addition, it is unclear why there is inconsistency in the 
coverage of the descriptions of the failures. The three different 
monitoring methods are mentioned explicitly with the Interface Shutdown 
failures, but are not even mentioned for the other failures.  And then 
while most of the failures list local or remote side, the last two 
failures do not indicate a side.  Why?

     Some of the abbreviations in section 6 are unclear.  For example, 
since there is no real provider it is not clear which router(s) are 
meant by PE as distinct from P routers.  Also, while I familiar with 
Layer3 VPN, I am not familiar with the usage"Layer2 VC".  Further given 
taht VPNs have different label usages, I suspect that both "Layer3 VPN" 
and "Layer2 VC" are insufficiently specific to match to a specific size 
label stack.
     As an example of the above confusion, in the figure in section 
6.1.2, the number of labels  in the Layer3VPN from the PE to the P 
router is described as going from 2 to 3 upon failure.  The PE->P link 
in the diagram is R1->R2, which is upstream of the failure.  So the 
number of labels on that link won't change.  The number of labels on the 
R6->R3 P->PE link (assuming I have properly guessed what PE is) does go 
from 2 to 3.  But the lines refer simply to PE-P.
     Similarly, while I suspect that the numbers are accurate, it is 
very hard to map the pre-failure label counts to the diagrams in a way 
that explains the difference between the numbers in section 6.1.1/6.1.2 
and  6.1.3 and onward.  Assuming PE-PE traffic is HE-TE traffic, then 
the internal topology should not affect the label count on that.  So you 
probably mean something else.  But I don't know what.

     It is unclear what section 7 means by "Select an overlay technology 
(e.g. IGP, VPN, or VC)."  Please clarify.

     Why is section 7.1.3 (determining tailend performance) included in 
the document, when no test cases include tailend failure?

     Is it an issue that the timestamp based  method for determining the 
failover time will, on average, overestimate the failure time by on 
inter-packet interval?  (Based on assuming that on average the failure 
and recovery are each uniformly distributed across the inter-packet 

Nits/editorial comments:
     Section 7.1.2 item A refers to 9 scenarios from section 6. There 
are only 8.
     Section 7.4 decides to leave out the nubmer of scenarios from 
section 6, leading to a surprising, but otherwise probably meaningless, 
difference in wording.