RE: draft-ietf-rtgwg-spf-uloop-pb-statement

<stephane.litkowski@orange.com> Wed, 23 May 2018 10:16 UTC

Return-Path: <stephane.litkowski@orange.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EAAFA126C89; Wed, 23 May 2018 03:16:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OLCHZfnjSpLw; Wed, 23 May 2018 03:15:58 -0700 (PDT)
Received: from orange.com (mta135.mail.business.static.orange.com [80.12.70.35]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE99112426E; Wed, 23 May 2018 03:15:54 -0700 (PDT)
Received: from opfednr05.francetelecom.fr (unknown [xx.xx.xx.69]) by opfednr23.francetelecom.fr (ESMTP service) with ESMTP id 2AFE8C01EB; Wed, 23 May 2018 12:15:53 +0200 (CEST)
Received: from Exchangemail-eme2.itn.ftgroup (unknown [xx.xx.31.59]) by opfednr05.francetelecom.fr (ESMTP service) with ESMTP id EF6C82006E; Wed, 23 May 2018 12:15:52 +0200 (CEST)
Received: from OPEXCLILMA4.corporate.adroot.infra.ftgroup ([fe80::65de:2f08:41e6:ebbe]) by OPEXCLILM43.corporate.adroot.infra.ftgroup ([fe80::ec23:902:c31f:731c%19]) with mapi id 14.03.0389.001; Wed, 23 May 2018 12:15:52 +0200
From: stephane.litkowski@orange.com
To: Chris Bowers <chrisbowers.ietf@gmail.com>, "draft-ietf-rtgwg-spf-uloop-pb-statement@ietf.org" <draft-ietf-rtgwg-spf-uloop-pb-statement@ietf.org>, RTGWG <rtgwg@ietf.org>
Subject: RE: draft-ietf-rtgwg-spf-uloop-pb-statement
Thread-Topic: draft-ietf-rtgwg-spf-uloop-pb-statement
Thread-Index: AQHT1b3cTJvIqfdCg0abbCoZ/Pn0sqQ9Um2Q
Date: Wed, 23 May 2018 10:15:52 +0000
Message-ID: <4767_1527070553_5B053F59_4767_419_1_9E32478DFA9976438E7A22F69B08FF924B1607CE@OPEXCLILMA4.corporate.adroot.infra.ftgroup>
References: <CAHzoHbtneDuiWLpTy+pA1zYRGxQJLRx_jPz2wKLiwpN3_vZ5mQ@mail.gmail.com>
In-Reply-To: <CAHzoHbtneDuiWLpTy+pA1zYRGxQJLRx_jPz2wKLiwpN3_vZ5mQ@mail.gmail.com>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.168.234.4]
Content-Type: multipart/alternative; boundary="_000_9E32478DFA9976438E7A22F69B08FF924B1607CEOPEXCLILMA4corp_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/66N8EYYKyVjG23Ao1HlsPKqDLOA>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 23 May 2018 10:16:02 -0000

Hi Chris,

I have uploaded a new revision. Let me know if it correctly addresses your comments.

Brgds,


From: Chris Bowers [mailto:chrisbowers.ietf@gmail.com]
Sent: Monday, April 16, 2018 22:02
To: draft-ietf-rtgwg-spf-uloop-pb-statement@ietf.org; RTGWG
Subject: draft-ietf-rtgwg-spf-uloop-pb-statement

As part of doing the shepherd write-up for this document, I did a review of the draft.

My comments are shown below as a diff on draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt.

They can also be viewed at:
https://github.com/cbowers/outgoing-feedback-on-ietf-drafts-2018/commit/c1c5018f857e9c7c0f4123c3de1e87041178e387

Thanks,
Chris

=============

diff --git a/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt b/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt
index 353ce3c..3dff746 100644
--- a/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt
+++ b/draft-ietf-rtgwg-spf-uloop-pb-statement-06.txt
@@ -21,7 +21,16 @@ Abstract

    In this document, we are trying to analyze the impact of using
    different Link State IGP implementations in a single network in
-   regards of micro-loops.  The analysis is focused on the SPF triggers
+   regards of micro-loops.
+
+=======
+[CB]
+   In this document, we are trying to analyze the impact of using
+   different Link State IGP implementations in a single network, with
+   respect to micro-loops.
+
+========
+   The analysis is focused on the SPF triggers
    and SPF delay algorithm.

 Requirements Language
@@ -95,13 +104,39 @@ Table of Contents
    Link State IGP protocols are based on a topology database on which an
    SPF (Shortest Path First) algorithm like Dijkstra is implemented to
    find the optimal routing paths.
-
+
+   =====
+  [CB] proposed modified text since the Shortest Path First algorithm and
+   Djikstra algorithm are essentially synonomous.  Also propose to use
+   "consistent set of non-looping routing paths", since shortest path routing
+   is often not optimal from a traffic engineering perspective.
+
+   [proposed text]
+   Link State IGP protocols are based on a topology database on which the
+   SPF (Shortest Path First) algorithm is run to
+   find a consistent set of non-looping routing paths.
+
+   =====
+
    Specifications like IS-IS ([RFC1195]) propose some optimizations of
    the route computation (See Appendix C.1) but not all the
    implementations are following those not mandatory optimizations.

+============
+[CB]  [proposed text]
+but not all implementations follow those non-mandatory
+optimizations.
+=============
+
    We will call "SPF trigger", the events that would lead to a new SPF
    computation based on the topology.
+
+============
+[CB]  [proposed text]
+   We will call "SPF triggers", the events that would lead to a new SPF
+   computation based on the topology.
+=============
+

    Link State IGP protocols, like OSPF ([RFC2328]) and IS-IS
    ([RFC1195]), are using multiple timers to control the router behavior
@@ -118,11 +153,27 @@ Internet-Draft                spf-microloop                 January 2018

    Some of those timers are standardized in protocol specification, some
    are not especially the SPF computation related timers.
+
+============
+[CB] [proposed text]
+   Some of those timers are standardized in protocol specification, while some
+   are not.  The SPF computation related timers have generally remained
+   unspecified.
+=============

    For non standardized timers, implementations are free to implement it
    in any way.  For some standardized timer, we can also see that rather
    than using static configurable values for such timer, implementations
    may offer dynamically adjusted timers to help controlling the churn.
+
+============
+[CB] In the dicussion above, it is unclear about what the meaning of "timer" is.
+Is it the numerical value of a timer?  Is it the trigger conditions and logic
+for a timer to start or be reset?  Is the the action taken when the timer expires?
+Perhaps the text could clarified by referring to "timer behavior" and "timer values"
+
+=============
+

    We will call "SPF delay", the timer that exists in most
    implementations that specifies the required delay before running SPF
@@ -138,6 +189,17 @@ Internet-Draft                spf-microloop                 January 2018
    Some micro-loop mitigation techniques have been defined by IETF (e.g.
    [RFC6976], [I-D.ietf-rtgwg-uloop-delay]) but are not implemented due
    to complexity or are not providing a complete mitigation.
+
+==========
+[CB]
+This paragraph needs to be clearer.
+[proposed text]
+   Two micro-loop mitigation techniques have been defined by the IETF.
+   [RFC6976] has not been widely implemented, presumably due to the complexity
+   of the technique.  [I-D.ietf-rtgwg-uloop-delay] has been implemented.
+   However, it does not prevent all micro-loops that can occur
+   for a given topology and failure scenario.
+==========

    In multi-vendor networks, using different implementations of a link
    state protocol may favor micro-loops creation during the convergence
@@ -185,17 +247,24 @@ Internet-Draft                spf-microloop                 January 2018
    will forward the traffic to C through B, but as B as not converged
    yet, B will loop back traffic to A, leading to a micro-loop.

+========
+[CB]
+Figure 1 and figure 4 are essentially the same topology, but the nodes
+have different names.  I think it would be much better for the reader of this
+document to consolidate the two figures into a single figure.
+========
+
    The micro-loop appears due to the asynchronous convergence of nodes
    in a network when an event occurs.

-   Multiple factors (and combination of these factors) may increase the
+   Multiple factors (or a combination of these factors) may increase the
    probability for a micro-loop to appear:

    o  the delay of failure notification: the more B is advised of the
       failure later than A, the more a micro-loop may have a chance to
       appear.

-   o  the SPF delay: most of the implementations supports a delay for
+   o  the SPF delay: most implementations support a delay for
       the SPF computation to try to catch as many events as possible.
       If A uses an SPF delay timer of x msec and B uses an SPF delay
       timer of y msec and x < y, B would start converging after A
@@ -204,8 +273,8 @@ Internet-Draft                spf-microloop                 January 2018
    o  the SPF computation time: mostly a matter of CPU power and
       optimizations like incremental SPF.  If A computes its SPF faster
       than B, there is a chance for a micro-loop to appear.  CPUs are
-      today faster enough to consider SPF computation time as
-      negligeable (order of msec in a large network).
+      today fast enough to consider SPF computation time as
+      negligible (on the order of milliseconds in a large network).

    o  the SPF computation order: an SPF trigger can be common to
       multiple IGP areas or levels (e.g., IS-IS Level1/Level2) or for
@@ -215,8 +284,8 @@ Internet-Draft                spf-microloop                 January 2018
       done in A and B for each area/level/topology/SPF-algorithm is
       different, there is a possibility for a micro-loop to appear.

-   o  the RIB and FIB prefix insertion speed or ordering: highly
-      implementation dependant.
+   o  the RIB and FIB prefix insertion speed or ordering.  This is highly
+      dependent on the implementation.



@@ -225,22 +294,21 @@ Litkowski, et al.         Expires July 28, 2018                 [Page 4]
 Internet-Draft                spf-microloop                 January 2018


-   This document will focus on analysis SPF delay (and associated
-   triggers).
+   This document will focus on analysis of the SPF delay behavior and the associated
+   triggers.

 3.  SPF trigger strategies

-   Depending of the change advertised in LSP/LSA, the topology may be
+   Depending on the change advertised in an LSPDU or LSA, the topology may be
    affected or not.  An implementation may avoid running the SPF
    computation (and may only run IP reachability computation instead) if
-   the advertised change is not affecting topology.
+   the advertised change does not affect the topology.

    Different strategies exists to trigger the SPF computation:

-   1.  An implementation may always run a full SPF whatever the change
-       to process.
+   1.  An implementation may always run a full SPF for any type of change.

-   2.  An implementation may run a full SPF only when required: e.g. if
+   2.  An implementation may run a full SPF only when required.  For example, if
        a link fails, a local node will run an SPF for its local LSP
        update.  If the LSP from the neighbor (describing the same
        failure) is received after SPF has started, the local node can
@@ -250,26 +318,28 @@ Internet-Draft                spf-microloop                 January 2018
    3.  If the topology does not change, an implementation may only
        recompute the IP reachability.

-   As pointed in Section 1, SPF optimizations are not mandatory in
-   specifications, leading to multiple strategies to be implemented.
+   As noted in Section 1, SPF optimizations are not mandatory in
+   specifications.  This has led to the implementation of
+   different strategies.

 4.  SPF delay strategies

    Implementations of link state routing protocols use different
-   strategies to delay the SPF computation.  We usually see the
-   following:
+   strategies to delay the SPF computation.  The two most
+   common SPF delay behaviors are the following.

-   1.  Two steps delay.
+   1.  Two phase delay.

    2.  Exponential backoff delay.

-   Those behavior will be explained in the next sections.
+   These behaviors are described in the following sections.

-4.1.  Two steps SPF delay
+4.1.  Two phase SPF delay

-   The SPF delay is managed by four parameters:
+   For the two phase SPF delay, the SPF delay is managed by four parameters:

-   o  Rapid delay: amount of time to wait before running SPF.
+   o  Rapid delay: amount of time to wait before running SPF, after the
+   initial SPF trigger event.



@@ -281,13 +351,13 @@ Litkowski, et al.         Expires July 28, 2018                 [Page 5]
 Internet-Draft                spf-microloop                 January 2018


-   o  Rapid runs: amount of consecutive SPF runs that can use the rapid
-      delay.  When the amount is exceeded the delay moves to the slow
+   o  Rapid runs: the number of consecutive SPF runs that can use the rapid
+      delay.  When the number is exceeded, the delay moves to the slow
       delay value .

    o  Slow delay: amount of time to wait before running SPF.

-   o  Wait time: amount of time to wait without events before going back
+   o  Wait time: amount of time to wait without receiving SPF trigger events before going back
       to the rapid delay.

    Example: Rapid delay = 50msec, Rapid runs = 3, Slow delay = 1sec,
@@ -308,7 +378,9 @@ Internet-Draft                spf-microloop                 January 2018
            |  |   |  | || |            |
                            < wait time >

-                   Figure 2 - Two steps delay algorithm
+                   Figure 2 - Two phase delay algorithm
+
+

 4.2.  Exponential backoff

@@ -394,13 +466,20 @@ Internet-Draft                spf-microloop                 January 2018


    for delaying PRC.  We consider that E is using a SPF trigger strategy
-   that always compute Full SPF and exponential backoff strategy for SPF
+   that always computes a Full SPF for any change,  and uses the exponential backoff strategy for SPF
    delay (start=150ms, inc=150ms, max=1s)

    We also consider the following sequence of events (note : the time
    scale does not intend to represent a real router time scale where
    jitters are introduced to all timers) :

+==========
+[CB]
+This note about jitter and time scale (or timeline) is not clear.  I suggest describing
+it in more detail or deleting it.
+==========
+
+
    o  t0=0 ms: a prefix is declared down in the network.  We consider
       this event to happen at time=0.

@@ -487,12 +566,12 @@ Internet-Draft                spf-microloop                 January 2018
                     Route computation event time scale

    In the table above, we can see that due to discrepancies in the SPF
-   management, after multiple events (of a different type), the values
-   of the SPF delay are completely misaligned between nodes leading to
-   long micro-loops creation.
+   management, after multiple events of a different type, the values
+   of the SPF delay are completely misaligned between node S and node E,
+   leading to the creation of micro-loops.

-   The same issue can also appear with only single type of events as
-   displayed below:
+   The same issue can also appear with only a single type of event as
+   shown below:

    +--------+--------------------+------------------+------------------+
    |  Time  |   Network Event    | Router S events  | Router E events  |
@@ -587,6 +666,28 @@ Internet-Draft                spf-microloop                 January 2018

 6.  Proposed work items

+===============
+[CB]
+Since we are publishing this document after the SPF backoff algorithm
+draft is published, I think the list of three proposed work items below will be
+confusing.  Someone reading this RFC will wonder why the
+SPF backoff algorithm RFC (which will have an earlier RFC number)
+doesn't satisfy the list of proposed work items.
+
+Perhaps this section should be renamed something like
+"Benefits of standardized SPF delay behavior", and the list of proposed
+work items should be removed.
+
+It may also make sense to explicitly say that the
+SPF backoff algorithm draft/RFC is a solution that
+satisfies this problem statement.
+And that we are publishing the document in order to
+capture the reasoning that led to that draft.  Text to this
+effect should probably go in the introduction, instead
+of this section.
+
+===============
+
    In order to enhance the current Link State IGP behavior, authors
    would encourage working on standardization of some behaviours.

@@ -603,14 +704,23 @@ Internet-Draft                spf-microloop                 January 2018

    Using the same event sequence as in figure 2, we may expect fewer
    and/or shorter micro-loops using standardized implementations.
+
+===========
+[CB] I think the text should refer to one of the previous tables and not Figure 2.
+Figure 2 shows the two step delay algorithm.
+===========

    +--------+--------------------+------------------+------------------+
    |  Time  |   Network Event    | Router S events  | Router E events  |
    +--------+--------------------+------------------+------------------+
    |  t0=0  |    Prefix DOWN     |                  |                  |
    |  10ms  |                    | Schedule PRC (in | Schedule SPF (in |
-
-
+
+===========
+[CB]
+It seems like there is a typo here.  Presumably router E should schedule a
+PRC (not an SPF) at 10ms in this table.
+===========

 Litkowski, et al.         Expires July 28, 2018                [Page 11]
 ^L
@@ -677,13 +787,48 @@ Internet-Draft                spf-microloop                 January 2018
    +--------+--------------------+------------------+------------------+

                     Route computation event time scale
-
+
+=============
+[CB]
+I think the term "time scale" throughout this document is not the right one.
+Perhaps the term "timeline" would be better or the phrase "sequence of events".
+=============
+[CB]
+There are several different tables with the same caption
+"Route computation event time scale".
+Regardless of the replacement term for "time scale", it would be helpful to make a
+distinction between the tables with each caption.  For example, this last
+table could have a caption like "Route computation when S and E use the
+same standardized behavior".
+
+==========
    As displayed above, there could be some other parameters like router
    computation power, flooding timers that may also influence micro-
    loops.  In Figure 4, we consider E to be a bit slower than S, leading
-   to micro-loop creation.  Despite of this, we expect that by aligning
+   to micro-loop creation.
+
+=================
+[CB]
+There is nothing in Figure 4 that shows that that E is slower than S.
+Perhaps it would be clearer to say something like:
+"In all of the
+examples in this document comparing the SPF timer behavior of
+router S and router E, we have made router E a bit slower than
+router S.  This can lead to microloops even when both S and E use
+a common standardized SPF behavior.
+=================
+
+
+   Despite of this, we expect that by aligning
    implementations at least on SPF trigger and SPF delay, service
    provider may reduce the number and the duration of micro-loops.
+===================
+[CB]
+"Despite of this" should read "In spite of this" or "Despite this".
+Or in this case "However" might be better.
+
+s/service provider/service providers/
+==================

 7.  Security Considerations


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.