RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

<stephane.litkowski@orange.com> Wed, 09 August 2017 11:37 UTC

Return-Path: <stephane.litkowski@orange.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5A0B8132153 for <rtgwg@ietfa.amsl.com>; Wed, 9 Aug 2017 04:37:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.398
X-Spam-Level:
X-Spam-Status: No, score=-5.398 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-2.8, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e69Avjj2Xmf5 for <rtgwg@ietfa.amsl.com>; Wed, 9 Aug 2017 04:37:32 -0700 (PDT)
Received: from relais-inet.orange.com (mta135.mail.business.static.orange.com [80.12.70.35]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6074F131D32 for <rtgwg@ietf.org>; Wed, 9 Aug 2017 04:37:31 -0700 (PDT)
Received: from opfednr07.francetelecom.fr (unknown [xx.xx.xx.71]) by opfednr21.francetelecom.fr (ESMTP service) with ESMTP id 7CAE0C07B1; Wed, 9 Aug 2017 13:37:29 +0200 (CEST)
Received: from Exchangemail-eme2.itn.ftgroup (unknown [xx.xx.31.27]) by opfednr07.francetelecom.fr (ESMTP service) with ESMTP id 427651C0066; Wed, 9 Aug 2017 13:37:29 +0200 (CEST)
Received: from OPEXCLILMA4.corporate.adroot.infra.ftgroup ([fe80::65de:2f08:41e6:ebbe]) by OPEXCLILM7C.corporate.adroot.infra.ftgroup ([fe80::8007:17b:c3b4:d68b%19]) with mapi id 14.03.0361.001; Wed, 9 Aug 2017 13:37:28 +0200
From: stephane.litkowski@orange.com
To: Sikhivahan Gundu <sikhivahan.gundu@ericsson.com>, Chris Bowers <cbowers@juniper.net>
CC: "draft-ietf-rtgwg-uloop-delay@tools.ietf.org" <draft-ietf-rtgwg-uloop-delay@tools.ietf.org>, RTGWG <rtgwg@ietf.org>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt
Thread-Topic: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt
Thread-Index: AdMP3yX5N8zZ03GxRHi9+6FucgpQ/wAY/ieAAAJIxSAAA2K9IAAe/WLAAAtG12A=
Date: Wed, 09 Aug 2017 11:37:28 +0000
Message-ID: <4324_1502278649_598AF3F9_4324_398_1_9E32478DFA9976438E7A22F69B08FF921EA1819D@OPEXCLILMA4.corporate.adroot.infra.ftgroup>
References: <MWHPR05MB2829961037B7A03049D677E0A98A0@MWHPR05MB2829.namprd05.prod.outlook.com> <9941_1502198679_5989BB97_9941_419_1_9E32478DFA9976438E7A22F69B08FF921EA08A36@OPEXCLILMA4.corporate.adroot.infra.ftgroup> <MWHPR05MB282938E44A09FB1F2C34B881A98A0@MWHPR05MB2829.namprd05.prod.outlook.com> <21996_1502205616_5989D6B0_21996_381_1_9E32478DFA9976438E7A22F69B08FF921EA08B2F@OPEXCLILMA4.corporate.adroot.infra.ftgroup> <HE1PR07MB1708B04E7BFFF20F6FF7F008EA8B0@HE1PR07MB1708.eurprd07.prod.outlook.com>
In-Reply-To: <HE1PR07MB1708B04E7BFFF20F6FF7F008EA8B0@HE1PR07MB1708.eurprd07.prod.outlook.com>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.168.234.5]
Content-Type: multipart/alternative; boundary="_000_9E32478DFA9976438E7A22F69B08FF921EA1819DOPEXCLILMA4corp_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/wrI7LkSjy9j-ryXa7TkC2drPIVE>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2017 11:37:36 -0000

Hi,

Thanks for your feedback, please find some comments inline.

Brgds,

Stephane


From: Sikhivahan Gundu [mailto:sikhivahan.gundu@ericsson.com]
Sent: Wednesday, August 09, 2017 12:19
To: LITKOWSKI Stephane OBS/OINIS; Chris Bowers
Cc: draft-ietf-rtgwg-uloop-delay@tools.ietf.org; RTGWG
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Hi,

Requesting a couple of clarifications.

>> If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped

Do we stop the timer if  "new convergence" is a result only of links coming
up, i.e, no links have failed?  My interpretation of the old text,  as well as the
revision, is that we don't, but in the light of the discussion that this passage
triggered, it seems better to have the interpretation validated, as below:

[SLI] Let's that you have a convergence triggered by a local link down, this convergence will apply the ULOOP_DELAY_DOWN_TIMER.
If during the timer run, a new topology change occurs (metric change, link up or down whatever it is local or remote), we need to update the FIB without anymore delaying with the latest topology.
If we do not do so, the local router will use an N-2 FIB version while the other routers will start to use the latest version N this could cause side effects.


Imagining the IGP router to be in one of two states:
-- NORMAL-UPDATE state (FIB updated "normally"), also the initial state,
-- and DELAYED-UPDATE state (FIB updated after ULOOP_DELAY_TIMER units of time),

the draft seems to suggest the following state transitions. I'd greatly appreciate
validation.



---------------------------+------------------------------------------------------+-------------------------+

     current state        |                                  event                             |         next state      |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+                  one local link failure                  +-------------------------+

DELAYED-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                  one remote link failure             |                                   |

---------------------------+                                    OR                               | NORMAL-UPDATE |

DELAYED-UPDATE    |  two or more (any kind of) link failures   |                                  |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+           no link failures (only link-up's)       +--------------------------+

DELAYED-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+



[SLI] The last line should be current state DELAYED-UPDATE , next state NORMAL-UPDATE.





Second: remote loops are illustrated as a non-applicable scenario for this

solution. How about local link failures that do not lead to (local) loops?

Applying the delay in such a case may result in packet loss if there is no

FRR backup.  OTOH, detecting that a local loop will form  involves more

computation.



[SLI] I agree with you, that's why the draft encourages to use the mechanism in combination with FRR. The draft does not prevent an implementation to detect if a loop exists or not before applying the mechanism.





Thanks,

Sikhi


From: rtgwg [mailto:rtgwg-bounces@ietf.org] On Behalf Of stephane.litkowski@orange.com<mailto:stephane.litkowski@orange.com>
Sent: 08 August 2017 20:50
To: Chris Bowers <cbowers@juniper.net<mailto:cbowers@juniper.net>>
Cc: draft-ietf-rtgwg-uloop-delay@tools.ietf.org<mailto:draft-ietf-rtgwg-uloop-delay@tools.ietf.org>; RTGWG <rtgwg@ietf.org<mailto:rtgwg@ietf.org>>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Thanks Chris, I will post a new revision with those changes.


From: Chris Bowers [mailto:cbowers@juniper.net]
Sent: Tuesday, August 08, 2017 16:05
To: LITKOWSKI Stephane OBS/OINIS
Cc: RTGWG; draft-ietf-rtgwg-uloop-delay@tools.ietf.org<mailto:draft-ietf-rtgwg-uloop-delay@tools.ietf.org>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Stephane,

See responses inline with [CB].

Chris

From: stephane.litkowski@orange.com<mailto:stephane.litkowski@orange.com> [mailto:stephane.litkowski@orange.com]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <cbowers@juniper.net<mailto:cbowers@juniper.net>>
Cc: RTGWG <rtgwg@ietf.org<mailto:rtgwg@ietf.org>>; draft-ietf-rtgwg-uloop-delay@tools.ietf.org<mailto:draft-ietf-rtgwg-uloop-delay@tools.ietf.org>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Hi Chris,

Thanks for the review. I'm updating the document to reflect your proposals.
Couple of comments:

-          s/"otherwise the standard IP convergence MUST be used."/ "otherwise the standard IP convergence MUST used". It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

[CB]  You are correct.  That proposed change is a mistake on my part.



-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)
Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:
"In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C)."
"The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path."


[CB] "when the primary path is an hop by hop defined path"  is somewhat ambiguous.
How about "when the primary path uses hop-by-hop routing" ?


For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I'm not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

[CB]  OK.


-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

"Upon an adjacency/link down event, this document introduces a change
   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the
   network wide convergence. The new step 5 is described below:"
           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event."

[CB]  This text seems clearer.

Brgds,

Stephane


From: Chris Bowers [mailto:cbowers@juniper.net]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; draft-ietf-rtgwg-uloop-delay@ietf.org<mailto:draft-ietf-rtgwg-uloop-delay@ietf.org>
Cc: rtgwg@ietf.org<mailto:rtgwg@ietf.org>
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Authors,

I'm in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt.

In reading the latest version of the document, I wrote down some feedback.
A diff can be found at:
https://github.com/cbowers/outgoing-feedback-on-ietf-drafts-2017/commit/70f3fc5b2c89dc65f813b992921d685049a4a4bd

http://bit.ly/2vJqoq2

Most of the feedback is related to clarifying language and typos.  However there
are few comments that I think are more substantive so I am
reproducing them below since they should probably discussed on the list.

===========
[CB]  I find the examples presented in section 1 and section 2.1 to
be confusing.  The conclusion drawn in the last paragraph of section
2.1 does not seem to follow from these examples.

Section 1 (figure 1) shows an example of micro-loops occuring when shortest
path forwarding is used and the metrics are such that LFA and rLFA
produce no backup paths from the PLR.

Section 2.1 (figure 2) also shows an example of micro-loops occuring when
shortest path forwarding is used and the metrics are such that LFA and rLFA
produce no backup paths from the PLR.  However, in this example,
a one-hop RSVP tunnel is provisioned to provide link protection for one of
the links.  However, even with this one-hop RSVP tunnel the example
demonstrates that micro-loops can occur.

The last paragraph asserts that:
"The issue described here is completely independent of the fast-
reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

There are two problems with this assertion.

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D.
Before the failure of the link C-B, this LSP would follow the path
S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would
follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is
made aware of the failure.  S will resignal the LSP to take the path S-E-A-D.
At no time would looping occur.

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from
micro-looping, but the text currently reads that way.  The assertion of the last
paragraph should be qualified to talk about how microloops will still affect traffic
forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not
been demonstrated with the examples provided.  I think it may instead be
the case that the assertion nay not be true for local LFA in some circumstances.
In particular, if traffic to a given destination can be protected for a given
failure by the PLR using a local LFA that is the same as the post convergence
path, then that traffic will not be subject to microloops.

Perhaps the overall intention of the example in figure 2 using
links protected with one-hop RSVP-signaled LSPs was to say that no
matter how much flexibility you give yourself in building a backup path
from the PLR, if the PLR stops using the backup path before other routers
stop sending traffic to the PLR, then you can still have forwarding loops.
However, I think the complexity and detail of the example using one-hop
RSVP-signaled LSPs ends up confusing the matter.

The text should either work more systematically through examples to
substantiate the assertion, or the assertion should be scaled back.
Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

======
Section 4.4

[CB]  It would be good to write out exactly what the modified version of step 5
looks like so there is no confusion. Something like:

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition
of a single local link-down event have been met, then an update of the
RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,
the RIB and FIB update is scheduled immediately.

=========

   Such a delay
   SHOULD only be introduced if all the LSDB modifications processed are
   only reporting a single local link down event (Section 4.3).  If a
   subsequent LSP/LSA is received/updated and a new SPF computation is
   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the
   same evaluation SHOULD be performed.

=========
[CB] What should one do if the evaluation of a subsequent LSP/LSA fails
at this point?  Do you go ahead and update the FIB with the forwarding
entries that you were waiting to do?  Or do you do a new SPF with the
new information?  Or is it up to the implementation?
=========

I also ran the idnits check which show  the following issues.
Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean?
https://www.ietf.org/tools/idnits?url=https://www.ietf.org/archive/id/draft-ietf-rtgwg-uloop-delay-05.txt


Thanks,
Chris



_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.



This message and its attachments may contain confidential or privileged information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.



This message and its attachments may contain confidential or privileged information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.

Thank you.

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.