RE: draft-ietf-rtgwg-spf-uloop-pb-statement

"Les Ginsberg (ginsberg)" <ginsberg@cisco.com> Wed, 19 April 2017 03:05 UTC

Return-Path: <ginsberg@cisco.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F2BC9129417 for <rtgwg@ietfa.amsl.com>; Tue, 18 Apr 2017 20:05:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.522
X-Spam-Level:
X-Spam-Status: No, score=-14.522 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zO8KooYWDiK4 for <rtgwg@ietfa.amsl.com>; Tue, 18 Apr 2017 20:05:36 -0700 (PDT)
Received: from alln-iport-7.cisco.com (alln-iport-7.cisco.com [173.37.142.94]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3F99E128D44 for <rtgwg@ietf.org>; Tue, 18 Apr 2017 20:05:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=14317; q=dns/txt; s=iport; t=1492571121; x=1493780721; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=25Cwjb+AmRnYl852nZuAond+mRT6RL6xjv9W9DoK1e0=; b=lDP8DsURe3RfSjzTAhvkx1ZTdnAYO/vRQxo7YAMOHbabY0HeYQR5UVdh M7jUa/O4CnbegRIJJHIvHctkFhmMKUWv0llxU6SWZcJLVlA8/y45x4tfX ikV5ScgGbdWAlZxivhOAc9NecbiHHLbauKLC46rJGjWxeAlBkHeLF45Fs s=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0DQAADV0vZY/51dJa1cGQEBAQEBAQEBAQEBBwEBAQEBgygrYYELB411kWKVYYIPIQuCQoM2AoN2PxgBAgEBAQEBAQFrHQuFFQEBAQEBAgEBODQLDAQCAQgRBAEBHwkHJwsUCQgBAQQOBQiKEQ6uG4swAQEBAQEBAQEBAQEBAQEBAQEBAQEBGAWGUoFdgxiEKREBhgEFnSIBhwODLYgwggmFMYoXiGuLIgEfOH0IYxVEgiSCQhyBY3UBBIZigSEBgQwBAQE
X-IronPort-AV: E=Sophos;i="5.37,219,1488844800"; d="scan'208";a="414366666"
Received: from rcdn-core-6.cisco.com ([173.37.93.157]) by alln-iport-7.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Apr 2017 03:05:19 +0000
Received: from XCH-ALN-004.cisco.com (xch-aln-004.cisco.com [173.36.7.14]) by rcdn-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id v3J35JiV015713 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Wed, 19 Apr 2017 03:05:19 GMT
Received: from xch-aln-001.cisco.com (173.36.7.11) by XCH-ALN-004.cisco.com (173.36.7.14) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Tue, 18 Apr 2017 22:05:19 -0500
Received: from xch-aln-001.cisco.com ([173.36.7.11]) by XCH-ALN-001.cisco.com ([173.36.7.11]) with mapi id 15.00.1210.000; Tue, 18 Apr 2017 22:05:19 -0500
From: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
To: "bruno.decraene@orange.com" <bruno.decraene@orange.com>
CC: RTGWG <rtgwg@ietf.org>
Subject: RE: draft-ietf-rtgwg-spf-uloop-pb-statement
Thread-Topic: draft-ietf-rtgwg-spf-uloop-pb-statement
Thread-Index: AdK4TvWsFJesfB3VR0O6fiH/OAGmKgACKmIgAAFjrzAAFlHE0A==
Date: Wed, 19 Apr 2017 03:05:18 +0000
Message-ID: <d49d43a0caa947ce8aa8f4c2d4c2e398@XCH-ALN-001.cisco.com>
References: <25494_1492526556_58F625DC_25494_1788_1_53C29892C857584299CBF5D05346208A31CBAD4A@OPEXCLILM21.corporate.adroot.infra.ftgroup> <6b495d33c4e047b3adb87ff088beff7f@XCH-ALN-001.cisco.com> <11399_1492533662_58F6419E_11399_16912_1_53C29892C857584299CBF5D05346208A31CBB091@OPEXCLILM21.corporate.adroot.infra.ftgroup>
In-Reply-To: <11399_1492533662_58F6419E_11399_16912_1_53C29892C857584299CBF5D05346208A31CBB091@OPEXCLILM21.corporate.adroot.infra.ftgroup>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [128.107.146.250]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/JziCWBRSfYoz5fCwR_U12TL4LZU>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Apr 2017 03:05:39 -0000

Bruno -

We have no disagreement regarding the content of draft-ietf-rtgwg-backoff-algo - what I am concerned about is the intent of the draft. I see two possibilities.

1)This is an Informational draft which provides a discussion of what is needed to  achieve both fast convergence and stability - and defines an example set of configuration parameters and the method to use them in order to achieve the goal.
This is non-normative and allows that other means of achieving the same result are both possible and allowed. If that were the case I would have no further concerns.

2)This is a Standards track draft whose intent is to - over time - require all implementations to use the exact mechanisms defined in the draft.

As the draft is positioned as a Standards track document I assume your intent is the latter. In which case we have to consider the existence of multiple long-lived implementations which are using a different set of parameters/implementation (e.g. exponential backoff) to achieve the fast convergence behavior described in the paper you reference below. These implementations have been proven successful. Before we require them to be changed I think it is prudent to demonstrate empirically - not merely by analysis - that a deployment where some implementations use the mechanisms defined in the draft and some use a proven alternative will result in significantly degraded convergence in real world failure scenarios. If we do not do this then we are imposing requirements on vendors and customers alike which are not proven to yield improvement.

I therefore have suggested testing involving:

  o Timers in the range recommended by the draft  - but also consistent with fast convergence (INITIAL delay sub 50 ms, SHORT_DELAY on the order of 100 ms or less)
  o Scenarios with multiple topology changes which trigger multiple SPFs in a short period of time
  o A mixture of routers with varying forwarding plane update speeds on the affected nodes in the network
  o Comparison of results using all standardized algorithm and a mixture of vendor specific algorithms and standard algorithm

This is not an argument against the content of the draft - just a request to do due diligence.

Does anyone have such data?

   Les

> -----Original Message-----
> From: rtgwg [mailto:rtgwg-bounces@ietf.org] On Behalf Of
> bruno.decraene@orange.com
> Sent: Tuesday, April 18, 2017 9:41 AM
> To: Les Ginsberg (ginsberg)
> Cc: RTGWG
> Subject: RE: draft-ietf-rtgwg-spf-uloop-pb-statement
> 
> Les,
> 
> > From: Les Ginsberg (ginsberg) [mailto:ginsberg@cisco.com]  > Sent:
> Tuesday, April 18, 2017 5:56 PM
> >
>  > Bruno -
>  >
>  > The discussion here is a pragmatic one.
> 
> That would be good indeed.
> 
> 
>  > As draft-ietf-rtgwg-backoff-algo is a Standards track document the
> implication of it becoming
>  > an RFC is that everyone SHOULD/MUST implement it.
> 
> No.
> I'm not aware that everyone SHOULD/MUST implement all IETF standard
> track RFC.
> 
> 
>  > Given that today vendors have implemented their own variations of SPF
> backoff, in order to
>  > justify requiring the use of a standardized algorithm it MUST be
> demonstrated that it makes
>  > a significant difference when used in real world deployments with timer
> values that are
>  > consistent with existing deployments.
> 
> You are mixing implementation consideration with IETF considerations.
> >From an IETF standpoint, in order to maximize the chance to have all (link
> state) IGP node to compute their SPF at the same time and using the same
> LSDB, we need nodes to have a consistent SPF delay algo.
> >From an implementation standpoint, we this draft has 3 implementations.
> 
> > I think we agree on the following:
>  >
>  > 1)For a  single topology change only the initial delay comes into play, so no
> benefits are
>  > expected from having a standardized algorithm
> 
> I agree for a single link failure.
> I disagree for other single failure, e.g. node failures and SRLG failures where
> multiple link state messages are involved from multiple nodes. Depending on
> differences in detection time, distance from the failure, flooding
> implementations long the way, SPF delay timers (..) multiple SPF may be
> involved for a single topological change.
> 
> (by "topology change" I assume that you mean the real/physical topology of
> the network, note the possibly N topological change that the IGP can see,
> depending on the configured SPF delays)
> 
> 
>  > 2)For multiple topology changes in a short period of time (i.e. within
> SHORT_SPF_DELAY as
>  > defined in the draft) the benefits of syncing the start of a second SPF in
> the control plane
>  > may be dwarfed by the time it takes to update the forwarding.
>  >
>  > In all the studies I am familiar with, the contribution of control plane work
> (routing update
>  > processing, SPF, RIB update) is under 30% of the total time required for
> convergence. The
>  > remainder is the time it takes to actually update the forwarding plane.
> 
> As already stated, I agree for some specific cases (e.g. first SPF) but not in the
> general case:
> 
> 1) FIB update time
> IMHO, the reference person for IGP Fast Convergence in the industry would
> be Clarence Filsfils. As a matter of luck for this discussion, both of you are in
> same organization and working on the same plateform, so this would ease
> similar point of view.
> - BTW, you can note that he co-auteurs this draft.
> - IMHO, a reasonable paper on this would be
> https://pdfs.semanticscholar.org/0deb/57cc07a36cf1ea5ba0b3482bf14f2e8b
> b60d.pdf (also from Clarence).
> 
> Note also that the paper is a bit old, and in the meantime both hardware and
> software have improved. But even at this time, FIB update time for 1000 IGP
> prefixes is/was around 300 ms (pecentil 90).
> Note that what matter are the "important" IGP prefixes which attract the
> customer's traffic. In short, this means the loopback of the router (used as
> BGP Next-Hop). So here, I'm taking the example of a IGP with 1000 nodes,
> which is a significant scaling.
> 
> 2) SPF delay
> A presented in slide 6 of the slides presented in IETF 90
> (https://www.ietf.org/proceedings/90/slides/slides-90-rtgwg-2.pdf ) SPF
> delay can easily be higher than 300ms.
> 
> 
>  > So, what I am asking is that BEFORE the draft becomes an RFC that real
> world data be
>  > gathered demonstrating the benefits of the standardized algorithm in the
> cases where it
>  > might matter.
> 
> I'm not aware that RTGWG WG requires deployment data are a prerequisite
> for RFC publication. Even implementations is not a pre-requisite (this has
> been discussed by the chairs a few month ago), but here, we do have 3
> implementations.
> That being said, IMHO the above data is real.
> 
> > I think the right set of test cases would include:
>  >
>  >   o Timers in the range recommended by the draft  - but also consistent
> with fast
>  > convergence (INITIAL delay sub 50 ms, SHORT_DELAY on the order of 100
> ms or less)
>  >   o Multiple topology changes which trigger multiple SPFs
>  >   o A mixture of forwarding plane updates speeds in the affected nodes in
> the network
>  >   o Comparison of results using all standardized algorithm and a mixture of
> vendor specific
>  > algorithms
>  >
>  > Based on these results we can then determine whether it is beneficial to
> progress the draft.
> 
> Let me restate a question that you have not answered so far:
> - Do we agree that having different SPF delay algo across one network, is not
> a feature but a bug?
> 
> --Bruno
> 
>  >
>  >    Les
>  >
>  >
>  > > -----Original Message-----
>  > > From: bruno.decraene@orange.com
> [mailto:bruno.decraene@orange.com]
>  > > Sent: Tuesday, April 18, 2017 7:43 AM
>  > > To: Les Ginsberg (ginsberg)
>  > > Cc: RTGWG
>  > > Subject: draft-ietf-rtgwg-spf-uloop-pb-statement
>  > >
>  > > Changing the subject of the thread.
>  > >
>  > > Hi Les,
>  > >
>  > > As a follow up on the discussion
>  > >
>  > > > From: Les Ginsberg (ginsberg)  > Sent: Tuesday, April 18, 2017 2:56 AM
>  > > >
>  > >  > In regards to the discussion regarding " draft-ietf-rtgwg-spf-uloop-pb-
>  > > statement" I am  > quoted as saying:
>  > >  >
>  > >  > " Les: most of the analysis that I am aware of -  > the largest
> contributor is
>  > > the control plane."
>  > >  >
>  > >  > In actuality what I said (or at least intended to say :-) ) was that the
> largest
>  > > contributor is the  > data plane (NOT the control plane).
>  > >  >
>  > >  > The point of the exchange between Bruno and myself was to
> emphaisze
>  > > the point that  > demonstrating the real world benefits of the
> standardized
>  > > backoff algorithm should include  > cases where forwarding plane update
>  > > speeds are different on different nodes in the  > topology. It is possible
> that
>  > > better synchronization of  the control plane execution times  > (which is
> what
>  > > use of a consistent backoff algorithm is likely to provide) may not mean
> much
>  > > > in cases where forwarding plane update speeds are significantly
> different
>  > > on different  > nodes and/or when forwarding plane update speeds
>  > > consume much more time than the  > control plane SPF/RIB updates.
> The
>  > > latter case is quite common.
>  > >
>  > > A few points/comments,
>  > >
>  > > - IMHO, your request seems more related to the problem statement
> draft.
>  > > https://tools.ietf.org/html/draft-ietf-rtgwg-spf-uloop-pb-statement-03
> If
>  > > you could comment the draft in order to improve it, this would probably
>  > > speed up the discussion.
>  > > - You are right that the IGP fast convergence, following a single failure, is
>  > > mostly due to the time needed to update the FIB on line cards. However,
> as
>  > > the SPF back-off algo kicks in, this is changing, and differences in spf
> delay
>  > > algo brings a significant delta. cf slide 6 of the slides presented in IETF 90
>  > > https://www.ietf.org/proceedings/90/slides/slides-90-rtgwg-2.pdf  You
> may
>  > > also review the whole presentation; not because you would learn
> anything,
>  > > but may be to ease the identification of the parts where we may have a
>  > > different opinion. (at this point, I'm not seeing real disagreement).
>  > > - Do we agree that having different SPF delay algo across one network, is
> not
>  > > a feature but a bug? IOW, there is value in standardizing one.
>  > >
>  > > Regards,
>  > > --Bruno
>  > >
>  > >
>  > >  >    Les
>  > >  >
>  > >  >
>  > >  > > -----Original Message-----
>  > >  > > From: rtgwg [mailto:rtgwg-bounces@ietf.org] On Behalf Of Jeff
> Tantsura
>  > > > > Sent: Thursday, April 13, 2017 4:09 PM  > > To: RTGWG  > > Cc: rtgwg-
>  > > chairs  > > Subject: RTGWG minutes IETF98  > >  > > Hi,  > >  > > The
> minutes
>  > > have been published at:
>  > >  > > https://datatracker.ietf.org/doc/minutes-98-rtgwg/
>  > >  > > Please provide your comments.
>  > >  > >
>  > >  > > Thanks!
>  > >  > > Jeff & Chris
>  > >  > >
>  > >  > >
>  > >  > >
>  > >  > > _______________________________________________
>  > >  > > rtgwg mailing list
>  > >  > > rtgwg@ietf.org
>  > >  > > https://www.ietf.org/mailman/listinfo/rtgwg
>  > >  >
>  > >  > _______________________________________________
>  > >  > rtgwg mailing list
>  > >  > rtgwg@ietf.org
>  > >  > https://www.ietf.org/mailman/listinfo/rtgwg
>  > >
>  > >
> __________________________________________________________
>  > >
> __________________________________________________________
>  > > _____
>  > >
>  > > Ce message et ses pieces jointes peuvent contenir des informations
>  > > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> exploites
>  > > ou copies sans autorisation. Si vous avez recu ce message par erreur,
> veuillez
>  > > le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les
>  > > messages electroniques etant susceptibles d'alteration, Orange decline
> toute
>  > > responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>  > >
>  > > This message and its attachments may contain confidential or privileged
>  > > information that may be protected by law; they should not be
> distributed,
>  > > used or copied without authorisation.
>  > > If you have received this email in error, please notify the sender and
> delete
>  > > this message and its attachments.
>  > > As emails may be altered, Orange is not liable for messages that have
> been
>  > > modified, changed or falsified.
>  > > Thank you.
> 
> 
> __________________________________________________________
> __________________________________________________________
> _____
> 
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
> message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
> 
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete
> this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
> 
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org
> https://www.ietf.org/mailman/listinfo/rtgwg