RE: updated draft-gs-l3vpn-scaling & agenda request

"Rajiv Asati (rajiva)" <rajiva@cisco.com> Thu, 08 March 2012 16:50 UTC

Return-Path: <rajiva@cisco.com>
X-Original-To: l3vpn@ietfa.amsl.com
Delivered-To: l3vpn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 590CF21F8592 for <l3vpn@ietfa.amsl.com>; Thu, 8 Mar 2012 08:50:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.266
X-Spam-Level:
X-Spam-Status: No, score=-9.266 tagged_above=-999 required=5 tests=[AWL=0.418, BAYES_00=-2.599, J_CHICKENPOX_13=0.6, RCVD_IN_DNSWL_HI=-8, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nQ0-pV2dNvll for <l3vpn@ietfa.amsl.com>; Thu, 8 Mar 2012 08:50:47 -0800 (PST)
Received: from rcdn-iport-7.cisco.com (rcdn-iport-7.cisco.com [173.37.86.78]) by ietfa.amsl.com (Postfix) with ESMTP id 12EBF21F8576 for <l3vpn@ietf.org>; Thu, 8 Mar 2012 08:50:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=rajiva@cisco.com; l=11884; q=dns/txt; s=iport; t=1331225447; x=1332435047; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to; bh=20xoJkNeKnfNOXmE1aknMryEv2bioQIfguI6XBpWyIU=; b=IJxA2Jn4eHvlO3cZ/9hxX147BBMZdVMnce32M/wcFo0iY0rdwVKURaaa PTsm6ck/LEmvSiU38Ep/TIp8XMBYMELFb8ZXG9rimm5hloXrc5ccGjlA9 CPp7YWSRO18q3o+bjHtk9YMqEQk+ZFR3aep/QByF0TI2dJi2V0BSJv5vw A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Av8EAG7iWE+tJXG+/2dsb2JhbABCtSqBB4IKAQEBBBIBFAlCEwQCAQgRBAEBCwYXAQYBRQkIAQEEARIIEweHaAugEwGXMoojAQSFY2MEiB8znQuDAoE1AQc
X-IronPort-AV: E=Sophos;i="4.73,552,1325462400"; d="scan'208";a="64867111"
Received: from rcdn-core2-3.cisco.com ([173.37.113.190]) by rcdn-iport-7.cisco.com with ESMTP; 08 Mar 2012 16:50:46 +0000
Received: from xbh-rcd-202.cisco.com (xbh-rcd-202.cisco.com [72.163.62.201]) by rcdn-core2-3.cisco.com (8.14.3/8.14.3) with ESMTP id q28Gokto002522; Thu, 8 Mar 2012 16:50:46 GMT
Received: from xmb-rcd-111.cisco.com ([72.163.62.153]) by xbh-rcd-202.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 8 Mar 2012 10:50:46 -0600
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: RE: updated draft-gs-l3vpn-scaling & agenda request
Date: Thu, 08 Mar 2012 10:50:44 -0600
Message-ID: <067E6CE33034954AAC05C9EC85E2577C0795D3CE@XMB-RCD-111.cisco.com>
In-Reply-To: <9198_1331212584_4F58B128_9198_2057_5_4FC3556A36EE3646A09DAA60429F533507E94CD5@PUEXCBL0.nanterre.francetelecom.fr>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: updated draft-gs-l3vpn-scaling & agenda request
Thread-Index: Acz73jZNIwy3VR2RQveW+x31Ps+7zABSNHNgAAj284A=
References: <DCC302FAA9FE5F4BBA4DCAD465693779173B637798@PRVPEXVS03.corp.twcable.com> <9198_1331212584_4F58B128_9198_2057_5_4FC3556A36EE3646A09DAA60429F533507E94CD5@PUEXCBL0.nanterre.francetelecom.fr>
From: "Rajiv Asati (rajiva)" <rajiva@cisco.com>
To: stephane.litkowski@orange.com, "George, Wes" <wesley.george@twcable.com>, l3vpn@ietf.org
X-OriginalArrivalTime: 08 Mar 2012 16:50:46.0337 (UTC) FILETIME=[9BB16310:01CCFD4B]
X-BeenThere: l3vpn@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <l3vpn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/l3vpn>, <mailto:l3vpn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/l3vpn>
List-Post: <mailto:l3vpn@ietf.org>
List-Help: <mailto:l3vpn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/l3vpn>, <mailto:l3vpn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Mar 2012 16:50:48 -0000

> "First, it leads to an inconsistent
>    routing table footprint from one PE router to the next, and it can
>    change with every new customer turned up on the router"
> => Yes, but is this an issue ? I agree that it could be better that all PEs having
> same set of routes, but even with current hardware , this doesn't scale ...

+1. Frankly, why load the router with routes that it doesn't need (because it has no customers).
Inconsistency in VPN routing table per PE is desired for sanity, IMO.


> "In addition, customers may request the use
>    of BGP multipath for faster failover or better load balancing, which
>    has the net effect of installing more active routes into the table,
>    rather than simply selecting the single best path."

True, but that's independent of L3VPN.  

Cheers,
Rajiv


> -----Original Message-----
> From: l3vpn-bounces@ietf.org [mailto:l3vpn-bounces@ietf.org] On Behalf
> Of stephane.litkowski@orange.com
> Sent: Thursday, March 08, 2012 8:16 AM
> To: George, Wes; l3vpn@ietf.org
> Subject: RE: updated draft-gs-l3vpn-scaling & agenda request
> 
> 
>  Hi George,
> 
> 
> Here is my feedback from your document :
> 
> 1) PE-CE BCP :
> 
> I pretty agree with the content of the section.
> But I would like to add some stuffs, maybe can we do some recommandation
> or at least a constation between usage of static/IGP/BGP.
> I see this in this way :
> - Some years ago, static routing couldn't provide dynamicity of failure
> detection for multiconnected site where PE-CE link layer not able to detect
> failure (ethernet non direct link ...), but today usage of BFD with static
> routing ensure detection in all cases, so static routing can be used for
> multiconnected sites.
> 
> - IGP : running IGP between PE-CE, IMHO, has a sense only to extend
> customer IGP between sites and make SP network transparent.
> 
> - BGP : BGP is a well designed protocol that can be used for two purposes :
> 	- providing dynamic advertisement of lot of routes (it seems
> impossible to provision hundreds of static routes for an access !)
> 	- providing failure detection
> 
> I took IGP case a bit out of my analysis ...
> For a monoconnected site, we could see that static routing is fine where
> number of routes is low (threshold to define : 10/15 ?) and where routes are
> not changing everyday (otherwise provision activity on PE would be
> important for this access), otherwise BGP would be prefered. In this case,
> there is no need of fast failure detection, as it's a monoconnected site. So,
> setting minimum holdtime to a high value (180) to protect PE may be fine (no
> need of BFD).
> 
> For a multiconnected site, whatever the protocol used, as you mentionned,
> it's better to rely on BFD or other connectivity detection mechanism for fast
> detection rather than tuning protocol timers.The main issue is that CPE are
> sometimes low cost and not supporting BFD :(
> Then choosing protocol is just a matter of number/dynamicity of routes on
> the access (same as for monoconnected sites) -> choose static when number
> of route low, BGP otherwise. In case of BGP, if BFD is used, setting minimum
> holdtime to a high value (180) to protect PE may be fine (BFD ensuring
> detection).  If BFD not available on CPE, for this specific session, setting a
> minimum holdtime to a protecting/tested value (15) to protect PE and as you
> mention number of session with fast timers must be tracked.
> 
> Compared to what you mention, the main point to be added in BCP is the
> difference that should be made between monoconnected sites (no BFD , no
> fast detection needed) and multiconnected sites (fast detection needed).
> Sometimes, some SP are using same rule for all BGP access, but it's not pretty
> good for scaling ...
> 
> One other issue to deal with could be persistent flapping of PE-CE BGP
> session (link issue ... Negotiation issue ...) => BGP DampPeerOscillation
> needed there but not well implemented
> 
> 
> 2) Network Event
> 
> As for PE-CE issues, we can fall into process priorization issue there. When a
> PE loose a direct link, or when there is a link failure near a set of PEs. PEs have
> to update their FIB and possibly hundreds thousands of VPN routes because
> the change cause an interface change or a MPLS transport label change. We
> already saw that some routers with bad (old) FIB implementations and bad
> process priorization are going at 100% CPU updating ISIS routes in RT and FIB
> and then VPN routes in RT and FIBs. During the 100% CPU, router is loosing
> PE-CE BGP sessions or ISIS adjacency because of bad process priorization.
> Now with H-FIB implementation in codes, the issue is more hidden as it
> requires less processing :)
> 
> 
> 3) Route Scale
> 
> In the number of routes a PE must support, you can add ISIS routes, LDP
> FECs, and possibly TE tunnels that are impacting the global scaling too.
> 
> Here I propose to separate this section between PE scaling, ASBR scaling, RR
> scaling and address each part separately. You are mainly talking about PEs
> here but ASBR and RRs are bottlenecks too.
> 
> "Most PE routers use the absence of a
>    given VRF instance (or RD/RT filtering) to limit the number of routes
>    that they must actually carry, but this is sometimes of limited
>    utility for a couple of reasons.  First, it leads to an inconsistent
>    routing table footprint from one PE router to the next, and it can
>    change with every new customer turned up on the router.  This leads
>    to non-deterministic performance and scale."
> 
> => I don't agree on all stuffs there. Limiting routes imported by PE is clearly
> helping controlplane. If you are importing all routes, then scaling impact is
> clearly implementation dependant : at least, router will have more memory
> consumption (so it should support millions of BGP routes !), and possibly CPU
> usage too (more nexthop reachability computation depending on how it's
> done...).
> As you mentionned, if PE doesn't import all routes, it requires to send route-
> refresh to RR each time a new VRF is provisionned. With million routes on RR,
> it's impacting, as it could take some minutes (5-10-20 min) to receive the
> routes , and only few routes will be accepted , all others will be denied.
> Formatting RIB-OUT upon route-refresh is something costly for the RR (could
> impact transient update propagation time). In our case, we are aggregating
> route-refresh request at RR level every x seconds to permit the RR to format
> one time and serve multiple PEs with the same update formatting action.
> 
> RTC is clearly helping there by just formatting/sending the requested routes
> and I think there is no issue with using RTC (this is another debate !)
> 
> "First, it leads to an inconsistent
>    routing table footprint from one PE router to the next, and it can
>    change with every new customer turned up on the router"
> => Yes, but is this an issue ? I agree that it could be better that all PEs having
> same set of routes, but even with current hardware , this doesn't scale ...
> 
> "This leads
>    to non-deterministic performance and scale." =>  VPN footprint of
> customers could be really different, I agree that some customers are
> spanning among most of PEs, but it's not the case of all. Based on our
> experience, PEs have different profiles in terms of VRFs.
> 
> 
> "In addition, customers may request the use
>    of BGP multipath for faster failover or better load balancing, which
>    has the net effect of installing more active routes into the table,
>    rather than simply selecting the single best path."
> => I think MP has a greater impact on FIB rather than in pure controlplane
> even if there is generally FIB structures in controlpane ...
> 
> 
> Regarding RD policy, I agree with your points, but now the choice can change
> as there is some solution like add path, ORR, best external that could permit
> fast restoration with same RD policy (as in non VPN environment) => but I
> agree that you still do not increase number of nets, but you will increase
> number of paths ...
> 
> 
> Regards,
> 
> Stephane
> 
> 
> 
> 
> 
> 
> 
> 
> -----Message d'origine-----
> De : l3vpn-bounces@ietf.org [mailto:l3vpn-bounces@ietf.org] De la part de
> George, Wes
> Envoyé : mardi 6 mars 2012 22:27
> À : l3vpn@ietf.org
> Cc : rtgarea-ads@ietf.org; rtgwg-chairs@ietf.org
> Objet : updated draft-gs-l3vpn-scaling & agenda request
> 
> Hello all -
> 
> Rob and I have completed a revision of our draft discussing L3VPN scaling
> considerations. We've made some changes to the document structure to
> make it flow better, and we think that we've added enough of the body that
> it is ready for discussion during a WG meeting. However, since L3VPN is not
> meeting during IETF in Paris, we're wondering if we should perhaps ask for
> time in the Routing area open meeting/RTGAREA WG instead?
> 
> Either way, comments are still very welcome, especially if you can help us
> bolster the currently weak section on multicast VPN scale.
> 
> Abstract
> 
>    This document discusses scaling considerations unique to
>    implementation of Layer 3 (IP) Virtual Private Networks, discusses a
>    few best practices, and identifies gaps in the current tools and
>    techniques which are making it more difficult for operators to cost-
>    effectively scale and manage their L3VPN deployments.
> 
> http://tools.ietf.org/html/draft-gs-vpn-scaling-01
> 
> Thanks,
> 
> Wes George
> 
> 
> This E-mail and any of its attachments may contain Time Warner Cable
> proprietary information, which is privileged, confidential, or subject to
> copyright belonging to Time Warner Cable. This E-mail is intended solely for
> the use of the individual or entity to which it is addressed. If you are not the
> intended recipient of this E-mail, you are hereby notified that any
> dissemination, distribution, copying, or action taken in relation to the
> contents of and attachments to this E-mail is strictly prohibited and may be
> unlawful. If you have received this E-mail in error, please notify the sender
> immediately and permanently delete the original and any copy of this E-mail
> and any printout.
> 
> __________________________________________________________
> __________________________________________________________
> _____
> 
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
> message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> France Telecom - Orange decline toute responsabilite si ce message a ete
> altere, deforme ou falsifie. Merci.
> 
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete
> this message and its attachments.
> As emails may be altered, France Telecom - Orange is not liable for messages
> that have been modified, changed or falsified.
> Thank you.