Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)

"George, Wesley" <wesley.george@twcable.com> Mon, 12 September 2011 18:26 UTC

Return-Path: <wesley.george@twcable.com>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2593D21F8C0B for <sidr@ietfa.amsl.com>; Mon, 12 Sep 2011 11:26:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.129
X-Spam-Level:
X-Spam-Status: No, score=0.129 tagged_above=-999 required=5 tests=[AWL=-0.008, BAYES_00=-2.599, HELO_EQ_MODEMCABLE=0.768, HOST_EQ_MODEMCABLE=1.368, J_CHICKENPOX_13=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XKuTUTJsJh4T for <sidr@ietfa.amsl.com>; Mon, 12 Sep 2011 11:26:15 -0700 (PDT)
Received: from cdpipgw02.twcable.com (cdpipgw02.twcable.com [165.237.59.23]) by ietfa.amsl.com (Postfix) with ESMTP id 408BD21F8B6C for <sidr@ietf.org>; Mon, 12 Sep 2011 11:26:15 -0700 (PDT)
X-SENDER-IP: 10.136.163.10
X-SENDER-REPUTATION: None
X-IronPort-AV: E=Sophos;i="4.67,516,1309752000"; d="scan'208";a="257730634"
Received: from unknown (HELO PRVPEXHUB01.corp.twcable.com) ([10.136.163.10]) by cdpipgw02.twcable.com with ESMTP/TLS/RC4-MD5; 12 Sep 2011 14:26:48 -0400
Received: from PRVPEXVS04.corp.twcable.com ([10.136.163.28]) by PRVPEXHUB01.corp.twcable.com ([10.136.163.10]) with mapi; Mon, 12 Sep 2011 14:28:18 -0400
From: "George, Wesley" <wesley.george@twcable.com>
To: Christopher Morrow <morrowc.lists@gmail.com>, Randy Bush <randy@psg.com>
Date: Mon, 12 Sep 2011 14:28:17 -0400
Thread-Topic: [sidr] BGPSec scaling (was RE: beacons and bgpsec)
Thread-Index: Acxw+6lMmkuZEwfjSfeyxWpzK0PFvgATicDA
Message-ID: <34E4F50CAFA10349A41E0756550084FB0E26B03F@PRVPEXVS04.corp.twcable.com>
References: <A37CADA4-F16D-4C01-8D9C-D01001C4EFE4@tcb.net> <21C19DA8-7BF3-4832-8C13-C9A45FE026FB@algebras.org> <87D9E106-2A37-4E1E-8C69-7084C199A3FE@tcb.net> <331AEFBD-6AE5-469E-A11E-E672DC61DCDC@pobox.com> <B92913D1-AB82-4D9F-B8A9-F8F4F99713D6@tcb.net> <p06240803ca685bff5443@128.89.89.43> <D6D12861-412E-4A65-B626-B627449981B8@tcb.net> <34E4F50CAFA10349A41E0756550084FB0C2ED5A4@PRVPEXVS04.corp.twcable.com> <7B321CF0-ABE6-4FCD-B755-8099BB63399A@rob.sh> <5E9BE75F-C0A6-4B48-B15F-7E0B80EFE981@ericsson.com> <m2ipp4qxs5.wl%randy@psg.com> <34E4F50CAFA10349A41E0756550084FB0E0D5BDC@PRVPEXVS04.corp.twcable.com> <D4059E53-6EEC-4F66-9E1E-B96675182F22@rob.sh> <m2wrdhvjpe.wl%randy@psg.com> <4E6A2CD0.1010305@riw.us> <m2pqj9vgc8.wl%randy@psg.com> <CAL9jLaYoU-f_6CmmrLkSFyO1oHEZeKeYL8pF+pjF+3DXd0myTg@mail.gmail.com>
In-Reply-To: <CAL9jLaYoU-f_6CmmrLkSFyO1oHEZeKeYL8pF+pjF+3DXd0myTg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "sidr@ietf.org" <sidr@ietf.org>
Subject: Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Sep 2011 18:26:16 -0000

-----Original Message-----
From: christopher.morrow@gmail.com [mailto:christopher.morrow@gmail.com] On Behalf Of Christopher Morrow
Sent: Sunday, September 11, 2011 11:26 PM
To: Randy Bush; George, Wesley
Cc: Russ White; sidr@ietf.org
Subject: Re: [sidr] BGPSec scaling (was RE: beacons and bgpsec)

maybe what Wes is asking here is really:
"Could someone model the load on a router doing bgpsec, in a world of
bgpsec speaking devices?"

Something like, for a core network edge device (say sprint, C&W, TWTC,
UU/vzb,ATT an edge connecting device in their worst metro):
  o number of updates today/second (steady state and 'worst case')
  o projected growth of update stream (given historical data)
  o projected 'cost' (cpu cycles) of un-assisted bgpsec
  o projected RIB RAM size (use historical data to project forward)
  o projected beacons/second (which really just look like updates in
the update stream)
  o routing table size (projected forward from historical data)

It seems most of that data exists in one form or another, it seems
that running the math isn't "hard". There's a question of the validity
of the model... but that's always the case.

Wes, is this sort of thing what you're asking for?

WEG] Yes, to some extent, but you're right that the model is the hard part, not the math. In trying to unwind a similar problem of how to characterize steady-state and peak CPU load on a L3VPN PE router so that there are real rules of thumb for capacity management and scaling, we discovered a couple of things -
1) (some) Vendors are quite bad at providing reasonably accurate multi-dimensional scaling models based on testing or real-world results. They tend to give a lot of single-dimension scale limits (eg with this knob turned to 11, you can get this value), but are very conservative and mumbly when it comes to what the actual real-life limits are, YMMV, etc. As a result, sometimes you end up finding out about the scaling cliff as you're falling over it, or you pay for hardware that you can never fully use because you stick to very conservative limitations.
2) a corollary: behavior at scale becomes increasingly non-deterministic the more variables you're working with simultaneously. Even worse, it's difficult to account in a model for things that work well enough at moderate scale, but are not efficient enough for high scale, or suffer some sort of secondary impact due to dependencies, etc.
3) some routers are very bad at providing useful data about critical scaling vectors (updates per sec, changes in multicast state, etc). Coupled with the fact that each router's numbers can be wildly different, it's difficult to characterize a "common" router, let alone a common network.
4) there are widely varying opinions among vendors and operators as to what is an acceptable level of performance at scale i.e. time to convergence of last route, steady-state CPU utilization (how much headroom is enough), stability during system or network events.

I think that what is coming up here are concerns in a couple of different categories:
1) Short-term hardware scale - is BGPSec supportable with what is realistically available today? For how long? Is that long enough?
2) Long-term hardware scale (5+ years) - What's the next breakthrough? How long does that buy us? Is that long enough? What does it do to our time remaining before we have to redesign the routing system to make it keep scaling?
This is where we should be considering RFC4984 and either updating or affirming the guidance there.
3) Cost for both - what is an acceptable assumption of the cost premium for BGPSec, in both capital and personnel?

On the hardware side, we're in a discussion that sounds a lot like predicting peak oil - when do we run out of scale growth on Moore's law with the current overall Internet architecture, and will BGPSec be just "one more gas-guzzler on the road" or the straw that broke the camel's back?

I don't know that we're going to get a definitive answer from modeling, and I'm not trying to bring on analysis paralysis either. Randy's (and mine, and everyone else's) guess may be BS, but even making a gut check based on what info we have available and documenting the assumptions we're basing our decision on would be a good thing.

Wes

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.