Re: [sidr] Scaling properties of caching in a globally deployed RPKI / BGPSEC system

Russ White <russw@riw.us> Wed, 28 November 2012 12:40 UTC

Return-Path: <russw@riw.us>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A2EF321F84BA for <sidr@ietfa.amsl.com>; Wed, 28 Nov 2012 04:40:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Imtl2H2Y0igu for <sidr@ietfa.amsl.com>; Wed, 28 Nov 2012 04:40:42 -0800 (PST)
Received: from da31.namelessnet.net (da31.namelessnet.net [74.124.205.66]) by ietfa.amsl.com (Postfix) with ESMTP id EE30821F841E for <sidr@ietf.org>; Wed, 28 Nov 2012 04:40:41 -0800 (PST)
Received: from cpe-065-190-156-032.nc.res.rr.com ([65.190.156.32] helo=[192.168.100.51]) by da31.namelessnet.net with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.80) (envelope-from <russw@riw.us>) id 1TdgwS-0001VR-VB for sidr@ietf.org; Wed, 28 Nov 2012 04:40:41 -0800
Message-ID: <50B6064F.8030805@riw.us>
Date: Wed, 28 Nov 2012 07:40:47 -0500
From: Russ White <russw@riw.us>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: sidr@ietf.org
References: <08EB6152-FE13-46FF-B3E7-E1D581263B8F@verisign.com> <D7A0423E5E193F40BE6E94126930C4930BAD02205C@MBCLUSTER.xchange.nist.gov>
In-Reply-To: <D7A0423E5E193F40BE6E94126930C4930BAD02205C@MBCLUSTER.xchange.nist.gov>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Antivirus-Scanner: Seems clean. You should still use an Antivirus Scanner
Subject: Re: [sidr] Scaling properties of caching in a globally deployed RPKI / BGPSEC system
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Nov 2012 12:40:42 -0000

> It is also important to note that a running RP fetches only *changed* RPKI objects 
> during each polling instance in normal (steady-state) operation 
> (i.e., whatever the delta happens to be since the previous polling instance). 
> The RP would download *all* RPKI objects rather rarely, e.g., when there is a restart 
> (i.e., failure recovery). So the common fetch delay seen at RPs would be 
> more like seconds or minutes as shown in the lower left corner of plot on slide 2 
> (or, top of the table in slide 1), corresponding to fetches of 100 to 10,000 changed rpki objects.
> 
> B. Estimate of RPKI repositories:

> It is known that 84% of all ASes are stub, and 16% are non-stub. So about 35,280 are stub ASes 
> and only 6,720 are non-stub. The stub ASes are likely to only run simplex bgpsec and 
> they would likely contract out publication point service to their upstream ISP or a third party 
> (Tim also observed this). Even many non-stub ASes can be expected to do the same. 
> So in the end, we think the total number of repositories (what you call SIAs) 
> will be O(100) or O(1000), much less than the 42,000 you have assumed.

Several points:

1. It is never a good idea to "just right" engineer a system to the
"best available numbers from today." Yes, I know the old saw about the
glass being neither half empty nor half full, but overengineered --but I
also know that virtually every system that counts on a specific traffic
pattern, or a specific rate of usage, finds those underlying assumptions
completely and totally wrong.

2. If I ran a large network (like a big financial, or major traffic
sink), I would never hand my certificate advertisement over to my SP. If
I've taken the time and trouble to get my own addresses in order to
maintain some level of independence from my provider, I wouldn't undo my
independence by using that same provider as the point of advertisement
for my routes in a system that can take 4+ hours to synchronize (only in
SIDR do we talk about minutes and hours as if they are "short"
convergence times).

3. So, I would assume a much higher number than the O(1000) you're
assuming here, and bring the number closer to the 40,000 contained in
the original process.

> C. Number of router certs:
> 
> Observe that per-router certs are not required; certs can be per AS. It remains to be seen 
> what granularity (ranging from per AS to per router) will be used. But, for worse case, 
> if we assume per-router certs in all ASes, then a conservative estimate can be obtained as follows. 
> The stub ASes will have a low average # eBGPSEC routers (say, 2 per AS), 
> and the non-stub ASes will have a relatively larger value for the same (say, 20 per AS). 
> Then the total number of eBGPSEC routers can be estimated at 204,960 (= 20*6720 + 2*35280).  
> You had assumed 420,000 (10 x 42,000). Also, there should be 4 certs per *non-stub* eBGPSEC router; 
> one pair (current and next) for origination prefixes and another pair (current and next) for transit ASes 
> (please see draft-rogaglia-sidr-bgpsec-rollover and draft-sriram-replay-protection-design-discussion). 
> And there should be only 2 certs per *stub* eBGPSEC router since it does not have transit prefixes. 
> That gives us an estimate of 678,720 (=4*20*6720 + 2*2*35280) for the total # router certs. 
> We think this number is conservative (high) but reasonable for now.

I don't think this is right, either --you're assuming a "stub AS" will
only have two routers, which is completely off base. The smallest stub
AS will have two routers, larger ones will have more as they scale
upwards. Several large enterprise networks I've worked on have at least
20, if not more, edge routers across a single AS, into multiple providers.

The size of a transit is also questionable --I can't imagine an average
of 20 edge routers for a transit provider. Are there really so many 2
and 3 edge router transits that it would offset AT&T, Level 3, and many
others we know must be on the order of hundreds of edges?

204k eBGP speakers is a gross understimate of the size of the Internet
at large.

Russ

-- 
<><
riwhite@verisign.com
russw@riw.us