Re: [sidr] WGLC: draft-ietf-sidr-origin-ops

Shane Amante <shane@castlepoint.net> Mon, 14 November 2011 10:45 UTC

Return-Path: <shane@castlepoint.net>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8899F11E8163 for <sidr@ietfa.amsl.com>; Mon, 14 Nov 2011 02:45:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.299
X-Spam-Level:
X-Spam-Status: No, score=-2.299 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, J_CHICKENPOX_22=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VQ3xZAYwL9mc for <sidr@ietfa.amsl.com>; Mon, 14 Nov 2011 02:45:16 -0800 (PST)
Received: from dog.tcb.net (dog.tcb.net [64.78.150.133]) by ietfa.amsl.com (Postfix) with ESMTP id 6112611E8156 for <sidr@ietf.org>; Mon, 14 Nov 2011 02:45:16 -0800 (PST)
Received: by dog.tcb.net (Postfix, from userid 0) id 18F78268063; Mon, 14 Nov 2011 03:45:16 -0700 (MST)
Received: from host2.tcb.net (64.78.235.218 [64.78.235.218]) (authenticated-user smtp) (TLSv1/SSLv3 AES128-SHA 128/128) by dog.tcb.net with SMTP; Mon, 14 Nov 2011 03:45:15 -0700 (MST) (envelope-from shane@castlepoint.net)
X-Avenger: version=0.7.8; receiver=dog.tcb.net; client-ip=64.78.235.218; client-port=50305; data-bytes=0
Mime-Version: 1.0 (Apple Message framework v1251.1)
Content-Type: text/plain; charset=us-ascii
From: Shane Amante <shane@castlepoint.net>
In-Reply-To: <m21utbfbhb.wl%randy@psg.com>
Date: Mon, 14 Nov 2011 18:45:09 +0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <48A7C4A7-7FFB-44CB-ABCA-76E148AE0574@castlepoint.net>
References: <CAL9jLaaOm_=W85r3P990A6DtROTcQwSJ-KBRzAi9ugw1Bo1_cQ@mail.gmail.com> <E4B4DE52-BBB3-4FA0-A75A-B51824BA83E7@lacnic.net> <m2hb3a7uqp.wl%randy@psg.com> <m2fwiu7uji.wl%randy@psg.com> <CAL9jLabcaLnBbZXbNf7Lbv+ppm-h9yO+wBHunG4s1=emOyM6=w@mail.gmail.com> <805B0799-7026-4532-A53C-4CFE3E863A33@castlepoint.net> <m21utbfbhb.wl%randy@psg.com>
To: Randy Bush <randy@psg.com>
X-Mailer: Apple Mail (2.1251.1)
Cc: sidr wg list <sidr@ietf.org>
Subject: Re: [sidr] WGLC: draft-ietf-sidr-origin-ops
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Nov 2011 10:45:17 -0000

Hi Randy,

Thanks for the response.  I think we're getting closer.  See below.

On Nov 14, 2011, at 2:45 PM, Randy Bush wrote:
>> 1)  From Section 3:
>> ---snip---
>>   A local valid cache containing all RPKI data may be gathered from the
>>   global distributed database using the rsync protocol, [RFC5781], and
>>   a validation tool such as rcynic [rcynic].
>> ---snip---
>> 
>> Would it be possible to mention and/or point to how the above process
>> is supposed to be bootstrapped?  IOW, is it expected that,
>> eventually?, the RIR's are going to publish to their end-users and
>> maintain URI's of RPKI publication points?  Since this is an Ops
>> guidelines document, some guidance and/or pointers are likely to save
>> [lots of] questions down the road.  I'm not expecting this to be a
>> tutorial document, but some idea on the theory of how a new SP
>> bootstraps their cache(s) would be helpful.
> 
> that is software dependent.  relying party software and how it decides
> to deal with the global rpki is very software dependent.  e.g. tim's
> varies from rob's, and we have some really wild ideas to deal with the
> issues of reliable distributed publication.
> 
> it has been suggested that you may be asking how the inter-publication
> stuff is to be bootstrapped at the global rpki level.  this is the sia
> pointer in the cert.  when i tell arin i want my space and will publish
> at uri://randy.foo, they put the cert in their pub point with an sia
> pointer to uri://randy.foo

Yes, I'm asking about the latter.  More specifically, what I've been attempting to ask here is how one configures, in one's _local_ RPKI cache (that syncs to the outside world), /where/ the RIR's publication points are on Day 1.  Do I contact one RIR (which maintains a list of other RIR's publication points) -or- each RIR individually to ask what is their publication point?  (If you can help provide an answer as to what is the expectation on the operator, I can then potentially help to provide text).


>> 2)  Given that, to my knowledge, the RPKI is [very] loosely
>>    synchronized in a "pull-only" fashion, shouldn't there be some
>>    text added below to that effect that: 
>>    a)  It may not be best to go more than, say, 2 levels of RPKI
>>    caches deep inside a single organization/ASN to avoid RPKI caches
>>    from being out of sync with each other?
> 
> considering the timings we are seeing in operation, like a few seconds
> to fetch from one cache to another, as compared to minutes to load from
> the global rpki (see rob's preso at iepg) and my comment above, this may
> be 180 degrees out.  inter-cache fetching may be far better.
>
> but code will vary.

I'm going to put this comment aside for the moment, since I think the issue below wrt the desire (or, need?) to deploy RPKI servers across the whole network may be a better way to answer the above.


>>    b)  Operators should look at running more aggressive
>> synchronization intervals _internally_ within their organization/ASN,
>> from "children" (2nd-level) RPKI caches to the 'parent' (top-level)
>> RPKI cache in their organization/ASN, compared to more "relaxed"
>> synchronization intervals to RPKI caches external to their
>> organization (top-level RPKI caches in their ASN to RIR's)? 
> 
> kind of assumed in current text, but i can add something.

OK, thanks.


>> ---snip---
>>   Validated caches may also be created and maintained from other
>>   validated caches.  Network operators SHOULD take maximum advantage of
>>   this feature to minimize load on the global distributed RPKI
>>   database.  Of course, the recipient SHOULD re-validate the data.
>> ---snip---
>> While I'm here, I don't think the text in Section 6, "Notes",
>>   addresses the above concerns, at all.  In fact, I find it extremely
>>   unhelpful to just dismiss this concern, out of hand, with the text:
>>   "There is no 'fix' for this, it is the nature of distributed data
>>   with distributed caches".  We know what the answer is here
> 
> for some values of "we."
> 
>>   you tune the synchronization intervals to strike the appropriate
>>   balance between [very] tight synchronization vs. increased load on
>>   the systems being synchronized.
> 
> could you reference an example, a paper, an algorithm, ...?  we seem to
> think differently about how easy this is.
> 
>>   I find it hard to believe a simple suggestion such as this is not
>>   proposed in the text, even including the phrase "the suggested
>>   values for such synchronization are outside the scope of this
>>   document, but will likely be subject to further studies to
>>   determine optimal values based on field experience".
> 
> ahhhh.  so we don't know.  i say that frequently!
> 
>    It is hoped that testing and deployment will produce advice on
>    relying party cache loading and timing.

OK.  How about the following tweak:
---snip---
It is hoped that testing and deployment will produce advice on
RPKI cache loading, intra-domain and inter-domain RPKI cache synchronization
timing, router to RPKI cache synchronization timing, etc.
---snip---


>> 3)  Granted, the following text is only a "SHOULD", but the text
>> offers no reasoning as to why caches should be placed close to
>> routers, i.e.: are there latency concerns (for the RPKI <-> cache
>> protocol), or is it that a geographically distributed system is one
>> way to avoid a single-point-of-failure, or something else entirely?
>> As a start, just defining "close" would help, e.g.: same POP, same
>> (U.S.) state, same country, same timezone 
> 
> i seem to get in trouble every time i do so.  it's an ietf thing.
> 
>> but, then a statement as to any latency or resiliency requirement for
>> geographic deployment of RPKI caches wold be useful.
> 
> so far, actual timings seem not to be very latency sensitive, probably
> because they have been done with freebsd on fat wires.
> 
> as i said on this thread on nanog, this is why we get the big bucks.  it
> is a multi-dimensional design issue.

This still hasn't answered the question.  Let me try again.

Money doesn't grow on trees.  Why is it 'advised' (SHOULD) that I deploy a RPKI cache in every single POP throughout my whole network?  Funny thing about business folks that control the budget, they want an answer to this question that is other than "because".  Let me try to now be more constructive.  Is this draft recommending RPKI cache per POP:
1)  For [massive] redundancy/resiliency inside the AS;
2)  To keep CPU (or, network) 'load' down on RPKI caches inside the AS -- even though, above, you've said that this hasn't been quantified, yet?
3)  To keep latency down in order to ensure the fastest possible inter-cache sync time -or- router to RPKI cache fetching time?
4)  To avoid reliance on either IGP -or- BGP within the AS for a router to get access to it's RPKI cache for data?
5)  Other?
6)  All of the above?

Ideally, you can narrow it down to just one answer from above, but if not which of the above are the "most important" and driving this recommendation?

Furthermore, this is not just about the cost of the server HW on top of which one would run RPKI inside the AS.  There is also much more substantial costs associated with [potentially violent] swings of traffic around the network as the "loosely synchronized" (out-of-band) RPKI caches push new RPKI data to routers -or- routers pull new RPKI data from their local RPKI cache that may be different from neighboring RPKI caches inside the same network.  In theory, one way to "solve" this problem is to deploy [much] less RPKI caches inside the AS, thus ensuring that a "large" collection of routers have a consistent view of the same RPKI data.


>>    Furthermore, given the [very] loosely synchronized nature of the
>> RPKI, should the text point out that the number of RPKI caches
>> (internal to the organization) be balanced against the potential need
>> of an organization to maintain a more tightly synchronized view,
>> across their entire network, of validated routing information?
> 
> i have no idea.  you seem to think inter-cache sync dominates timing far
> more than i do.  i suspect that publishing party timing dominates.
> 
> so send text.

See just below for what is my primary concern.


>> A concern might be that if routers in Continent A pull information
>> from their RPKI caches that tell them that ROA is not "Invalid", but
>> other routers in Continent B are still using 'older' information in
>> RPKI caches in Continent B that says the same ROA is either "Not
>> Found" or "Valid", then the result might be that BGP Path Selection
>> swings all traffic from Continent A to Continent B.  At a minimum,
>> this could lead to substantially increased latency or, at worst,
>> congestion, packet-loss or a unintended DoS.
> 
> yep, that is a concern.
> 
> how deeply do we want to get into explaining all the issues surrounding
> distributed data?  at this point i am tempted to point to a basic cs
> text on the subject.

So, I think there are two issues here:
1)  Inter-RPKI cache synchronization _within_ an AS <-- something an individual operator does have a lot of control over; and,
2)  Inter-RPKI cache synchronization _between_ AS'es (technically, amongst all the authoritative, global RPKI caches in the world) <-- something an individual operator does not have much, if any, control over.

FWIW, my comment above is in regards to #1.  More specifically, with respect to the recommendation to deploy RPKI caches in every POP, globally, there is a [very] large risk (?) that those RPKI caches inside the network will be "loosely synchronized".  Thus, different RPKI caches will be updating BGP policy with their associated routers at much (?) different time intervals.  This results in BGP path selection calculating a new best path and shifting traffic around the network, until _all_ RPKI caches (intra-AS) have updated their associated router, thus getting BGP policy back to an "optimal" state of load-sharing traffic across, say, all connections to a multi-homed customer or peer.

I also think there's substantial operational issues wrt #2, but I don't have a palatable suggestion to offer.


>>    In the short-term, the LOCAL_PREF Attribute may be used to carry
>> both the validity state of a prefix along with it's Traffic
>> Engineering characteristic(s).  It is likely that the SP will have to
>> change their BGP policies such that they can encode these two,
>> separate characteristics in the same BGP attribute without negatively
>> impacting their existing use or leading to accidental privilege
>> escalation attacks.  
> 
> thanks for text
> 
>> 5)  I have three comments on the below:
>>    a)  It's not clear, to me, what is meant by "internal metric"
>>    below.  Do you mean MED or IGP metric or something else?  I don't
>>    see IGP metric as being practical, so I'm assuming you mean
>>    additively altering MED (up|down) based on validity state.
>>    Regardless, I would recommend you state more precisely which BGP
>>    Path Attribute you're referring to below. 
>>    b)  Since MED is passed from one ASN to (only) a second, downstream ASN to influence ingress TE policy, is it "OK" from a security PoV that MED is a *trusted* means to convey ROA validity information from one ASN to a second?  Presumably, the answer should be "heck, no", right?  If that's the case, then wouldn't it be wise to state that:
>>        i)  MED's, encoded with any ROA validity information, should get reset on egress from an ASN to remove said validity information and only carry TE information, as appropriate; and,
>>        ii) MED's should not be trusted on ingress to convey any meaning with respect to validity information?
>>    c)  What is meant by the statement, "might choose to let AS-Path rule"?  Is your intent to state that an SP may choose to just use MED, which follows after LOCAL_PREF & AS_PATH in the BGP Path Selection Algorithm, as a means to determining validity of a particular prefix?  If so, then it would be much more clear if you just stated that, e.g.:
>> ====
>>    If LOCAL_PREF is not used to convey validity information, then MED
>>    is likely the next best candidate BGP Attribute that can be used
>>    to influence path selection based on the validity of a particular
>>    prefix.  As with LOCAL_PREF, care must be taken to avoid changing
>>    the MED attribute and creating privilege escalation attacks.
> 
>      Other providers may not want the RPKI validation result to be more
>      important than AS-path length -- these providers would need to map
>      RPKI validation result to some BGP attribute that is evaluated in
>      BGP's path selection process after AS-path is evaluated.  Routers
>      implementing RPKI-based origin validation MUST provide such
>      options to operators.

OK.


>> 7) Is this document only intended (scoped?) to cover PE's that can
>> (or, eventually, will) speak the RPKI-RTR protocol for validation?
> 
> yes

OK.

-shane