Re: [sidr] WGLC: draft-ietf-sidr-origin-ops

Shane Amante <shane@castlepoint.net> Sun, 30 October 2011 05:38 UTC

Return-Path: <shane@castlepoint.net>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 038D41F0C35 for <sidr@ietfa.amsl.com>; Sat, 29 Oct 2011 22:38:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.466
X-Spam-Level:
X-Spam-Status: No, score=-2.466 tagged_above=-999 required=5 tests=[AWL=0.133, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nrsucnLcztRd for <sidr@ietfa.amsl.com>; Sat, 29 Oct 2011 22:38:07 -0700 (PDT)
Received: from dog.tcb.net (dog.tcb.net [64.78.150.133]) by ietfa.amsl.com (Postfix) with ESMTP id B487D1F0C34 for <sidr@ietf.org>; Sat, 29 Oct 2011 22:38:07 -0700 (PDT)
Received: by dog.tcb.net (Postfix, from userid 0) id 312C1268063; Sat, 29 Oct 2011 23:38:05 -0600 (MDT)
Received: from mbpw.castlepoint.net (65-102-206-76.hlrn.qwest.net [65.102.206.76]) (authenticated-user smtp) (TLSv1/SSLv3 AES128-SHA 128/128) by dog.tcb.net with SMTP; for sidr@ietf.org; Sat, 29 Oct 2011 23:38:04 -0600 (MDT) (envelope-from shane@castlepoint.net)
X-Avenger: version=0.7.8; receiver=dog.tcb.net; client-ip=65.102.206.76; client-port=58938; syn-fingerprint=65535:54:1:64:M1452,N,W2,N,N,T,S; data-bytes=0
Content-Type: text/plain; charset="windows-1252"
Mime-Version: 1.0 (Apple Message framework v1251.1)
From: Shane Amante <shane@castlepoint.net>
In-Reply-To: <CAL9jLabcaLnBbZXbNf7Lbv+ppm-h9yO+wBHunG4s1=emOyM6=w@mail.gmail.com>
Date: Sat, 29 Oct 2011 23:37:48 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <805B0799-7026-4532-A53C-4CFE3E863A33@castlepoint.net>
References: <CAL9jLaaOm_=W85r3P990A6DtROTcQwSJ-KBRzAi9ugw1Bo1_cQ@mail.gmail.com> <E4B4DE52-BBB3-4FA0-A75A-B51824BA83E7@lacnic.net> <m2hb3a7uqp.wl%randy@psg.com> <m2fwiu7uji.wl%randy@psg.com> <CAL9jLabcaLnBbZXbNf7Lbv+ppm-h9yO+wBHunG4s1=emOyM6=w@mail.gmail.com>
To: sidr wg list <sidr@ietf.org>
X-Mailer: Apple Mail (2.1251.1)
Subject: Re: [sidr] WGLC: draft-ietf-sidr-origin-ops
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Oct 2011 05:38:09 -0000

I have some questions that pertain to this document, specifically around:
- whether it's intended or 'safe' to use BGP Attributes, (MED, communities), to convey validity of prefixes from one ASN to another ASN
- better guidance/recommendations around the number, placement and synchronization characteristics of RPKI caches within a SP.


1)  From Section 3:
---snip---
   A local valid cache containing all RPKI data may be gathered from the
   global distributed database using the rsync protocol, [RFC5781], and
   a validation tool such as rcynic [rcynic].
---snip---

Would it be possible to mention and/or point to how the above process is supposed to be bootstrapped?  IOW, is it expected that, eventually?, the RIR's are going to publish to their end-users and maintain URI's of RPKI publication points?  Since this is an Ops guidelines document, some guidance and/or pointers are likely to save [lots of] questions down the road.  I'm not expecting this to be a tutorial document, but some idea on the theory of how a new SP bootstraps their cache(s) would be helpful.

2)  Given that, to my knowledge, the RPKI is [very] loosely synchronized in a "pull-only" fashion, shouldn't there be some text added below to that effect that:
    a)  It may not be best to go more than, say, 2 levels of RPKI caches deep inside a single organization/ASN to avoid RPKI caches from being out of sync with each other?  IOW, there are likely a small set of 1st/top-level RPKI caches that speak externally to fetch RPKI cache information, (similar to 'hidden' authoritative DNS servers), then a second tier of RPKI caches that synchronize (only) from the top-level RPKI caches, (similar to external, anycast authoritative DNS servers). 
    b)  Operators should look at running more aggressive synchronization intervals _internally_ within their organization/ASN, from "children" (2nd-level) RPKI caches to the 'parent' (top-level) RPKI cache in their organization/ASN, compared to more "relaxed" synchronization intervals to RPKI caches external to their organization (top-level RPKI caches in their ASN to RIR's)?
---snip---
   Validated caches may also be created and maintained from other
   validated caches.  Network operators SHOULD take maximum advantage of
   this feature to minimize load on the global distributed RPKI
   database.  Of course, the recipient SHOULD re-validate the data.
---snip---
While I'm here, I don't think the text in Section 6, "Notes", addresses the above concerns, at all.  In fact, I find it extremely unhelpful to just dismiss this concern, out of hand, with the text: "There is no 'fix' for this, it is the nature of distributed data with distributed caches".  We know what the answer is here: you tune the synchronization intervals to strike the appropriate balance between [very] tight synchronization vs. increased load on the systems being synchronized.  I find it hard to believe a simple suggestion such as this is not proposed in the text, even including the phrase "the suggested values for such synchronization are outside the scope of this document, but will likely be subject to further studies to determine optimal values based on field experience".

3)  Granted, the following text is only a "SHOULD", but the text offers no reasoning as to why caches should be placed close to routers, i.e.: are there latency concerns (for the RPKI <-> cache protocol), or is it that a geographically distributed system is one way to avoid a single-point-of-failure, or something else entirely?  As a start, just defining "close" would help, e.g.: same POP, same (U.S.) state, same country, same timezone … but, then a statement as to any latency or resiliency requirement for geographic deployment of RPKI caches wold be useful.

    Furthermore, given the [very] loosely synchronized nature of the RPKI, should the text point out that the number of RPKI caches (internal to the organization) be balanced against the potential need of an organization to maintain a more tightly synchronized view, across their entire network, of validated routing information?  A concern might be that if routers in Continent A pull information from their RPKI caches that tell them that ROA is not "Invalid", but other routers in Continent B are still using 'older' information in RPKI caches in Continent B that says the same ROA is either "Not Found" or "Valid", then the result might be that BGP Path Selection swings all traffic from Continent A to Continent B.  At a minimum, this could lead to substantially increased latency or, at worst, congestion, packet-loss or a unintended DoS.  
---snip---
   As RPKI-based origin validation relies on the availability of RPKI
   data, operators SHOULD locate caches close to routers that require
   these data and services.  A router can peer with one or more nearby
   caches.
---snip---

In Section 5, "Routing Policy":
4)  From a practical standpoint, LOCAL_PREF is already widely used to influence Traffic Engineering, both by an SP as well as by the SP's customers (through the use of "TE communities" sent by a downstream customer to the SP) -- the latter of which is done in order so the customer can influence traffic from the SP toward themselves, (e.g.: one example where a customer prefers a circuit be 'backup' for another circuit only if their other SP is not announcing that same prefix).  In reality, I think that there will have to be significant re-work of an SP's existing BGP policies to encode dual-meanings inside a single LOCAL_PREF attribute, (route validity + TE preference).  It may be good to acknowledge this by recommending that in the text, above, something like:
====
    In the short-term, the LOCAL_PREF Attribute may be used to carry both the validity state of a prefix along with it's Traffic Engineering characteristic(s).  It is likely that the SP will have to change their BGP policies such that they can encode these two, separate characteristics in the same BGP attribute without negatively impacting their existing use or leading to accidental privilege escalation attacks. 
====
---snip---
Some may choose to use the large Local-Preference hammer.
---snip---

5)  I have three comments on the below:
    a)  It's not clear, to me, what is meant by "internal metric" below.  Do you mean MED or IGP metric or something else?  I don't see IGP metric as being practical, so I'm assuming you mean additively altering MED (up|down) based on validity state.  Regardless, I would recommend you state more precisely which BGP Path Attribute you're referring to below.
    b)  Since MED is passed from one ASN to (only) a second, downstream ASN to influence ingress TE policy, is it "OK" from a security PoV that MED is a *trusted* means to convey ROA validity information from one ASN to a second?  Presumably, the answer should be "heck, no", right?  If that's the case, then wouldn't it be wise to state that:
        i)  MED's, encoded with any ROA validity information, should get reset on egress from an ASN to remove said validity information and only carry TE information, as appropriate; and,
        ii) MED's should not be trusted on ingress to convey any meaning with respect to validity information?
    c)  What is meant by the statement, "might choose to let AS-Path rule"?  Is your intent to state that an SP may choose to just use MED, which follows after LOCAL_PREF & AS_PATH in the BGP Path Selection Algorithm, as a means to determining validity of a particular prefix?  If so, then it would be much more clear if you just stated that, e.g.:
====
    If LOCAL_PREF is not used to convey validity information, then MED is likely the next best candidate BGP Attribute that can be used to influence path selection based on the validity of a particular prefix.  As with LOCAL_PREF, care must be taken to avoid changing the MED attribute and creating privilege escalation attacks.
====
---snip---
   […]  Others
   might choose to let AS-Path rule and set their internal metric, which
   comes after AS-Path in the BGP decision process.
---snip---



Other Comments:
6)  Related to #5, above, BGP Communities are another transitive attribute that /might/ be used to convey validity information of a prefix, or lack thereof, from one ASN to a second ASN (or, more).  However, as we know, there is no means to authenticate BGP Attributes, from one ASN to the next.  So, from a security hygiene perspective, would it be best to say something along the lines of:
====
The validity state of routes MUST NOT be transmitted beyond the borders of an SP's ASN, since: a) there is no authenticity of BGP Attributes; and, b) this would place hidden dependencies on the ability of the upstream ASN to validate routes and pass them along to others, which would increase the fragility of the overall system.  Finally, ASN's MUST NOT rely on BGP Attributes received on an eBGP session, to convey any meaning with respect to validity of a particular prefix for the reasons just stated.
====

7)  Is this document only intended (scoped?) to cover PE's that can (or, eventually, will) speak the RPKI-RTR protocol for validation?  Or is this document intended to also cover PE's that do not speak RPKI-RTR, but those PE's would obviously need some other mechanism, (e.g.: periodically pushing an updated config to them based on RPKI validated data), in order that they could influence the policy applied to valid routes in such a way that is consistent with other more modern routers that do run RPKI-RTR protocol?  If so, wouldn't it be good to suggest this, even if only as a means to increase the deployment speed?  Or, to at least let readers know that this needs to be considered during their deployment so that they can factor in the load on their [existing] systems that might do this work as well as the effects of the 'loosely synchronized' aspects of the RPKI?

-shane


On Oct 28, 2011, at 7:59 AM, Christopher Morrow wrote:
> Two folks seem to have given this a read-through, is that all the
> interest that exists? is documenting how originators of routes ought
> to think/use/abuse RPKI not something we should do here?
> 
> please chime in if you've given this a read and are onboard with it
> moving forward.
> 
> -chris
> 
> On Sat, Oct 15, 2011 at 12:22 AM, Randy Bush <randy@psg.com> wrote:
>>>> What's the rationale of this change from version 10 to 11?
>>> after much discussion with ops and security folk, it is the purpose of
>>> the whole exercise.  you wanna stop 7007?
>> 
>> fwiw, it has swung back and forth a few times
>> 
>> randy
>> _______________________________________________
>> sidr mailing list
>> sidr@ietf.org
>> https://www.ietf.org/mailman/listinfo/sidr
>> 
> _______________________________________________
> sidr mailing list
> sidr@ietf.org
> https://www.ietf.org/mailman/listinfo/sidr