Re: [rrg] IRON-RANGER scalability and support for packets from non-upgraded networks
Robin Whittle <rw@firstpr.com.au> Thu, 18 March 2010 02:50 UTC
Return-Path: <rw@firstpr.com.au>
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id CECAF3A6801 for <rrg@core3.amsl.com>; Wed, 17 Mar 2010 19:50:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.358
X-Spam-Level:
X-Spam-Status: No, score=0.358 tagged_above=-999 required=5 tests=[AWL=-1.891, BAYES_40=-0.185, DNS_FROM_OPENWHOIS=1.13, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, J_CHICKENPOX_14=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tEQu5OdfxX4j for <rrg@core3.amsl.com>; Wed, 17 Mar 2010 19:50:36 -0700 (PDT)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id 62A703A67A6 for <rrg@irtf.org>; Wed, 17 Mar 2010 19:50:33 -0700 (PDT)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id 2E560175CC0; Thu, 18 Mar 2010 13:50:42 +1100 (EST)
Message-ID: <4BA19503.1040008@firstpr.com.au>
Date: Thu, 18 Mar 2010 13:50:43 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
References: <C7B93DF3.4F45%tony.li@tony.li> <4B94617E.1010104@firstpr.com.au > <E1829B60731D1740BB7A0626B4FAF0A649511933 94@XCH-NW-01V.nw.nos.boeing.co m > <4B953EA5.4090707@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A6495 1 19 34CF@XCH-NW-01V.nw.nos.boeing.com> <4B97016B.5050506@firstpr.com.au> < E1 829B60731D1740BB7A0626B4FAF0A6495119413D@XCH-NW-01V.nw.nos.boeing.com> < 4B9 98826.9070104@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A649511 DCE A0@XCH-NW-01V.nw.nos.boeing.com> <4B9B0244.7010304@firstpr.com.au> <E18 29B60731D1740BB7A0626B4FAF0A649511DD102@XCH-NW-01V.nw.nos.boeing.com> <4B9F 6E22.60509@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A649511DD643@XCH-NW-01V.nw.nos.boeing.com> <4BA022A3.6060607@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A649511DD9B1@XCH-NW-01V.nw.nos.boeing.com>
In-Reply-To: <E1829B60731D1740BB7A0626B4FAF0A649511DD9B1@XCH-NW-01V.nw.nos.boeing.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [rrg] IRON-RANGER scalability and support for packets from non-upgraded networks
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Mar 2010 02:50:39 -0000
Short version: Continuing discussions: Is OSPF suitable for the I-R overlay? BGP is highly decentralised, but I am not sure this is the case with OSPF. Fred suggests Virtual Router Redundancy Protocol (VRRP) RFC 5798 for the task of "DEL" routers registering their EID prefixes with a handful of VP routers. My initial impression is that this won't work because it requires multicast - which I think would be impossible or at least unscalable on a global overlay network as I-R requires. I give some names to the various roles I think IRON routers might have, and consider what combinations of roles might be valid. Hi Fred, Continuing our interesting conversation on the design of IRON-RANGER, you wrote: >> I assumed that these "DITR-like" routers were not necessarily VP routers. > > Correct; these routers (IDMs) may also be VP routers on > the IRON but need not be. So, we have three classes of > IRON routers: 1) VP routers, 2) IDMs, and 3) both. I understand the population of IRON routers can be classified according to their roles. I am giving some new names to these roles. I think it is important to invent names for concepts in a new design, otherwise we have to use phrases which are longer, and might be written in different ways when they really refer to the same thing. DEL Delivers packets to one or more nearby end-user networks, and so needs to register the one or more EID prefixes this involves with VP routers (VPRs). (Typically 2 VPRs, but I tend to think of 2, 3 or 4 or so for robustness.) LFR Local Forwarding Router. Advertises a few prefixes covering all I-R "edge" space in the local routing system of the network it is located in. For the purposes of discussion, I will assume this is an ISP network, but it could be the network of a large end-user network which has its own AS and participated in the interdomain routing system. This router is not advertising anything outside the ISP network - so it is not "advertising I-R edge space in the DFZ". Any packets addressed to I-R edge space (that is to any EID prefix used by an I-R-using end-user network) will go to this router rather than to the ISP's BRs. So this router needs to tunnel it to one of the VPRs for the VP this EID address is within. That VPR doesn't have to be the "closest" of the two or more VPRs, but if you use BGP in the overlay network then it typically will be the "closest" in BGP terms - which would be desirable, to reduce path lengths and delays for the one or more initial packets which go via the VPR. That VPR will tunnel the packet to the IRON router playing the DEL role. The VPR will also send mapping to this LFR-role router which will subsequently tunnel further traffic packets whose destination address matches the EID prefix in the mapping to one of the IRON routers which are playing the DEL role for this prefix. So this LFR role involves the IRON router knowing the address of at least one VPR for every VP. It finds this out, in the current design, from the BGP best path it gets from the I-R overlay BGP system. However, it doesn't tunnel the packet via that overlay system - it tunnels it via the Internet. (This means the VPR must use its Internet IP address for its VP advertisements on the overlay network.) IDM IRON Default Mapper. As for LFR, except this router is a BR of the ISP and advertises all I-R edge space to neighbouring ISPs - that is, to the DFZ. As you wrote below, IDMs also advertise "default" on the I-R overlay ("on the IRON") - but I don't understand the purpose of this. VPR Virtual Prefix Routers. These advertise one or more Virtual Prefixes (VPs) on the I-R overlay. Each VP covers multiple (thousands to millions in principle) individual EID prefixes, each of which is used by an end-user network via one or more IRON routers playing the DEL role. See LFR above for a description of the responsibilities of VPR routers regarding traffic packets. The VPR role also involves accepting registrations from DEL- role routers for all the EIDs covered by the VP. The mechanisms for doing so are currently undefined, since we have been discussing the inability of the BGP overlay system to tell each DEL router the addresses of all VPRs for the VP which matches the DEL router's EID prefix. The new arrangements may involve (at least I suggested this as a possibility) the VPRs for a given VP working together to share registration information. I figure they will be run by the one organization, or at least this VPR role for a given VP will be controlled by the organization which runs the VP - so they will presumably be coordinated in some way. This would probably mean they don't need to automatically discover the other VPR-role routers which are handling a given VP. As best I understand it, any IRON router can perform the DEL role - its just a matter of somehow configuring it to initiate the registration process for an EID prefix for an end-user network it can deliver packets to. As far as I know, IRON routers are typically not DFZ routers and are (to a rough approximation) not BRs - so they typically perform the LFR role as well. (A BR could still perform the LFR role, by not advertising the I-R edge space to other ASes - only within its own network.) Its just that a router which is not a BR can't advertise the edge prefixes in the DFZ, and so can't perform the IDM role. However, a subset of IRON routers are BRs and are also configured to perform the IDM role. While a router could perform purely this IDM role and not advertise the edge prefixes locally, I will assume this would not be typical. A VPR need not be a BR. It need not perform any other roles, but I guess it typically would perform some, such as DEL. Assuming that all IRON routers will, or could, perform the DEL role, here are the various combinations: LFR? IDM? VPR? BR? 0 - - - Maybe Just playing the DEL role. 1 - - VPR Maybe Also playing the VPR role. 2 - IDM - Yes Just playing DEL and IDM roles - but for some non-obvious reason not advertising I-R edge space to local routing system. 3 - IDM VPR Yes As for 2, but also VPR role. 4 LFR - - Maybe DEL and accepting packets from the local network too. 5 LFR - VPR Maybe As for 1, but also accepting packets from the local network. 6 LFR IDM - Yes As for 2, but also accepting packets from the local network. 7 LFR IDM VPR Yes As for 3, but also accepting packets from the local network. >> Here is my understanding on what you just wrote: >> >>> The more I think about it, the more these specialized >>> VP routers >> >> I think you mean the "DITR-like" routers are VP routers. Later you >> refer to these as "IRON Default Mappers (IDMs)". I had assumed they >> either were not VP routers, or that they need not be VP routers. > > The latter - IDMs need not also be VP routers, but they > could be. OK. >> However, this part: >> >>> On the IRON, they advertise "default" >> >> makes no sense to me. I don't recall any IRON router advertising >> "default" on the IRON overlay network. I understand that a VP router >> advertises its one or more VPs. > > Yes; this is new. By having the IDMs connected to the DFZ > advertise "default" on the IRON, other IRON routers that do > not connect to the DFZ can discover a nearby IDM that can > reach the non-upgraded IPv6 Internet. Assuming all IRON routers are IPv6 routers, why would they need to find another IRON router via the overlay network which could deliver packets to any IPv6 address? I think the reasoning for this must come from your mixed IPv4 / IPv6 plans, which I have tried to avoid thinking about so far. Can you explain more about your vision for this? >>>> They are going to be busy, depending on where they are located, the >>>> traffic patterns, how many of them there are etc. So they need to >>>> be able to handle the cached mapping of some potentially large number >>>> of I-R end-user network prefixes. >>> >>> In the case of IPv6, I think whether the IRON Default >>> Mappers (IDMs) will be very busy depends on how large >>> the IPv6 DFZ becomes. In my understanding, the IPv6 DFZ >>> is not very big yet. So, if most IPv6 growth occurs in >>> the IRON and not in the IPv6 DFZ the packet forwarding >>> load on the IDMs might not be so great. >> >> This would only be true if you could convince most networks adopting >> IPv6 to adopt I-R at the same time. > > Well, now is the time to put forward the case for > handling new IPv6 growth in the IRON instead of in > the IPv6 DFZ. Otherwise, once growth in the IPv6 > DFZ takes off and we start to see significant PI > addressing and multihoming, we will eventually > end up in the same boat we are in with the IPv4 > DFZ today. OK. But I still prefer Ivip for IPv6 since it will be able to give end-user networks, or their appointees, real-time control of tunneling behavior. This will be advantageous for real-time responsive inbound TE and for quickly getting all traffic packets to the newly selected TTR (Translating Tunnel Router) in TTR Mobility - so the MN can quickly drop the tunnel it made to the previous TTR. >>> The term "bubbles" came from teredo (RFC4380). Maybe we can >>> think of a better term to use for IRON-RANGER? >> >> OK. I don't think "bubbles" is appropriate for the registration >> methods you have described so far, or that I have suggested. > > OK. How about Channel Queries (CQs)? I don't see any "channels" and it doesn't look like a "query". In my nomenclature, it is a DEL router registering an EID prefix (I think this is the term you use in I-R) with a VPR because this VPR is one of the typically two or more VPRs which handle this VP. What about "EID Registration Message" - ERM? >>>> I am definitely not going to try to think about mixed IPv4/v6 >>>> implementations of I-R. I can handle thinking about purely IPv4 and >>>> purely IPv6. >>> >>> I choose to think of mixed IPv4/IPv6 for at least three >>> reasons: >>> >>> 1) We already have global deployment of IPv4, and that won't >>> go away overnight when IPv6 begins to deploy. >> >> I agree. >> >>> 2) IPv4 is fully built-out, so new growth will come via IPv6. >> >> I don't agree with this at all. I think there's plenty of scope for >> more growth in the IPv4 Internet. Fig. 11 at: >> >> http://www.potaroo.net/tools/ipv4/ >> >> shows 130 /8s worth of space is currently advertised. Fig. 5 shows >> this in more detail. Of the /8s to to 223, a handful can't be used >> (127, 0 maybe). There are still a bunch of /8s which are >> unadvertised. As time progresses, this space will be too valuable to >> use internally, probably inefficiently - so I expect quite a lot of >> that will be made available and advertised too. > > OK, but how bad would it be if we just let IPv4 address > depletion run out under the current system, then jack up > to IPv6 in parallel to handle PI addressing and multihoming? I am interested in exploring the IRON-RANGER design - including for mixed IPv4 and IPv6, because I find this stuff generally interesting and a good way to learn about scalable routing. Maybe at the end of this process I might think that IRON-RANGER is practical and in some ways desirable compared to Ivip or LISP, which are the only two I consider potentially practical or desirable at present. (msg06219) A likely outcome is that this process will prompt me to think of improvements to Ivip - since previous improvements to Ivip came from thinking about other proposals, not from a conscious effort to improve Ivip. However, I think it is wildly unrealistic to assume that IPv4 will die or become anything but *the* Internet everyone relies upon for a very long time, perhaps forever. I am not saying this is a good thing. If you can articulate your vision for mixed IPv4 and IPv6 IRON-RANGER operation, I can go along with it. But I don't believe at all that IPv6 will take over from IPv4 for most end-users before 2020. As I mentioned, there's still a lot of unused advertised space - and (I assume) unused unadvertised - global unicast IPv4 address space. I can't envisage a situation where it will be better to sell ordinary (non-mobile) users purely an IPv6 service, without even behind-NAT IPv4 connectivity, than to sell them a service which is either a single global unicast IPv4 address or behind-NAT IPv4. Mobile users could be different, since many functions and services suitable for hand-held cellphone-like devices could be done via IPv6 - and since there would always be an option to tunnel through IPv6 to an IPv4 NAT box so people can run client-style IPv4 applications on their MN when they want to. >> Then there are ways of using space more efficiently, as Ivip, LISP >> and probably IRON-RANGER could do, by slicing and dicing it into much >> smaller chunks than is possible with the /24 limit on prefixes in the >> DFZ. > > OK. So to me, a successful implementation of IRON-RANGER would be as good as Ivip or LISP in enabling really high levels of address utilization in IPv4. This will considerably extend the ability of IPv4 to handle new users, including new end-user networks which need real global unicast space (not behind-NAT) because they are running servers. >> I think that most growth in Internet usage will occur in the IPv4 >> Internet for at least the rest of this decade. The only time it >> would make sense to use IPv6 instead of direct IPv4 or IPv4 behind >> NAT would be for some service where it wasn't important to be able to >> connect to IPv4. At present, you couldn't sell any such service. I >> guess that it may be possible to do this for large IP cell-phone >> deployments where there are enough IPv6 services available to do a >> reasonable subset of what people want in a hand-held device, and >> where tunneling to a server which provides behind-NAT IPv4 >> connectivity would also be possible. > > I agree that the IPv4 Internet is not only not going away > but also continuing to grow. But, I still think that users > will want to have both IPv4 (behind NAT if necessary) and > IPv6 as we move forward from here. At present, there's only one scenario in which I can imagine there being a real demand among non-mobile customers for IPv6. Let's say that one or more large mobile phone companies decides to make their new, or existing, 3G systems work with each MN having its own global unicast IPv6 address (or perhaps /64). This would enable direct host-to-host connectivity between any of these MNs. (Though carriers typically want to avoid this, to stop people running VoIP and instead to use their voice call services, for which they charge more than they can for basic IP connectivity). Now let's say there are hundreds of millions or billions of these MNs, each with its own global unicast IPv6 address. That address could be stable as long as the MN is in the one carrier network. If it roams to another network, it would probably get another address. However, the TTR Mobility system would fix this - and give each MN its own permanent /64, no matter how it connected to the Net, as long as it is via IPv6. (I do not currently plan any connections between Ivip or TTR Mobility for IPv4 and IPv6 - best to keep them as separate systems.) In this situation, people on non-mobile networks would have a genuine reason to get native IPv6 connectivity. Firstly, they might want to sell or give services to these MN users. Secondly, from home, they might want to run a web-cam, file sharing, VPN or whatever which the MN could access directly, on a host-to-host basis, without mucking around with IPv4. So I can imagine this trend happening - but only once there are a substantial number of ordinary users with native IPv6 connectivity. I guess this is most likely to occur with cell-phones. >>> 3) IPv6 addresses can embed IPv4 addresses such that there >>> is stateless address mapping between an EID nexthop and >>> an RLOC. >> >> Can you explain this with an example? I can't clearly envisage what >> you mean. > > I mean, if the IPv6 EID FIB includes entries with a next-hop > address such as: 'fe80::5efe:V4ADDR' (i.e., an IPv6 address > with embedded IPv4 address), then V4ADDR can be statelessly > extracted as the RLOC address of the ETR. So the "mapping", which the LFR-role and IDR-role routers get from the VP router is actually telling them to tunnel subsequent traffic packets to an IPv4 address? That would only work if every LFR-role and IDR-role router had IPv4 access - unless you were to establish special routers to act as gateways for delivering to IPv4 addresses, which is not out of the question. Also, an IPv6 VPR would need to be able to do the same thing - tunnel an IPv6 traffic packet to a DEL-role router which is actually on an IPv4 address, but which is nonetheless delivering packets to an end-user network which uses an IPv6 EID. This could be done, I guess, but there are messy PMTUD problems to solve. I prefer not to think about such things, but for now can imagine you might want to do this, and that you could devise a way of doing it. >>>> There are two reasons an IRON router M might need to know about which >>>> other IRON routers A, B and C advertise a given VP: >>>> >>>> 1 - When M has a traffic packet. (M is either an ordinary IRON >>>> router and advertises the I-R "edge" space in its own network >>>> or it is a "DITR-like" router advertising this space in the >>>> DFZ.) M needs to tunnel the packet to one of these VP routers. >>>> >>>> The VP router will tunnel it to the IRON router Z it chooses as >>>> the best one to deliver the packet to the destination network >>>> and will send a "mapping" packet to M which will cache this >>>> information and from then on tunnel packets matching the >>>> end-user network prefix in the "mapping" to Z (or some other >>>> IRON router like Z, if there were two or more in the "mapping"). >>>> >>>> In this case, M needs only the address of one of the A, B or C >>>> routers. Ideally it would have the address of the closest one - >>>> but it doesn't matter too much if it has the address of a more >>>> distant one. That would involve a somewhat longer trip to the >>>> VP router, and perhaps a longer or shorter trip from there to Z. >>>> (This would typically be shorter than the path taken through >>>> LISP-ALT's overlay network.) >>>> >>>> After M gets the "mapping", it tunnels traffic packets to Z - so >>>> the distance to the VP router no longer affects the path of >>>> traffic packets. >>>> >>>> In this case, BGP on the overlay would be perfectly good - since >>>> it provides the best path to one of A, B or C - typically that >>>> of the "closest" (in BGP terms). >>>> >>>> >>>> 2 - When M is one of potentially multiple IRON routers which >>>> delivers packets to a given end-user network - packets whose >>>> destination address matches a given end-user network prefix P. >>>> >>>> M needs to "blow bubbles" (highly technical term from this >>>> R&D phase of IRON-RANGER) to A, B and C. The most obvious >>>> way to do this is for M to be able to know, via the overlay >>>> network the addresses of all VP routers which advertise a given >>>> VP. There may be two or three or a few more of these. They >>>> could be anywhere in the world. >>>> >>>> BGP does not appear to be a suitable mechanism for this, since >>>> its "best path" basic functions would only provide M with >>>> the IP address of one of A, B and C. >>>> >>>> You could do it with BGP, by having A, B and C all know about >>>> each other, and with all three sending everything they get to >>>> the others. This is not too bad in scaling terms for two, >>>> three of four such VP routers. >>>> >>>> Then, M sends its registration to one of them - whichever it >>>> gets the address of via the BGP of the overlay network - and >>>> A, B and C compare notes so they all get the registration. >>>> >>>> I will call this the "VP router flooding system". >>> >>> This is a nice idea. If I get what you are suggesting, each >>> IRON router that advertises the same VP (e.g., VP(x)) would >>> need to engage in a routing protocol instance with one >>> another to track all of the PI prefix registrations. The >>> problem I have with it is that that would make for perhaps >>> 10^5 or more of these little routing protocol instances as >>> well as lots and lots of manually-configured peering >>> arrangements between the IRON routers that advertise VP(x). >> >> Something like this - but I am not sure what you mean by "routing >> protocol instance". I understand that the two or three VP routers >> for any one VP "P" do need to cooperate and share their various >> registrations. You could either create a fresh protocol to do this, >> or push into service some existing protocol, including perhaps a >> routing protocol. > > We haven't brought the Virtual Router Redundancy Protocol (VRRP) > into discussion yet [RFC5798], but we might want to consider > looking at this as a way of providing fault tolerance for VP > routers. I'm not sure whether VRRP would also support load > balancing between the multiple routers, but it seems like > fault tolerance is the dominant consideration. I agree - fault tolerance is more important than load balancing at this stage of the design, though some form of load balancing might be possible and desirable too. I don't want to try to read this RFC in order to imagine how it might work with I-R, so if you can describe how it would work, that would be good. > Using VRRP also reduces the "fanout" of VP-advertising routers > to just a single RLOC address, and so makes for less complexity > in ferrying CQs around the IRON. But if all VPRs are on the one IP address, this would radically alter the nature of the overlay network. Also a single router might be VPR for multiple VPs - so I can't see how this would work. A quick look into this RFC: http://tools.ietf.org/html/rfc5798#section-5.1.1.2 indicates that it relies on multicast. I think VRRP is intended for multiple routers in a single local network, where multicast could be done. I can't imagine how you could scalably implement multicast on the I-R overlay network. I think this illustrates our differing design approaches. I think you tend to view the subsystems from a very high level - and it if looks like one might do the trick, you consider it. I immediately want to know whether it is possible to do such things, and in this case, it took me a few minutes with a protocol I had never heard of to find a "lower level" detail which seems to preclude its use in the way you intend. I am not suggesting my approach is always the best - because I think it is important to brainstorm ideas and think loosely for a while. Too much "no, it can't be done" thinking too soon results in there being nothing to explore. >> You haven't specified anything other than manual configuration for >> how an IRON router becomes a VP router. VP routers have extra >> workload, so whoever runs such a router must have a reason to do >> this, probably involving payment of money in some way from the >> end-user networks whose EID prefixes are covered by this VP. > > Yes. End-users have to pay either a one-time or > recurring cost for their PI prefixes. OK - but what about the costs of running the IDMs, which will handle widely varying traffic loads from one EID to the next, with these loads generally having little correlation with the amount of space in the EID? >> If there are two or three IRON routers acting as VP routers for a >> given VP, then some organisation is responsible for that VP, is >> collecting payments as described above and is therefore the one >> organisation driving the existence of these two or three VP routers. >> So manual configuration seems OK to me - I don't think there needs >> to be a fancy automated system by which one VP router for a given VP >> "P" would auto-discover any other VP router for "P" in the whole I-R >> system. However, these VP routers for the one VP do need to work >> together to share registrations, and to quickly detect when one or >> more of the set becomes unreachable. > > VRRP maybe? Since it appears to involve multicast, maybe not. It shouldn't be too hard to develop a protocol by which a handful of VPRs work together. Maybe some existing protocols can be used as part of this. >>> For these reasons, I believe it is better for IRON router >>> M to know about all three of A, B and C and direct bubbles >>> to each of them. I think we can achieve this using OSPF >>> with the NBMA link model in the IRON overlay. >> >> OK - but I guess that means not running BGP. I don't know anything >> about OSPF or its scaling properties. BGP has no central >> coordination - something which is understandably attractive to many >> people. Does OSPF have central coordination, single points of >> failure etc.? > > In this case, central coordination would be through > maintenance of the domainname-to-RLOC mappings for > the FQDN "isatapv2.net". In other words, when a new > IDM comes into existence its RLOC address gets added > to the DNS RR's for "isatapv2.net". In the same way, > when an existing IDM is decommissioned its RLOC address > is removed. > > Currently, "isatapv2.net" is registered to me. Do you > trust me to maintain it properly? :^} Sure! Whoever runs it needs to have some fancy way of recognising or rejecting attempts to register whatever needs to be in this branch of the DNS. But how is OSPF structured? BGP is flat and egalitarian, with links between nearby routers all that is required - and of course care about which prefixes are advertised. You could chop the whole BGP-based interdomain routing system into two or more pieces and they would keep running, just fine - although of course each only with a subset of the prefixes. I quick look at: http://en.wikipedia.org/wiki/OSPF and the IPv4 RFC: http://tools.ietf.org/html/rfc2328#page-19 indicates that a large OSPF network is organised into various areas. How would you do this for the IRON-RANGER overlay network? Don't OSPF and ISIS require more centralised administration, such as to structure the whole system into sub-systems and to give certain routers particular roles, on which other routers depend? I haven't read the OSPF article, but my impression is that it is a valuable resource, with Wbenton: http://en.wikipedia.org/wiki/User:Wbenton-test contributing many things, not least a formidable table and diagram of interdependencies between RFCs. The diagram looks like it needs it own routing protocol! >>> Please note: the EID-based IRON overlay is configured over >>> the DFZ, which is using BGP to disseminate RLOC-based >>> prefix information. So, it is BGP in the underlay and >>> OSPF in the overlay - weird, but I think it works. >> >> Yes the DFZ uses BGP and the overlay uses . . . originally I-R used >> BGP (a separate instance of BGP in each such router). Also, IRON >> routers don't need to be DFZ routers and in many or most cases are >> not DFZ (BR) routers - but they all communicate via tunnels which are >> carried between networks via the ordinary Internet (using the DFZ). >> >> I guess these tunnels between IRON routers will need to be manually >> configured, since they are typically between physically and >> topologically nearby routers. > > No manual config needed; the IRON is just a gigantic NBMA > link, and can use automatic tunneling the same as for VET > and ISATAP. But it is important for IRON routers to run their new BGP instance with neighbouring IRON routers which are generally physically or topologically close. Otherwise, the "distance" metrics in the overlay network won't resemble the real "distance" to the other routers, and your routers playing the LFR or IDM role won't automatically discover the address of the "closest" VPR for a given VP. These tunnels surely need to be manually configured - and that defines the membership in the I-R overlay network and its structure for the purposes of its BGP (or OSPF?) control plane. >>>>>> Also, this is just for 10 minute registrations. I recall that the 10 >>>>>> minute time is directly related to the worst-case (10 minute) and >>>>>> average (5 minute) multihoming service restoration time, as per our >>>>>> previous discussions. I think that these are rather long times. >>>>> >>>>> Well, let's touch on this a moment. The real mechanism >>>>> used for multihoming service restoration is Neighbor >>>>> Unreachability Detection. Neighbor Unreachability >>>>> Detection uses "hints of forward progress" to tell if >>>>> a neighbor has gone unreachable, and uses a default >>>>> staletime of 30sec after which a reachability probe >>>>> must be sent. This staletime can be cranked down even >>>>> further if there needs to be a more timely response to >>>>> path failure. This means that the PI prefix-refreshing >>>>> "bubbles" can be spaced out much longer - perhaps 1 every >>>>> 10hrs instead of 10min. (Maybe even 1 every 10 days!) >>>> >>>> OK, I am not sure if I ever knew the details of "Neighbor >>>> Unreachability Detection" - but shortening the time for these >>>> mechanisms raises its own scaling problems. >>>> >>>> Can you give some examples of how this would work? >>> >>> I want to go back on this notion of extended inter-bubble >>> intervals, and return to something shorter like 600sec >>> or even 60sec. There needs to be a timely flow of bubbles >>> in case one or a few IRON routers goes down and needs to >>> have its PI prefix registrations refreshed. >> >> OK - I will stay tuned for further details. > > Bringing VRRP into the consideration could have a > contributing factor to how long the bubble (er, CQ) > interval needs to be. I regard the whole question of registering EIDs with VPRs as being undecided until you propose an exact mechanism. >>>> At present, I can see these choices for this registration mechanism: >>>> >>>> 1 - Keep BGP as the overlay protocol and use my proposed "VP router >>>> flooding system". >>>> >>>> 2 - Retain your current plan of each IRON router like M needing to >>>> know the addresses of all the routers handing a given VP (A, B >>>> and C) which BGP can't do. So you could: >>>> >>>> 2a - keep BGP and add some other mechanism. Maybe M sends a >>>> message to the one of A, B or C it has a best path to, >>>> requesting the full list of all routers A, B and C which >>>> handle a given VP. When M gets the list, it sends >>>> registration "bubbles" to the routers on the list. This >>>> needs to be repeated from time-to-time to discover >>>> new VP routers. >>>> >>>> 2b - use something different from BGP which provides all the >>>> A, B and C router addresses to every IRON router, such as >>>> M. This needs to dynamically change as A, B and C die and >>>> are restarted, or joined by others. >>> >>> Right - I am still leaning toward OSPF with its NBMA >>> link model capabilities. The good news is that the >>> IRON topology itself should be relatively stable, so >>> not much churn due to dynamic updates. >> >> OK. Since the IRON routers have their own IP addresses and are >> generally in networks multihomed by existing BGP techniques, then any >> outages don't affect the IRON routers' IP addresses or their >> tunneling arrangements. There would still be transitory breaks in >> connectivity, before the BGP multihoming arrangements kick in. If >> you could ignore those by some means in the overlay's routing system >> (BGP or OSPF) then yes, the IRON routers should be pretty stable. > > With VRRP, probably even moreso. Or with your own purpose-designed protocol involving one, two or a few more IRON routers in their DEL-roles registering the one EID with two or maybe a few more VPRs. - Robin
- [rrg] FW: I-D Action:draft-irtf-rrg-recommendatio… Tony Li
- [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Robin Whittle
- Re: [rrg] Recommendation and what happens next Tony Li
- Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Tony Li
- Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Robin Whittle
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Tony Li
- Re: [rrg] Recommendation and what happens next Tony Li
- Re: [rrg] Recommendation and what happens next Brian E Carpenter
- Re: [rrg] Recommendation and what happens next Tony Li
- [rrg] Why won't supporters of Loc/ID Separation (… Robin Whittle
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] Why won't supporters of Loc/ID Separati… Tony Li
- Re: [rrg] Recommendation and what happens next Tony Li
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] Why won't supporters of Loc/ID Separati… Robin Whittle
- Re: [rrg] Recommendation and what happens next Russ White
- Re: [rrg] Recommendation and what happens next Templin, Fred L
- Re: [rrg] Recommendation and what happens next Templin, Fred L
- [rrg] IRON-RANGER scalability and support for pac… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] Recommendation and what happens next Tony Li
- Re: [rrg] Why won't supporters of Loc/ID Separati… Tony Li
- Re: [rrg] Recommendation and what happens next Brian E Carpenter
- Re: [rrg] Recommendation and what happens next Scott Brim
- Re: [rrg] Recommendation and what happens next Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] Recommendation and what happens next Tony Li
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] Recommendation and what happens next Robin Whittle
- Re: [rrg] Recommendation and what happens next Scott Brim
- Re: [rrg] Why won't supporters of Loc/ID Separati… Scott Brim
- Re: [rrg] Why won't supporters of Loc/ID Separati… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] Recommendation and what happens next Templin, Fred L
- Re: [rrg] Why won't supporters of Loc/ID Separati… Templin, Fred L
- Re: [rrg] Recommendation and what happens next Scott Brim
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
- Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
- [rrg] Comments on 'draft-whittle-ivip-arch' Templin, Fred L
- Re: [rrg] Comments on 'draft-whittle-ivip-arch' Robin Whittle
- Re: [rrg] Comments on 'draft-whittle-ivip-arch' Robin Whittle