Re: [homenet] Egress Routing Discussion: Baker model

Ray Hunter <v6ops@globis.net> Fri, 22 February 2013 10:01 UTC

Return-Path: <v6ops@globis.net>
X-Original-To: homenet@ietfa.amsl.com
Delivered-To: homenet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 048CA21F8E2C for <homenet@ietfa.amsl.com>; Fri, 22 Feb 2013 02:01:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.374
X-Spam-Level:
X-Spam-Status: No, score=-3.374 tagged_above=-999 required=5 tests=[AWL=0.225, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zGFstTVq0+eW for <homenet@ietfa.amsl.com>; Fri, 22 Feb 2013 02:01:08 -0800 (PST)
Received: from globis01.globis.net (mail.globis.net [87.195.182.18]) by ietfa.amsl.com (Postfix) with ESMTP id 66A8121F8E20 for <homenet@ietf.org>; Fri, 22 Feb 2013 02:01:07 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by globis01.globis.net (Postfix) with ESMTP id 2A6B6870064; Fri, 22 Feb 2013 10:53:17 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at globis01.globis.net
Received: from globis01.globis.net ([127.0.0.1]) by localhost (mail.globis.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZRNdEVQS6ttO; Fri, 22 Feb 2013 10:52:47 +0100 (CET)
Received: from Rays-iMac-2.local (unknown [192.168.0.3]) (Authenticated sender: Ray.Hunter@globis.net) by globis01.globis.net (Postfix) with ESMTPA id 66E45870061; Fri, 22 Feb 2013 10:52:47 +0100 (CET)
Message-ID: <51273FE8.7050302@globis.net>
Date: Fri, 22 Feb 2013 10:52:40 +0100
From: Ray Hunter <v6ops@globis.net>
User-Agent: Postbox 3.0.7 (Macintosh/20130119)
MIME-Version: 1.0
To: "Fred Baker (fred)" <fred@cisco.com>
References: <472E7EB7-E262-46CE-A17E-DE4C45C70566@cisco.com> <8C48B86A895913448548E6D15DA7553B79DCE3@xmb-rcd-x09.cisco.com>
In-Reply-To: <8C48B86A895913448548E6D15DA7553B79DCE3@xmb-rcd-x09.cisco.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: "homenet@ietf.org Group" <homenet@ietf.org>, "Abhay Roy (akr)" <akr@cisco.com>, "isis-chairs@tools.ietf.org" <isis-chairs@tools.ietf.org>, "ospf-chairs@tools.ietf.org" <ospf-chairs@tools.ietf.org>
Subject: Re: [homenet] Egress Routing Discussion: Baker model
X-BeenThere: homenet@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <homenet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/homenet>, <mailto:homenet-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/homenet>
List-Post: <mailto:homenet@ietf.org>
List-Help: <mailto:homenet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/homenet>, <mailto:homenet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Feb 2013 10:01:11 -0000

I have read all of your drafts, and those of the other authors,
carefully, once. No doubt I'll have to re-read them.

This response is limited to high level comments regarding the overall
approach, and isprobably applicable to all 3 sets of authors :

1. Some drafts talk extensively about the need for an "extensible"
routing protocol, and often mention the desirability of TLV
(type-length-value) objects.

I agree that a TLV structure potentially solves many issues of how to
encode and transport new options between routing protocol speakers.

But given that route determination is a distributed algorithm, and that
Homenet devices will not always run the latest and greatest code,
what action should nodes that are running older code take regarding any
TLV options that they don't understand?

Isn't there a danger that extensibility will lead to more routing loops,
instability, and black holes?

Is there a need for all speakers to first agree the (newest)
commonly-understood subset of options that all speakers in a Homenet
can/will honour before any extension options are transmitted?

2. Aren't we forgetting the first hop?

Given a shared subnet/prefix/link with 2 CPE routers performing some
fancy new form of forwarding (based on PBR or SADR or whatever) that is
also shared by existing host implementations, how will the routers
signal these new default route semantics to end hosts?

Would we need a new prefix information option in ND?

Would we need an extension to RFC 4191 Section 2.3 Route Information
Option to include (source prefix,destination prefix) routes?

Would we need a new ICMPv6 redirect message to extend RFC2461 Section
4.5 to include the possibility of (source,destination) redirects?

3. Limiting this discussion strictly to Homenet requirements:  Aren't we
forgetting the inter-provider management boundary?

My view of the Homenet is a network that is potentially a member of
multiple overlapping AS's simultaneously, without being an AS itself.
That's highly unusual in routing protocol terms.

I think that it is very unlikely that any operator will allow any
dynamic routing between a CPE managed by a customer and a PE managed by
the provider.

I think there's also a potential issue of anyone making any assumptions
about dynamic routing being available between a provider-managed CPE and
a customer owned CPE. Software version control will not be trivial.

The current most likely source of external routing information is
DHCPv6-PD, used to locally autoconfigure a "floating static" default
route on the Homenet BR, pointing out of the upstream interface to "the
Internet".

As such, how will any routing information beyond a simple default route
(related to a single delegated prefix) be injected into the Homenet?

Why are we importing the extra complexity (related to data centres and
enterprises) into Homenet?

4. We're still planning on doing something about source address
selection, aren't we?
For all the suggested complexity of the packet routing solutions
suggested so far, isn't the real "route" selection going to be performed
in the host?

5. Flow label routing: hasn't rfc6437 scuppered any chance that the flow
labels themselves will directly carry any meaning that could be
realistically used to make deterministic forwarding decisions by
low/mid-powered routers, because you're essentially going to have to
reverse a 20 bit hash function to make a forwarding decision? Wouldn't
the requirements in rfc6437 also suggest a routing table explosion,
because each individual flow would have to be associated with an
individual IS-IS route? Perhaps you could distribute some intermediate
result to end hosts and routers via IS-IS,which is deterministic and
related to policy based routing (such as including the tenant label in a
TLV), but which after applying some hash transform and adding some
entropy, would comply with the requirements of 6437? That'd potentially
reduce the IS-IS routing table size dramatically. I've been waiting 15+
years for fully dynamic PBR and I fully expect to wait some time longer.
In any case, with all due respect, I don't think it's relevant for Homenet,

6. Other potentially simpler approaches that might be faster to market
I have provided detailed feedback to Ole & Lorenzo, suggesting how
Homenet could potentially work without modifying any routing protocols
at all, with multi-homing, without resorting to NPT, and with BCP38
ingress/egress filters (albeit with a hard link to some
yet-to-be-defined autoconfiguration protocol, and a limit that the
prefixes of any walled-gardens must be disjoint from other AS's directly
connected to this Homenet, and possibly some other limitations such as
dumb hosts should not be connected to dual routers from competing
providers).

Do I need to write a draft on this, or is this already clear?
Would someone be willing to help out/ collaborate?

If I have other detailed comments on the individual drafts, I'll come
back with those.

regards,
RayH

Fred Baker (fred) wrote:
> On Feb 22, 2013, at 6:22 AM, Fred Baker <fred@cisco.com> wrote:
>
>> In Atlanta, Mark asked Lorenzo and I to put together a draft of an approach to source/destination, and especially egress, routing. I pulled together a plan of attack that I applied to both IPv4 and IPv6, and to both IS-IS and OSPF and sought review from a limited list including Lorenzo; this includes capabilities for at least one other use case I'm looking at. Mark asked me to cut that down to make things separately implementable (which they would be anyway, but to make it apparent). So I have broken it into components.
>>
>> ......
>>
>> I have documented my approach, which provides the generalized concept, is built into the routing protocols IS-IS and OSPF, and also addresses at least one other "tagged routing" option, which is to look at flow labels.
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-isis-automatic-prefix
>>  "Automated prefix allocation in IS-IS", Fred Baker, 18-Feb-13
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-isis-dst-flowlabel-routing
>>  "Using IS-IS with Role-Based Access Control", Fred Baker, 17-Feb-13
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-isis-dst-src-routing
>>  "IPv6 Source/Destination Routing using IS-IS", Fred Baker, 17-Feb-13
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-ospf-dst-flowlabel-routing
>>  "Using OSPFv3 with Role-Based Access Control", Fred Baker, 17-Feb-13
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-ospf-dst-src-routing
>>  "IPv6 Source/Destination Routing using OSPFv3", Fred Baker, 17-Feb-13
>>
>> http://tools.ietf.org/html/draft-baker-ipv6-ospf-extensible
>>  "Extensible OSPF LSAs", Fred Baker, 17-Feb-13
>>
>> I'll say why I think mine is the correct approach in another email, as I imagine my colleagues will for theirs. But the working group should consider the question.
>
>
> This is that separate email. As I start, may I make the most direct possible apology to my colleagues with other approaches. I'm going to say that I think you're "wrong", and say why. That is in the spirit of open discussion in which we have a technical disagreement. BTW, you should be aware that I have a fair set of people who are telling me I'm wrong as well, hopefully in the same spirit. Some of them are pretty irritated with me, and I have gotten irritated with them.
>
> Two parts of the question here relate to "what problem do we think we're solving" and "how do we model the network". A third will be "how do we implement it in a router?"
>
> Homenet, I believe, thinks it's solving "how do I provide egress routing with zero, or perhaps epsilon, configuration, using open source code in the cheapest possible implementation." That's laudable; that said, I do believe that egress routing is a requirement for any IPv6 network involving multiple provider-allocated prefixes, and that as a result this is something to be built into the base protocol. In OSPF terminology, it is sufficient to build on an AS-external-LSA if that's the only source/destination route to be considered (it is certainly a principal use case); for a general network, however, we don't generally carry a lot of AS-external information around. So we need, I think, the ability to use native default routes as well as AS-external default routes, which means that we also need to look at the intra-area-prefix-LSA and the inter-area-prefix-LSA. The problem, I argue, is how to tag a route with a set of attributes and apply a route policy to it. One possible attribute is the source prefix that might be used by systems communicating with this destination, and there are other possible attributes. The route policy is that traffic is forwarded to a stated destination if and only if it also matches the other specified attributes. So, when we want to source/destination route to a specified egress, the destination in question is a default route (::/0), the source is the relevant PA prefix, and we are asking the network to build routes in the way it natively does so for traffic that has the specified set of attributes.
>
> As to "how we model it", there are at least two possible approaches. When Jon Moy was first building what we now call OSPF, one of the problems that the Internet was dealing with was congestive collapse. There were several experiments, one of which I was involved in with Neil Bierbaum of NASNET, in which we wanted to send interactive traffic using paths that had relatively low delay, and high volume file transfer traffic on other paths that had high throughput. It was not unusual for those "high throughput" paths to also have long delay, such as through a geosynchronous satellite. The technology for doing this was originally documented, at least publicly, in RFCs 1131 and 1247, commented on in RFC 1812, and has evolved to RFC 4915's Multi-topology Routing; it was developed in parallel in IS-IS as well. The multi-topology model is based on the premise that routers and/or links differ from each other in some important way, such as capacity, delay, or capability (do they support IPv6? Do they have a stated diffserv configuration?), and that it is useful to segregate them into sub-topologies and build routes within those topologies. The other possible model, which I am following, presumes that one is not routing "to a destination", although arriving at an intended destination is an important facet, or "through a specific topology"; ultimately, one is routing a class of traffic, as I described in http://tools.ietf.org/html/draft-baker-fun-routing-class. That class of traffic (as seen by a given router) consists of the datagrams that will pass through a stated router and have some set of attributes, which might include destination address, source address, a flow label, a DSCP, or several other characteristics. For a DSCP, topology is important - I need to know whether the links have the attributes I am asking about. Those other attributes are not, in my view, topology-relevant - they simply help define the class of traffic. 
>
> So I'm modeling the routing of a class of traffic identified by a set of attributes in what is otherwise a single topology.
>
> When you looked at my list of documents, you no doubt thought "the working group is focused on OSPF as the routing protocol; why IS-IS?" I looked at IS-IS first because it is the simplest to update in this way. The way IS-IS works is that each router advertises an at-least-in-concept single Link State PDU (LSP) that contains information about itself and its neighborhood. The information is stored in type-length-value (TLV) objects, and the LSP is in essence a list of them describing a single router. Each router starts with its own LSP, determines how it will reach its neighbors, and then their neighbors, and so on until it has figured out how to reach every router in the network (including LANs, which are modeled as virtual routers called pseudonodes). Along the way, it also discovers that routers advertise reachability of various kinds: "if you have a route to me, I can deliver data to <>", where <> might be an AppleTalk subnet, an IPv4 subnet (RFC 1195), an IPv6 subnet (RFC 5308), or any of a number of other things. RFC 5308 specifies such a TLV, which specifies an IPv6 prefix with some attributes like how it got into the network. It also specifies the ability to have sub-TLVs, TLVs that specify additional attributes for the larger TLV, although it doesn't define any. Those are for "future extension". Welcome to the future.
>
> So, from my perspective, in the routing protocol, we need to specify sub-TLVs for any additional attributes, such as source prefixes, that we want to include, and IS-IS has given us that capability. What's necessary next is a FIB that the routes can be stored into. I'll come to that later.
>
> Of course, IS-IS isn't going to be very useful in homenet without a way to allocate prefixes to LANs. draft-baker-ipv6-isis-automatic-prefix is in essence draft-ietf-ospf-ospfv3-autoconfig for IS-IS. Shoot me.
>
> But we wanted to do this in OSPF. Here, I have a problem. OSPF is designed with fixed LSAs, not extensible LSAs, so adding an attribute to an LSA is difficult without fundamentally changing the LSA. I started by defining a set of extensible LSAs in draft-baker-ipv6-ospf-extensible; they are direct counterparts to RFC 5340's LSAs, and intended to be backward-compatible with them. I'm not stuck on exactly that approach; I have since found that an extensible LSA was considered and set aside by the working group in 2006/2007, in http://tools.ietf.org/html/draft-ietf-ospf-mt-ospfv3. Give me an approach that gives me the ability to add a sub-TLV to an OSPF LSA such as I did for IS-IS, and I'm pleased as punch. draft-baker-ipv6-ospf-dst-src-routing is a direct counterpart to draft-baker-ipv6-isis-dst-src-routing, and draft-baker-ipv6-ospf-dst-flowlabel-routing is a direct counterpart to draft-baker-ipv6-isis-dst-flowlabel-routing; they add attributes to an extensible LSA. The theory is the same.
>
> Which brings us, I suppose, to the FIB. There are many ways a FIB can be designed, and I would expect every vendor to do it their own way; this is not a matter for standardization. However, to have the discussion, it seems important to at least mention some possible approaches. In appendices in the drafts, I mention two. They are far from the only two; they are two.
>
> Mark felt it was important to specifically state how these might play with the Waikato source address routing extensions in Linux, so I addressed that. I generalized it a little bit; there is a trade-off, and I wanted to comment on it. The premise in Waikato is that one identifies the source prefix and uses that to select a destination FIB. In terms of writing code quickly, that's a reasonable approach; the FIB itself is already designed, the question is how to have more than one of them. My question is how many FIBs one might have, and what memory impacts that might imply. In a generalized model, one could imagine having quite a number of FIBs. But now, what routes do we expect to put into those FIBs? I expect that, at least in a small-to-medium sized network, one has some number of PA prefixes and as many egress default routes; most of the routes, whether few or many, are traditional destination routes within the domain. So, I can have a FIB per PA prefix, put egress default routes only into the FIBs for their source prefixes, and put every single destination route into each FIB. That results in a potentially-large number of FIBs, each of which contains mostly-identical information. If memory is a concern, I throw out a second option; I could have a destination-only FIB and a set of PA prefix-specific FIBs, which are usually relatively small, perhaps only a single route. When you do a route lookup, you look in both the destination-only FIB, which will find local routes, and the PA-relevant FIB, which will get you a default route, and use the one that is most specific.
>
> That said, I think there is a much simpler approach, although it may take some time to explain. A PATRICIA trie is a structure originally discussed in 1968 for searching through ASCII text, and used in route tables. It allows you to build a set of discontiguous bit strings, and do a radix search through the trie. For the details, I'll refer you to discussions of PATRICIA tries on the web and the appendix in the document. 
>
> So there are at least two possible FIB approaches; there are no doubt other approaches, and they might be better. To each his or her own.