[RADIR] proposed changes to Section 3

Thomas Narten <narten@us.ibm.com> Tue, 29 April 2008 14:59 UTC

Return-Path: <radir-bounces@ietf.org>
X-Original-To: radir-archive@optimus.ietf.org
Delivered-To: ietfarch-radir-archive@core3.amsl.com
Received: from core3.amsl.com (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AE51E3A68BF; Tue, 29 Apr 2008 07:59:54 -0700 (PDT)
X-Original-To: radir@core3.amsl.com
Delivered-To: radir@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A2A3C3A6B26 for <radir@core3.amsl.com>; Tue, 29 Apr 2008 07:59:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.949
X-Spam-Level:
X-Spam-Status: No, score=-4.949 tagged_above=-999 required=5 tests=[AWL=-1.650, BAYES_50=0.001, RCVD_IN_DNSWL_MED=-4, SARE_BIZOP=0.7]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EUYIGIKKPIty for <radir@core3.amsl.com>; Tue, 29 Apr 2008 07:59:47 -0700 (PDT)
Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by core3.amsl.com (Postfix) with ESMTP id 6381B3A68BF for <radir@ietf.org>; Tue, 29 Apr 2008 07:59:47 -0700 (PDT)
Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e36.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id m3TExjO1005521 for <radir@ietf.org>; Tue, 29 Apr 2008 10:59:45 -0400
Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m3TExdDP204588 for <radir@ietf.org>; Tue, 29 Apr 2008 08:59:44 -0600
Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m3TExchL006216 for <radir@ietf.org>; Tue, 29 Apr 2008 08:59:38 -0600
Received: from cichlid.raleigh.ibm.com (wecm-9-67-201-74.wecm.ibm.com [9.67.201.74]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id m3TExa1R005810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <radir@ietf.org>; Tue, 29 Apr 2008 08:59:38 -0600
Received: from cichlid.raleigh.ibm.com (cichlid-new [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.2/8.12.5) with ESMTP id m3TExWqi029606 for <radir@ietf.org>; Tue, 29 Apr 2008 10:59:33 -0400
Message-Id: <200804291459.m3TExWqi029606@cichlid.raleigh.ibm.com>
To: radir@ietf.org
Date: Tue, 29 Apr 2008 10:59:32 -0400
From: Thomas Narten <narten@us.ibm.com>
Subject: [RADIR] proposed changes to Section 3
X-BeenThere: radir@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Routing and Addressing Directorate <radir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/radir>, <mailto:radir-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/radir>
List-Post: <mailto:radir@ietf.org>
List-Help: <mailto:radir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/radir>, <mailto:radir-request@ietf.org?subject=subscribe>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: radir-bounces@ietf.org
Errors-To: radir-bounces@ietf.org

Going back through Section 3, here is what I'd propose we do:

Better define "superlinear". I.e., to make clear we don't know exactly
what the curve is, but that it appears to be
quardratic/polynomial/something. The fact that it is not linear is
problematic because historically, technology is able to keep up with
linear growth. Some definitions:

When multiple processors are used, you get speedup greater than the
number of processors. I.e., you get 3.5X improvement with 3 processors
vs. just one.

Another example: (where parallelizing a function produces a greater
performance improvement than the number of applied processing cores)

Superlinear/linear is also defined in the context of a converging
series, though I'm not sure how to apply that to routing updates...

So, do we mean something like: the cost of (or resources needed to)
process/managing routing updates goes up at a rate greater than linear
on the number of prefixes

(though this doesn't factor in the rate of updates, which is also a
factor).


>    o  The overall rate of routing updates is increasing, requiring
>       routers to process updates at an increased rate or converge more
>       slowly if they cannot.  The rate increase is driven by a number of
>       factors (discussed below).  It should be noted that the overall
>       routing update rate is dependent on two factors: the number of
>       individual prefixes and the mean per-prefix update rate.  While it
>       is clear that the overall number of prefixes is increasing super-
>       linearly, further study is needed to determine whether the mean
>       per-prefix update rate is increasing as well [1].

I think we should dig a bit deeper into the last point and engage
folks (geoff?) to find references and/or to encourge work be done
here.

>    This super linear growth presents a scalability challenge for current
>    and/or future routers.  There are two aspects to the challenge.  The
>    first one is purely technical: can we build routers (i.e., hardware &
>    software) actually capable of handling the control plane load, both
>    today and going forward?  The second challenge is one of economics:
>    is the cost of developing, building and deploying such routers
>    economically sustainable, given current and realistic business models
>    that govern how ISPs operate as businesses?

Tony has won me over and I think we need to collapse the two
points. We can make pigs fly. The issue is at what cost. It is the
cost fact that is the issue. Or, the cost factor will be looming large
if we are really bumping into technical challenges.

> 3.1.  Technical Aspects
> 
>    The technical challenge of building routers relates to the resources
>    needed to process a larger and increasingly dynamic amount of routing
>    information.  More specifically, routers must maintain an increasing
>    amount of associated state information in the RIB, they must be
>    capable of populating a growing FIB, they must perform forwarding
>    lookups at line rates (while accessing the FIB) and they must be able
>    to initialize the RIB and FIB at boot time.  Moreover, this activity
>    must take place within acceptable time frames (i.e., paths for
>    individual destinations must converge and stabilize within an
>    acceptable time period).  Finally, the hardware needed to achieve
>    this cannot have unreasonable power consumption or cooling
>    demands.

Reword slightly to just list what routers have to do (technically) and
how bigger tables/faster updates increases the challenge.

> 3.2.  Business Aspects
> 
>    Even if it is technically possible to build routers capable of
>    meeting the technical and operational requirements, it is also
>    necessary that the overall cost to build, maintain and deploy such
>    equipment meet reasonable business expectations.  ISPs, after all,
>    are run as businesses.  As such, they must be able to plan, develop
>    and construct viable business plans that provide an acceptable return
>    on investment (i.e., one acceptable to investors).

Reword to just say that key issue is "at what cost". (Note: by saying
"even if it is technically possible" we are probably hitting the hot
buttons of folk who do not doubt it can be done.)
 
>    While the IETF does not (and cannot) concern itself with business
>    models or the profitability of the ISP community, the cost of running
>    the routing subsystem as a whole is directly influenced by the
>    routing architecture of the Internet, which clearly is the IETF's
>    business.  Further, because cost implications are part of each and
>    every engineering decision, controlling or limiting the overall cost
>    of running the routing subsystem (through architectural decisions) is
>    part of the IETF's fundamental charter.  Consequently, having the
>    IETF continue with an architectural model that places unbounded cost
>    requirements on critical infrastructure represents an undue risk to
>    the future of the Internet as a whole.
> 
>    One aspect of planning concerns the assumptions made about the
>    expected usable lifetime of purchased equipment.  Businesses
>    typically expect that once deployed, equipment can remain in use for
>    some projected amount of time (e.g., 3-5 years).  Upgrading equipment
>    earlier than planned is more easily justified (as an unplanned
>    expense) when a new business opportunity is enabled as a result of an
>    upgrade.  For example, an upgrade might be justified by an ability to
>    support increased traffic or an increase in the number of customer
>    connections, etc., where the upgrade can translate into increased
>    revenue.  In contrast, it is more difficult to justify unplanned
>    upgrades in the absence of corresponding customer benefit (and
>    revenue) to cover the upgrade cost.  It is generally desired that
>    deployed equipment remain usable over its planned lifetime.  An
>    increase in the resources required to support larger or more dynamic
>    routing tables is viewed as a sort of "unfunded mandate", in that
>    customers do not expect to have to pay more just to retain the same
>    level of service as before, i.e., having all destinations be
>    reachable as was the case in the past.  This undermining of planning
>    is particularly problematic when the increase in routing demand
>    originates external to the ISP, and the ISP has no way to control or
>    limit it (e.g., the increased demand comes from being part of the
>    DFZ).
> 
>    From a business perspective, it is desirable to maintain or increase
>    the useful lifespan of routing equipment, by improving the scaling
>    properties of the routing and addressing system.

Actually, let me suggest we scrap this entire section and try
again. (I suspect this section is the one that people are the most
unhappy with.).

How about something like:

While it seems likely that it will remain technically feasible to
build routers that meet the technical and operational requirements of
operating within the DFZ, the more important question is at what cost
and what the actual useable lifetime of such routing equipment would
be.

One cost is the capital cost to purchase a router that can adequately
particpate in DFZ routing. As the cost rises, smaller (or more
struggling) ISPs may find themselves priced out of the market and
unable to fully participate in DFZ routing. Should the cost of such
routers be high enough, we may find that only a small number of the
largest ISPs can operate within the DFZ. At some point, we may find
that global routing is effectively controlled by a small number of
operators, with a high barrier to entry for newcomers. This would be a
significant change from how routing within the Internet has
historically been managed and may result in a reduction in the
innovation that has historically fueled Internet growth.

Another cost relates to how long a given piece of equipment (i.e.,
hardware configuration) is sufficient to fully participate in routing.
Hardware purchases are made assuming that the equipment will be
useable for a fixed amount of time (e.g., 3-5 years) before needing to
be upgraded and replaced. But increased load on routers stemming from
increased routing updates external to an ISP are a special case, as
they are not under the control of an ISP and hence are difficult to
predict and plan for. Should the routing load increase too quickly,
ISPs will need to replace routers earlier than predicted and budgeted.
For businesses that are not growing, i.e., that are not expanding
service or capacity to existing customers or increasing the size of
their customer base, replacing hardware earlier than planned can be
problematic, as they cannot easily pass such "unplanned" costs onto
their customers, who do not see increased value from a price
increase. (Selling customers on potential service-reduction if they
don't pay more is a difficult business model to sustain.)

> 3.3.  Alignment of Incentives
> 
>    Today's growth pattern is influenced by the scaling properties of the
>    current system.  If the system had better scaling properties, we
>    would be able support and enable more widespread usage of certain
>    applications such as multihoming and traffic engineering.  Currently
>    the system does not allow everyone to multihome, as there are some
>    barriers to multihoming due to operational practices that try to
>    strike a balance between the amount of multihoming and preservation
>    of routing slots.  It is desirable that the routing and addressing
>    system exert the least possible back pressure on end user
>    applications and deployment scenarios, to enable the broadest
>    possible use of the Internet.

I'd suggest being blunt here and adding a line: If everyone who
potentially wanted to multihome were given PI space, the routing
system would simply collapse. Hence, there is a need to say "no".

>    One aspect of the current architecture is a misalignment of cost and
>    benefit.  Injecting individual prefixes into the DFZ creates a small
>    amount of "pain" for those routers that are part of the DFZ.  Each
>    individual prefix has a small cost, but the aggregate sum of all
>    prefixes is significant, and leads to the core problem at hand.
>    Those that inject prefixes into the DFZ do not generally pay the cost
>    associated with the individual prefix -- it is carried by the routers
>    in the DFZ.  But the originator of the prefix receives the benefit.
>    Hence, there is misalignment of incentives between those receiving
>    the benefit and those bearing the cost of providing the benefit.
>    Consequently, incentives are not aligned properly to produce a
>    natural balance between the cost and benefit of maintaining routing
>    tables.
> 
> 3.4.  Table Growth Targets
> 
>    A precise target for the rate of table size or routing update
>    increase that should reasonably be supported going forward is
>    difficult to state in quantitative terms.  One target might simply be
>    to keep the growth at a stable, but manageable growth rate so that
>    the increased router functionality can roughly be covered by
>    improvements in technology (e.g., increased processor speeds,
>    reductions in component costs, etc.).

Say "something close to linear growth" would be ideal.

>    However, it is highly desirable to significantly bring down (or even
>    reverse) the growth rate in order to meet user expectations for
>    specific services.  As discussed below, there are numerous pressures
>    to deaggregate routes.  These pressures come from users seeking
>    specific, tangible service improvements that provide "business-
>    critical" value.  Today, some of those services simply cannot be
>    supported to the degree that future demand can reasonably be expected
>    because of the negative implications on DFZ table growth.  Hence,
>    valuable services are available to some, but not all potential
>    customers.  As the need for such services becomes increasingly
>    important, it will be difficult to deny such services to large
>    numbers of users, especially when some "lucky" sites are able to use
>    the service and others are not.

Thoughts?

Thomas
_______________________________________________
RADIR mailing list
RADIR@ietf.org
https://www.ietf.org/mailman/listinfo/radir