DPA replacement?

"John G. Scudder" <jgs@ieng.com> Sun, 25 August 1996 00:55 UTC

Message-Id: <v03007803ae44ecd551c6@[152.160.213.42]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Sat, 24 Aug 1996 17:50:58 -0700
To: bgp@ans.net
From: "John G. Scudder" <jgs@ieng.com>
Subject: DPA replacement?
Sender: owner-idr@merit.edu
Precedence: bulk

A little while ago I was wrestling with the current DPA drafts and figured
out the the respective drafts and RFCs (the DPA draft, the DPA usage draft,
and RFC 1771) form kind of a contradiction loop.  This seems to stem from
the fact that the current DPA draft wants to treat DPA as a global
administratively defined metric (which still carries around AS numbers for
some reason even though there's not a reasonable way to use them) or maybe
(it isn't clear) as an arbitrary tag (even though Communities does this
just fine).  It's quite clear in either case that the DPA should be used to
bias or set localpref.  The usage draft, on the other hand, keeps the
earlier idea of making DPA a MED-like metric which is only comparable when
the AS part of the DPA matches.  But, RFC 1771 says you can't do elections
when biasing or setting localpref.

This got me thinking about what DPA really wants to do and I ended up with
the following ideas, which are basically tricks to try to make a global
metric useful.  Your feedback is solicited.  Comments from network
operators as to the real-world usefulness of any mechanism such as this (or
the current DPA draft for that matter) would be particularly interesting.

This is still pretty drafty; there are a number of unresolved notes in
[square brackets].  Also, I didn't spend any time trying to come up with a
clever TLA, this leads to some textual awkwardness where I refer to "the
attribute" or "the metric" generically.  Sorry.

--John


Introduction
------------

The current DPA proposal <draft-ietf-idr-bgp-dpa-05.txt> differs from
earlier versions of the document in that it eliminates the requirement that
only DPAs with like source ASes be comparable.  This makes the AS# portion
of the DPA largely redundant; it also increases the possibility of
conflicts in scenarios such as that shown in section 3 of
<draft-ietf-idr-dpa-application-02.txt>.

Also, scenarios such as the aforementioned suffer from DPA's inability to
select between multiple branching paths.  The application document suggests
using Local_Pref to overcome this limitation.  This proposal outlines
another approach, involving a globally significant metric which is refined
in smaller and smaller increments as it propagates further from its
originator.

The proposed algorithm allows the preference of each AS in the path to be
reflected in the metric.  Total ordering is preserved in many cases, but
see the "Limitations" section for notable exceptions.

Path Attribute
--------------

This document proposes an optional transitive path attribute of fixed
length.  The attribute is a range of integers represented by a pair <lower,
upper>.  Both values are four octet, non-negative integers.  The attribute
may be modified by ASes other than the orginator, according to the
following algorithm.  The attribute must not be replaced other than
according to the following algorithm.  [How about removing it?]

[Originator's AS# could also be included if anyone can think of a good use
for it.]

Route Selection Process
-----------------------

For the purposes of route selection, <lower> should be considered to be the
metric.  Routes with a higher metric should be preferred to routes with a
lower metric.  [This could be by biasing the local_pref, or by including
the comparison after the local_pref comparison.]  Routes with any metric
should be preferred over similar routes without any metric.

Operation
---------

When an AS originates routes for which it wishes to bias upstream selection
preference, it should set the metrics according to the following algorithm:

Call the received attribute <rcv_lower, rcv_upper>
range = rcv_upper - rcv_lower
n = number of distinct preferences with which the route is to be offered
p = ordinal preference of the route in question, with the most preferred
route having the highest preference, and the least preferred route having
preference ordinal 1.  Note that several routes could have the same
preference.

For a route with preference p,

        lower = rcv_lower + (p-1) * range/n
        upper = rcv_lower +   p   * range/n

When originating a route, rcv_lower is defined as 0 and rcv_upper is
defined as 2^32-1.  [Or rcv_lower could be defined as 1; this could be
convenient since 0 could be used as a magic cookie meaning "no metric"
which would also sort correctly per the rules above.]  Otherwise, when
propagating a route, rcv_lower and rcv_upper are taken from the received
attribute.

If the AS propagating the route has no wish to bias upstream selection
preference, it need not (and should not) modify the attribute.

The intent of this algorithm is to divide the original sender's preference
up into a set of non-overlapping ranges whose ordering reflects the
preferences of all transit ASes which have such preference.  [But see
"Limitations" for cases where ranges can overlap.]

[Hmm, what do we do when we work range sizes all the way down to zero?
This won't take long in principle, but in practice we might actually have a
low enough diameter to make this a non-issue?]

Example
-------

Here is an example, with a comparison to DPA:

        A-------B----------------D----+
        |                        |\   |
        +-------C--------E-------+ \  |
                |\                  | |
                | -------F----------+ |
                |                     |
                +--------G------------+

Consider A-G to be ASes.  Suppose that A prefers to receive packets via B.
Likewise, C prefers to receive packets for A via E over F and F over G.

Using DPAs, A could send DPA <A, 2> to B and <A, 1> to C.  C could send DPA
<C, 3> to E, <C, 2> to F and <C, 1> to G.  D would have no way within the
context of DPA of choosing between B and E.  This could be fixed using
Local_Pref, but the amount of coordination required seems undesirable if
another workable option is available.

Using the algorithm above, A would originate two attributes.  To B, it
would send <2147483648, 4294967295>.  To C it would send <0, 2147483648>.
C would originate three attributes.  To E it would send <1431655765,
2147483648>.  To F it would send <715827883, 1431655765>.  To G it would
send <0, 715827883>.  D would then be able to determine the desired
ordering (B beats E beats F beats G).

Limitations
-----------

This scheme preserves a total ordering on the metric so long as each AS
where branching occurs has a preference and thus modifies the attribute
(such as in the example).  In the event that a branching occurs with no
modification of the preference, followed by a reconvergence of paths, some
of the ordering property is lost.  That is, ranges may overlap.  For
example:

	B--------C--------E--------J
	|	 |		   |
	|	 +--------F--------|
	|			   |
	+--------D--------G--------|
		 |		   |
		 +--------H--------+

Suppose B has no preference between C or D, but C prefers E over F, and D
prefers G over H.  When J receives B's routes from E, F, G and H it will be
able to determine that it should prefer (E or G) over (F or H) but will not
be able to determine by examining the metric whether to prefer E or G.

As long as all "primary" paths are up, this behaves as desired, by allowing
J to choose whichever preferred path it likes.  However, if one primary
path (E, for example) goes down, traffic for B will be "attracted" to the
remaining "primary" path (G in this case).  This might not be desirable.

In a similar vein, consider this case, similar to the previous but with the
addition of AS K.  D prefers G over H over K:

	B--------C--------E--------J
	|	 |		   |
	|	 +--------F--------|
	|			   |
	+--------D--------G--------|
		 |		   |
		 |--------H--------|
		 |		   |
		 +--------K--------+

In this case J will prefer (E or G) over (F or H or K).  However, we also
now have H preferred over F preferred over K.  This is just an artifact of
the algorithm and doesn't really reflect any particular preference of any
downstream AS.

--
John Scudder                        email:  jgs@ieng.com
Internet Engineering Group, LLC     phone:  (313) 669-8800
122 S. Main, Suite 280              fax:    (313) 669-8661
Ann Arbor, MI  41804                www:    http://www.ieng.com


- - - - - - - - - - - - - - - - -

Re: DPA replacement? Tony Bates
DPA replacement? John G. Scudder
Re: DPA replacement? Curtis Villamizar