Re: [rrg] Compact Routing - critique (v2)

Robin Whittle <rw@firstpr.com.au> Fri, 26 February 2010 03:03 UTC

Return-Path: <rw@firstpr.com.au>
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1625928C2AC for <rrg@core3.amsl.com>; Thu, 25 Feb 2010 19:03:28 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.428
X-Spam-Level:
X-Spam-Status: No, score=-1.428 tagged_above=-999 required=5 tests=[AWL=-0.133, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, J_CHICKENPOX_43=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w+lkk8D8WHS9 for <rrg@core3.amsl.com>; Thu, 25 Feb 2010 19:03:26 -0800 (PST)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id D784028C211 for <rrg@irtf.org>; Thu, 25 Feb 2010 19:03:24 -0800 (PST)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id 5A751175C42; Fri, 26 Feb 2010 14:05:34 +1100 (EST)
Message-ID: <4B873A82.6090400@firstpr.com.au>
Date: Fri, 26 Feb 2010 14:05:38 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
References: <4B80A6B1.7020103@firstpr.com.au> <26E5D1C5D5365D47B147E5E62FC735853F91F9@FIESEXC035.nsn-intra.net>
In-Reply-To: <26E5D1C5D5365D47B147E5E62FC735853F91F9@FIESEXC035.nsn-intra.net>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [rrg] Compact Routing - critique (v2)
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Feb 2010 03:03:28 -0000

Hello Hannu,

Thanks for your message.  Below is a revised version of the critique
which I hope will be OK.  This is still well over 500 words, so
at the very end is a chopped down version for the RRG Report.  It is
difficult or impossible  to do a decent critique in 500 words.

If you want to massage the 494 word version a little before Tony
includes it, then please do so.  If you have further suggestions
for the longer version, please respond on the list and I will
revise it.  The longer version will go in my forthcoming omnibus
of critiques which didn't make it in full into the RRG Report:
draft-whittle-rrg-critiques.

I will soon take a 5 day break from RRG matters, so I can't do any
more changes to the ~500 word version myself.

You wrote:

> Thank you Robin for spending your time to reading the proposal! 
> Good and thoughtful comments indeed.
> 
> Below are my responses and remarks.
> 
> Best regards
> Hannu 

>> The "Compact routing in locator identifier mapping system" 
>> proposal - hereafter "CRM" - is most easily understood as an 
>> alteration to the routing structure of the LISP-ALT mapping 
>> overlay system.  It is not a complete proposal, and therefore 
>> cannot be considered as a scalable routing solution suitable 
>> for further development.
> 
> This is true, but its use is not limited to LISP, or LISP+ALT.
> It is more generic to that. It can serve as a basis for any system
> that need a mapping system That is relying a packet or packets to
> the destination. Eventually BGP could just take The role of
> topology discovery between the landmark nodes.
>
> However, I am ok that the critique is in the context of LISP/LISP+ALT
> as this was used in my write up as the main reference.

OK, I updated the long critique to reflect this.


>> ALT is a global query server system by which ITRs or Map 
>> Resolvers (MRs) can request mapping for a given "edge" address 
>> (EID) address.  The authoritative query servers are either 
>> ETRs or Map Servers (MSs).  The query traverses the ALT 
>> network and is delivered to the MS or ETR - which sends it 
>> reply directly to the MR or ITR via the Internet.
>> Despite its name, LISP is not a Locator / Identifier Separation
>> architecture.   LISP is a Core-Edge Separation architecture.  Core-Edge
>> Elimination architectures such as ILNP and HIP implement the 
>> Locator / Identifier Separation naming model.
> 
> Yes. Agreed.

OK.

>> The ALT network consists of routers running a BGP system quite 
>> separate from that of the Internet, communicating with each 
>> other via tunnels.
>> ALT can be used to deliver the initial packets an ITR has no 
>> mapping for, to deliver them to the destination network and 
>> for them to function as map requests.  CRM includes this as a 
>> specific aim:
>>
>>  ... particularly when the mapping system is also a message relaying
>>  service, i.e. the first packet is delivered through the mapping
>>  system to its destination.
>>
>> Since fractions of a second to perhaps several seconds may 
>> elapse before the ITR receives the map reply, it is possible 
>> that more than one packet may be sent over the ALT overlay 
>> network.  This use of the overlay network for potentially long 
>> and/or numerous traffic packets, rather than short map 
>> requests, significantly increases the load on the data plane 
>> of the overlay network.
> 
> Right, but the same system if landmarks could deliver more that just 
> the first packet. This is a capacity issue (CPU, BW). 

OK.


>> CRM is concerned with optimising the path length taken by these "query"
>> packets (which may be traffic packets) through a significantly 
>> modified version of the ALT network - by altering or adding to 
>> the network's BGP control plane.  Compact Routing principles 
>> are discussed in this regard, with the aim of reducing the 
>> control plane load on any one router while also generally 
>> reducing typical or maximum paths taken by the query packets.
> 
> Right.

OK.


>> While Compact Routing principles may be able to achieve these 
>> goals, compared to whatever, as-yet unspecified, structure the 
>> ALT network would achieve (to minimise path lengths while 
>> remaining robust against single points of failure) there are 
>> two objections to this approach.
>>
>> Firstly, a CRM-modified ALT structure would still be a global 
>> query server system.  No matter how ALT's path lengths and 
>> delays are optimised, there is a fundamental problem with a 
>> querier - which could be anywhere in the world - relying on 
>> mapping information from one or ideally two or more 
>> authoritative query servers, which could also be anywhere in 
>> the world.  The delays and risks of packet loss which are 
>> inherent in such a system constitute a fundamental problem for 
>> any CES or CEE system which relies on it.  The only solution 
>> is to employ local full database query servers, or some other 
>> arrangement in which there are larger numbers of authoritative 
>> query servers, with one or more typically being located quite 
>> close (say one or two thousand kilometers) from any querier.
> 
> I agree with the problem. And the solution that you offer works
> in the ideal world where large databases can be synchronized without 
> any delays. Replication of data is likely to be a problem in the real
> world case as so many times earlier. 

Yes - I have a new arrangement for Ivip in which there is replication
of mapping information in real-time on a global scale, but this is
only required within a single organisation - whoever runs the multiple
DITR (Default ITR in the DFZ) sites (PTRs in LISP) which are required
anyway, and would ideally be located in widely dispersed locations
around the Net:

   DRTM - Distributed Real Time Mapping for Ivip & LISP
   http://www.ietf.org/mail-archive/web/rrg/current/msg06128.html
   Later to be in: draft-whittle-ivip-drtm


>> Secondly, the alterations contemplated in this proposal 
>> involve the roles of particular nodes in the network being 
>> dynamically assigned - as part of its self-organizing nature.  
>> Therefore, at one point in time, a physical node may be 
>> responsible for aggregating routes to a given set of 
>> authoritative query servers, while at a later time, this role 
>> may be moved to another node.
> 
> Exactly.

OK.


>> The discussion of Clustering in the middle of page 4 also 
>> indicates that particular nodes are responsible for 
>> registering EIDs from typically far-distant ETRs, all of which 
>> are handling closely related EIDs which this node can 
>> aggregate.  Since MSes are apparently nodes within the compact 
>> routing system, and the process of an MS deciding whether to 
>> accept EID registrations is determined as part of the 
>> self-organising properties of the system [1], there are 
>> concerns about how the vital function of EID registration can 
>> be performed securely, when no particular physical node is 
>> responsible for it.
> 
> Very correct. There was an off line comment that pointed out the same
> issue. I haven't address this yet, but this is not unsolvable as ISPs
> create trust relationships in the internet exchange points and with 
> peering arrangements. Currently this is done by filtering out "unwanted"
> 
> Announcements. The trust between the landmarks (mapping servers) can be 
> built based on the current BGP relationships. However, this needs more 
> from my part.      

OK.


>> Since individual nodes cost money, and are presumably run by 
>> individual participants in the entire system, it is not clear 
>> how those nodes which have the greatest traffic and the most 
>> responsibilities are to have their running costs paid for by 
>> those who benefit from its activities.
>> Furthermore, those who benefit also rely on this node in terms 
>> of reliability and security - and it does not seem tenable to 
>> trust this to some dynamically chosen node, whose security and 
>> reliability cannot be reliably ascertained.
> 
> The situation is the same with BGP in DFZ. And true, it needs to
> be evaluated for other proposals as well. 

Indeed - the economic aspect of the routing scaling problem is
all DFZ routers having to do more work for some obscure end-user
network, which the router may never carry packets for.

I added a note to this effect.


>> There are simpler solutions to the mapping problem than having 
>> an elaborate network of routers.  If a global-scale query 
>> system is still preferred, then it would be better to have 
>> ITRs use local MRs, each of which is dynamically configured to 
>> know the IP address of the million or so authoritative Map 
>> Server (MS) query servers - or two million or so
>> assuming they exist in pairs for redundancy.   This 10^6 figure is
>> mentioned on page 3.  Whether it is a realistic figure is 
>> uncertain, since neither CRM nor ALT have any clear set of 
>> design objectives.
>>
>> Neither ALT nor a CRM-enhanced ALT network can be suitable 
>> mapping solutions for CES or CEE architectures due to their 
>> global nature.  The solution to this problem involve bringing 
>> authoritative query servers closer to the queriers.
>>
>>
>> [1] "Accepting a map registration from an xTR, the map server 
>> also accepts to take a role to create a cluster of 
>> neighborhood, and to act as a cluster head for these xTRs. To 
>> summarize map servers forming the clusters of compact routing 
>> should be selected based on their capability to aggregate 
>> EIDs. This capability can be concluded from BGP announcements. 
>> For boot strapping an initial seed set of EIDs could be 
>> delegated to all or some map servers. However, this set will 
>> change as the registration situation evolves."
> ----------------------
> 
> Thank you for your through comments.

OK - thanks very much for your appreciative reply.

  - Robin



Critique of "Compact routing in locator identifier mapping system"
(Long version v2 - see "<<<<")
------------------------------------------------------------------

The "Compact routing in locator identifier mapping system" proposal -
hereafter "CRM" - is most easily understood as an alteration to the
routing structure of the LISP-ALT mapping overlay system.  Hannu         <<<<
Flinck describes it more generally: "It can serve as a basis for any
system that needs a mapping system and which relies on sending
packets to the destination. Eventually BGP could just take the role
of topology discovery between the landmark nodes."

This is not a complete proposal, and therefore cannot be considered
for further development by the IETF as a scalable routing solution.

ALT is a global query server system by which ITRs or Map Resolvers (MRs)
can request mapping for a given "edge" address (EID) address.  The
authoritative query servers are either ETRs or Map Servers (MSs).  The
query traverses the ALT network and is delivered to the MS or ETR -
which sends it reply directly to the MR or ITR via the Internet.
Despite its name, LISP is not a Locator / Identifier Separation
architecture.   LISP is a Core-Edge Separation architecture.  Core-Edge
Elimination architectures such as ILNP and HIP implement the Locator /
Identifier Separation naming model.

The ALT network consists of routers running a BGP system quite separate
from that of the Internet, communicating with each other via tunnels.
ALT can be used to deliver the initial packets an ITR has no mapping
for, to deliver them to the destination network and for them to function
as map requests.  CRM includes this as a specific aim:

  ... particularly when the mapping system is also a message relaying
  service, i.e. the first packet is delivered through the mapping
  system to its destination.

Since fractions of a second to perhaps several seconds may elapse before
the ITR receives the map reply, it is possible that more than one packet
may be sent over the ALT overlay network.  This use of the overlay
network for potentially long and/or numerous traffic packets, rather
than short map requests, significantly increases the load on the data
plane of the overlay network.

CRM is concerned with optimising the path length taken by these "query"
packets (which may be traffic packets) through a significantly modified
version of the ALT network - by altering or adding to the network's BGP
control plane.  Compact Routing principles are discussed in this regard,
with the aim of reducing the control plane load on any one router while
also generally reducing typical or maximum paths taken by the query
packets.

While Compact Routing principles may be able to achieve these goals,
compared to whatever, as-yet unspecified, structure the ALT network
would achieve (to minimise path lengths while remaining robust against
single points of failure) there are two objections to this approach.

Firstly, a CRM-modified ALT structure would still be a global query
server system.  No matter how ALT's path lengths and delays are
optimised, there is a fundamental problem with a querier - which could
be anywhere in the world - relying on mapping information from one or
ideally two or more authoritative query servers, which could also be
anywhere in the world.  The delays and risks of packet loss which are
inherent in such a system constitute a fundamental problem for any CES
or CEE system which relies on it.  The only solution is to employ local
full database query servers, or some other arrangement in which there
are larger numbers of authoritative query servers, with one or more
typically being located quite close (say one or two thousand kilometers)
from any querier.

Secondly, the alterations contemplated in this proposal involve the
roles of particular nodes in the network being dynamically assigned - as
part of its self-organizing nature.  Therefore, at one point in time, a
physical node may be responsible for aggregating routes to a given set
of authoritative query servers, while at a later time, this role may be
moved to another node.

The discussion of Clustering in the middle of page 4 also indicates that
particular nodes are responsible for registering EIDs from typically
far-distant ETRs, all of which are handling closely related EIDs which
this node can aggregate.  Since MSes are apparently nodes within the
compact routing system, and the process of an MS deciding whether to
accept EID registrations is determined as part of the self-organising
properties of the system [1], there are concerns about how the vital
function of EID registration can be performed securely, when no
particular physical node is responsible for it.

Since individual nodes cost money, and are presumably run by individual
participants in the entire system, it is not clear how those nodes which
have the greatest traffic and the most responsibilities are to have
their running costs paid for by those who benefit from its activities.
Furthermore, those who benefit also rely on this node in terms of
reliability and security - and it does not seem tenable to trust this to
some dynamically chosen node, whose security and reliability cannot be
reliably ascertained.

These problems of unfair cost burdens, trust and security already exist     <<<<
in today's interdomain routing system.  Costs are borne by DFZ router
operators who do not benefit from the extra activities their routers
perform - since their customers may never send packets to or receive
them from some of the PI-using end-user network which advertise prefixes
in the DFZ.  In today's system, all network operators and Internet users
place a high level of trust in in every aspect of the routing system -
such as not to forward the packets addressed to one network to another.

There are simpler solutions to the mapping problem than having an
elaborate network of routers.  If a global-scale query system is still
preferred, then it would be better to have ITRs use local MRs, each of
which is dynamically configured to know the IP address of the million or
so authoritative Map Server (MS) query servers - or two million or so
assuming they exist in pairs for redundancy.   This 10^6 figure is
mentioned on page 3.  Whether it is a realistic figure is uncertain,
since neither CRM nor ALT have any clear set of design objectives.

Neither ALT nor a CRM-enhanced ALT network can be suitable mapping
solutions for CES or CEE architectures due to their global nature.  The
solution to this problem involve bringing authoritative query servers
closer to the queriers.

[1] "Accepting a map registration from an xTR, the map server also
accepts to take a role to create a cluster of neighborhood, and to act
as a cluster head for these xTRs. To summarize map servers forming the
clusters of compact routing should be selected based on their capability
to aggregate EIDs. This capability can be concluded from BGP
announcements. For boot strapping an initial seed set of EIDs could be
delegated to all or some map servers. However, this set will change as
the registration situation evolves."





494 word version for the RRG Report
-----------------------------------

"Compact routing in locator identifier mapping system" (CRM) is most
easily understood as an elaboration of the routing overlay structure
of the LISP-ALT - but which is applicable to other packet-routing
based global query server systems.

This is not a complete proposal, and therefore cannot be considered
for further development by the IETF as a scalable routing solution.

A CRM system is intended to deliver initial traffic packets to their
destination networks, where they also function as map requests.
These packets may be long and numerous in the fractions of a second
to perhaps several seconds which may elapse before the ITR receives
the map reply.

Compact Routing principles are used to reduce the control plane load
on any one router while also generally reducing the total lengths of
paths taken by the packets.  Some objections to this approach
include the following:

A CRM-modified ALT structure would still be a global query server
system.  No matter how path lengths and delays are reduced, there is
a fundamental problem with a querier - which could be anywhere in
the world - relying on mapping information from one or ideally two
or more authoritative query servers, which could also be anywhere in
the world.  The delays and risks of packet loss which are inherent
in such a system constitute a fundamental problem for any CES or CEE
system which relies on it.

The alterations contemplated in this proposal involve the roles of
particular nodes in the network being dynamically assigned - as part
of its self-organizing nature.

The discussion of Clustering in the middle of page 4 indicates that
particular nodes are responsible for registering EIDs from typically
far-distant ETRs, all of which are handling closely related EIDs
which this node can aggregate.  Since MSes are nodes within the
compact routing system, and the process of an MS deciding whether to
accept EID registrations is determined as part of the
self-organising properties of the system, there are concerns about
how EID registration can be performed securely, when no particular
physical node is responsible for it.

There are also concerns about individually owned nodes performing
work for other organisations.  Such problems of trust and of
responsibilities and costs being placed on those who do not directly
benefit already exist in the interdomain routing system, and are a
challenge for any scalable routing solution.

There are simpler solutions to the mapping problem than having an
elaborate network of routers.  If a global-scale query system is
still preferred, then it would be better to have ITRs use local MRs,
each of which is dynamically configured to know the IP address of
the million or so authoritative query servers.

No global query server system can be optimal choice for CES or CEE
architectures due to their inherently greater delays and risks of
packet loss.  The solution appears to involve a greater number of
widely distributed authoritative query servers, one or more of which
will be acceptably close to each querier.