Re: [rrg] A Revised critique for LISP

jnc@mercury.lcs.mit.edu (Noel Chiappa) Thu, 18 February 2010 19:26 UTC

To: rrg@irtf.org
Message-Id: <20100218192753.4018C6BE61F@mercury.lcs.mit.edu>
Date: Thu, 18 Feb 2010 14:27:53 -0500
From: jnc@mercury.lcs.mit.edu
Cc: jnc@mercury.lcs.mit.edu
Subject: Re: [rrg] A Revised critique for LISP
Precedence: list

    > From: Lixia Zhang <lixia@cs.ucla.edu>

    > here is a revision of LISP critique

This looks OK; I have a comments on a few points (below).

Note: these are places where I think the critique says something incorrect -
disagreements based on engineering judgement of course belong in the rebuttal.


    > Thus "EID" in LISP is not a host identifier, but IP addresses used
    > within a site for packet delivery.

We keep going around and around on this.

The LEID is used to identify the host, and does so across a global scope - in
both sense of the term - i.e. it is the same everywhere in the network, and
it is also used for that purpose everywhere in the network. Hosts _everwhere_
in the Internet will _always_ use that identifier for communicating with the
host named by that LEID.

So it _is_ a host identifier, and is used for _more_ than just 'packet
delivery ... within a site'; the wording above is, at best, misleading.

In _some_ cases it _also_ has location semantics, with local scope (i.e. that
location information is _not_ useful outside that local scope - even though
it _is_ unique). However, in other cases it dosn't.

The LEID does not _cleanly_ correspond to either an endpoint identifier or an
interface locator, because depending on the circumstances, and because of the
backwards compatability, it can in some cases have mixed semantics. Asking
whether an LEID is 'an identifier, or a locator' is thus akin to asking if an
octagon is a square or a circle.

If you absolutely _have_ to stuff it into either the round hole or a square
hole, I'd say the 'endpoint identifier' one is a better fit, because it
_always_ has identification semantics on a global scope, whilst on the other
hand, i) in _no_ circumstances does an LEID ever have global location
sematics (which is what I think of a locator as having), and ii) in _some_
circumstances it has no location sematics _at all_. 

How can a thing which _always_ has endpoint identity semantics, and sometimes
has _no_ interface location semantics, be an 'address'?

Since fully and clearly exploring this point would use all the space
available, why don't you just say that an LEID is a 'identifier which,
because of the requirements of backwards compatability, has complex
semantics', and let it go at that?


    > LISP's most serious challenges are due to the fact that it effectively
    > divides today's routed IP address space into two, edges and the core,
    > which comes with all the challenges that such a grand division brings;
    > the list below attempts to capture the major ones that have been
    > identified

You seem to have modelled this on my earlier text:

  LISP's most serious challenges are due to the fact that it is effectively a
  new packet-switching layer, with all the challenges (neighbour liveness
  detection, etc) that such layers bring - but with a much larger fan-out
  than is typical in packet-switching systems, since any ITR might
  communicate with any ETR.

First, it's not clear if by the "divi[sion of] today's routed IP address
space into two", you mean to refer to i) the fact that a particular 32-bit
(or 128-bit for ipv6) might be either an EID or an RLOC (or a legacy
address), or ii) generic architectural challenges from the use of two
separate namespaces (location and identity) instead of one (addresses).
Or perhaps you meant both?

The first possible meaning is a significant one, but I am pretty sure that it
is not, in architectural terms, as significant a challenge as the one I
mentioned. The second (separate namespaces) one is certainly significant, but
it is very different from the one I mentioned.

I am not at all sure that the list of problems below is all due to the
division of the address space into two (either possible meaning). For
instance, the 'RLOC liveness' problem has nothing to do with the division of
the namespaces, but is _entirely_ due to the 'new packet switching layer'
architectural problem (and the large fan-out exacerbator).

So I would think the text should both clarify the meaning of the 'separation
of the address space into two' issue, and retain the mention of the 'new
packet switching layer', as that is a separate issue, architecturally.


    > The first question is whether, or how, one can draw a clear boundary to
    > sort existing networks into the core and edges. 

In LISP, _networks_ aren't sorted into 'core' and 'edge' - _pieces of the
namespace_ are sorted into RLOCs (names of forwarding boxes) and EIDs (names
of endpoints). Big difference...


    > the mapping database that captures connectivity between an edge site

But the mapping doesn't _really_ capture 'connectivity' (which is the cause
of one of the weaknesses listed, the inability of a distant site to make
certain that the destination EID is reachable, not just it's RLOC). I'm not
sure offhand what term would be appropriate - it's something like 'an
administrative indication of the ETRs which can provide acess to a given
EID', or something like that.

    > its TRs

LISP uses the term 'xTR' for a device which is both an ITR and an ETR.


    > the question of how to handle packets while ITR waiting for the mapping
    > information. The current decision (dropping packets) favors simplicity
    > at the cost of data performance. Would be feasible to buffer the
    > packets? How deep such a buffer could be? Such questions need future
    > research.

Recent discussion has indicated that this is in fact a non-issue; that there
is an efficient, simple, workaround. (I mention this because of the space
limit; you might not wish to spend space covering something which is not a
problem.)

Any reasonable deployment plan _has_ to provide access to LISP sites from
'legacy' sites, and that will involve some machine(s) advertizing those
destinations into the DFZ, and being able to forward traffic to those
destinations when it turns up at those machines. It turns out that there are
lots of Bad Dudes out there who are busy scanning the address space, and it
has been have observed (on the LISP testbed) that that activity results in
the devices which do that advertizing (the PITRs) wind up with caches
completely filled in. So the 'packet at an ITR which has no mapping for the
destination EID on hand' can be handled by simply sending the packet to a
PITR instead. This will result in a small amount of path stretch until the
ITR gets the mapping, but that should not have any adverse effects.


    > Because of this caching effect, and the fact that the ETR to a
    > multihomed destination site is chosen at ITR, LISP design also faces
    > challenges of response to component failures. LISP cannot easily test
    > reachability of ultimate destinations (e.g. behind an ETR).

The last sentence here does not follow from the first. The difficulty LISP,
or any other system (e.g. the current BGP routing) would face in "test[ing]
reachability of ultimate destinations" has nothing to do with either
i) caching of bindings, or ii) 'early' binding of ETR to a destination
EID at the ITR.

If you want to mention the 'cannot easily test reachability of ultimate
destinations (e.g. behind an ETR)', it needs to be a separate bullet point.
It's not due to _either_ of the architectural issue identified earlier (two
separate namespaces, and the new packet-switching layer), as can be seen from
the fact that the current BGP system also has this issue.


    > One would not be able to see global routing table size reduction
    > unless/until LISP has been adopted by significant number of networks.

As Dino has already mentioned, this (as written) is not correct. It depends
on whether one reads this as 'any global routing table size reduction', or
'massive global routing table size reduction' - and the text as written is
ambiguous.

If read the former way, this is not correct, because the first site that
replaces use of a number of more-specifics (for traffic control purposes)
with LISP, and withdraws the more-specifics, will result in a reduction in
the global routing table size. Each additional site to so convert will result
in a further reduction in the global routing table size.

So, to make this accurate, you need to go into more detail, because how much
reduction will result is a complex question, with many dependencies (e.g. how
many sites which want to multi-home will use LISP to do so, instead of a
globally-advertized PI route).


	Noel

[rrg] A Revised critique for LISP Lixia Zhang
Re: [rrg] A Revised critique for LISP Eliot Lear
Re: [rrg] A Revised critique for LISP Dino Farinacci
Re: [rrg] A Revised critique for LISP Dino Farinacci
Re: [rrg] A Revised critique for LISP - 745 words Robin Whittle
Re: [rrg] A Revised critique for LISP Noel Chiappa
Re: [rrg] A Revised critique for LISP Templin, Fred L
Re: [rrg] A Revised critique for LISP Noel Chiappa
Re: [rrg] A Revised critique for LISP Robin Whittle