[lisp] Comments on lisp-architecture

Albert Cabellos <acabello@ac.upc.edu> Sat, 03 November 2012 13:02 UTC

Return-Path: <acabello@ac.upc.edu>
X-Original-To: lisp@ietfa.amsl.com
Delivered-To: lisp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 65DE921F9C53 for <lisp@ietfa.amsl.com>; Sat, 3 Nov 2012 06:02:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y5d2S8v2EAxY for <lisp@ietfa.amsl.com>; Sat, 3 Nov 2012 06:02:02 -0700 (PDT)
Received: from roura.ac.upc.es (roura.ac.upc.edu [147.83.33.10]) by ietfa.amsl.com (Postfix) with ESMTP id 9E9F321F9C3E for <lisp@ietf.org>; Sat, 3 Nov 2012 06:01:57 -0700 (PDT)
Received: from gw.ac.upc.edu (gw.ac.upc.edu [147.83.30.3]) by roura.ac.upc.es (8.13.8/8.13.8) with ESMTP id qA3D1GVT015941; Sat, 3 Nov 2012 14:01:31 +0100
Received: from [10.0.0.7] (62.83.144.193.dyn.user.ono.com [62.83.144.193]) by gw.ac.upc.edu (Postfix) with ESMTP id 7D9346B0074; Sat, 3 Nov 2012 14:00:45 +0100 (CET)
From: Albert Cabellos <acabello@ac.upc.edu>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Sat, 03 Nov 2012 14:00:34 +0100
Message-Id: <AD8D31C0-CA49-424E-9ACF-E32526670936@ac.upc.edu>
To: Noel Chiappa <jnc@mercury.lcs.mit.edu>
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
X-Mailer: Apple Mail (2.1499)
Cc: lisp@ietf.org
Subject: [lisp] Comments on lisp-architecture
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lisp>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 03 Nov 2012 13:02:04 -0000

Hi all,

Here are my comments to draft-lisp-architecture,

Thanks!

Albert


> LISP Working Group                                         J. N. Chiappa
> Internet-Draft                              Yorktown Museum of Asian Art
> Intended status: Informational                             July 16, 2012
> Expires: January 17, 2013
> 
> 
>                 An Architectural Perspective on the LISP
>                   Location-Identity Separation System
>                    draft-chiappa-lisp-architecture-01

[snip]

> 2.2. Deployment of New Namespaces
> 
>    Once the mapping system is widely deployed and available, it should
>    make deployment of new namespaces (in the sense of new syntax, if not
>    new semantics) easier. E.g. if someone wishes in the future to
>    devise a system which uses native MPLS [RFC3031] for a data carriage
>    system joining together a large number of xTRs, it would easy enough
>    to arrange to have the mappings for destinations attached to those
>    xTRs abe some sort of MPLS-specific name.

Once-> I suggest removing this word given that the Mapping System is already there.

MPLS-> Although it is a good analogy, I don´t think that MPLS is a good example given that with LISP we can´t stack labels.

>    More broadly, the existence of a binding layer, with support for
>    multiple namespace built into the interface on both sides (see
>    Section 5) is a tremendously powerful evolutionary tool; one can
>    introduce a new namespace (on one side) more easily, if it is mapped
>    to something which is already deployed (on the other). Then, having
>    taken that step, one can invert the process, and deploy yet another
>    new namespace, but this time on the other.
> 
> 2.3. Future Development of LISP
> 
>    Speculation about long-term future developments which are enabled by
>    the deployment of LISP is not really proper for this document.
>    However, interested readers may wish to consult [Future] for one
>    person's thoughts on this topic.
> 
> 3. Architectual Perspectives
> 
>    This section contains some high-level architectural perspectives
>    which have proven useful in a number of ways for thinking about LISP.
>    For one, when trying to think of LISP as a complete system, they
>    provide a conceptual structure which can aid analysis of LISP. For
>    another, they can allow the application of past analysis of, and
>    experience with, similar designs.
> 
> 3.1. Another Packet-Switching Layer

Weak suggestion, maybe it is worth to mention Van Jacobson´s view that the Internet was an overlay of the Public Switched Telephone Network, in this context LISP is an overlay of the Internet.

>    When considering the overall structure of the LISP system at a high
>    level, it has proven most useful to think of it as another packet-
>    switching layer, run on top of the original internet layer - much as
>    the Internet first ran on top of the ARPANET.
> 
>    All the functions that a normal packet switch has to undertake - such
>    as ensuring that it can reach its neighbours, and they they are still
>    up - the devices that make up the LISP overlay also have to do, along
>    the 'tunnels' which connect them to other LISP devices.
> 
>    There is, however, one big difference: the fanout of a typical LISP
>    ITR will be much larger than most classic physical packet switches.
>    (ITRs only need to be considered, as the LISP tunnels are all
>    effectively unidirectional, from ITR to ETR - an ETR needs to keep no
>    per-tunnel state, etc.)
> 
>    LISP is, fundamentally, a 'tunnel' based system. Tunnel system
>    designs do have their issues (e.g. the high inter-'switch' fan-out),
>    but it's important to realize that they also can have advantages,
>    some of which are listed below.

[snip]

> 4.2. Need for a Mapping System
> 
>    LISP does need to have a mapping system, which brings design,
>    implementation, configuration and operational costs. Surely all
>    these costs are a bad thing?  However, having a mapping system have
>    advantages, especially when there is a mapping layer which has global
>    visibility (i.e. other entities know that it is there, and have an
>    interface designed to be able to interact with it). This is unlike,
>    say, the mappings in NAT, which are 'invisible' to the rest of the
>    network.

Typo? "Surely all these costs are a bad thing."
Typo: However, having a mapping system *has*...

>    In fact, one could argue that the mapping layer is LISP's greatest
>    strength. Wheeler's Axiom* ('Any problem in computer science can be
>    solved with another level of indirection') indicates that the binding
>    layer available with the LISP mapping system will be of great value.
>    Again, it is not the job of this document to list them all - and in
>    any event, there is no way to forsee them all.
> 
>    The author of this document has often opined that the hallmark of
>    great architecture is not how well it does the things it was designed
>    to do, but how well it does things it was never expected to have to
>    handle. Providing such a powerful and generic binding layer is one
>    sure way to achieve the sort of lasting flexibility and power that
>    leads to that outcome.
> 
>    [Footnote *: This Axiom is often mis-attributed to Butler Lampson,
>    but Lampson himself indicated that it came from David Wheeler.]
> 
> 4.3. Piggybacking of Control on User Data
> 
>    LISP piggybacks control transactions on top of user data packets.
>    This is a technique that has a long history in data networking, going
>    back to the early ARPANET. [McQuillan] It is now apparently regarded
>    as a somewhat dubious technique, the feeling seemingly being that
>    control and user data should be strictly segregated.
> 
>    It should be noted that _none_ of the piggybacking of control
>    functionality in LISP is _architecturally fundamental_ to LISP. All
>    of the functions in LISP which are performed with piggybacking could
>    be performed almost equally well with separate control packets.
> 
>    The "almost" is solely because it would cause more overhead (i.e.
>    control packets); neither the response time, robustness, etc would
>    necessarily be affected - although for some functions, to match the
>    response time observed using piggybacking on user data would need as
>    much control traffic as user data traffic.
> 
>    This technique is particularly important, however, because of the
>    issue identified at the start of this section - the very large fanout
>    of the typical LISP switch. Unlike a typical router, which will have
>    control interactions with only a few neighbours, a LISP switch could
>    eventually have control interactions with hundreds, or perhaps even
>    thousands (for a large site) of neighbours.
> 
>    Explicit control traffic, especially if good response times are
>    desired, could amount to a very great deal of overhead in such a
>    case.

Maybe it´s worth mentioning a specific example of piggybacked control on user data: Echo-nonce.

[snip]

> 5.3. Overlapping Uses of Existing Namespaces
> 
>    It is in theory possible to have a block of IPvN namespace used as
>    both EIDs and RLOCs. In other words, EIDs from that block might map
>    to some other RLOCs, and that block might also appear in the DFZ as
>    the locators of some other ETRs.
> 
>    This is obviously potentially confusing - when a 'bare' IPvN address
>    from one of these blocks, is it the RLOC, or the EID?  Sometimes it
>    it obvious from the context, but in general one could not simply have
>    a (hypothetical) table which assigns all of the address space to
>    either 'EID' or 'RLOC'.
> 
>    In addition, such usage will not allow interoperation of the sites
>    named by those EIDs with legacy sites, using the PITR mechanism
>    ([Introduction], Section "Proxy Devices"), since that mechanisms
>    depends on advertizing the EIDs into the DFZ, although the LISP-NAT
>    mechanism should still work ([Introduction], Section "LISP-NAT").
> 
>    Nevertheless, as the IPv4 namespace becomes increasingly used up,
>    this may be an increasingly attractive way of getting the 'absolute
>    last drop' out of that space.

I think that there might be some potential issues of overlapping namespaces and LISP-TE recursion mechanism. In this scenario the mapping system is queried using RLOCs. If a given address has meaning both in terms of EID and RLOC this might cause some issues.

> 5.4. LCAFs
> 
>    {{To be written.}}
> 
>    --- Key-ID
>    --- Instance-IDs
> 
> 6. Scalability
> 
>    As with robustness, any global communication system must be scalable,
>    and scalable up to almost any size. As previously mentioned (xref
>    target="Perspectives-Packet"/), the large fanouts to be seen with
>    LISP, due to its 'overlay' nature, present a special challenge.
> 
>    One likely saving grace is that as the Internet grows, most sites
>    will likely only interact with a limited subset of the Internet; if
>    nothing else, the separation of the world into language blocks means
>    that content in, say, Chinese, will not be of interest to most of the
>    rest of the world. This tendency will help with a lot of things
>    which could be problematic if constant, full, N^2 connectivity were
>    likely on all nodes; for example the caching of mappings.

I suggest removing the `Chinese´example for a more general sentence, something like "the separation of the world into language blocks might suggest that users speaking a given language do not typically access content written in other languages". Besides this many measurements show that because of port-scans networks reach the entire Internet, so this might not be strictly true.
 
> 6.1. Demand Loading of Mappings
> 
>    One question that many will have about LISP's design is 'why demand-
>    load mappings - why not just load them all'?  It is certainly true
>    that with the growth of memory sizes, the size of the complete
>    database is such that one could reasonably propose keeping the entire
>    thing in each LISP device. (In fact, one proposed mapping system for
>    LISP, named NERD, did just that. [NERD])
> 
>    A 'pull'-based system was chosen over 'push' for several reasons; the
>    main one being that the issue is not just the pure _size_ of the
>    mapping database, but its _dynamicity_. Depending on how often
>    mappings change, the update rate of a complete database could be
>    relatively large.
> 
>    It is especially important to realize that, depending on what
>    (probably unforseeable) uses eventually evolve for the
>    identity->location mapping capability LISP provides, the update rate
>    could be very high indeed. E.g. if LISP is used for mobility, that
>    will greatly increase the update rate. Such a powerful and flexible
>    tool is likely be used in unforseen ways (Section 4.2), so it's
>    unwise to make a choice that would preclude any which raise the
>    update rate significantly.
> 
>    Push as a mechanism is also fundamentally less desirable than pull,
>    since the control plane overhead consumed to load and maintain
>    information about unused destinations is entirely wasted. The only
>    potential downside to the pull option is the delay required for the
>    demand-loading of information.
> 
>    (It's also probably worth noting that many issues that some people
>    have with the mapping approach of LISP, such as the total mapping
>    database size, etc are the same - if not worse - for push as they are
>    for pull.)
> 
>    Finally, for IPv4, as the address space becomes more highly used, it
>    will become more fragmented - i.e. there will tend to be more,
>    smaller, entries. For a routing table, which every router has to
>    hold, this is problematic. For a demand-loaded mapping table, it is
>    not bad. Indeed, this was the original motivation for LISP
>    ([RFC4984]) - although many other useful and desirable uses for it
>    have since been enumerated (see [Introduction], Section
>    "Applications").
> 
>    For all of these reasons, as long as there is locality of reference
>    (i.e. most ITRs will use only a subset of the entire set), it makes
>    much more sense to use the a pull model, than the classic push one
>    heretofore seen widely at the internetwork layer (with a pull
>    approach thus being somewhat novel - and thus unsettling to many - to
>    people who work at that layer).
> 
>    It may well be that some sites (e.g. large content providers) may
>    need non-standard mechanisms - perhaps something more of a 'push'
>    model. This remains to be determined, but it is certainly feasible.
> 
> 6.2. Caching of Mappings
> 
>    It should be noted that the caching spoken of here is likely not
>    classic caching, where there is a fixed/limited size cache, and
>    entries have to be discarded to make room for newly needed entries.
>    The economics of memory being what they are, there is no reason to
>    discard mappings once they have been loaded (although of course
>    implementations are free to chose to do so, if they wish to).

I am unsure if I understood this last paragraph. AFAIK the map-cache must be implemented -for high-speed routers- in TCAM memories which are expensive and hence, have a fixed/limited cache size.

>    This leads to another point about the caching of mappings: the
>    algorithms for management of the cache are purely a local issue. The
>    algorithm in any particular ITR can be changed at will, with no need
>    for any coordination. A change might be for purposes of
>    experimentation, or for upgrade, or even because of environmental
>    variations - different environments might call for different cache
>    management strategies.
> 
>    The local, unsynchronized replacability of the cache management
>    scheme is the architectural aspect of the design; the exact
>    algorithm, which is engineering, is not.

Related paper: http://ebookbrowse.com/an-analytical-model-for-the-lisp-cache-size-pdf-d362171631

[snip]

> 7.2. Design Guidance
> 
>    In designing the security, there are a small number of key points
>    that will guide the design:
> 
>    - Design lifetime
>    - Threat level
> 
>    How long is the design intended to last?  If LISP is successful, a
>    minimum of a 50-year lifetime is quite possible. (For comparison,
>    IPv4 is now 34 at the time of writing this, and will be around for at
>    least several decades yet, if not longer; DNS is 28, and will
>    probably last indefinitely.)

"will probably last indefinitely" is a strong statement, I suggest removing it.

>    How serious are the threats it needs to meet?  As mentioned above,
>    the Internet can bring the worst crackers from anywhere to any
>    location, in a flash. Their sophistication level is rising all the
>    time: as the easier holes are plugged, they go after others. This
>    will inevitably eventually require the most powerful security
>    mechanisms available to counteract their attacks.
> 
>    Which is not to say that LISP needs to be that secure _right away_.
>    The threat will develop and grow over a long time period. However,
>    the basic design has to be capable of being _securable_ to the
>    expanded degree that will eventually be necessary. However,
>    _eventually_ it will need to be as securable as, say, DNS - i.e. it
>    _can_ be secured to the same level, although people may chose not to
>    secure their LISP infrastructure as well as DNSSEC potentially does.
>    [RFC4033]
> 
>    In particular, it should be noted that historically many systems have
>    been broken into, not through a weakness in the algorithms, etc, but
>    because of poor operational mechanics. (The well-known 'Ultra'
>    breakins of the Allies were mostly due to failures in operational
>    procedure. [Welchman]) So operational capabilities intended to
>    reduce the chance of human operational failure are just as important
>    as strong algorithms; making things operationally robust is a key
>    part of 'real' security.
> 
> 7.2.1. Security Mechanism Complexity
> 
>    Complexity is bad for several reasons, and should always be reduced
>    to a minimum. There are three kinds of complexity cost: protocol
>    complexity, implementation complexity, and configuration complexity.
>    We can further subdivide protocol complexity into packet format
>    complexity, and algorithm complexity. (There is some overlap of
>    algorithm complexity, and implementation complexity.)
> 
>    We can, within some limits, trade off one kind of complexity for
>    others: e.g. we can provide configuration _options_ which are simpler
>    for the users to operate, at the cost of making the protocol and
>    implementation complexity greater. And we can make initial (less
>    capable) implementations simpler if we make the protocols slightly
>    more complex (so that early implementations don't have to implement
>    all the features of the full-blown protocol).
> 
>    It's more of a question of some operational convenience/etc issues -
>    e.g. 'How easy will it be to recover from a cryptosystem
>    compromise'. If we have two ways to recover from a security
>    compromise, one which is mostly manual and a lot of work, and another
>    which is more automated but makes the protocol more complicated, if
>    compromises really are very rare, maybe the smart call _is_ to go
>    with the manual thing - as long as we have looked carefully at both
>    options, and understood in some detail the costs and benefits of
>    each.
> 
> 7.3. Security Overview
> 
>    First, there are two different classes of attack to be considered:
>    denial of service (DoS, i.e. the ability of an intruder to simply
>    cause traffic not to successfully flow) versus exploitation (i.e. the
>    ability to cause traffic to be 'highjacked', i.e. traffic to be sent
>    to the wrong location).
> 
>    Second, one needs to look at all the places that may be attacked.
>    Again, LISP is a relatively simple system, so there are not that many
>    parts to examine. The following are the things we need to secure:
> 
>    - Lookups
>    - Indexing
>    - Mappings

I suggest citing LISP-SEC.

[snip]

> 11.1.1. Missing Mapping Packet Queueing
> 
>    Currently, some (all?)  ITRs discard packets when they need a
>    mapping, but have not loaded one yet, thereby causing the applicaton
>    to have to retransmit their opening packet. True, many ARP
>    implementations use the same strategy, but the average APR cache will
>    only ever contain a few mappings, so it will not be so noticeable as
>    with the mapping cache in an ITR, which will likely contain
>    thousands.
> 
>    Obviously, they could queue the packets while waiting to load the
>    mapping, but this presents a number of subtle implementation issues:
>    the ITR must make sure that it does not queue too many packets, etc.
> 
>    In particular, if such packets are queued, this presents a potential
>    DoS attack vector, unless the code is carefully written with that
>    possibility in mind.

A missing mapping packet can be also forwarded to a PETR avoiding drop/buffering.

> 11.1.2. Mapping Cache Management Algorithm
> 
>    Relatively little work has been done on sophisticated mapping cache
>    management algorithms; in particular, the issue of which mapping(s)
>    to drop if the cache reaches some maximum allowed size.
> 
>    This particular issue has also been identified as another potential
>    DoS attack vector.

We have a technical report (not published yet, but public) discussing and evaluating precisely this aspect. If needed we can share the document.