Re: [rrg] Comments on 'draft-whittle-ivip-arch'

Robin Whittle <rw@firstpr.com.au> Wed, 24 March 2010 09:44 UTC

Message-ID: <4BA9DEF5.1090108@firstpr.com.au>
Date: Wed, 24 Mar 2010 20:44:21 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
References: <C7B93DF3.4F45%tony.li@tony.li> <4B953EA5.4090707@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A64951 19 34CF@XCH-NW-01V.nw.nos.boeing.com> <4B97016B.5050506@firstpr.com.au>< E1 829B60731D1740BB7A0626B4FAF0A6495119413D@XCH-NW-01V.nw.nos.boeing.com>< 4B 9 98826.9070104@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A649511DCE A 0@XCH-NW-01V.nw.nos.boeing.com> <4B9B0244.7010304@firstpr.com.au> <E18 29B6 0731D1740BB7A0626B4FAF0A649511DD102@XCH-NW-01V.nw.nos.boeing.com> <4B 9F 6E 22.60509@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A649511DD643@X CH-N W-01V.nw.nos.boeing.com> <4BA022A3.6060607@firstpr.com.au> <E1829B60731D174 0BB7A0626B4FAF0A649511DD9B1@XCH-NW-01V.nw.nos.boeing.com> <4BA19503.1040008 @firstpr.com.au><E1829B60731D1740BB7A0626B4FAF0A649512248D1@XCH-NW-01V.nw.nos.boeing.com><4BA2D58C.4050706@firstpr.com.au> <E1829B60731D1740BB7A0626B4FAF0A64951224C5A@XCH-NW-01V.nw.nos.boeing.com> <E1829B60731D1740BB7A0626B4FAF0A64951224D5C@XCH-NW-01V.nw.nos.boeing.c om>
In-Reply-To: <E1829B60731D1740BB7A0626B4FAF0A64951224D5C@XCH-NW-01V.nw.nos.boeing.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [rrg] Comments on 'draft-whittle-ivip-arch'
Precedence: list

Hi Fred,

Thanks very much for taking a look at:

  http://tools.ietf.org/html/draft-whittle-ivip-arch-04

and for your questions.

> I haven't seen anyone commenting on your Ivip proposal
> ('draft-whittle-ivip-arch').

(Med) Boucadair wrote some comments on the 03 version, which was a
fresh rewrite:

  http://www.ietf.org/mail-archive/web/rrg/current/msg05661.html

I responded (msg05714) to some of what he wrote but have not yet
replied to all he wrote, because I concentrated on critiquing other
proposals and developing the DRTM approach to Ivip, which
ivip-arch-04 refers to.

> I myself am just now getting
> around to taking a cursory look, but here are some initial
> observations. I'm sorry to say that I was only able to get
> through to the end of Section 5 on this pass. I will try
> to review more later.
>
> Fred


> --- begin comments ---
>
> 1) In your abstract, you say: "Both involve modifying the
>    IP header and require most DFZ routers to be upgraded."
>    Somehow, it seems like a very big assumption to think
>    that most or even some existing DFZ routers could be
>    upgraded to recognize new IP header formats. Indeed,
>    it goes against one of the RFC1955 principles of "no
>    changes to *most* routers". I will check in the rest of
>    the document for what you mean by this, but I am
>    concerned when I see what looks like a non-starter in
>    the abstract.

This is only for the two "Modified Header Forwarding" approaches
which are alternatives to encapsulation:

  ETR Address Forwarding  (IPv4)
  http://tools.ietf.org/html/draft-whittle-ivip-etr-addr-forw-00

    I may alter this to make a whole new header, of the same
    length as the IPv4 header, but with a different type.  That
    way I can use the "evil" bit and so get 31 bits of ETR
    address instead of 30.  Perhaps I could find another bit
    to get all 32.

  Prefix Label Forwarding (IPv6)
  http://www.firstpr.com.au/ip/ivip/PLF-for-IPv6/

The idea is for these to be used in the long-term - 10 to 20 years
from now.  If we could use them earlier, such at introduction, that
would be great - but I am assuming this won't be possible.

Its best to keep these in mind as long-term upgrades to Ivip - since
over long periods of time it will be easy to have all DFZ and other
routers have the new functionality.

Most of the Ivip documentation assumes encapsulation.  To-do: take
this stuff out of the abstract, to shorten it and to avoid people
mistakenly thinking that Ivip requires upgrades to all DFZ routers.


> 2) In the Introduction, it seems that a significant amount
>    of the architecture actually appears in other documents.
>    I will check further to see if the base document conveys
>    enough without having to dig up all of the others.

I tried to make it so you can read Ivip-arch and then Ivip-drtm,
referring to Ivip-glossary if you need to.

The Modified Header Forwarding stuff doesn't matter - it is best to
regard that as a long-term efficiency upgrade.


> 3) 2nd paragraph in Section 2 says: "...each with a common
>    mapping to a single ETR". Why can't the SPI's map to
>    *multiple* ETRs?

  http://tools.ietf.org/html/draft-whittle-ivip-arch-04#page-7

For multihoming:

Once the end-user network (or some other organization they appoint)
has real-time control of the tunneling of all the ITRs which are
handling traffic for a given micronet of SPI space, there's no need
for the ITRs to be given a list of multiple ETRs and then to have to
choose between them.

All the CES (Core-Edge Separation) architectures other than Ivip
(LISP, APT, TRRP, TIDR and IRON-RANGER) assume that this real-time
control can't be achieved.  Then, in order to achieve multihoming
service restoration faster than the ability of the mapping system to
get new mapping information to all the ITRs which need it, it is
necessary for the mapping to be more complex.  This requires more
storage in the ITR - but the worst problem is that the ITRs are now
required to choose between ETR addresses on the basis of which ETRs
enable the destination network to be reached.  This scales badly and
is difficult or impossible for an ITR to test, unless it has a
specific address within the destination network to test - which none
of these architectures currently provide.

Also, by providing real-time direct control of ITR tunneling
behaviour from outside the Ivip system, this externalises all the
testing and decision making process.  It requires end-user networks
to detect reachability and make multihoming service restoration
decisions - which means they can do it however they like, without any
need to be restricted by whatever limited functionality we can build
into all ITRs.  A typical multihomed end-user network would pay a
dedicated "Multihoming monitoring" company to test reachability via
its two or more ETRs and to control the mapping accordingly in
real-time.

Another advantage is in scaling.  An ETR might have 10k ITRs sending
packets to it, but it doesn't want to have to tell all 10k ITRs in
some way that the packets have been delivered.  With Ivip, it is up
to the end-user network what the nature of the reachability testing
is, so there are no scaling problems at all.


For load sharing:

You could have multiple ETR addresses for the one micronet.  That
would involve variable-length mapping and extra stuff to do with
giving each ETR some percentage of the total traffic, on average.   I
could do this, but at present I am keeping the mapping and the ITRs
simple by having a single ETR address.  Load sharing via real-time
inbound traffic engineering can still be done, provided the load is
spread over two or more micronets - which will typically be easy to
achieve.  Even a single host could have two IP addresses, and each
could be covered by a different micronet, which could be mapped to a
different ETR address.  Real-time changes to mapping of these
micronets will directly steer traffic via the two or more ETRs and
therefore two or more ISPs.

   (To-do: explain this approach to inbound traffic engineering
    more clearly and make it easier to find in Ivip-arch.)


> 4) 4th paragraph of Section 2, it appears that MAB in
>    Ivip means the same thing as VP in IRON-RANGER?

They are approximately equivalent.  Both a MAB and an VP are sections
of "edge" space which are administered by a single organization, and
which typically cover the space used by many end-user networks.

In I-R, I think, at least for IPv6, you would have all the edge space
in a single short prefix.  Then your IR(GW) routers (the equivalent
of Ivip's DITRs and LISP's PTRs) would advertise just this short
prefix into the DFZ.   But this raises questions of each such IR(GW)
router doing work for multiple beneficiaries - so who runs the IR(GW)
routers? (As I discussed in my recent message:  IRON-RANGER - Route
Reflectors & DNS?)

In IPv6 Ivip, the SPI space and so the MABs could be intermixed with
ordinary prefixes.  There's no need to make an existing prefix into a
MAB, since there's oodles of fresh space to use instead.

For Ivip with IPv4 and IPv6, generally a single DITR will advertise a
subset of the MABs - because whoever operates this DITR is being paid
to do so by a subset of the total number of MABOCs (MAB Operating
Companies).

With IRON-RANGER in IPv4, you would have your VPs scattered all over
the place, like Ivip MABs, I think, so it would be important not to
have so many VPs, each of which needs to be separately advertised in
the DFZ, as to overly burden the DFZ control plane.

If there were ever 100k MABs, that would be a considerable burden on
the DFZ control plane - but as long as each MAB was serving the needs
of many end-user networks, this is still achieving routing
scalability.  I intend a typical MAB in IPv4 to be a /16 or so and to
provide space for tens of thousands of end-user networks, each with
one or more micronets of one or more IPv4 addresses each.


> 5) 5th paragraph Section 2, it appears that a MABOC is
>    the same as an IRON-RANGER VP provider?

Yes.



> 6) 6th paragraph Section 2, "Multihoming end-user networks
>    would typically contract a separate company...". Why
>    would they not instead directly negotiate the mappings
>    with the MABOC themselves?

The end-user network (EUN) has ultimate authority over the mapping of
its SPI space - and how that space is broken into individual micronets.

If the EUN has a single site with two ISPs, and therefore two ETRs
ETR-A and ETR-B for its one or more micronets, then in theory, it
could manually or automatically control the mapping of these
micronets to achieve multihoming service restoration.  That will be
allowed, of course, but I expect it would be easier and better to pay
another organisation - a "Multihoming Monitoring" company - to
continually probe reachability of their network via the two ETRs, and
to change the mapping from the usually used ETR-A to ETR-B in the
even that the network was not reachable via ETR-A.  This company's
automatic systems could also switch the mapping back when ETR-A was
working again.

For the EUN itself to detect reachability reliably and then to change
its own mapping may be possible, but when there are network failures
going on, it is probably easiest to let another company not affected
by those failures control the mapping.  Also, the multihoming
monitoring company can test reachability from its network of probing
servers all over the world, which would be better than what most EUNs
could do on their own.

Of course the MABOC might provide these "Multihoming Monitoring and
Mapping Control" services, as part of the package by which the EUN
leases their SPI space.   But it doesn't have to be done by the MABOC
- it can be done by any organisation the EUN gives proper credentials
to.  The EUN would have an arrangement with the MABOC to get a
username and password which enable the Multihoming Monitoring company
to control mapping for a defined subset (perhaps all) of its SPI
space - and the EUN could easily revoke these credentials.


> 7) 7th paragraph, Section 2, "...which is typically a
>    subset of all MABs in the Ivip system." IRON-RANGER
>    gateways each advertise the full set of VPs, but I
>    suppose could also be configured to advertise only
>    a partial set as well. Is it not possible for the
>    DITR to advertise the full set, or do you for some
>    reason think it would be best only to advertise a
>    partial set?

A DITR could advertise the whole set of MABs.  Someone running a DITR
would probably only want to do so if they had a contract with all the
MABOCs so they were paid for the traffic the DITR handled for each
MAB.  In order for the MABOCs to be happy with this, the usage
information would need to be broken down into traffic per micronet,
so the MABOC could in turn charge their EUN customers according to
the traffic which was being handled.

This probably would not need to be accounting for every byte or
packet - I guess a sampling system would be generally good enough.


> 8) 8th paragraph, Section 2, with the query service, it
>    looks like Ivip is standing up its own SPI to RLOC
>    mapping service apart from any routing systems or
>    the DNS? Would deploying this service be on the
>    same level of complexity as for the global DNS?

With Ivip before DRTM (Distributed Real Time Mapping) - which I only
finalised two weeks ago - there was a single, globally coordinated,
but still somewhat distributed mapping system.

With DRTM, there is no one mapping system.  Each MABOC needs to run
DITRs - either directly, or pay some other company to run DITRs for
them.  The DITRs typically need to be scattered around the Net.  Its
up to the MABOC how those DITRs get their real-time mapping changes.
 This need not be part of Ivip, but there could also be standards on
at least one way of doing it.  Since it is an internal matter for the
MABOC, and for any company it hires to run its DITRs, the problem of
secure, real-time distribution of the full set of mapping changes for
each MAB is assumed to be solvable without too much fuss.  There are
only one or two organisations involved, and there are a finite (5 to
50 or so) DITR sites to get the mapping to.  This is a challenge, but
not so difficult as to present a fundamental objection to Ivip or a
barrier to any companies which wanted to do this.

The Internet can get a packet from almost anywhere to almost anywhere
else in about 0.2 seconds, with a pretty high reliability - such as
99% or more on average.  This is pretty good material for making a
real-time mapping distribution system, especially just within one or
two companies.  Also, for this finite number of sites, a DITR-site
company might want to use private network links in order to avoid
potential problems with DoS flooding or any other problems which
makes the Internet difficult for traffic which must be highly
reliable.  Private network links would be more expensive, but there
is only a finite number of DITR sites and they are all money-making
concerns.

Where Ivip standards are required is for ISPs and other large
networks which want to run their own caching ITRs.  They will run one
or more QSRs (Resolving Query Servers), and we need a standardised
protocol by which QSRs can query the new QSA (Authoritative Query
Servers) which will run at each DITR site.   Each QSA only handles
queries for whatever subset of MABs the DITR site handles - but it is
"full-database" for those MABs, getting real-time updates and so not
requiring any delays to look up the mapping anywhere else.  Each QSR
may need to query hundreds or perhaps even thousands of these QSAs,
and this protocol obviously needs to be standardised so all ISPs can
use the one protocol with all the QSAs of all the DITR-companies.

Ivip standardises the protocol between these QSAs and QSDs - and
likewise the protocol between ITRs and QSRs, which can optionally
involve one or more levels of intermediate QSCs (Caching Query Servers).

In a full deployment, the complexity of the entire system of ITRs,
QSCs, QSRs and QSAs, with the real-time pushing of all mapping
changes to QSAs, and real-time pushing of map updates to querying
devices (QSRs, QSCs and ITRs) which may need it, would be at least as
complex and important as the DNS system.  It would be at least as
distributed too.  There is no central system within the whole thing,
except for the DNS arrangements by which each QSR finds two or more
(typically) nearby QSAs for each MAB.

I just suggested you use DNS for a similar purpose in IRON-RANGER.



> 9) 10th paragraph, Section 2, "Each QSR uses a DNS-based
>    mechanism and an additional protocol to discover...".
>    Does this mean that the QSR uses the MAB as an FQDN
>    in a DNS query? That could mean quite a few MABs in
>    the DNS. I think the main difference between this and
>    the IRON-RANGER use of BGP to disseminate VP to RLOC
>    mappings is that BGP can more readily cope with dynamic
>    updates if some RLOCs fail and/or if the VP moves to
>    a different RLOC. Also, each IR has full knowledge of
>    all VPs in its RIB, so there are no per-packet lookups
>    needed. There are probably more differences as well.
>
>    That said, there may well be a use for keeping the
>    MABs/VPs in *both* the BGP RIB *and* the DNS. The
>    BGP RIB could be used as the full information data
>    base that can be accessed in real time, and DNS could
>    be used as a check to ensure that the MAB/VP is actually
>    "registered" to the ETR thatclaims to own it.

I won't try to fully describe the DNS-based mechanism here.  I
haven't had time to write it up in draft-whittle-ivip-drtm-01, so
please refer to it in the RRG archives:

   http://www.ietf.org/mail-archive/web/rrg/current/msg06128.html

   "Stage 2 needs a DNS-based system so TRs (QSRs) can find
   DITR-Site-QSDs (QSAs)"

On boot up, each QSR walks through the DNS quickly discovering the
start and end of each MAB in order.  If this was going to take a
while, it could start at four places in the address range and move
forward from each until the whole global unicast space was covered.
The DNS arrangements I have planned will enable a single query to
find that a given address is either the start of a MAB, and then what
the end address is, or that it it is the start of a MAB-free zone,
and then what the starting address of the next MAB (or potentially
the next MAB-free-zone) is.  So the QSR should be able to walk
through these pretty quickly.

The DNS reply for each MAB will also provide addresses of all the
QSAs which handle this MAB.  The QSR will choose two or so of these -
ideally based on closeness, which can be done with a little probing
with ping or traceroute.


> 10) 11th paragraph, Section 2, "MABOCs will typically
>    charge their customers for each mapping change." This
>    arrangement seems like it might be very costly to highly
>    mobile SPI prefix holders - why not just a flat-rate?

Everyone still seems to think that mobility involves frequent mapping
changes.  This is true for LISP-MS - but that won't be practical for
a number of reasons.  (See the end of:
http://www.firstpr.com.au/ip/ivip/lisp-links/#critiques )

TTR Mobility does not require a mapping change whenever the MN gets a
new address or access network:

  http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf

The mapping might stay unchanged for a year.  It is not absolutely
necessary to change the mapping, but if the MN moves more than about
1000km or so, then it would be good to tunnel to a TTR which is
closer to its new locality.  Once that is done, the mapping can be
changed to the new TTR.


> 11) 12th paragraph, Section 2, I am not (yet) seeing an
>    exact mechanism whereby the real-time updated full mapping
>    database is managed.

This is described in:

   http://tools.ietf.org/html/draft-whittle-ivip-drtm

Version 01 doesn't have everything, but it has most of the DRTM
system clearly described.  This is explained in terms of the staged
development of services.  DRTM is not a single thing which has to be
built before anything can happen.  It is an approach to providing
portability, multihoming and inbound TE with essentially no ISP
involvement or investment (ISPs only need to forward packets with SPI
source addresses, since EUNs can run their own ETRs).  For TTR
Mobility, DRTM involves mobility without ISPs doing anything at all.


> 12) 16th paragraph, Section 2, I think in your description
>     here you are assuming that EIDs are routable within the
>     same scope as RLOCs. That would mean a pure-IPv4 or
>     pure-IPv6 deployment and not an IPv6/IPv4 one?

I am currently planning Ivip to be two completely independent systems
for IPv4 and for IPv6.  Closer to the time of deployment, we would
obviously consider some kinds of linkage, interdependence etc.  Maybe
there will be a role for Ivip in supporting interworking, transition
etc.


> 13) 18th paragraph, Section 2, I will wait until it comes
>    up in later sections, but the notion of modifying the IP
>    heard and changing *most* DFZ routers seems onerous.

Modified Header Forwarding is not at all required for Ivip to be
widely used.  It should be a long-term option to be adopted whenever
we can upgrade all the relevant routers.


> 14) Section 4, paragraph 1, I have doubts about the
>    ability to deploy what is being called "Modified Header
>    Forwarding".

In the long-term future - 10 to 20 years - it shouldn't be hard to
have new protocol (or modified IPv4 header) functionality built into
routers.  The new approaches are not particularly complex, and for
IPv4 only concern the FIB.  For IPv6, there is also some involvement
of the RIB.  In neither case is there any change to the BGP behaviour
of the router, or to the BGP control plane.


> 15) Section 4, paragraph 1, I have strong doubts about the
>    use of the sending host's address as the outer headers
>    source address. First, ICMP messages coming from within
>    the tunnel would not be recognized by the sending host.
>    Second, it only works when both the inner and outer IP
>    protocols are the same version, i.e., it doesn't work
>    for mixed IPv6/IPv4. Also, the BR source address
>    filtering can still be done if the inner source is
>    different than the outer source, because the outer
>    source indicates the previous hop. This is very useful
>    for IPv6-within-IPv4 encapsulation, since the previous
>    hop can be constructed as an IPv6 link-local address
>    that embeds the IPv4 source address.

I don't understand the last sentence.  I am not at present trying to
mix IPv4 and IPv6.

This is just the summary of architectural choices.  There is more
explanation at:

  http://tools.ietf.org/html/draft-whittle-ivip-arch-04#section-7.7

The idea is to support existing ISP BR source address filtering
without significant expense or configuration in each ETR.  This
should work nicely, and it would be impossible to do it in any
reasonable fashion if the outer address was that of the ITR.

I explore how LISP might do it, and it seems to be impossible:

  http://www.ietf.org/mail-archive/web/rrg/current/msg06219.html
  17.6.9 - ETR support for ISP border router source address filtering

assuming an ETR in an ISP is also receiving packets tunneled from
ITRs inside that ISP - in which case the filtering applied by the
ISP's BR should not be applied to the decapsulated packets, since
they don't arrive from outside the ISP's network.  The same would be
true for IRON RANGER, I think.

(More below on ETRs supporting source address filtering.)

The sending host won't recognise PTBs from routers in the tunnel -
but nothing short of host modifications could make that work.  My
approach to Path MTU Discovery is intended to cope with all this:

  http://www.firstpr.com.au/ip/ivip/pmtud-frag/


> 16) Section 4, paragraph 2, a large part of the motivation
>    for MHF seems to be to avoid PMTUD issues. But, PMTUD
>    issues can IMHO be handled through the use of SEAL
>    with segmentation/reassembly turned off, but with the
>    ability to discover and weed out degenerate links.

I don't intend to do fragmentation and reassembly for any traffic
packets - other than the few an ITR sends in a special dual packet
format for probing PMTU to a given ETR.  I don't intend to support
DF=1 IPv4 packets longer than some figure like 1440 bytes or a little
less that we could actually encapsulate and send between any ITR and
ETR without fear of fragmentation.  This will involve restricting the
placement of all ITRs and ETRs in IPv4 so they have a PMTU of at
least some chosen figure (below 1500 bytes) from the DFZ, which is
assumed to have some other PMTU somewhat below 1500 bytes.

I think the encapsulation overhead, particularly with IPv6, is more
than sufficient motivation to avoid encapsulation in the long-term.
Being able to forget about PMTUD problems is an additional motivation.


> 17) Section 5.3, paragraph 2, it surprises me to see that
>    aspects of the mapping system are outside the scope of
>    Ivip. Does it mean that there needs to be some adjunct
>    protocols or mechanisms at work in a manner that can
>    plug into Ivip? I guess I don't really understand this
>    business of separation.

At present, there is nothing in Ivip to say how a MABOC and any
company it hires to run its DITRs - and therefore the QSAs for its
MABs - can or must get mapping in real-time to each DITR site.
There's no point in specifying how it "must" happen, since these one
or more companies can always make private arrangements which are not
visible to anyone else.

It would be desirable to have at least one standardised method of
transmitting the mapping in real-time to DITR sites, and for ways
that a MABOC could transmit its mapping to one or more DITR-site
companies.  It would be highly desirable to provide a standardised
protocol by which any EUN could interact with its MABOC (directly or
through some other organisation) to split its SPI space into
micronets, control the mapping of those micronets and get a username
and password (or use some other delegation system) so another
organization it appoints could control the mapping of one or more
micronets.

Over time, I expect these protocols could be developed.  Regarding
real-time mapping within a system of DITR sites, perhaps the
"Replicator" system I developed for the original version of Ivip
could be used:

  http://tools.ietf.org/html/draft-whittle-ivip-fpr-01

None of this is a part of the main Ivip proposal.

The key point is that with DRTM, the totality of the real-time
mapping system is broken into any number of parallel systems - for
each set of DITR sites - with no need to interwork between them, that
it is reasonable to assume each such system can be realised without
any prohibitive security or scaling problems.

DRTM enables the entire system to be made of a potentially very large
number of independent pieces, such that there is no need for a single
system at all - thereby avoiding a bunch of scaling problems and
concerns over single points of failure, centralized administration etc.


> 18) Section 5.4, paragraph 1, the path MTU determination
>    mechanism seems to require actively looking for a reply
>    to an explicit probe before the MTU can be adjusted.
>    SEAL uses all packets as implicit probes, so there is
>    no need to wait for an explicit probe reply and MTU
>    limitations can be discovered asynchronously.

Yes, but these probes - a long and a short packet - are only sent at
particular instances where the traffic packet would be in the Zone of
Uncertainty between the currently established upper and lower limits
of what the ITR knows about the PMTU to a given ETR.  Each such probe
will typically reduce the gap between these, until there is no gap.
Then, there is no need for a probe, except occasionally - such as
after a few to ten minutes - to see if the PMTU has changed.



> 19) Section 5.4, paragraph 3, "...and ETRs do not
>    communicate at all." - This seems to preclude the
>    ability for the ETR to inform the ITR of any error
>    conditions, e.g., if the ITR thinks that the ETR has
>    the correct mapping when in fact it really doesn't.
>    So, there seems to be a need for the ETR to at least
>    convey error information to the ITR.

There may need to be a system by which an ETR or any other device can
complain it is getting a bunch of packets, apparently tunneled by an
ITR, which it doesn't want.  The correct response, if any, would be
for the EUN whose micronet this is - or whoever is currently
authorised to control the mapping - changing the mapping accordingly.

That wouldn't be done by the ITR - but I agree there needs to be some
mechanism for manual or automatic reporting of what appears to be
erroneously tunneled packets.


> 20) Section 5.4, final paragraph, the use of vanilla
>    IP-in-IP encapsulation may not interact well with
>    load balancing and/or ECMP in the network.

I agree - I need to research this more.  If this is a concern, then
Ivip's encapsulation approach would use IP and then UDP in some way,
still with the outer source address being that of the original
sending host, but with some UDP port arrangement to keep the ECMP/LAG
systems happy.  This would increase the encapsulation overhead somewhat.

I mention this in section 17.6.8 of:

  http://www.ietf.org/mail-archive/web/rrg/current/msg06219.html

and it is a To-do to add this to the Ivip-arch ID.


> 21) Section 5.5, this section seems to assume no
>    "recursive re-encapsulation", i.e., where there could
>    be a path "ITR(1)->ETR(1)->ITR(2)->ETR(2)-> ... etc."

There won't be anything like this.  If there was, the packet would
come back to ETR(1).

There are two basic ways an ETR could handle the decapsulated packet:

  1 - The ETR will recognise the destination address of the tunneled
      packet as one which it has special knowledge of - and will have
      some means of delivering the packet to the destination network
      without it going to an ITR.  This may be via some direct link
      or a tunnel.  If the ETR is at the site of the destination
      network, then it is only expecting to get tunneled traffic
      packets addressed to this network.

  2 - If the ETR is at an ISP, and doesn't have a specific mechanism
      to get the packets to the destination network, then if the
      ISP's network's routing system has a more-specific route for
      the address space of this end-user network (which would need
      to be covered by a prefix, since routes are always for prefixes
      rather than integer numbers of IPv4 addresses) so the packets
      would be forwarded according to that more-specific route,
      despite also matching a shorter, less-specific, prefix of the
      MAB, which is also advertised in that routing system.  (Or
      if no MABs are advertised, then the default route which leads
      to the DFZ.)

If the ETR is a TTR, then it tunnels the traffic packet by whatever
means to the MN.  The MN establishes a two-way tunnel to the TTR,
including from behind NAT or from an SPI address.


> 22) Section 5.7, I am highly skeptical of the MHF approach
>    and/or the use of a non-IP protocol type. I just don't
>    see it reasonable to expect the majority of DFZ routers
>    to update.

As mentioned below, over a decade or two it could be done with very
little cost or trouble.


> 23) Section 5.8, I think the goal of unmodified hosts is
>    correct, but should be tempered by "hosts are unaffected
>    by the use of an Ivip-like routing service in the network.
>    I say this, because it is possible that hosts will want
>    to upgrade to support adjunct mechanisms (e.g., HIP,
>    MIP) that are unrelated to the network routing service.

OK - HIP or some other Core-Edge Elimination (Locator / Identifier
Separation) architecture would not be affected by anything such as
Ivip, which deals just with IP addresses.


> 24) Section 5.11, paragraph 3, IRON-RANGER also does
>    do source address filtering at the ETR. That is why
>    each potential ITR needs to securely prove to the ETR
>    that it is authorized to source packets from a particular
>    EID prefix.

I don't recall this being part of our IRON-RANGER discussions.  I am
not sure how you could do this scalably and securely.


> 25) Section 5.11, paragraph 4, I do not believe the "inner
>    same as outer" source arrangement makes any difference
>    if the source address can be spoofed.


More on BR source address filtering and ETR support for this:

Let's say there is an ISP with various prefixes of its own and
customer networks which advertise their PI space solely through this
ISP.  One of these prefixes is 44.44.44.0/20.

The ISP sets up its BRs to drop any packet which arrives from another
AS with a source address matching any one of these prefixes -
including this one.  I understand this is typically done with TCAM
(tens and hundreds of thousands of parallel address comparators on a
single expensive, power-hungry, chip).

Let's say an attacker wants to send a packet to some end-user network
(EUN) which is using an ETR inside this ISP's network (including
perhaps an  ETR at this EUN's site, and that the attacker wants to
make it seem as if the packet came from 44.44.44.44, or any other
address in one of the prefixes the ISP had BR source address
filtering for.

The attacker is assumed to be outside the ISP's network.  The BR
filtering stops the attack succeeding for any ordinary packet
entering the ISP's network from another AS ("from the DFZ").

With Ivip's arrangement, the attacker can't succeed, because if it
sends the packet to the ETR with an outer source address other than
the inner source address 44.44.44.44, then the ETR will drop the
packet.  If the attacker tries to send a packet to the ETR with the
outer source address being 44.44.44.44, then it will never get to the
ETR, since the packet will be dropped by the BR.   This will work
fine, and will still allow the ETR to handle packets tunneled from
ITRs in the ISP's site, such as those which really do originate from
the 44.44.44.0 /20 network.

In LISP, the BR filtering doesn't look at the inner packet, so the
ETR receives it from the attacker, in a form which is
indistinguishable from a packet arriving from any ITR outside the
ISP's network.  To gain the same protection on decapsulated packets
which the BR filtering achieves on conventional packets, each LISP
ETR would need to separately filter on the source address of the
decapsulated packet, which is inordinately expensive for more than a
handful of prefixes - and then only apply this to packets whose outer
addresses were not from any of the addresses inside the ISP's
network.  The first task is too expensive and the second is equally
expensive and probably impossible, since it may not be possible to
know all the prefixes within the ISP on which an ITR could reside.


>    This can be fixed
>    by a new mechanism in SEAL whereby the ITR and ETR
>    synchronize sequence numbers so that there is no
>    chance of an off-path spoofing.

Yes, but how can the ETR function in IRON-RANGER - the IR(EID) role -
know that any particular ITR is "authorised" to tunnel packets whose
source address is of any particular network?  I can't imagine a
scalable way of doing this, much less a scalable way where the
IR(EID) function could do this quickly, without delaying the delivery
of the initial packets, and without making a CPU- and bandwidth-
intensive target for a DoS attack.


>    Note also that the
>    "inner same as outer" approach only works for same
>    inner and outer IP protocol,

Yes, it wouldn't work with IPv4 packets being tunneled in IPv6
packets or vice-versa - but Ivip doesn't involve such arrangements.

>    and only works when
>    both addresses are globally routable.

I think the nature of the source address doesn't really matter.  If
the BR filter accepts an outer source address, then as long as the
inner source address is the same, then the ETR will deliver it to the
destination network.  Of course BRs doing such filtering would
typically drop a packet if its source address was not globally
routable - so the system I have for Ivip will automatically support
this, without the ETR needing to decide what is globally routable or
not.  Likewise, it will allow packets sent from a non-globally
routable address inside the ISP network, which is exactly what it
should do.


> 26) Section 5.12, about full or parital knowledge of MABs,
>    if the number of MABs can be kept manageable (e.g.,
>    order of 100k or less) there should be no reason for
>    each DITR to not keep full knowledge. The number of
>    expected MABs may be different for IPv4 (where a
>    significant portion of the address space has already
>    been delegated) than for IPv6 (where a significant
>    portion of the address space has not already been
>    delegated).

I agree.  The IPv4 MABs are going to pop up in all sizes all over the
address space, because there is a crowded address space.  In IPv6,
there can be MABs, if desired, all in some currently unused short
prefix, and even if there were a million of them, each one could
support billions of end-user networks.

In this way, perhaps each MABOC would have a single MAB.  So there
might be a few hundred or a few thousand at most.

With IPv4, there could be 100k MABs, but there are unlikely to be
anywhere near that number of MABOCs.  There is definitely not going
to be thousands or even hundreds of separate global sets of DITRs,
even if there were tens of thousands of MABOCs, because economies of
scale would quickly kick in and it would be cheaper and more
effective for almost all these MABOCs to pay an existing DITR company
to handle their one or more MABs, rather than each one establish its
own global set of DITRs.

Thanks again for your questions.

  - Robin

[rrg] FW: I-D Action:draft-irtf-rrg-recommendatio… Tony Li
[rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Robin Whittle
Re: [rrg] Recommendation and what happens next Tony Li
Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Tony Li
Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Robin Whittle
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] FW: I-D Action:draft-irtf-rrg-recommend… Tony Li
Re: [rrg] Recommendation and what happens next Tony Li
Re: [rrg] Recommendation and what happens next Brian E Carpenter
Re: [rrg] Recommendation and what happens next Tony Li
[rrg] Why won't supporters of Loc/ID Separation (… Robin Whittle
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] Why won't supporters of Loc/ID Separati… Tony Li
Re: [rrg] Recommendation and what happens next Tony Li
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] Why won't supporters of Loc/ID Separati… Robin Whittle
Re: [rrg] Recommendation and what happens next Russ White
Re: [rrg] Recommendation and what happens next Templin, Fred L
Re: [rrg] Recommendation and what happens next Templin, Fred L
[rrg] IRON-RANGER scalability and support for pac… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] Recommendation and what happens next Tony Li
Re: [rrg] Why won't supporters of Loc/ID Separati… Tony Li
Re: [rrg] Recommendation and what happens next Brian E Carpenter
Re: [rrg] Recommendation and what happens next Scott Brim
Re: [rrg] Recommendation and what happens next Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] Recommendation and what happens next Tony Li
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] Recommendation and what happens next Robin Whittle
Re: [rrg] Recommendation and what happens next Scott Brim
Re: [rrg] Why won't supporters of Loc/ID Separati… Scott Brim
Re: [rrg] Why won't supporters of Loc/ID Separati… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] Recommendation and what happens next Templin, Fred L
Re: [rrg] Why won't supporters of Loc/ID Separati… Templin, Fred L
Re: [rrg] Recommendation and what happens next Scott Brim
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
Re: [rrg] IRON-RANGER scalability and support for… Robin Whittle
Re: [rrg] IRON-RANGER scalability and support for… Templin, Fred L
[rrg] Comments on 'draft-whittle-ivip-arch' Templin, Fred L
Re: [rrg] Comments on 'draft-whittle-ivip-arch' Robin Whittle
Re: [rrg] Comments on 'draft-whittle-ivip-arch' Robin Whittle