Re: [homenet] Multilink subnet routing (MLSRv2)

Curtis Villamizar <curtis@occnc.com> Mon, 10 October 2011 20:50 UTC

Return-Path: <curtis@occnc.com>
X-Original-To: homenet@ietfa.amsl.com
Delivered-To: homenet@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B41E21F84AB for <homenet@ietfa.amsl.com>; Mon, 10 Oct 2011 13:50:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1
X-Spam-Level:
X-Spam-Status: No, score=-1 tagged_above=-999 required=5 tests=[AWL=1.600, BAYES_00=-2.599, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SIOHBySgAcXp for <homenet@ietfa.amsl.com>; Mon, 10 Oct 2011 13:50:17 -0700 (PDT)
Received: from gateway.ipv6.occnc.com (gateway.ipv6.occnc.com [IPv6:2001:470:1f07:1545::1:132]) by ietfa.amsl.com (Postfix) with ESMTP id 9A00821F8CE8 for <homenet@ietf.org>; Mon, 10 Oct 2011 13:50:17 -0700 (PDT)
Received: from newharbor.ipv6.occnc.com (newharbor.ipv6.occnc.com [IPv6:2001:470:1f07:1545::1:320]) (authenticated bits=0) by gateway.ipv6.occnc.com (8.14.5/8.14.5) with ESMTP id p9AKoDdP055011; Mon, 10 Oct 2011 13:50:14 -0700 (PDT) (envelope-from curtis@occnc.com)
X-DKIM: Sendmail DKIM Filter v2.8.3 gateway.ipv6.occnc.com p9AKoDdP055011
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=occnc.com; s=occnc; t=1318279814; bh=JosL67J/0EfwTktODBU68avEIRa7/G0TSHy9m7dAQXM=; h=To:cc:Reply-To:From:Subject:In-reply-to:Date; b=WNfbJ9jmhbX5c1Sdph9BBvo41BrdiIqoEgN/6SYQCYch56CItThS+pJvUO98eEOk9 61g5lzumFGtOsyylrZYn8JG3lBDLWWc8Jao5md5YebujghMP/oMXxG9cgaHGwt8i3L e51lVLi9HPs+pec91eP4n6I/vvuntBnS9a/wFYvw=
Message-Id: <201110102050.p9AKoDdP055011@gateway.ipv6.occnc.com>
To: Ole Troan <ot@cisco.com>
From: Curtis Villamizar <curtis@occnc.com>
In-reply-to: Your message of "Mon, 10 Oct 2011 00:38:28 +0200." <F5641488-CF63-4331-A471-3C79EF6450C4@cisco.com>
Date: Mon, 10 Oct 2011 13:50:13 -0700
Cc: Erik Nordmark <nordmark@cisco.com>, homenet@ietf.org
Subject: Re: [homenet] Multilink subnet routing (MLSRv2)
X-BeenThere: homenet@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: curtis@occnc.com
List-Id: <homenet.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/homenet>, <mailto:homenet-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/homenet>
List-Post: <mailto:homenet@ietf.org>
List-Help: <mailto:homenet-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/homenet>, <mailto:homenet-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Oct 2011 20:50:19 -0000

In message <F5641488-CF63-4331-A471-3C79EF6450C4@cisco.com>
Ole Troan writes:
 
> Erik, et al,
>  
> to expand on the ideas I presented on MLSR (or rather MLSRv2 as it
> hasn't really been described anywhere) as a method for numbering a
> routed home. please let me be clear that I'm not convinced this is a
> good idea. i.e. why not just get < /64?  I do think we could get
> something working though.

Having missed the meeting, anything more recent on MLSR than this:

  draft-ietf-ipv6-multilink-subnets-00
  Multi-link Subnet Support in IPv6 (2002-07-08, Expired)

  draft-thaler-ipngwg-multilink-subnets-02
  Multi-link Subnet Support in IPv6 (2001-11-29, Expired)

The only reference to MLSR in an RFC is in an IAB RFC, RFC4903:

   There was also a proposal to define multi-link subnets [MLSR] for
   IPv6.  However, this notion was abandoned by the IPv6 WG due to the
   issues discussed in this memo, and that proposal was replaced by a
   different mechanism that preserves the notion that a subnet spans
   only one link [RFC4389].

As to the second comment regarding /64: The magic /64 boundary should
be interpreted as follows:

  Anything shorter than or equal to /64 can be globally routable.

  Anything longer than /64 may not be globally routable.

For core routers the usual behaviour (or at least goal) is:

  Anything shorter than or equal to /64 can be forwarded at full
  speed, even for long bursts of very small packets.

  For anything matching a 64 bit "local subnet" address, look at the
  bottom /64 and still forward at full speed, even for long bursts of
  very small packets.

  For everything else (ie: not in a local subnet, and longer than /64
  prefix) forward anyway, but maybe not at full speed.

The whole purpose of this is that longer LPM lookups take time and at
high speed like like 10 Gb/s or 100 Gb/s (or more), it is really hard
to go faster without a high cost in silicon die area and power (and
worse yet for anything using external TCAM, two external lookups).

The /64 boundary was rather arbitrary.  (And IMNSHO further evidence
that 128 bit addrseeses was a bone headed choice and much too long,
but it is way too late by about a decade or more to change that).

There is no reasons that homenets (generally supporting 1 Gb/s or
less) can't further subnet a /64.

> routers can be in an arbitrary topology. all routers running a routing
> protocol.  the site prefix (/64) is either advertised in the IGP with
> a new LSA or proxying of RA messages is done (split horizon).  a
> router advertises the same /64 prefix (in a PIO) on all of its
> interfaces. L bit is 0.
>  
> the link model here is that all hosts are off link from each
> other. link-local scope is restricted to only the physical
> link. multicast link-local scope as well.

Not a good assumption.  There will still be dumb Ethernet switches and
bridging only WiFi hubs out there for a long time, so multiple hosts
on a subnet/link.

> a host uses SLAAC (or DHCP) to create an address, then does DAD as
> normal. the first hop router uses it's routing topology database to
> check for conflicts. similar mechanisms described in SAVI are used to
> glean address information from the host. the SAVI binding database is
> then used to inject host routes into the IGP.

Neither SLAAC (rfc2462) or DHCP6 (rfc3315) supports allocation of a
non-link-local prefix for the purpose of subdividing a prefix.  For
example, in DHCP6, the IA_TA and IA_NA are assumed to be link-local.
The IA-PD added this capability to DHCP6, but there is no
corresponding addition to SLAAC, so DHCP6 is preferred for routers.

Where multiple routers exist, all will initially be sending an IA-PD
request to each other.  If there is no connectivity to a service
provider who responds to an IA-PD, and none of the routers are
configured, then none should respond (can't sub allocate what you
don't have).

If the provider sends an IA-PD, response with a /64, then that border
router can respond with suballocations.  There is no guidance in
rfc3633 as to what to initially send in a IA-PD, but I suggest the
following (which lacks loop suppression):

  send 0/112 (65K addressses) initially

  send 0/[len+8] if the following are both true:
    the router has received an IA-PD request from another router, and
    the router already has a prefix assignment, and
    the existing prefix assignment is too small, and
    len is the longest prefix length it has

What is needed then is request loop suppression.  A new IA-PD would be
needed.  One possibility is a new IA-PD option in which the set of
IA-PD requestors is listed.  This creates a path vector loop
suppression, much like BGP using AS numbers.  If a router gets two
replies, prefer the one with the shorter path.

Having set up one or more prefixes, a router can then select a
router-Id (non-trivial) and start running a link state IGP (or link
state IGP with specialized metric for LLN or whatever).  That will
have to be the topic of another thread (please).

Another issue brought up is a multihomed connection.  If the provider
is smart, both of the connections will respond with the same /64 to an
IA-PD request.  That would be set up at service provisioning time.  If
the client was also smart, the border routers would not even bother
with an IA-PD request and have the /64 configured.  Absent of this,
then the homenet would have two /64s to allocate from and would
allocate from both.

There is no harm in a subnet having more than one prefix allocated to
it.  If a routing protocol is running, then any arbitrary fault, short
of a partition, would maintain connectivity.  With a partition, bad
things can happen.  Do we solve that?

> this requires no flooding of ND, or any other changes to on-link
> protocols for loop detection. no changes in hosts either.  only
> downside is that it requires a host to have sent a packet of some form
> for the SAVI binding to be initiated.  it might also be possible to
> support host mobility with the home with this mechanism.

The other down side is you essentially have host routes.

I'm definitely not in favor of making homenets run more like bridged
networks which is what MLSR does (afaik).

ND Proxy (rfc4389) is also making a network look more like a bridged
networks (no traceroute support, does not always use shortest path,
less than robust zero configuration loop suppression, etc.  IMHO ND
Proxy should also not be considered.

> cheers,
> Ole

cheers,

Curtis