Re: [nvo3] Reachability and Locator Path Liveness

David Allan I <david.i.allan@ericsson.com> Tue, 10 April 2012 22:47 UTC

Return-Path: <david.i.allan@ericsson.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 59A9911E8081 for <nvo3@ietfa.amsl.com>; Tue, 10 Apr 2012 15:47:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v3CiGB+x-LO3 for <nvo3@ietfa.amsl.com>; Tue, 10 Apr 2012 15:47:28 -0700 (PDT)
Received: from imr3.ericy.com (imr3.ericy.com [198.24.6.13]) by ietfa.amsl.com (Postfix) with ESMTP id 1790111E812C for <nvo3@ietf.org>; Tue, 10 Apr 2012 15:47:28 -0700 (PDT)
Received: from eusaamw0706.eamcs.ericsson.se ([147.117.20.31]) by imr3.ericy.com (8.13.8/8.13.8) with ESMTP id q3AMk8Be023566 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Tue, 10 Apr 2012 17:47:24 -0500
Received: from EUSAACMS0703.eamcs.ericsson.se ([169.254.1.158]) by eusaamw0706.eamcs.ericsson.se ([147.117.20.31]) with mapi; Tue, 10 Apr 2012 18:47:11 -0400
From: David Allan I <david.i.allan@ericsson.com>
To: "david.black@emc.com" <david.black@emc.com>, "dmm@1-4-5.net" <dmm@1-4-5.net>
Date: Tue, 10 Apr 2012 18:47:09 -0400
Thread-Topic: [nvo3] Reachability and Locator Path Liveness
Thread-Index: Ac0XNdTsG2RuX7GfS5aqxaKmRME9EAAMtDDQAABi5VA=
Message-ID: <60C093A41B5E45409A19D42CF7786DFD522EC7160E@EUSAACMS0703.eamcs.ericsson.se>
References: <8D3D17ACE214DC429325B2B98F3AE712034117@MX15A.corp.emc.com> <CAHiKxWiVQX=H23gFL7Z4bqKAadWqNBWnYuLz=DeGD7JVZODpYQ@mail.gmail.com> <8D3D17ACE214DC429325B2B98F3AE712034170@MX15A.corp.emc.com> <CAHiKxWhTOnDKhCz1GQ4p3igWJ8rySK7zAAw=EkqonwErhiE6kQ@mail.gmail.com> <8D3D17ACE214DC429325B2B98F3AE7120534E9@MX15A.corp.emc.com>
In-Reply-To: <8D3D17ACE214DC429325B2B98F3AE7120534E9@MX15A.corp.emc.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "nvo3@ietf.org" <nvo3@ietf.org>
Subject: Re: [nvo3] Reachability and Locator Path Liveness
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "L2 \"Network Virtualization Over l3\" overlay discussion list \(nvo3\)" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Apr 2012 22:47:29 -0000

I don't think you can actually achieve geographic independence, aggregatable addresses, mobilty and locality in the same breath let alone network.

Currently L3 over L2 in some form of overlay is the level of indirection that decouples what I would call macro locality (where in the network) from micro locality (where in the cloud). Aggregatable addresses directly over aggregatable addresses without the level of indirection looks problematic to me and requires, at first blush, excessive gymnastics...and actually eliminates any potential value as it becomes more subsetting...you'd might as well concatenate the prefixes together

In that regard, I think we are saying the same thing...

Cheers
D

-----Original Message-----
From: nvo3-bounces@ietf.org [mailto:nvo3-bounces@ietf.org] On Behalf Of david.black@emc.com
Sent: Tuesday, April 10, 2012 3:29 PM
To: dmm@1-4-5.net
Cc: nvo3@ietf.org
Subject: Re: [nvo3] Reachability and Locator Path Liveness

> > To take a specific IPv4 example, it should be fine to have a bunch 
> > of NVEs in 10.1.1.0/24 and expect OSPF or the like to deal with that 
> > /24 and its kin as opposed to a /32 per NVE.  This example also 
> > applies to NVEs in end systems
> > - the IGP can still work with /24s and the like and does not need 
> > /32s (in this case, each /24 points to an L2 domain with potentially many end system NVEs).
> 
> Just because there is a covering prefix doesn't mean that any of the 
> longer prefixes that are covered are reachable. It is just this loss 
> of information (in this case due to aggregation) that causes the problem.

That is an assumption that can be made in a data center for the NVEs (which don't move) based on the IGP in the underlying network (e.g., every NVE in
10.1.1.0/24 is reachable via 10.1.1.0/24).  The initial endpoint identifiers for nvo3 are L2 MACs, which are inherently non-aggregatable, and I agree that aggregation assumptions for IP address endpoint identifiers are potentially problematic.

Thanks,
--David


> -----Original Message-----
> From: nvo3-bounces@ietf.org [mailto:nvo3-bounces@ietf.org] On Behalf 
> Of David Meyer
> Sent: Tuesday, April 10, 2012 12:20 PM
> To: Black, David
> Cc: nvo3@ietf.org
> Subject: Re: [nvo3] Reachability and Locator Path Liveness
> 
> David,
> 
> 
> > Backing up ...
> >
> >> > Looking at draft-meyer-loc-id-implications-01:
> >> >
> >> >        
> >> > http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
> >> >
> >> > I would suggest that for initial nvo3 work, reachability between 
> >> > all NVEs
> in a
> >> > single overlay instance should be assumed, as there will be an 
> >> > IGP
> routing protocol
> >> > (e.g., OSPF with ECMP) running on the underlying data center 
> >> > network
> which will
> >> > handle link failures.
> >>
> >> That may be a reasonable assumption, but the fact that an IGP is 
> >> running doesn't ease the "locator liveness" problem unless the 
> >> routing system is carrying /32s (or /128s) and the corresponding 
> >> /32 (/128) isn't injected into the IGP unless the decapsulator is 
> >> up (that in and of itself might not be sufficient as we've learned 
> >> from our experiences with anycast DNS overlays). In any event what 
> >> we would be doing in this case is using the routing system to 
> >> signal a live path to the decapsulator. Of course, carrying such 
> >> long prefixes has its own set of problems.
> >
> > I strongly disagree with that statement.
> >
> > First, what may be an important difference is that when the NVE is 
> > not in the end system (e.g., NVE in hypervisor softswitch or 
> > top-of-rack
> switch),
> > the locator (outer IP destination address) points at the NVE (tunnel
> endpoint,
> > decapsulation location), not the end system.  The end system is 
> > beyond the
> NVE,
> > so the NVE decaps the L2 frame and forwards based on that frame's L2
> destination
> > (MAC) address (so the NVE serves multiple end systems.
> 
> Well, the endpoint I was talking about is the decapsulator (e.g., a 
> LISP ETR). So maybe you can get away with just injecting /32s for 
> those (if one decided to do it this way).
> >
> > To take a specific IPv4 example, it should be fine to have a bunch 
> > of NVEs in 10.1.1.0/24 and expect OSPF or the like to deal with that 
> > /24 and its kin as opposed to a /32 per NVE.  This example also 
> > applies to NVEs in end
> systems
> > - the IGP can still work with /24s and the like and does not need 
> > /32s (in
> this
> > case, each /24 points to an L2 domain with potentially many end 
> > system
> NVEs).
> 
> Just because there is a covering prefix doesn't mean that any of the 
> longer prefixes that are covered are reachable. It is just this loss 
> of information (in this case due to aggregation) that causes the 
> problem.
> 
> Dave
> 
> >
> >> > That suggests that a starting point for whether different tunnel
> encapsulation types
> >> > should be supported in a single data center could be "if they 
> >> > don't have
> an NVE
> >> > node in common, they can be made to work" and optimizations can 
> >> > be
> considered later.
> >>
> >> Agree with this latter statement.
> >
> > Thank you.
> >
> > Thanks,
> > --David
> >
> >
> >> -----Original Message-----
> >> From: David Meyer [mailto:dmm@1-4-5.net]
> >> Sent: Tuesday, April 10, 2012 7:55 AM
> >> To: Black, David
> >> Cc: nvo3@ietf.org
> >> Subject: Re: [nvo3] Requirements + some non-requirement suggestions
> >>
> >> On Mon, Apr 9, 2012 at 1:51 PM,  <david.black@emc.com> wrote:
> >> >> Dave McDyson asked about three classes of connectivity:
> >> >>
> >> >> 1) a single data center
> >> >> 2) a set of data centers under control of one administrative 
> >> >> domain
> >> >> 3) multiple sets of data centers under control of multiple
> >> >>       administrative domains
> >> >>
> >> >> Which of these do we *need* to address in NVO3?
> >> >
> >> > I agree with a number of other people that we have to start with 
> >> > 1), and
> >> then I'd
> >> > suggest addressing as much of 2) as "makes sense" without 
> >> > significantly
> >> affecting
> >> > the design and applicability of the result to data centers.  For 
> >> > example,
> a
> >> single
> >> > instance of an overlay that spans data centers "makes sense", at 
> >> > least to
> >> me, or
> >> > as Thomas Narten described it:
> >> >
> >> >> two DCs are under the same administrative domain, but are 
> >> >> interconnected by some (existing) VPN technology, of which we 
> >> >> don't care what the details are. The overlay just tunnels 
> >> >> end-to-end and couldn't care less about the existence of a VPN 
> >> >> interconnecting parts of the network.
> >> >
> >> > Looking at draft-meyer-loc-id-implications-01:
> >> >
> >> >        
> >> > http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
> >> >
> >> > I would suggest that for initial nvo3 work, reachability between 
> >> > all NVEs
> in
> >> a
> >> > single overlay instance should be assumed, as there will be an 
> >> > IGP
> routing
> >> protocol
> >> > (e.g., OSPF with ECMP) running on the underlying data center 
> >> > network
> which
> >> will
> >> > handle link failures.
> >>
> >> That may be a reasonable assumption, but the fact that an IGP is 
> >> running doesn't ease the "locator liveness" problem unless the 
> >> routing system is carrying /32s (or /128s) and the corresponding 
> >> /32 (/128) isn't injected into the IGP unless the decapsulator is 
> >> up (that in and of itself might not be sufficient as we've learned 
> >> from our experiences with anycast DNS overlays). In any event what 
> >> we would be doing in this case is using the routing system to 
> >> signal a live path to the decapsulator. Of course, carrying such 
> >> long prefixes has its own set of problems.
> >>
> >> > Specifically, for initial nvo3 work I'd suggest assuming that the 
> >> > underlying network handles reachability of the NVE at the other
> >> side of
> >> > the overlay (other end of the tunnel) that does the 
> >> > decapsulation.  In
> terms
> >> > of that draft, within a single data center (and hence for the 
> >> > scope of
> >> initial
> >> > nvo3 work), I'd suggest that the underlying network be 
> >> > responsible for
> >> handling
> >> > the Locator Path Liveness problem.
> >>
> >> Not sure what you mean here by underlying network. In the LISP 
> >> case, does the underlying network handle this problem? In any event 
> >> can you be a bit more explicit in what you mean here?
> >> >
> >> > This suggestion also applies to the Multi-Exit problem, although 
> >> > on a
> >> related
> >> > note, I think it's a good idea to make sure that nvo3 doesn't 
> >> > turn any
> >> crucial
> >> > NVE into a single point of failure. Techniques like VRRP address 
> >> > this in
> the
> >> > absence of nvo3, so this could be mostly a matter of paying 
> >> > attention to
> >> ensure
> >> > that they're applicable to NVE failure.  Regardless of whether 
> >> > things
> work
> >> out
> >> > that way, I'd suggest that availability concerns be in scope.
> >> >
> >> > Turning to the topic of IGP metrics and "tromboning", I'd suggest 
> >> > that
> >> having
> >> > nvo3 add a full IGP routing protocol (or even an IGP metrics
> infrastructure
> >> for
> >> > one that has to be administered) beyond what's already running in 
> >> > the
> >> underlying
> >> > network is not a good idea.  It seems like a large portion of the
> >> "tromboning"
> >> > concerns could be resolved by techniques that distribute the 
> >> > default
> gateway
> >> in
> >> > a virtual network so that moving a VM (virtual machine) 
> >> > automatically
> sends
> >> > traffic to the locally-applicable instance of the default gateway 
> >> > for the
> >> VM's
> >> > new location, based on the same L2 address. There are multiple 
> >> > examples
> of
> >> this
> >> > sort of approach - OTV and draft-raggarwa-data-center-mobility-02 
> >> > are
> among
> >> them.
> >> >
> >> > One more - Peter Ashwood-Smith writes:
> >> >
> >> >> Is it a requirement to support different tunnel encapsulation 
> >> >> types in
> the
> >> same DC?
> >> >>
> >> >> It would seem that a very large DC could well end up with 
> >> >> several
> different
> >> kinds of
> >> >> tunnel encapsulations that would need to somehow be bridged if 
> >> >> they
> >> terminate VMs in
> >> >> the same subnet.
> >> >
> >> > I'd suggest that the latter scenario be out of scope and that 
> >> > crossing
> >> virtual
> >> > networks initially involve routing in preference to bridging, so 
> >> > that an
> NVE
> >> > receiving an unencapsulated packet can determine the overlay and
> >> encapsulation by
> >> > knowing which virtual network the packet belongs to.  An 
> >> > implication is
> that
> >> I'd
> >> > suggest figuring out how to optimize the following structure into 
> >> > a
> single
> >> > network node later (or at least as a cleanly separable work effort:
> >> >
> >> > ... (Overlay1)---- NVE1 ---- (VLANs) ---- NVE2 ---- (Overlay2) ...
> >> >
> >> > In the above, NVE1 and NVE2 are separate nodes, and the 
> >> > parenthesized
> terms
> >> are
> >> > the means of virtual network separation.
> >> >
> >> > That suggests that a starting point for whether different tunnel
> >> encapsulation types
> >> > should be supported in a single data center could be "if they 
> >> > don't have
> an
> >> NVE
> >> > node in common, they can be made to work" and optimizations can 
> >> > be
> >> considered later.
> >>
> >> Agree with this latter statement.
> >>
> >> Dave
> >
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3

_______________________________________________
nvo3 mailing list
nvo3@ietf.org
https://www.ietf.org/mailman/listinfo/nvo3