Re: [nvo3] Reachability and Locator Path Liveness

> > To take a specific IPv4 example, it should be fine to have a bunch of NVEs
> > in 10.1.1.0/24 and expect OSPF or the like to deal with that /24 and its kin
> > as opposed to a /32 per NVE.  This example also applies to NVEs in end systems
> > - the IGP can still work with /24s and the like and does not need /32s (in this
> > case, each /24 points to an L2 domain with potentially many end system NVEs).
> 
> Just because there is a covering prefix doesn't mean that any of the longer prefixes that are
> covered are reachable. It is just this loss of information (in this case due to aggregation) that causes
> the problem.

That is an assumption that can be made in a data center for the NVEs (which
don't move) based on the IGP in the underlying network (e.g., every NVE in
10.1.1.0/24 is reachable via 10.1.1.0/24).  The initial endpoint identifiers
for nvo3 are L2 MACs, which are inherently non-aggregatable, and I agree that
aggregation assumptions for IP address endpoint identifiers are potentially
problematic.

Thanks,
--David

> -----Original Message-----
> From: nvo3-bounces@ietf.org [mailto:nvo3-bounces@ietf.org] On Behalf Of David
> Meyer
> Sent: Tuesday, April 10, 2012 12:20 PM
> To: Black, David
> Cc: nvo3@ietf.org
> Subject: Re: [nvo3] Reachability and Locator Path Liveness
> 
> David,
> 
> 
> > Backing up ...
> >
> >> > Looking at draft-meyer-loc-id-implications-01:
> >> >
> >> >        http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
> >> >
> >> > I would suggest that for initial nvo3 work, reachability between all NVEs
> in a
> >> > single overlay instance should be assumed, as there will be an IGP
> routing protocol
> >> > (e.g., OSPF with ECMP) running on the underlying data center network
> which will
> >> > handle link failures.
> >>
> >> That may be a reasonable assumption, but the fact that an IGP is
> >> running doesn't ease the "locator liveness" problem unless the routing
> >> system is carrying /32s (or /128s) and the corresponding /32 (/128)
> >> isn't injected into the IGP unless the decapsulator is up (that in and
> >> of itself might not be sufficient as we've learned from our
> >> experiences with anycast DNS overlays). In any event what we would be
> >> doing in this case is using the routing system to signal a live path
> >> to the decapsulator. Of course, carrying such long prefixes has its
> >> own set of problems.
> >
> > I strongly disagree with that statement.
> >
> > First, what may be an important difference is that when the NVE is not
> > in the end system (e.g., NVE in hypervisor softswitch or top-of-rack
> switch),
> > the locator (outer IP destination address) points at the NVE (tunnel
> endpoint,
> > decapsulation location), not the end system.  The end system is beyond the
> NVE,
> > so the NVE decaps the L2 frame and forwards based on that frame's L2
> destination
> > (MAC) address (so the NVE serves multiple end systems.
> 
> Well, the endpoint I was talking about is the decapsulator (e.g., a
> LISP ETR). So maybe
> you can get away with just injecting /32s for those (if one decided to
> do it this way).
> >
> > To take a specific IPv4 example, it should be fine to have a bunch of NVEs
> > in 10.1.1.0/24 and expect OSPF or the like to deal with that /24 and its kin
> > as opposed to a /32 per NVE.  This example also applies to NVEs in end
> systems
> > - the IGP can still work with /24s and the like and does not need /32s (in
> this
> > case, each /24 points to an L2 domain with potentially many end system
> NVEs).
> 
> Just because there is a covering prefix doesn't mean that any of the
> longer prefixes that are
> covered are reachable. It is just this loss of information (in this
> case due to aggregation) that causes
> the problem.
> 
> Dave
> 
> >
> >> > That suggests that a starting point for whether different tunnel
> encapsulation types
> >> > should be supported in a single data center could be "if they don't have
> an NVE
> >> > node in common, they can be made to work" and optimizations can be
> considered later.
> >>
> >> Agree with this latter statement.
> >
> > Thank you.
> >
> > Thanks,
> > --David
> >
> >
> >> -----Original Message-----
> >> From: David Meyer [mailto:dmm@1-4-5.net]
> >> Sent: Tuesday, April 10, 2012 7:55 AM
> >> To: Black, David
> >> Cc: nvo3@ietf.org
> >> Subject: Re: [nvo3] Requirements + some non-requirement suggestions
> >>
> >> On Mon, Apr 9, 2012 at 1:51 PM,  <david.black@emc.com> wrote:
> >> >> Dave McDyson asked about three classes of connectivity:
> >> >>
> >> >> 1) a single data center
> >> >> 2) a set of data centers under control of one administrative domain
> >> >> 3) multiple sets of data centers under control of multiple
> >> >>       administrative domains
> >> >>
> >> >> Which of these do we *need* to address in NVO3?
> >> >
> >> > I agree with a number of other people that we have to start with 1), and
> >> then I'd
> >> > suggest addressing as much of 2) as "makes sense" without significantly
> >> affecting
> >> > the design and applicability of the result to data centers.  For example,
> a
> >> single
> >> > instance of an overlay that spans data centers "makes sense", at least to
> >> me, or
> >> > as Thomas Narten described it:
> >> >
> >> >> two DCs are under the same administrative domain, but are
> >> >> interconnected by some (existing) VPN technology, of which we don't
> >> >> care what the details are. The overlay just tunnels end-to-end and
> >> >> couldn't care less about the existence of a VPN interconnecting parts
> >> >> of the network.
> >> >
> >> > Looking at draft-meyer-loc-id-implications-01:
> >> >
> >> >        http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
> >> >
> >> > I would suggest that for initial nvo3 work, reachability between all NVEs
> in
> >> a
> >> > single overlay instance should be assumed, as there will be an IGP
> routing
> >> protocol
> >> > (e.g., OSPF with ECMP) running on the underlying data center network
> which
> >> will
> >> > handle link failures.
> >>
> >> That may be a reasonable assumption, but the fact that an IGP is
> >> running doesn't ease the "locator liveness" problem unless the routing
> >> system is carrying /32s (or /128s) and the corresponding /32 (/128)
> >> isn't injected into the IGP unless the decapsulator is up (that in and
> >> of itself might not be sufficient as we've learned from our
> >> experiences with anycast DNS overlays). In any event what we would be
> >> doing in this case is using the routing system to signal a live path
> >> to the decapsulator. Of course, carrying such long prefixes has its
> >> own set of problems.
> >>
> >> > Specifically, for initial nvo3 work I'd suggest assuming
> >> > that the underlying network handles reachability of the NVE at the other
> >> side of
> >> > the overlay (other end of the tunnel) that does the decapsulation.  In
> terms
> >> > of that draft, within a single data center (and hence for the scope of
> >> initial
> >> > nvo3 work), I'd suggest that the underlying network be responsible for
> >> handling
> >> > the Locator Path Liveness problem.
> >>
> >> Not sure what you mean here by underlying network. In the LISP case,
> >> does the underlying network handle this problem? In any event can you
> >> be a bit more explicit in what you mean here?
> >> >
> >> > This suggestion also applies to the Multi-Exit problem, although on a
> >> related
> >> > note, I think it's a good idea to make sure that nvo3 doesn't turn any
> >> crucial
> >> > NVE into a single point of failure. Techniques like VRRP address this in
> the
> >> > absence of nvo3, so this could be mostly a matter of paying attention to
> >> ensure
> >> > that they're applicable to NVE failure.  Regardless of whether things
> work
> >> out
> >> > that way, I'd suggest that availability concerns be in scope.
> >> >
> >> > Turning to the topic of IGP metrics and "tromboning", I'd suggest that
> >> having
> >> > nvo3 add a full IGP routing protocol (or even an IGP metrics
> infrastructure
> >> for
> >> > one that has to be administered) beyond what's already running in the
> >> underlying
> >> > network is not a good idea.  It seems like a large portion of the
> >> "tromboning"
> >> > concerns could be resolved by techniques that distribute the default
> gateway
> >> in
> >> > a virtual network so that moving a VM (virtual machine) automatically
> sends
> >> > traffic to the locally-applicable instance of the default gateway for the
> >> VM's
> >> > new location, based on the same L2 address. There are multiple examples
> of
> >> this
> >> > sort of approach - OTV and draft-raggarwa-data-center-mobility-02 are
> among
> >> them.
> >> >
> >> > One more - Peter Ashwood-Smith writes:
> >> >
> >> >> Is it a requirement to support different tunnel encapsulation types in
> the
> >> same DC?
> >> >>
> >> >> It would seem that a very large DC could well end up with several
> different
> >> kinds of
> >> >> tunnel encapsulations that would need to somehow be bridged if they
> >> terminate VMs in
> >> >> the same subnet.
> >> >
> >> > I'd suggest that the latter scenario be out of scope and that crossing
> >> virtual
> >> > networks initially involve routing in preference to bridging, so that an
> NVE
> >> > receiving an unencapsulated packet can determine the overlay and
> >> encapsulation by
> >> > knowing which virtual network the packet belongs to.  An implication is
> that
> >> I'd
> >> > suggest figuring out how to optimize the following structure into a
> single
> >> > network node later (or at least as a cleanly separable work effort:
> >> >
> >> > ... (Overlay1)---- NVE1 ---- (VLANs) ---- NVE2 ---- (Overlay2) ...
> >> >
> >> > In the above, NVE1 and NVE2 are separate nodes, and the parenthesized
> terms
> >> are
> >> > the means of virtual network separation.
> >> >
> >> > That suggests that a starting point for whether different tunnel
> >> encapsulation types
> >> > should be supported in a single data center could be "if they don't have
> an
> >> NVE
> >> > node in common, they can be made to work" and optimizations can be
> >> considered later.
> >>
> >> Agree with this latter statement.
> >>
> >> Dave
> >
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3