Re: [nvo3] Reachability and Locator Path Liveness

Dino Farinacci <dino@cisco.com> Tue, 10 April 2012 17:31 UTC

Return-Path: <dino@cisco.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1C51211E80DF for <nvo3@ietfa.amsl.com>; Tue, 10 Apr 2012 10:31:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0
X-Spam-Level:
X-Spam-Status: No, score=x tagged_above=-999 required=5 tests=[]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Xa+aOreEce+x for <nvo3@ietfa.amsl.com>; Tue, 10 Apr 2012 10:31:20 -0700 (PDT)
Received: from mtv-iport-2.cisco.com (mtv-iport-2.cisco.com [173.36.130.13]) by ietfa.amsl.com (Postfix) with ESMTP id 02B6D11E80E8 for <nvo3@ietf.org>; Tue, 10 Apr 2012 10:31:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=dino@cisco.com; l=2399514; q=dns/txt; s=iport; t=1334079079; x=1335288679; h=subject:mime-version:from:in-reply-to:date:cc:message-id: references:to; bh=0VVj6FfyLVGwksbeuTBBa2blCvIgtCEBsrHRVEkrL/I=; b=GK0LHGl1TSEAhsrbF6U66eVqFryCD29Pa562vt1gN+SJ7YIQX+9wMam2 iIc+MWWqmy0bvJaNu9D22+GaQsjDVAiR9KZf3Y7lg6o4x7wQ/KV4X6p91 y8oyeXrSPvMlEFPFNx0oUo1eDkQvbe3+tcbOHQYFjn8sEUxUgCSu9NKXB 0=;
X-Files: lisp-ietf-arn-lra.ppt, lisp-ietf-arn-rloc-probing.ppt : 1072128, 672256
X-IronPort-AV: E=Sophos; i="4.75,399,1330905600"; d="ppt'32?scan'32,208,32"; a="39884417"
Received: from mtv-core-2.cisco.com ([171.68.58.7]) by mtv-iport-2.cisco.com with ESMTP; 10 Apr 2012 17:31:16 +0000
Received: from [10.21.75.141] ([10.21.75.141]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id q3AHVCjV000471; Tue, 10 Apr 2012 17:31:12 GMT
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: multipart/mixed; boundary="Apple-Mail=_7AC084B0-5ADB-447A-BD83-5A73C6D01F93"
From: Dino Farinacci <dino@cisco.com>
In-Reply-To: <4F844068.4060004@joelhalpern.com>
Date: Tue, 10 Apr 2012 10:31:11 -0700
Message-Id: <35B6FF1A-0A70-43C0-913B-D7F7B4E44FC0@cisco.com>
References: <8D3D17ACE214DC429325B2B98F3AE712034117@MX15A.corp.emc.com> <CAHiKxWiVQX=H23gFL7Z4bqKAadWqNBWnYuLz=DeGD7JVZODpYQ@mail.gmail.com> <8D3D17ACE214DC429325B2B98F3AE712034170@MX15A.corp.emc.com> <4F844068.4060004@joelhalpern.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>
X-Mailer: Apple Mail (2.1257)
Cc: dmm@1-4-5.net, david.black@emc.com, nvo3@ietf.org
Subject: Re: [nvo3] Reachability and Locator Path Liveness
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "L2 \"Network Virtualization Over l3\" overlay discussion list \(nvo3\)" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Apr 2012 17:31:21 -0000

Here are two presentations on the subject I gave at the Stockholm LISP WG meeting in July 2009.

Read the lisp-ietf-arn-lra.ppt attachment first, it leads you into the second one. It turns out we put RLOC-probing and Echo-Noncing in the LISP spec as possible solutions with different scaling tradeoffs.

Dino




On Apr 10, 2012, at 7:15 AM, Joel M. Halpern wrote:

> In concluding that there is no locator liveness problem, is you assumption that there is always only one locator that can be used?
> 
> The reason I ask is that there are two basic, related, causes that drive the locator liveness issue.
> First, if an end-point can be reached with two locators, it is important to know that one of them is no longer working, so that you switch to using the other.
> \Second, if an end-point can change locators, one of the usual techniques for handling things is to have the original locator send new information.  If one relies on this, then if the original locator is dead one needs that information as a proxy to prompt refetching the information.
> 
> If neither of those applies, then maybe there is no locator liveness problem for VMs.
> 
> With regard to site exit, the issues seem more complex than your initial description painted.  It may be possible to solve.  But it is not free if one is using the locator of the site exit, instead of underlying routing to the ultimate destination.
> 
> Yours,
> Joel
> 
> On 4/10/2012 9:41 AM, david.black@emc.com wrote:
>> Dave,
>> 
>> In reverse order ...
>> 
>>>> Specifically, for initial nvo3 work I'd suggest assuming
>>>> that the underlying network handles reachability of the NVE at the other side of
>>>> the overlay (other end of the tunnel) that does the decapsulation.  In terms
>>>> of that draft, within a single data center (and hence for the scope of initial
>>>> nvo3 work), I'd suggest that the underlying network be responsible for handling
>>>> the Locator Path Liveness problem.
>>> 
>>> Not sure what you mean here by underlying network. In the LISP case,
>>> does the underlying network handle this problem? In any event can you
>>> be a bit more explicit in what you mean here?
>> 
>> The underlying network is what connects the NVEs (encap/decap locations) in
>> the data center, and it runs an IGP (e.g., OSPF with ECMP) for its routing.
>> That IGP is going to be configured with relatively conventional IP address
>> blocks (e.g., /24s for IPv4 if that's what the data center is already using).
>> 
>> Backing up ...
>> 
>>>> Looking at draft-meyer-loc-id-implications-01:
>>>> 
>>>>        http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
>>>> 
>>>> I would suggest that for initial nvo3 work, reachability between all NVEs in a
>>>> single overlay instance should be assumed, as there will be an IGP routing protocol
>>>> (e.g., OSPF with ECMP) running on the underlying data center network which will
>>>> handle link failures.
>>> 
>>> That may be a reasonable assumption, but the fact that an IGP is
>>> running doesn't ease the "locator liveness" problem unless the routing
>>> system is carrying /32s (or /128s) and the corresponding /32 (/128)
>>> isn't injected into the IGP unless the decapsulator is up (that in and
>>> of itself might not be sufficient as we've learned from our
>>> experiences with anycast DNS overlays). In any event what we would be
>>> doing in this case is using the routing system to signal a live path
>>> to the decapsulator. Of course, carrying such long prefixes has its
>>> own set of problems.
>> 
>> I strongly disagree with that statement.
>> 
>> First, what may be an important difference is that when the NVE is not
>> in the end system (e.g., NVE in hypervisor softswitch or top-of-rack switch),
>> the locator (outer IP destination address) points at the NVE (tunnel endpoint,
>> decapsulation location), not the end system.  The end system is beyond the NVE,
>> so the NVE decaps the L2 frame and forwards based on that frame's L2 destination
>> (MAC) address (so the NVE serves multiple end systems.
>> 
>> To take a specific IPv4 example, it should be fine to have a bunch of NVEs
>> in 10.1.1.0/24 and expect OSPF or the like to deal with that /24 and its kin
>> as opposed to a /32 per NVE.  This example also applies to NVEs in end systems
>> - the IGP can still work with /24s and the like and does not need /32s (in this
>> case, each /24 points to an L2 domain with potentially many end system NVEs).
>> 
>>>> That suggests that a starting point for whether different tunnel encapsulation types
>>>> should be supported in a single data center could be "if they don't have an NVE
>>>> node in common, they can be made to work" and optimizations can be considered later.
>>> 
>>> Agree with this latter statement.
>> 
>> Thank you.
>> 
>> Thanks,
>> --David
>> 
>> 
>>> -----Original Message-----
>>> From: David Meyer [mailto:dmm@1-4-5.net]
>>> Sent: Tuesday, April 10, 2012 7:55 AM
>>> To: Black, David
>>> Cc: nvo3@ietf.org
>>> Subject: Re: [nvo3] Requirements + some non-requirement suggestions
>>> 
>>> On Mon, Apr 9, 2012 at 1:51 PM,<david.black@emc.com>  wrote:
>>>>> Dave McDyson asked about three classes of connectivity:
>>>>> 
>>>>> 1) a single data center
>>>>> 2) a set of data centers under control of one administrative domain
>>>>> 3) multiple sets of data centers under control of multiple
>>>>>       administrative domains
>>>>> 
>>>>> Which of these do we *need* to address in NVO3?
>>>> 
>>>> I agree with a number of other people that we have to start with 1), and
>>> then I'd
>>>> suggest addressing as much of 2) as "makes sense" without significantly
>>> affecting
>>>> the design and applicability of the result to data centers.  For example, a
>>> single
>>>> instance of an overlay that spans data centers "makes sense", at least to
>>> me, or
>>>> as Thomas Narten described it:
>>>> 
>>>>> two DCs are under the same administrative domain, but are
>>>>> interconnected by some (existing) VPN technology, of which we don't
>>>>> care what the details are. The overlay just tunnels end-to-end and
>>>>> couldn't care less about the existence of a VPN interconnecting parts
>>>>> of the network.
>>>> 
>>>> Looking at draft-meyer-loc-id-implications-01:
>>>> 
>>>>        http://tools.ietf.org/html/draft-meyer-loc-id-implications-01
>>>> 
>>>> I would suggest that for initial nvo3 work, reachability between all NVEs in
>>> a
>>>> single overlay instance should be assumed, as there will be an IGP routing
>>> protocol
>>>> (e.g., OSPF with ECMP) running on the underlying data center network which
>>> will
>>>> handle link failures.
>>> 
>>> That may be a reasonable assumption, but the fact that an IGP is
>>> running doesn't ease the "locator liveness" problem unless the routing
>>> system is carrying /32s (or /128s) and the corresponding /32 (/128)
>>> isn't injected into the IGP unless the decapsulator is up (that in and
>>> of itself might not be sufficient as we've learned from our
>>> experiences with anycast DNS overlays). In any event what we would be
>>> doing in this case is using the routing system to signal a live path
>>> to the decapsulator. Of course, carrying such long prefixes has its
>>> own set of problems.
>>> 
>>>> Specifically, for initial nvo3 work I'd suggest assuming
>>>> that the underlying network handles reachability of the NVE at the other
>>> side of
>>>> the overlay (other end of the tunnel) that does the decapsulation.  In terms
>>>> of that draft, within a single data center (and hence for the scope of
>>> initial
>>>> nvo3 work), I'd suggest that the underlying network be responsible for
>>> handling
>>>> the Locator Path Liveness problem.
>>> 
>>> Not sure what you mean here by underlying network. In the LISP case,
>>> does the underlying network handle this problem? In any event can you
>>> be a bit more explicit in what you mean here?
>>>> 
>>>> This suggestion also applies to the Multi-Exit problem, although on a
>>> related
>>>> note, I think it's a good idea to make sure that nvo3 doesn't turn any
>>> crucial
>>>> NVE into a single point of failure. Techniques like VRRP address this in the
>>>> absence of nvo3, so this could be mostly a matter of paying attention to
>>> ensure
>>>> that they're applicable to NVE failure.  Regardless of whether things work
>>> out
>>>> that way, I'd suggest that availability concerns be in scope.
>>>> 
>>>> Turning to the topic of IGP metrics and "tromboning", I'd suggest that
>>> having
>>>> nvo3 add a full IGP routing protocol (or even an IGP metrics infrastructure
>>> for
>>>> one that has to be administered) beyond what's already running in the
>>> underlying
>>>> network is not a good idea.  It seems like a large portion of the
>>> "tromboning"
>>>> concerns could be resolved by techniques that distribute the default gateway
>>> in
>>>> a virtual network so that moving a VM (virtual machine) automatically sends
>>>> traffic to the locally-applicable instance of the default gateway for the
>>> VM's
>>>> new location, based on the same L2 address. There are multiple examples of
>>> this
>>>> sort of approach - OTV and draft-raggarwa-data-center-mobility-02 are among
>>> them.
>>>> 
>>>> One more - Peter Ashwood-Smith writes:
>>>> 
>>>>> Is it a requirement to support different tunnel encapsulation types in the
>>> same DC?
>>>>> 
>>>>> It would seem that a very large DC could well end up with several different
>>> kinds of
>>>>> tunnel encapsulations that would need to somehow be bridged if they
>>> terminate VMs in
>>>>> the same subnet.
>>>> 
>>>> I'd suggest that the latter scenario be out of scope and that crossing
>>> virtual
>>>> networks initially involve routing in preference to bridging, so that an NVE
>>>> receiving an unencapsulated packet can determine the overlay and
>>> encapsulation by
>>>> knowing which virtual network the packet belongs to.  An implication is that
>>> I'd
>>>> suggest figuring out how to optimize the following structure into a single
>>>> network node later (or at least as a cleanly separable work effort:
>>>> 
>>>> ... (Overlay1)---- NVE1 ---- (VLANs) ---- NVE2 ---- (Overlay2) ...
>>>> 
>>>> In the above, NVE1 and NVE2 are separate nodes, and the parenthesized terms
>>> are
>>>> the means of virtual network separation.
>>>> 
>>>> That suggests that a starting point for whether different tunnel
>>> encapsulation types
>>>> should be supported in a single data center could be "if they don't have an
>>> NVE
>>>> node in common, they can be made to work" and optimizations can be
>>> considered later.
>>> 
>>> Agree with this latter statement.
>>> 
>>> Dave
>> 
>> _______________________________________________
>> nvo3 mailing list
>> nvo3@ietf.org
>> https://www.ietf.org/mailman/listinfo/nvo3
>> 
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3