Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03

Marc Binderberger <marc@sniff.de> Thu, 20 February 2014 00:40 UTC

Return-Path: <marc@sniff.de>
X-Original-To: lisp@ietfa.amsl.com
Delivered-To: lisp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF5D11A0467 for <lisp@ietfa.amsl.com>; Wed, 19 Feb 2014 16:40:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_DE=0.35, RP_MATCHES_RCVD=-0.548] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MNAbG95QcmDl for <lisp@ietfa.amsl.com>; Wed, 19 Feb 2014 16:40:11 -0800 (PST)
Received: from door.sniff.de (door.sniff.de [IPv6:2001:6f8:94f:1::1]) by ietfa.amsl.com (Postfix) with ESMTP id 7AE761A02D8 for <lisp@ietf.org>; Wed, 19 Feb 2014 16:40:10 -0800 (PST)
Received: from [IPv6:::1] (localhost.sniff.de [127.0.0.1]) by door.sniff.de (Postfix) with ESMTP id 43B122AA0F; Thu, 20 Feb 2014 00:40:04 +0000 (GMT)
Date: Wed, 19 Feb 2014 16:40:03 -0800
From: Marc Binderberger <marc@sniff.de>
To: Dino Farinacci <farinacci@gmail.com>
Message-ID: <20140219164003436846.18c08f74@sniff.de>
In-Reply-To: <70C508D5-17D3-4C62-9CDE-802D05AA8D9D@gmail.com>
References: <20140218144825842648.087ffc67@sniff.de> <7DFCF6EA-9F05-468D-B51F-7AB7DEC149C8@inria.fr> <20140219111747519985.d46b87a8@sniff.de> <C7979A6D-4636-45EF-82A7-AE35F1269F36@gmail.com> <20140219115305183057.3957d484@sniff.de> <70C508D5-17D3-4C62-9CDE-802D05AA8D9D@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: GyazMail version 1.5.15
Archived-At: http://mailarchive.ietf.org/arch/msg/lisp/iAMLehDRlF8cJQyuD27gBFb0CQg
Cc: Damien Saucez <damien.saucez@inria.fr>, Clarence Filsfils <cf@cisco.com>, Luigi Iannone <luigi.iannone@telecom-paristech.fr>, LISP mailing list list <lisp@ietf.org>
Subject: Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Feb 2014 00:40:14 -0000

Hello Dino et al.,

> Yes, what you describe can work. But once you deflect, the other ITR 
> still needs to send Map-Requests for all the new EIDs that are not 
> cached in the map-cache.

True. Two options I see

(a) rate-limit the map-requests from the just-reloaded ITR. All this 
does is some EIDs are a bit longer deflected

(b) as Darrel explained it to me: if the MR/MS/mapping system cannot 
handle this from a single site then it's probably too weak and not fit 
for the job :-)


Regards, Marc



> 
> Dino
> 
> On Feb 19, 2014, at 11:53 AM, Marc Binderberger <marc@sniff.de> wrote:
> 
>> Hello Dino et al.,
>> 
>>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be 
>>>> used to fill the cache of the 2nd ITR (the one that is not reloaded). 
>>> 
>>> Then you get sub-optimal routing.
>>> 
>>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all LISP 
>>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be ready, 
>>> 
>>> This is easier said then done. That means you have to inject *all 
>>> remote EID-prefixes* into your IGP. That is a non-starter.
>> 
>> maybe I think too simple. Assuming you have two xTRs to connect your 
>> site to the LISP cloud. They both originate a default route into your 
>> site IGP. You then e.g. increase the metric of ITR1's default route or 
>> remove the default originated into the site IGP. Routing out of the 
>> site (to another EID) then moves to ITR2.
>> 
>> Ingress is a different story, probably you need to reduce TTL for 
>> registrations sent from ITR1, so you end up traffic ingress will use 
>> ITR2 only (?).
>> 
>> Then you are ready to reload ITR1.
>> 
>> 
>> Long story short: using the "Traffic deflection to other ITRs" plus the 
>> right operational procedure may solve the problem?
>> 
>> 
>> Regards, Marc
>> 
>> 
>> 
>> On Wed, 19 Feb 2014 11:41:19 -0800, Dino Farinacci wrote:
>>>> Hello Damien,
>>>> 
>>>> thanks for the reply!
>>>> 
>>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>>> would be very happy to look at them and integrate them in the proposed
>>>>> solution.
>>>> 
>>>> And I was curious to see a light-weight protocol extension from you :-)
>>>> Seriously, was wondering if you see an elegant, light way to implement 
>>>> this in the LISP protocol (?). 
>>> 
>>> Light-weight reads as non-robust and scalable. If you want those 
>>> things, you have to do it right. And you then implemented BGP. 
>>> 
>>> One reason people like LISP is because it is reasonably easy to 
>>> understand and employs *less protocol machinery* rather than more.
>>> 
>>>> 
>>>>> the purpose of the document is to deal with planned restart of routers
>>>>> meaning that we know exactly when the routeur will get down then up
>>>>> (it is controlled by the operator).
>>>> 
>>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be 
>>>> used to fill the cache of the 2nd ITR (the one that is not reloaded). 
>>> 
>>> Then you get sub-optimal routing.
>>> 
>>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all LISP 
>>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be ready, 
>>> 
>>> This is easier said then done. That means you have to inject *all 
>>> remote EID-prefixes* into your IGP. That is a non-starter.
>>> 
>>>> you turn off deflection on ITR2 and reload ITR1. Then turning on 
>>>> deflection on ITR1 and bring the IGP routing back to active-active (or 
>>>> whatever the setup was before).
>>> 
>>> Dino
>>> 
>>>> 
>>>> 
>>>> Regards, Marc
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, 19 Feb 2014 09:38:54 +0100, Damien Saucez wrote:
>>>>> Hello Marc,
>>>>> 
>>>>> On 18 Feb 2014, at 23:48, Marc Binderberger <marc@sniff.de> wrote:
>>>>> 
>>>>>> Hello Damien/Olivier/Luigi/Clarence & LISP experts,
>>>>>> 
>>>>>> had a look at draft-saucez-lisp-itr-graceful-03. And wonder if 
>>>>>> there is 
>>>>>> more to come?
>>>>> 
>>>>> Thank you for the interest.  We are indeed thinking on ways to extend
>>>>> the document and provide more details on the ways the solutions could
>>>>> be implemented.
>>>>> 
>>>>> 
>>>>>> Somehow section 4 feels a bit "short".
>>>>>> 
>>>>>> What I mean: if you try to solve the problem of the _two_ cache-miss 
>>>>>> storms - first on the 2nd ITR (ITR2) when your restarting ITR (ITR1) 
>>>>>> goes down, then on the restarting ITR1 when it picks up traffic 
>>>>>> again - 
>>>>>> then section 4 would probably need to talk about a permanent cache 
>>>>>> synchronization (?). Unless you want to solve a planned restart only. 
>>>>>> But for a failure of the ITR1 I don't see how the solution you 
>>>>>> describe 
>>>>>> would work
>>>>>> 
>>>>>> o  ITR cache synchronization: upon startup, the ITR synchronizes its
>>>>>>    cache with the other ITRs in its synchronization set.  The ITR is
>>>>>>    marked as available only after the cache is synchronized.
>>>>>> 
>>>>>> as ITR2 would trigger the cache-miss storm for the traffic after ITR1 
>>>>>> failure.
>>>>>> 
>>>>>> Or if you want to solve only the cache-miss storm when ITR1 comes back 
>>>>>> into the traffic stream then the ITR deflection has the advantage to 
>>>>>> not require any cache-synchronization protocol, IMHO. The rate of 
>>>>>> Map-Requests could be throttled to turn the storm into a breeze. The 
>>>>>> method how to transport traffic to ITR2 could be one of many - a 
>>>>>> direct 
>>>>>> LAN, GRE, Lisp.
>>>>>> 
>>>>>> 
>>>>>> So my question in short: are you planning to add some words about a 
>>>>>> permanent cache synchronization?
>>>>>> 
>>>>> 
>>>>> For now we don't have acceptable techniques to keep caches
>>>>> synchronised in a permanent way but I don't think it is a big issue as
>>>>> the purpose of the document is to deal with planned restart of routers
>>>>> meaning that we know exactly when the routeur will get down then up
>>>>> (it is controlled by the operator).
>>>>> 
>>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>>> would be very happy to look at them and integrate them in the proposed
>>>>> solution.
>>>>> 
>>>>> Thank you,
>>>>> 
>>>>> Damien Saucez
>>>>> 
>>>>>> 
>>>>>> Thanks & Regards,
>>>>>> Marc
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> lisp mailing list
>>>> lisp@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/lisp
>>> 
>