Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03

Dino Farinacci <farinacci@gmail.com> Wed, 19 February 2014 22:25 UTC

Return-Path: <farinacci@gmail.com>
X-Original-To: lisp@ietfa.amsl.com
Delivered-To: lisp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFA1A1A03F0 for <lisp@ietfa.amsl.com>; Wed, 19 Feb 2014 14:25:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l43enSB66I3e for <lisp@ietfa.amsl.com>; Wed, 19 Feb 2014 14:25:06 -0800 (PST)
Received: from mail-pd0-x235.google.com (mail-pd0-x235.google.com [IPv6:2607:f8b0:400e:c02::235]) by ietfa.amsl.com (Postfix) with ESMTP id E0E341A02C1 for <lisp@ietf.org>; Wed, 19 Feb 2014 14:25:05 -0800 (PST)
Received: by mail-pd0-f181.google.com with SMTP id y10so948232pdj.12 for <lisp@ietf.org>; Wed, 19 Feb 2014 14:25:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=hbh5w13VtNqtNVeojlhbhAxPoxE3yYF6r4SpA1Zr6GE=; b=n5RfizjL31PoIPnqpDoXDlOrDQnRmS6Oyswu8BBmdgmCNvPCEdF4/gh8Wnh3l8iRc/ R9oBLeEbUzP0R7EhUkMxPJmN2Pqv0R8eVwOkqRH7USlwY13WmATouBJnIl3mxTc4k+SI tr6MYGrSwXGs1PDPiW01oBQ4y80Cn/F5i7xTaFChybWO987whyB15HaedcDY+suaD8ME 0zZaW/oc3baaMBqvzk7fSWGSePmJ1tg3oIJabGNvCjUvIEwE08iQRp9Qwr8sq24Mjej2 SuEEzZ/fJNrgXd8i0xprZflCpnREJy+9iI3Oqau6gnCgAzTWKvi2UbkbT24UQxzIKgOG ltEQ==
X-Received: by 10.66.246.229 with SMTP id xz5mr5044197pac.119.1392848702708; Wed, 19 Feb 2014 14:25:02 -0800 (PST)
Received: from [10.2.2.212] ([63.239.94.10]) by mx.google.com with ESMTPSA id oa3sm4052252pbb.15.2014.02.19.14.25.00 for <multiple recipients> (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 19 Feb 2014 14:25:01 -0800 (PST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\))
From: Dino Farinacci <farinacci@gmail.com>
In-Reply-To: <20140219115305183057.3957d484@sniff.de>
Date: Wed, 19 Feb 2014 13:46:39 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <70C508D5-17D3-4C62-9CDE-802D05AA8D9D@gmail.com>
References: <20140218144825842648.087ffc67@sniff.de> <7DFCF6EA-9F05-468D-B51F-7AB7DEC149C8@inria.fr> <20140219111747519985.d46b87a8@sniff.de> <C7979A6D-4636-45EF-82A7-AE35F1269F36@gmail.com> <20140219115305183057.3957d484@sniff.de>
To: Marc Binderberger <marc@sniff.de>
X-Mailer: Apple Mail (2.1827)
Archived-At: http://mailarchive.ietf.org/arch/msg/lisp/8GlUsyiLtBzojjFEyTN_4yjPobM
Cc: Damien Saucez <damien.saucez@inria.fr>, Clarence Filsfils <cf@cisco.com>, Luigi Iannone <luigi.iannone@telecom-paristech.fr>, LISP mailing list list <lisp@ietf.org>
Subject: Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Feb 2014 22:25:09 -0000

Yes, what you describe can work. But once you deflect, the other ITR still needs to send Map-Requests for all the new EIDs that are not cached in the map-cache.

Dino

On Feb 19, 2014, at 11:53 AM, Marc Binderberger <marc@sniff.de> wrote:

> Hello Dino et al.,
> 
>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be 
>>> used to fill the cache of the 2nd ITR (the one that is not reloaded). 
>> 
>> Then you get sub-optimal routing.
>> 
>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all LISP 
>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be ready, 
>> 
>> This is easier said then done. That means you have to inject *all 
>> remote EID-prefixes* into your IGP. That is a non-starter.
> 
> maybe I think too simple. Assuming you have two xTRs to connect your 
> site to the LISP cloud. They both originate a default route into your 
> site IGP. You then e.g. increase the metric of ITR1's default route or 
> remove the default originated into the site IGP. Routing out of the 
> site (to another EID) then moves to ITR2.
> 
> Ingress is a different story, probably you need to reduce TTL for 
> registrations sent from ITR1, so you end up traffic ingress will use 
> ITR2 only (?).
> 
> Then you are ready to reload ITR1.
> 
> 
> Long story short: using the "Traffic deflection to other ITRs" plus the 
> right operational procedure may solve the problem?
> 
> 
> Regards, Marc
> 
> 
> 
> On Wed, 19 Feb 2014 11:41:19 -0800, Dino Farinacci wrote:
>>> Hello Damien,
>>> 
>>> thanks for the reply!
>>> 
>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>> would be very happy to look at them and integrate them in the proposed
>>>> solution.
>>> 
>>> And I was curious to see a light-weight protocol extension from you :-)
>>> Seriously, was wondering if you see an elegant, light way to implement 
>>> this in the LISP protocol (?). 
>> 
>> Light-weight reads as non-robust and scalable. If you want those 
>> things, you have to do it right. And you then implemented BGP. 
>> 
>> One reason people like LISP is because it is reasonably easy to 
>> understand and employs *less protocol machinery* rather than more.
>> 
>>> 
>>>> the purpose of the document is to deal with planned restart of routers
>>>> meaning that we know exactly when the routeur will get down then up
>>>> (it is controlled by the operator).
>>> 
>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be 
>>> used to fill the cache of the 2nd ITR (the one that is not reloaded). 
>> 
>> Then you get sub-optimal routing.
>> 
>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all LISP 
>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be ready, 
>> 
>> This is easier said then done. That means you have to inject *all 
>> remote EID-prefixes* into your IGP. That is a non-starter.
>> 
>>> you turn off deflection on ITR2 and reload ITR1. Then turning on 
>>> deflection on ITR1 and bring the IGP routing back to active-active (or 
>>> whatever the setup was before).
>> 
>> Dino
>> 
>>> 
>>> 
>>> Regards, Marc
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, 19 Feb 2014 09:38:54 +0100, Damien Saucez wrote:
>>>> Hello Marc,
>>>> 
>>>> On 18 Feb 2014, at 23:48, Marc Binderberger <marc@sniff.de> wrote:
>>>> 
>>>>> Hello Damien/Olivier/Luigi/Clarence & LISP experts,
>>>>> 
>>>>> had a look at draft-saucez-lisp-itr-graceful-03. And wonder if there is 
>>>>> more to come?
>>>> 
>>>> Thank you for the interest.  We are indeed thinking on ways to extend
>>>> the document and provide more details on the ways the solutions could
>>>> be implemented.
>>>> 
>>>> 
>>>>> Somehow section 4 feels a bit "short".
>>>>> 
>>>>> What I mean: if you try to solve the problem of the _two_ cache-miss 
>>>>> storms - first on the 2nd ITR (ITR2) when your restarting ITR (ITR1) 
>>>>> goes down, then on the restarting ITR1 when it picks up traffic again - 
>>>>> then section 4 would probably need to talk about a permanent cache 
>>>>> synchronization (?). Unless you want to solve a planned restart only. 
>>>>> But for a failure of the ITR1 I don't see how the solution you describe 
>>>>> would work
>>>>> 
>>>>> o  ITR cache synchronization: upon startup, the ITR synchronizes its
>>>>>    cache with the other ITRs in its synchronization set.  The ITR is
>>>>>    marked as available only after the cache is synchronized.
>>>>> 
>>>>> as ITR2 would trigger the cache-miss storm for the traffic after ITR1 
>>>>> failure.
>>>>> 
>>>>> Or if you want to solve only the cache-miss storm when ITR1 comes back 
>>>>> into the traffic stream then the ITR deflection has the advantage to 
>>>>> not require any cache-synchronization protocol, IMHO. The rate of 
>>>>> Map-Requests could be throttled to turn the storm into a breeze. The 
>>>>> method how to transport traffic to ITR2 could be one of many - a direct 
>>>>> LAN, GRE, Lisp.
>>>>> 
>>>>> 
>>>>> So my question in short: are you planning to add some words about a 
>>>>> permanent cache synchronization?
>>>>> 
>>>> 
>>>> For now we don't have acceptable techniques to keep caches
>>>> synchronised in a permanent way but I don't think it is a big issue as
>>>> the purpose of the document is to deal with planned restart of routers
>>>> meaning that we know exactly when the routeur will get down then up
>>>> (it is controlled by the operator).
>>>> 
>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>> would be very happy to look at them and integrate them in the proposed
>>>> solution.
>>>> 
>>>> Thank you,
>>>> 
>>>> Damien Saucez
>>>> 
>>>>> 
>>>>> Thanks & Regards,
>>>>> Marc
>>>> 
>>> 
>>> _______________________________________________
>>> lisp mailing list
>>> lisp@ietf.org
>>> https://www.ietf.org/mailman/listinfo/lisp
>>