Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03

"Lev Shvarts (lshvarts)" <lshvarts@cisco.com> Tue, 11 March 2014 01:01 UTC

Return-Path: <lshvarts@cisco.com>
X-Original-To: lisp@ietfa.amsl.com
Delivered-To: lisp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2902D1A0675 for <lisp@ietfa.amsl.com>; Mon, 10 Mar 2014 18:01:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.048
X-Spam-Level:
X-Spam-Status: No, score=-10.048 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VOcfrhuS7e3g for <lisp@ietfa.amsl.com>; Mon, 10 Mar 2014 18:01:10 -0700 (PDT)
Received: from alln-iport-5.cisco.com (alln-iport-5.cisco.com [173.37.142.92]) by ietfa.amsl.com (Postfix) with ESMTP id DCB161A0671 for <lisp@ietf.org>; Mon, 10 Mar 2014 18:01:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=9055; q=dns/txt; s=iport; t=1394499664; x=1395709264; h=from:to:subject:date:message-id:in-reply-to:content-id: content-transfer-encoding:mime-version; bh=ZEkMg7gAQaCu0LnvLWxFSH0fAf2SKxDYW9YpyFGlWdY=; b=HACf/Wp3XvhfmC7SAzlXADXkXBbBytt0ZxW33+FXCBIUv1H7A5M7dYER I3xDRj46UlTcpCS6ZVyln3x0vQyxLgweo2Zcsm5xOaCIrabdOfxMrM5Qa jkH3O+/7Jw1dM7VubhqgBFB9LSUFdA9v7/r4D8AnnJazfTHBZlCM+5xcC Y=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgMFAGpfHlOtJV2Z/2dsb2JhbABaDoJ4O1fBTIEeFnSCJQEBAQQBAQFiBgMdAQgOClULJQIEARIbh14N0AUTBI1wEwEkOoQ4BIkZjyySLYJuP4FpAR8i
X-IronPort-AV: E=Sophos;i="4.97,627,1389744000"; d="scan'208";a="26410792"
Received: from rcdn-core-2.cisco.com ([173.37.93.153]) by alln-iport-5.cisco.com with ESMTP; 11 Mar 2014 01:01:04 +0000
Received: from xhc-rcd-x10.cisco.com (xhc-rcd-x10.cisco.com [173.37.183.84]) by rcdn-core-2.cisco.com (8.14.5/8.14.5) with ESMTP id s2B114Xm020414 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Tue, 11 Mar 2014 01:01:04 GMT
Received: from xmb-aln-x11.cisco.com ([169.254.6.219]) by xhc-rcd-x10.cisco.com ([173.37.183.84]) with mapi id 14.03.0123.003; Mon, 10 Mar 2014 20:01:03 -0500
From: "Lev Shvarts (lshvarts)" <lshvarts@cisco.com>
To: Florin Coras <fcoras@ac.upc.edu>, "lisp@ietf.org" <lisp@ietf.org>, "Marc Binderberger" <marc@sniff.de>
Thread-Topic: [lisp] Questions about draft-saucez-lisp-itr-graceful-03
Thread-Index: AQHPLPuNuVnprPHu7E6F+XuVgbvMUZq8pqkAgACygICAAAaTgIAAA0qAgAAfu4CAADBygIAAGjaAgB1Sl4A=
Date: Tue, 11 Mar 2014 01:01:03 +0000
Message-ID: <CF43AD58.6FAA6%lshvarts@cisco.com>
In-Reply-To: <530564DF.8030600@ac.upc.edu>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.3.120616
x-originating-ip: [128.107.166.31]
Content-Type: text/plain; charset="iso-8859-1"
Content-ID: <4DF286E0E4AA8546A55E2EE70E884428@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/lisp/qwZrbolVP7xtJqJ4qW9OnCc1lFk
Subject: Re: [lisp] Questions about draft-saucez-lisp-itr-graceful-03
X-BeenThere: lisp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: List for the discussion of the Locator/ID Separation Protocol <lisp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lisp>, <mailto:lisp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/lisp/>
List-Post: <mailto:lisp@ietf.org>
List-Help: <mailto:lisp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lisp>, <mailto:lisp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Mar 2014 01:01:13 -0000

I think the general feeling is that introducing more complexity into the
protocol, where the same can be achieved other ways, is unnecessary;
however I agree there should be a general approach at handling the xTR
restart. I like the second approach better to be honest - in a case that
all the ITRs in the synchronization set be reachable via the same anycast
ip. What this would mean is that all the flows will be load-balanced
within the site and this will naturally synchronize the map-caches across
the xTRs. A gracefully restarted ITR can define a GR timer during which
packets arriving at it will trigger a redirect to one of the ITRs In the
synchronization set and a [rate-limited] map-request for cache rebuilding.
This approach wouldn¹t solve the map-request storm problem posed, but this
can be solved with more rate-limiting! :-)

The problem I see with the map-cache synchronization within a site is that
a mapping is not retrieved from an authoritative source, so cache errors
can propagate throughout the whole site.

The LISP-specific part is a proper calculation of the GR timer. Simply it
can be a knob, like Marc suggested, or a generally agreed upon timer
value. Further implementation-specific bits could be based on the size of
the local EID space(configured), or could be based on the rate of the
packets that need to be encapsulated.

Thank you,
Lev.




On 2/19/14, 6:13 PM, "Florin Coras" <fcoras@ac.upc.edu> wrote:

>On 02/20/2014 01:40 AM, Marc Binderberger wrote:
>> Hello Dino et al.,
>>
>>> Yes, what you describe can work. But once you deflect, the other ITR
>>> still needs to send Map-Requests for all the new EIDs that are not
>>> cached in the map-cache.
>> True. Two options I see
>>
>> (a) rate-limit the map-requests from the just-reloaded ITR. All this
>> does is some EIDs are a bit longer deflected
>>
>> (b) as Darrel explained it to me: if the MR/MS/mapping system cannot
>> handle this from a single site then it's probably too weak and not fit
>> for the job :-)
>
>What Dino meant, I think, is that all active, egressing flows passing
>through the ITR to be reloaded (ITR1 in your example) will cache miss in
>the backup ITR. I don't know if the problem you try to solve is is
>theoretical or practical, but in the latter case, maybe it would be
>easier just to provision a local caching Map-Resolver, close to the two
>ITRs.
>
>Florin
>>
>> Regards, Marc
>>
>>
>>
>>> Dino
>>>
>>> On Feb 19, 2014, at 11:53 AM, Marc Binderberger <marc@sniff.de> wrote:
>>>
>>>> Hello Dino et al.,
>>>>
>>>>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be
>>>>>> used to fill the cache of the 2nd ITR (the one that is not
>>>>>>reloaded).
>>>>> Then you get sub-optimal routing.
>>>>>
>>>>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all
>>>>>>LISP
>>>>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be
>>>>>>ready,
>>>>> This is easier said then done. That means you have to inject *all
>>>>> remote EID-prefixes* into your IGP. That is a non-starter.
>>>> maybe I think too simple. Assuming you have two xTRs to connect your
>>>> site to the LISP cloud. They both originate a default route into your
>>>> site IGP. You then e.g. increase the metric of ITR1's default route or
>>>> remove the default originated into the site IGP. Routing out of the
>>>> site (to another EID) then moves to ITR2.
>>>>
>>>> Ingress is a different story, probably you need to reduce TTL for
>>>> registrations sent from ITR1, so you end up traffic ingress will use
>>>> ITR2 only (?).
>>>>
>>>> Then you are ready to reload ITR1.
>>>>
>>>>
>>>> Long story short: using the "Traffic deflection to other ITRs" plus
>>>>the
>>>> right operational procedure may solve the problem?
>>>>
>>>>
>>>> Regards, Marc
>>>>
>>>>
>>>>
>>>> On Wed, 19 Feb 2014 11:41:19 -0800, Dino Farinacci wrote:
>>>>>> Hello Damien,
>>>>>>
>>>>>> thanks for the reply!
>>>>>>
>>>>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>>>>> would be very happy to look at them and integrate them in the
>>>>>>>proposed
>>>>>>> solution.
>>>>>> And I was curious to see a light-weight protocol extension from you
>>>>>>:-)
>>>>>> Seriously, was wondering if you see an elegant, light way to
>>>>>>implement
>>>>>> this in the LISP protocol (?).
>>>>> Light-weight reads as non-robust and scalable. If you want those
>>>>> things, you have to do it right. And you then implemented BGP.
>>>>>
>>>>> One reason people like LISP is because it is reasonably easy to
>>>>> understand and employs *less protocol machinery* rather than more.
>>>>>
>>>>>>> the purpose of the document is to deal with planned restart of
>>>>>>>routers
>>>>>>> meaning that we know exactly when the routeur will get down then up
>>>>>>> (it is controlled by the operator).
>>>>>> but then the "Traffic deflection to other ITRs (or a PxTR)" could be
>>>>>> used to fill the cache of the 2nd ITR (the one that is not
>>>>>>reloaded).
>>>>> Then you get sub-optimal routing.
>>>>>
>>>>>> You turn it on on ITR2 (off on ITR1), change your IGP to send all
>>>>>>LISP
>>>>>> data to remote sites to ITR2, "wait a bit", then ITR2 should be
>>>>>>ready,
>>>>> This is easier said then done. That means you have to inject *all
>>>>> remote EID-prefixes* into your IGP. That is a non-starter.
>>>>>
>>>>>> you turn off deflection on ITR2 and reload ITR1. Then turning on
>>>>>> deflection on ITR1 and bring the IGP routing back to active-active
>>>>>>(or
>>>>>> whatever the setup was before).
>>>>> Dino
>>>>>
>>>>>>
>>>>>> Regards, Marc
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 19 Feb 2014 09:38:54 +0100, Damien Saucez wrote:
>>>>>>> Hello Marc,
>>>>>>>
>>>>>>> On 18 Feb 2014, at 23:48, Marc Binderberger <marc@sniff.de> wrote:
>>>>>>>
>>>>>>>> Hello Damien/Olivier/Luigi/Clarence & LISP experts,
>>>>>>>>
>>>>>>>> had a look at draft-saucez-lisp-itr-graceful-03. And wonder if
>>>>>>>> there is
>>>>>>>> more to come?
>>>>>>> Thank you for the interest.  We are indeed thinking on ways to
>>>>>>>extend
>>>>>>> the document and provide more details on the ways the solutions
>>>>>>>could
>>>>>>> be implemented.
>>>>>>>
>>>>>>>
>>>>>>>> Somehow section 4 feels a bit "short".
>>>>>>>>
>>>>>>>> What I mean: if you try to solve the problem of the _two_
>>>>>>>>cache-miss
>>>>>>>> storms - first on the 2nd ITR (ITR2) when your restarting ITR
>>>>>>>>(ITR1)
>>>>>>>> goes down, then on the restarting ITR1 when it picks up traffic
>>>>>>>> again -
>>>>>>>> then section 4 would probably need to talk about a permanent cache
>>>>>>>> synchronization (?). Unless you want to solve a planned restart
>>>>>>>>only.
>>>>>>>> But for a failure of the ITR1 I don't see how the solution you
>>>>>>>> describe
>>>>>>>> would work
>>>>>>>>
>>>>>>>> o  ITR cache synchronization: upon startup, the ITR synchronizes
>>>>>>>>its
>>>>>>>>     cache with the other ITRs in its synchronization set.  The
>>>>>>>>ITR is
>>>>>>>>     marked as available only after the cache is synchronized.
>>>>>>>>
>>>>>>>> as ITR2 would trigger the cache-miss storm for the traffic after
>>>>>>>>ITR1
>>>>>>>> failure.
>>>>>>>>
>>>>>>>> Or if you want to solve only the cache-miss storm when ITR1 comes
>>>>>>>>back
>>>>>>>> into the traffic stream then the ITR deflection has the advantage
>>>>>>>>to
>>>>>>>> not require any cache-synchronization protocol, IMHO. The rate of
>>>>>>>> Map-Requests could be throttled to turn the storm into a breeze.
>>>>>>>>The
>>>>>>>> method how to transport traffic to ITR2 could be one of many - a
>>>>>>>> direct
>>>>>>>> LAN, GRE, Lisp.
>>>>>>>>
>>>>>>>>
>>>>>>>> So my question in short: are you planning to add some words about
>>>>>>>>a
>>>>>>>> permanent cache synchronization?
>>>>>>>>
>>>>>>> For now we don't have acceptable techniques to keep caches
>>>>>>> synchronised in a permanent way but I don't think it is a big
>>>>>>>issue as
>>>>>>> the purpose of the document is to deal with planned restart of
>>>>>>>routers
>>>>>>> meaning that we know exactly when the routeur will get down then up
>>>>>>> (it is controlled by the operator).
>>>>>>>
>>>>>>> If you have a solution to continuously synchronise ITRs caches, we
>>>>>>> would be very happy to look at them and integrate them in the
>>>>>>>proposed
>>>>>>> solution.
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Damien Saucez
>>>>>>>
>>>>>>>> Thanks & Regards,
>>>>>>>> Marc
>>>>>> _______________________________________________
>>>>>> lisp mailing list
>>>>>> lisp@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/lisp
>> _______________________________________________
>> lisp mailing list
>> lisp@ietf.org
>> https://www.ietf.org/mailman/listinfo/lisp
>
>_______________________________________________
>lisp mailing list
>lisp@ietf.org
>https://www.ietf.org/mailman/listinfo/lisp