RE: discussion on fast notification work

András Császár <Andras.Csaszar@ericsson.com> Thu, 07 July 2011 10:52 UTC

Return-Path: <Andras.Csaszar@ericsson.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0EDB321F85CA for <rtgwg@ietfa.amsl.com>; Thu, 7 Jul 2011 03:52:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.299
X-Spam-Level:
X-Spam-Status: No, score=-6.299 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wwglU0bv4Z0M for <rtgwg@ietfa.amsl.com>; Thu, 7 Jul 2011 03:52:23 -0700 (PDT)
Received: from mailgw9.se.ericsson.net (mailgw9.se.ericsson.net [193.180.251.57]) by ietfa.amsl.com (Postfix) with ESMTP id 1A77921F85C5 for <rtgwg@ietf.org>; Thu, 7 Jul 2011 03:52:22 -0700 (PDT)
X-AuditID: c1b4fb39-b7bfdae000005125-15-4e158fe4d522
Received: from esessmw0247.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) by mailgw9.se.ericsson.net (Symantec Mail Security) with SMTP id D6.B5.20773.4EF851E4; Thu, 7 Jul 2011 12:52:20 +0200 (CEST)
Received: from ESESSCMS0363.eemea.ericsson.se ([169.254.1.174]) by esessmw0247.eemea.ericsson.se ([10.2.3.116]) with mapi; Thu, 7 Jul 2011 12:52:20 +0200
From: András Császár <Andras.Csaszar@ericsson.com>
To: Alia Atlas <akatlas@gmail.com>, "rtgwg@ietf.org" <rtgwg@ietf.org>
Date: Thu, 07 Jul 2011 12:52:19 +0200
Subject: RE: discussion on fast notification work
Thread-Topic: discussion on fast notification work
Thread-Index: Acw8H0pFV/FD4JKaRdaUKdEnz26TaAAc2ZTQ
Message-ID: <8DCD771BDA4A394E9BCBA8932E8392973216EA6154@ESESSCMS0363.eemea.ericsson.se>
References: <CAG4d1rfNthpfrHDzPASL5UVgP8ixXCDQY4KZSerRqx9YUriOpA@mail.gmail.com>
In-Reply-To: <CAG4d1rfNthpfrHDzPASL5UVgP8ixXCDQY4KZSerRqx9YUriOpA@mail.gmail.com>
Accept-Language: hu-HU, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: hu-HU, en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAA==
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtgwg>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jul 2011 10:52:24 -0000

Dear All,

As a recap, the basic idea was to explore how one could approximate
1. near instantaneous notification of failures to neighbour and remote nodes
2. near instantaneous update of the FIB 

1 is approximated by a completely dataplane-based fast notification (FN) framework.
2 is approximated by pre-calculating and pre-downloading backup routes for RELEVANT failures and doing the FIB update from within the linecard.

Since last IETF, based on the comments we received, we have been working on (and prototyping) a method where FNs are propagated on the shortest path and each hop performs SHA256 authentication in the dataplane before forwarding the packet.

Important highlights proving feasibility:

- In a 1000-node area with a diameter of 20 hops and 500k external routes, the backup FIB even in a very bad case is not bigger than 30MB with very diverse ECMP (10 ECMP alternatives for each destination). The download of this backup FIB size should be no problem.

- A naive serial FIB update procedure after a failure in the above network takes less than 15ms within a dataplane card (assuming 5MT/sec memory performance and 1 memory controller). But there may be more intelligent approaches, such as a lazy (on-demand) FIB update.

- In reality, our calculations show that typically only nodes between 1 and 3 hops away need to prepare for a failure, i.e. failures only 1-2-3 hops away are RELEVANT (the above calculation assumes that for each destination needs to prepare for all failures of the 20-hop diameter)

- Very important: the FN packet always proceeds AHEAD OF normal data packets, so re-routed data packets typically find nodes on their way which have finished or almost finished reconfiguring. (In this way long links do not cause problems as both FN and normal data packets are delayed the same.)

- Pre-calculation complexity is in the same order of magnitude as with Not-Via, and it's done "offline"


Conclusions of our naïve implementation are the following:

- The solution can be implemented on a current platform, and we don't seem to use any operation that would make it less useful on other platforms including e.g. EZChip NP-4

- A FN packet can be originated in less than 200us (micro-sec) after failure detection 

- An FN packet can be forwarded at each hop in ca. 180us (this already includes SHA256 verification and duplicate check!)


András
 

> -----Original Message-----
> From: rtgwg-bounces@ietf.org [mailto:rtgwg-bounces@ietf.org] 
> On Behalf Of Alia Atlas
> Sent: 2011. július 6. 22:57
> To: rtgwg@ietf.org
> Subject: discussion on fast notification work
> 
> The last 2 IETFs, we have had discussions about the idea of fast
> notification, as described in
> draft-lu-fast-notification-framework, draft-lu-fn-transport-00, and
> draft-csaszar-ipfrr-fn-00.
> 
> Since then, I have not seen substantial discussion or interest on the
> mailing list.  If you are
> interested in this work, have questions about it, or would like to see
> RTGWG continue to discuss it,
> please send email to this mailing list.  I'd like to see this
> conversation happening here before IETF.
> 
> Thanks,
> Alia
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org
> https://www.ietf.org/mailman/listinfo/rtgwg
>