Re: [Idr] draft-ietf-idr-rs-bfd state distribution

Robert Raszuk <robert@raszuk.net> Tue, 30 May 2017 12:32 UTC

Return-Path: <rraszuk@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 39273129C15 for <idr@ietfa.amsl.com>; Tue, 30 May 2017 05:32:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.398
X-Spam-Level:
X-Spam-Status: No, score=-2.398 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.199, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nQwDau3uNVzO for <idr@ietfa.amsl.com>; Tue, 30 May 2017 05:32:39 -0700 (PDT)
Received: from mail-it0-x22c.google.com (mail-it0-x22c.google.com [IPv6:2607:f8b0:4001:c0b::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B195129526 for <idr@ietf.org>; Tue, 30 May 2017 05:32:39 -0700 (PDT)
Received: by mail-it0-x22c.google.com with SMTP id c15so40504395ith.0 for <idr@ietf.org>; Tue, 30 May 2017 05:32:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=v6b5rxZMeKj8bO4iT8jARGmBHdr4Couqp7TaF1jA2Kc=; b=ejccx7CKI6TXxUEtS0L4cAOBOcrD6Eqkif0j4XiI28Z3JiJZdZVk/8+ghZrLXmvMtC +Jog56npcD3zHmT0HZB4yC519SUBggyc+Sk5WvqMkbbRtWrWRGlIySjFTvOvyHfC7hZ/ 0p5tWyBrrCqul7PrwGWUJyOjwbEH/kl8UASDmTBVAaUiF5Izw9X1YTmQqBEQVfqzhqTL IalZ/SqWu2QeX8UoP/cjXPcagvN7RKKJoxIFb6zwkRGtgXEh2oLfqa/WGh4Y6BVkarxT HdKRDAAuQofZCRFClUOoyVL733kJ4AsL9oUKiAgvkbsYZrVUzCKO415S+l54YN3w4RBt uFjg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=v6b5rxZMeKj8bO4iT8jARGmBHdr4Couqp7TaF1jA2Kc=; b=FVovJMmB19mgL23uq/JPyoN0lKesIjVL2emO1iGWLejEdCXH6WxOtcU1gddxhNnaQC lP+h2xT3aNFqbR68snrVG0l+Vpd/fo967VuiagEbOA6vdXlvyNY5z5mXpP9auGKMYG06 KPM4fwiKDpoIPYq4VtoD35MmgO3Lehp2u/7aXjKmQbMaZW+LHP0RnOoGR4uxRHiQpzhM aovuzGYGDS7jAbxKYLyEFBUjEbDcaPsjVD7UY1RZ1koHUigvzFFaL+gWjIDV1p5tBZxY Q9xajchSJhiFZGtKbt8RqHqEE2t417Lnvfa1A9ZHnySy7pLotqPIkFQdX7uh+UiLBXei 3zqg==
X-Gm-Message-State: AODbwcCH7QD1pC4vJoA23X2asxelpuTK+m1Ow5PjorKnX0RTj4hbXEho lMaiepq1yBt2uQJT2uc3/uclqkLxtqL9
X-Received: by 10.36.121.22 with SMTP id z22mr1574792itc.59.1496147558976; Tue, 30 May 2017 05:32:38 -0700 (PDT)
MIME-Version: 1.0
Sender: rraszuk@gmail.com
Received: by 10.79.62.24 with HTTP; Tue, 30 May 2017 05:32:38 -0700 (PDT)
In-Reply-To: <20170530113459.pauvpic623ibecj4@hanna.meerval.net>
References: <20170530113459.pauvpic623ibecj4@hanna.meerval.net>
From: Robert Raszuk <robert@raszuk.net>
Date: Tue, 30 May 2017 14:32:38 +0200
X-Google-Sender-Auth: uHjnqGLhBi3r2-k0-m6zRLndzTo
Message-ID: <CA+b+ERnqGOnLKRMiOa8Y2r5b4F3JQHu5MV0xqSmO12y9KHNHAw@mail.gmail.com>
To: Job Snijders <job@instituut.net>
Cc: idr wg <idr@ietf.org>
Content-Type: multipart/alternative; boundary="001a114a9e6cc8d3a50550bcff87"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/uxOUp6KutTRKQMSzOfVgOinqEiY>
Subject: Re: [Idr] draft-ietf-idr-rs-bfd state distribution
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 May 2017 12:32:42 -0000

Hi Job,

This is going into right direction !

However if the change is really about adding new SAFI to so to say
"negotiate" BFD between clients why instead of new SAFI basically define a
new attribute or community for that matter and attach it to original SAFI
1/1 & 2/1.

I take that we are no longer sending only next hops in this SAFI as this
would still keep RS aware of client to client failures. If so I really do
not think new SAFI brings any value and I would suggest to get rid of it.

Then we will have nice and clean spec :)

Cheers,
R.


On Tue, May 30, 2017 at 1:34 PM, Job Snijders <job@instituut.net> wrote:

> Hi all,
>
> Perhaps this has been covered in prior discussion, if so, my apologies.
> Some individuals in the working group made every effort to drown out
> discussion of meaningful deployment scenarios.
>
> In reviewing draft-ietf-idr-rs-bfd-02 it occurred to me that there might
> be an optimalisation to be made. At the last IETF I questioned whether
> the Route Server itself really needs to be aware of reachability between
> Route Server participants, and I still think that is not necessity.
>
> OLD:
>     This document proposes the use of BFD between the two peering
>     routers to detect a data plane failure, and then uses a newly
>     defined BGP SAFI to signal the state of the data link to the route
>     server(s).
> NEW:
>     This document proposes the use of BFD between the two peering
>     routers to detect a data plane failure. The Route Server facilitates
>     setup and teardown of such BFD sessions through a newly defined BGP
>     SAFI in which it announces a next-hop's capability and willingness
>     to setup a direct BFD session.
>
> OLD:
>     To remedy this, two basic problems need to be solved:
>     1.  Client routers must have a means of verifying connectivity
>         amongst themselves, and
>     2.  Client routers must have a means of communicating the knowledge
>         of the failure back to the route server.
> NEW:
>     To remedy this, a basic problem need to be solved:
>     1.  Client routers must have a means of verifying connectivity
>         amongst themselves.
>
> etc..
>
> Operation:
> ----------
>
> Scenario: ISP_A and ISP_B are connected to common layer-2 fabric IXP_C,
> and both have a BGP session with Route Server RS_R.
>
> ISP_A and ISP_B are both support draft-ietf-idr-rs-bfd-XX, and in the
> OPEN message to RS_R they announce support for idr-rs-bfd through a BGP
> capability. Since RS_R received this capability from both ISP_A and
> ISP_B, it _can_ announce in a newly defined SAFI ISP_A's next-hop to
> ISP_B, and _can_ ISP_B's next-hop to ISP_A.
>
> Note: if ISP_A did not announce the capability, ISP_A's nexthop will not
> be announced to ISP_B, and ISP_A will of course not receive messages in
> context of the newly defined SAFI.
>
> If RS_R announces a path to ISP_B for which the next-hop is ISP_A, it
> must also announce a ISP_A's next-hop in the newly defined SAFI to
> indicate that ISP_A expressed a willingness and capability to set up a
> BFD session. However, since a Route Server allows for facilitation of
> unidirectional traffic flows, and BFD is a bidirectional construct, RS_R
> must also announce ISP_B's next-hop in the newly defined SAFI to ISP_A.
>
> The above 'pairwise' announcement style might violate RFC 4271 section
> 9.1: "The function that calculates the degree of preference for a given
> route SHALL NOT use any of the following as its inputs: the existence of
> other routes, the non-existence of other routes, or the path attributes
> of other routes." But since this is a Route Server, it is perhaps
> permissible to add yet another crime to the Route Server's rap sheet.
>
> Implementers might want to add a degree of dampening for the newly
> defined SAFI to mitigate all to fast setup and teardown of BFD sessions.
>
> Rationale:
> ----------
>
> Should there be a fault of sorts on the IXP where for some reason ISP_A
> and ISP_B can no longer reach each other, but they can reach RS_R, I'd
> argue this is a matter solely between ISP_A and ISP_B.
>
> They both were facilitated by RS_R to set up BFD to each other, they
> have a BFD session, the BFD session goes down because of the incident,
> as a consequence they'll consider each other's next-hop to be
> inadmissible for route selection and proceed to treat routes with each
> other's next-hop as withdrawn. Life is good.
>
> Should ISP_A and ISP_B have a feedback loop to RS_R to inform the RS
> that they no longer can reach each other, what do we expect RS_R to do?
> Calculate new best-paths for each client? Why is this useful? By the
> time any flavor of draft-ietf-idr-rs-bfd-XX is implemented, we might
> anticipate more use of ADD-PATH. But even without ADD-PATH, there is no
> real harm in using a few routes from the RS_R when the IXP has gone
> split-brain, the way any Route Server is used on the Internet, they only
> carry partial routing tables anyway.
>
> My main concern is that in real life, when RS_R receives from hundreds
> of clients on one side of the IXP that they cannot reach the other side
> of the IXP, and the hundreds on the other side of the IXP informs RS_R
> that they cannot reach the side I first mentioned, this will create a
> stampede for both Route Server and Route Server Participants.
>
>     1)  All Participants observe that 100s of BFD sessions go down
>     2)  All Participants immediately proceed to deprecate those paths in
>         their own RIBs, sending out withdraws to downstream BGP
>         speakers.
>     3)  The Route Server receives from n*100s of clients that they can't
>         reach various next-hops. These notifications may arrive in a
>         staggered fashion.
>     3a) The Route Server may observe BGP sessions going down with a
>         subset of the participants, since IXP faults like these rarely
>         are clean-cut.
>     4)  The Route Server has to run a per-client best-path-selection
>         process within each RIB
>     4a) Participants see churn in the announcements received in the
>         newly defined SAFI, and will proceed with teardown / setup of
>         BFD sessions.
>     5)  The Route Server has to announce the newly selected best paths
>
> Knowing that at the larger IXPs the Route Servers are already at the
> upper bound of their scaling capabilities, and many routing engines used
> by IXP participants are somewhat underscaled. With the current proposal
> I see a lot, perhaps too much of stirring in the convergence soup.
>
> In making the Route Server aware of link-failures _between_ Route Server
> Participants, the totality of all stakeholders depends not only on local
> convergence, but now convergence at the Route Server plays a significant
> role, which in turn can impact the participant's convergence.
>
> Another issue that under the current proposal, should a Route Server
> participant oscilate within the RS-Reachable SAFI, this oscilation can
> place a significant additional burden on the Route Server since the
> per-client Loc-RIBs needs to be recomputed. This risk does not exist in
> a mode of operation where less state is made known to the Route Server.
>
> Another note, instead of "The RS-Reachable Control Extended Community",
> shouldn't "Enhanced Route Refresh" (RFC 7313) be used?
>
> It appears to me that the only argument for storing client-to-client
> data-link reachability state at the Route Server, is to mitigate Path
> Hiding (rfc7947 section 2.3.1) - but it appears to me that it comes at a
> too high computational cost. Should the path hiding really need to be
> mitigated for the duration of a catastrophic failure at the IXP,
> ADD-PATH can be used.
>
> Kind regards,
>
> Job
>
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>