Node failure detection (was Re: [spring] Draft for Node protection of intermediate nodes in SR Paths)

Greg Mirsky <> Mon, 02 December 2019 14:44 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id AA46A12006D; Mon, 2 Dec 2019 06:44:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.398
X-Spam-Status: No, score=-1.398 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, PDS_BTC_ID=0.499, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id HxDtEorl-j3k; Mon, 2 Dec 2019 06:44:48 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4864:20::134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id D1B33120045; Mon, 2 Dec 2019 06:44:47 -0800 (PST)
Received: by with SMTP id r15so25260707lff.2; Mon, 02 Dec 2019 06:44:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=blRBDwHn/MWC0tW4aG6s4FiOPaxMdhrnUQLNdHUNYzc=; b=gnChZ5T7eJtF/94DBVpnqb5iYSDcfL7Mr3Zuv7UNQueyU3JFIzUnDotCnmG7MexWOX vBw2ElN9s+XHlquueSpayrP5j2AsIDK3qb0PBAegI7f015ROpTN7Cef028Pb7xsv2BGE 6NhPCLohS/dtCu26snqAZsgug5WsXDk2jNgks5chLT8YBvJ9duqm6gt57vAOQzJPu9SI n+sEZqoEDuEmkyLYUPv/3mUh4I9M6WJdBM8Gt+Y3gU+GC8J1i4KXCDxWecDgAZjPoPex bDLpYgcsAXGBAkmGww8BYWfZuK8tI6pydrckqS+tL+e44vP0ZX/QYwkQXJ53fCaSscw3 lpsw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=blRBDwHn/MWC0tW4aG6s4FiOPaxMdhrnUQLNdHUNYzc=; b=iPZ2+5wQvjkANf9XSmjzKEEOQrH5OcmJgWuW2ghunhcGCoJpQeNvNT80gLaB5OXrpU iJ1mHYYzBjOjK/JlgDXN9Ow29qfblgfAdpC/9Eq9S+8i7ow248ji1y7Px790kmFyXnYU DT0wTOj0D9TTTBp3lI3qENchlj+qDNQLAVv4c/wXjuDqhS3R/VP9CGMcKuxKg8TcuIK7 Uwr4f+egvSBpCuavVR3DK0qJDO2JQHg0Oa0l/VS0jRHMcZzNvyf/DWpDnZbnIi8w1hgx +8CzG8mGJhytnwxAJqfSdjAbaNdll+yNBBnev/sFkO6GFIbxo1ais9sJfo9438I45Fp6 Xi9Q==
X-Gm-Message-State: APjAAAVYl2RCwThAkXqc3DMRMyjKHi8RhM+0hrnHOw/Rcn6OscOaf2cl X+QFZYo/OqT+E63wGg69CmOB77zXk54mpt/d6GhEE80m
X-Google-Smtp-Source: APXvYqyDkGa67kdUn1m3Ma7Jhc2U4i1Q7ZTfXOwdsBSwJDFxh9xseY7HIBpoiastTBpPxr65L4cACyVBjPhgHYJJg/M=
X-Received: by 2002:a19:ae10:: with SMTP id f16mr28234617lfc.147.1575297885852; Mon, 02 Dec 2019 06:44:45 -0800 (PST)
MIME-Version: 1.0
References: <> <> <> <> <> <>
In-Reply-To: <>
From: Greg Mirsky <>
Date: Mon, 2 Dec 2019 09:44:34 -0500
Message-ID: <>
Subject: Node failure detection (was Re: [spring] Draft for Node protection of intermediate nodes in SR Paths)
To: Alexander Vainshtein <>
Cc: Robert Raszuk <>, Shraddha Hegde <>, "" <>, "" <>, "" <>
Content-Type: multipart/alternative; boundary="000000000000e6d85e0598b99e9a"
Archived-At: <>
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 02 Dec 2019 14:44:52 -0000

Hi Sasha, et al.,
many thanks for the great discussion. Please correct me if my recollection
is not accurate, but at the time of RFC 4090 it was agreed, that a trigger
to local protection may be in fact a false negative and, as a result, the
protection switchover is suboptimal. I understand that as the realization
that there should be a practical and pragmatic balance between time to
detect a failure and the accuracy of characterizing it. I think, that the
draft Shraddha had mentioned, doesn't need to discuss any of these issues.
What I'm curious, as you've suggested to the possible use of S-BFD, what
then the detection time could be. Unlike with the asynchronous mode of BFD,
S-BFD, as I understand, doesn't have predictable detection time like
defined in Section 6.8.4 RFC 5880. The detection time by S-BFD, I believe,
characterizes RTT and RTT may vary from probe to probe significantly. True,
a system may gather S-BFD RTT statistics to set and periodically adjust the
detection time to achieve more accurate failure detection. But, in my view,
it seems like significant complexity comparing to RFC 5880.


On Sat, Nov 23, 2019 at 9:45 AM Alexander Vainshtein <> wrote:

> Robert,
> On the second thought, for the purpose of this draft (i.e. in the scope of
> SR) it is possible to implement your suggestion by running S-BFD sessions
> between R7 (as the initiator) and each other adjacency of R8  (acting as
> Reflectors) of a SR policy with list of two SIDs:
> - protected adjacency between R7 and R8
> - Node SID of the specific "other" adjacency  of R8.
> If all these sessions fail, R7 can reliably consider R8 as failed.
> I am not sure this would be much better than multi-hop IP BFD, and it
> looks much more complicated to me.
> What do you think?
> Get Outlook for Android <>
> ------------------------------
> *From:* Alexander Vainshtein <>
> *Sent:* Saturday, November 23, 2019, 13:15
> *To:* Robert Raszuk; Shraddha Hegde
> *Cc:*;;
> *Subject:* Re: [spring] Draft for Node protection of intermediate nodes
> in SR Paths
> Robert,
> Lots of thanks for a prompt response.
> I respectfully disagree with your statement that BFD implementation  is
> usually offloaded to the HW of the ingress line card.  I do not think this
> can wor for MH BFD sessions because the ingress and egress line cards are
> not known in advance and change with the routing changes
> A good  multi-hop BFD implementation should be ready to overcome this..
> There are many ways to achieve that. A naive implementation that runs in SW
> of the control card is also possible of course. And they would sensd and
> receive packets
> My 2c.
> Get Outlook for Android <>
> ------------------------------
> *From:* Robert Raszuk <>
> *Sent:* Saturday, November 23, 2019, 12:37
> *To:* Alexander Vainshtein; Shraddha Hegde
> *Cc:*;;
> *Subject:* Re: [spring] Draft for Node protection of intermediate nodes
> in SR Paths
> Hi Sasha,
> On the surface your suggestion may look cool - but if you zoom in - I do
> not think it will work in practice.
> See - one of the biggest value of BFD is its offload to line card's
> hardware. And in most cases it is ingress line card to the box. So if you
> instruct such hardware to respond to SID address loopback you still did not
> gain much in terms of detection router's fabric failures, remote LC failure
> or control plane issues which could soon result in box failure. The
> catalogue of router failures is of course much more colorful.
> If you ask BFD to be responded by RP/RE it no longer has the BFD
> advantage.
> IMHO the best way to detect node failure is actually to send the probes
> *across* the node under test to its peers.
> The way I would think of establishing such m-hop sessions would be fully
> automated with one knob per IGP adj. ex: "bfd detect-node-failure [max N]"
> where local BFD subsystem would create N sessions to IGP peers of the node
> we are to protect. LSDB has those peers so no new protocol extension is
> needed, perhaps even no new IETF draft is required :). N would be the limit
> of such sessions in case the node under protection has say 10s of peers.
> Default could be perhaps even 1.
> Thx,
> Robert.
> On Sat, Nov 23, 2019 at 10:00 AM Alexander Vainshtein <
>> wrote:
>> Shraddha, Robert and all,
>> Regarding Robert's question:
>> I wonder if multi-hop IP BFD session with addresses used as /32 (or /128)
>> prefixes serving as Nose SIDs of R8 and R7 respectively could be used as
>> such a trigger by R7? Such a session would not respond to link failures,
>> and I find it problematic to imagine a scenario when it would be kept UP in
>> the case of a real node failure.
>> Of course such a session would have to be slow enough not to react to
>> link failures. But it still couks be much faster than IGP conversion IMHO.
>> My 2c,
>> Sasha
>> Such
>> Get Outlook for Android
>> <>
>> ------------------------------
>> *From:* spring <> on behalf of Robert Raszuk <
>> *Sent:* Friday, November 22, 2019, 11:22
>> *To:* Shraddha Hegde
>> *Cc:*;
>> *Subject:* Re: [spring] Draft for Node protection of intermediate nodes
>> in SR Paths
>> Hi Shraddha,
>> I have one question to the document.
>> As you know the critical element for the effective protection of any
>> scheme is the failure detection. On that your draft seems to have just one
>> little paragraph:
>>    Note that R7 activates the node-protecting backup path when it
>>    detects that the link to R8 has failed.  R7 does not know that node
>>    R8 has actually failed.  However, the node-protecting backup path is
>>    computed assuming that the failure of the link to R8 implies that R8
>>    has failed.
>> Well IMO this is not enough. Specifically there can be a lot of types of
>> node failure when link is still up. Moreover there can be even running BFD
>> across the link just fine when say fabric failure occurs at R8.
>> While this is not solely issue with this draft, it is our common IETF
>> failure to provide correct means of detecting end to end path or fragments
>> of path failures (I am specifically not calling them segment here :).
>> For example I propose that to effectively detect R8 failure as node
>> failure which is the topic of your proposal a mechanism is clearly defined
>> and includes bi-dir data plane probes send between R7-R9, R3-R7, R4-R7,
>> R4-R9, R3-R9
>> Many thx,
>> Robert.
>> On Fri, Nov 22, 2019 at 4:38 AM Shraddha Hegde <shraddha=
>> <>> wrote:
>>> WG,
>>> This is the draft I pointed out that talks about solutions for providing
>>> node-protection.
>>> It covers Anycast case as well as keeping forwarding plane longer.
>>> **
>>> <>
>>> Review and comments solicited.
>>> Rgds
>>> Shraddha
>>> _______________________________________________
>>> rtgwg mailing list
>>> <>
> ___________________________________________________________________________
> This e-mail message is intended for the recipient only and contains
> information which is
> CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have
> received this
> transmission in error, please inform us by e-mail, phone or fax, and then
> delete the original
> and all copies thereof.
> ___________________________________________________________________________