Re: Service Redundancy using BFD

Greg Mirsky <gregimirsky@gmail.com> Tue, 28 November 2017 22:59 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C9F8A1288B8 for <rtg-bfd@ietfa.amsl.com>; Tue, 28 Nov 2017 14:59:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.698
X-Spam-Level:
X-Spam-Status: No, score=-2.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vx6JQz0XXhgF for <rtg-bfd@ietfa.amsl.com>; Tue, 28 Nov 2017 14:59:04 -0800 (PST)
Received: from mail-lf0-x236.google.com (mail-lf0-x236.google.com [IPv6:2a00:1450:4010:c07::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2FE4B1287A7 for <rtg-bfd@ietf.org>; Tue, 28 Nov 2017 14:59:04 -0800 (PST)
Received: by mail-lf0-x236.google.com with SMTP id 94so1685006lfy.10 for <rtg-bfd@ietf.org>; Tue, 28 Nov 2017 14:59:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=iW1WExh2SYxvDjweOW9ebLDTyC1L4AMUfnALDl7HTIw=; b=ZUHP9gSX5DYI7hOE2/dGPNrM83JK2LfBl2USGpHPIWD0o/D8+36fpuSnjwjl5SuSBE 0rmiUQlU19IThJs+xaWWz/fJRnnsYZ5N1wnIuEDgoS65LUNhHsETnoBb4MddHybUjg9M GcjRfYH0rCh1MEoZkfuNykOM0VlxlttXr32j8H28n8MDiiYRLsHR7/durUiHIp4tWmEw jxhxz+v3abyP2h7x53gw8dilHrM+KqkylR8PAYZpxr1Ionn9JuozBM3uMci18CEK4N05 HNjK/LVinPfhbs59otHyDDbBOMrXPXxcrXqJDJUWC2DiKtVCsC/S/BvdkV7AK6/UtvOj Zf0A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=iW1WExh2SYxvDjweOW9ebLDTyC1L4AMUfnALDl7HTIw=; b=DF5NaEcEAVIKm21hN9TZ4TqYJdVmpG9JzCMVzarbA0mNfN6hoz47gJPKqq15I+Ilh2 wM6OBCZSAMQVd3dETMXMOW6cKsDgtkU9aonsXbF5TcS99cOXHH3woa452n/7pIzGr4Qi AFL/e90xrBxfG5c2JXVYYiyJnrlvIe+EiBzjaabISxsXvkoV5cd2z+BYdRu6mU1E7fTj cIupjuIsL8lFMqc0YtoBBVCEmbqNumP9W1dDboczOVLcwIBG08Ddzwtz/rNrJSH7hOyx bWCxktz4V7oHstBbWwo1+vYuvkQo80MxXyXelMTp9esrQ7Sgoy8Liz/VTCLWb0fBSsNz YZHA==
X-Gm-Message-State: AJaThX6h+eDZ3X6F0fVJ/z7q4nseD95KOGT7lG/tCYymVhIKJQlJpopK i3iDlWj7AvRJxZ7oyaSQWqQt7d7l5yuWNUVPYFA=
X-Google-Smtp-Source: AGs4zMY4JXNPPCxCmbfJgdFQXjNJd796cuFpq5skFs/eDfYim2eQ7ea4E6Xpw2y3Bg3FbwBBRfJhIZZYmccXyNrZ1XM=
X-Received: by 10.25.81.133 with SMTP id g5mr257249lfl.77.1511909942182; Tue, 28 Nov 2017 14:59:02 -0800 (PST)
MIME-Version: 1.0
Received: by 10.46.32.136 with HTTP; Tue, 28 Nov 2017 14:59:01 -0800 (PST)
In-Reply-To: <359C5687-6C52-494D-8CF3-A76B4BDC622A@vmware.com>
References: <3A4A67EC-042C-4F8A-80AB-E7A5F638DE15@vmware.com> <76804F35-63BB-46A0-A74C-9E41B2C213B4@outlook.com> <6FB7BA5C-8ECC-4330-89D0-8FD7306217F5@vmware.com> <00F17C92-E43D-4BFB-81B1-534DD221E66F@outlook.com> <42407007-C6BA-4CAF-8BE8-F6C552B92A38@vmware.com> <874DFFD3-1DE2-43A1-B726-B128E5746DBE@outlook.com> <828E73CC-E8C2-48C8-93CD-3CB580174536@vmware.com> <FE9ABD4D-A752-4999-9ED2-B86014A278E7@vmware.com> <CA+RyBmWKKgzkWEZq9th1A0Dx3Ps93-xrrxQY6F+qjiR3etKD-A@mail.gmail.com> <359C5687-6C52-494D-8CF3-A76B4BDC622A@vmware.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Tue, 28 Nov 2017 14:59:01 -0800
Message-ID: <CA+RyBmXCuPJAfNvjuXT3pWmyAkSJJc1a=kgrhLx2hXHpqMJxKQ@mail.gmail.com>
Subject: Re: Service Redundancy using BFD
To: Sami Boutros <sboutros@vmware.com>
Cc: Ashesh Mishra <mishra.ashesh@outlook.com>, Ankur Dubey <adubey@vmware.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, Reshad Rahman <rrahman@cisco.com>
Content-Type: multipart/alternative; boundary="94eb2c1cdc00094a07055f12f76d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/9Y5JyzL54FvDMMbDr8hlCbsRHDg>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Nov 2017 22:59:07 -0000

Hi Sami,
you've indicated that it is that one of the set of network functions (NF),
A and B in the figure below, that provides L2/L3 services to NF C. My
question was how C addresses the designated forwarder (DF) of the A-B set.
If it uses virtual address that associated with the function of the DF,
then NF C doesn't need to know the identity of the DF (similar to VRRP,
isn't it). If NF C needs to know the identity of the DF, then it must use
some means to monitor liveliness of A and B.
And I have to point to couple BFD related assumptions in the draft:

   - failure of BFD session between A and B cannot be interpreted as
   failure of A or B by respective BFD peer but only as loss of continuity
   between the forwarding engines. Assumption that the failure is not of link
   but of a node may lead to duplicate DFs;
   - using multi-hop BFD to detect node failure may produce false negative
   if failure detection is more aggressive than network convergence, e.g.
   network convergence is guaranteed within 100 ms while BFD interval is 10 ms.

Regards,
Greg

On Tue, Nov 28, 2017 at 2:39 PM, Sami Boutros <sboutros@vmware.com> wrote:

> Hi Greg,
>
> A can detect failures to the link to C using any mechanisms not only BFD.
>
> The picture below is for illustration, A and B themselves can be providing
> services (L4 to L7), this could include Firewall, NAT, LoadBalancer etc..
>
> Thanks,
>
> Sami
> From: Greg Mirsky <gregimirsky@gmail.com>
> Date: Tuesday, November 28, 2017 at 2:20 PM
> To: Sami Boutros <sboutros@vmware.com>
> Cc: Ashesh Mishra <mishra.ashesh@outlook.com>, Ankur Dubey <
> adubey@vmware.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, Reshad Rahman <
> rrahman@cisco.com>
>
> Subject: Re: Service Redundancy using BFD
>
> Hi Sami,
> would C have BFD sessions to A and B respectively or it use anycast
> address? The more I look at the use case, the more I think of VRRP ;)
>
> Regards,
> Greg
>
> On Tue, Nov 28, 2017 at 2:15 PM, Sami Boutros <sboutros@vmware.com> wrote:
>
>>
>> Hi Ashesh,
>>
>> The topology is more like the following:
>>
>> A <—\
>> |         \
>> BFD      C
>> |         /
>> B<—/
>>
>> A and B are nodes providing L2 and L3 services for C, with A/S redundancy.
>>
>> A can be active and B standby, if A goes down then B start providing the
>> services.
>>
>> Thanks,
>>
>> Sami
>> From: Ashesh Mishra <mishra.ashesh@outlook.com>
>> Date: Tuesday, November 28, 2017 at 1:45 PM
>>
>> To: Sami Boutros <sboutros@vmware.com>, Ankur Dubey <adubey@vmware.com>,
>> "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
>> Cc: Reshad Rahman <rrahman@cisco.com>
>> Subject: Re: Service Redundancy using BFD
>>
>> Okay. That makes sense now.
>>
>>
>>
>> So in a scenario where you have a primary overlay service between A and
>> B, and a backup overlay service between C and D, the BFD sessions in
>> question will be between A and C, and B and D (so that the backup can send
>> diag code to primary)?
>>
>>
>>
>> A <------- primary service --------->B
>>
>> |                                                           |
>>
>> BFD                                                    BFD
>>
>> |                                                           |
>>
>> C<-------- backup service ---------->D
>>
>>
>>
>> --
>>
>> Ashesh
>>
>>
>>
>>
>>
>> *From: *Sami Boutros <sboutros@vmware.com>
>> *Date: *Tuesday, November 28, 2017 at 4:21 PM
>> *To: *Ashesh Mishra <mishra.ashesh@outlook.com>, Ankur Dubey <
>> adubey@vmware.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
>> *Cc: *Reshad Rahman <rrahman@cisco.com>
>> *Subject: *Re: Service Redundancy using BFD
>>
>>
>>
>> Hi Ashesh,
>>
>>
>>
>> A service is an overlay service running on a routing node, this could be
>> a L2 or L3 VPN service running on set of links connected to 2 or more
>> nodes, where one node is active for a service at a given point in time, and
>> one node is standby.
>>
>>
>>
>> Now, BFD is running on underlay links between the 2 nodes active and
>> standby, once BFD goes down, the standby assumes that the active went down
>> and activates the services that it shares with the active. On the BFD
>> session the standby would signal to the old active when it came back up
>> that it activated the non-preemptive services via this diag code saying
>> that it didn’t fail, so the old active node doesn’t activate those
>> non-preemptive services.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Sami
>>
>> *From: *Ashesh Mishra <mishra.ashesh@outlook.com>
>> *Date: *Tuesday, November 28, 2017 at 1:14 PM
>> *To: *Sami Boutros <sboutros@vmware.com>, Ankur Dubey <adubey@vmware.com>,
>> "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
>> *Cc: *Reshad Rahman <rrahman@cisco.com>
>> *Subject: *Re: Service Redundancy using BFD
>>
>>
>>
>> Thanks for the response, Sami. I think our disconnect lies in the
>> definition of a service. From a BFD perspective, I expect the service to be
>> established across two nodes, at the very least, so that BFD can monitor
>> its liveness. Can you elaborate on
>>
>>
>>
>> -          What, in the context of this draft, a service is?
>>
>> -          How does BFD signal for a service that it is not monitoring
>> the liveness for?
>>
>>
>>
>> Thanks,
>>
>> Ashesh
>>
>>
>>
>> *From: *Sami Boutros <sboutros@vmware.com>
>> *Date: *Tuesday, November 28, 2017 at 1:23 PM
>> *To: *Ashesh Mishra <mishra.ashesh@outlook.com>, Ankur Dubey <
>> adubey@vmware.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
>> *Cc: *Reshad Rahman <rrahman@cisco.com>
>> *Subject: *Re: Service Redundancy using BFD
>>
>>
>>
>> Hi Ashesh,
>>
>>
>>
>> Thanks for your comments.
>>
>>
>>
>> For your first comment the draft applies to both single hop or what you
>> call interface BFD and multi hop BFD too. And yes the per service could be
>> per interface too if this is a single hop BFD, we can clarify that in the
>> draft.
>>
>>
>>
>> For your second comment, I am not sure I understand. The service will be
>> active only on one node, if the service is associated with the whole node,
>> then the BFD session is monitoring the node liveness. And when the service
>> is associated with an interface the BFD session will monitor the interface
>> connectivity as well. So, a primary service can’t be active at the 2 node
>> endpoints hosting the BFD session.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Sami
>>
>>
>>
>