Re: Service Redundancy using BFD

Greg Mirsky <gregimirsky@gmail.com> Tue, 28 November 2017 22:06 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B866127AD4 for <rtg-bfd@ietfa.amsl.com>; Tue, 28 Nov 2017 14:06:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.709
X-Spam-Level:
X-Spam-Status: No, score=-0.709 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vlIo0qAdgUzp for <rtg-bfd@ietfa.amsl.com>; Tue, 28 Nov 2017 14:06:26 -0800 (PST)
Received: from mail-lf0-x230.google.com (mail-lf0-x230.google.com [IPv6:2a00:1450:4010:c07::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C363912726E for <rtg-bfd@ietf.org>; Tue, 28 Nov 2017 14:06:25 -0800 (PST)
Received: by mail-lf0-x230.google.com with SMTP id t197so1568518lfe.7 for <rtg-bfd@ietf.org>; Tue, 28 Nov 2017 14:06:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gEyTg1/15rrTdgpvmLll/xMfJBNGaIQ3G3VuvN4nRBI=; b=e3fX0FLBl8zrk7wVLRFFx71BoG6AW0RlG513FvKghh3Kat7A/f7XH/4Tl1royjKNbp gHLNaad6i6bBTmjydRe/3PUbJNCtaxM/3f9Menbe5AtJCHpvl9fx41q99Fx1KRA9rQVa NF/fwFrcGh9D2PfoGeNkuh45kE0OixgdRF4UuMgvUicMffrS+5RlIs8tMi0oSpORBr9q K4hhAxwu2ZeAMDhAlp7erF5mxyuquIqN5cAc2cDzfKAJGaswhpnJ4ew5foSGM/J95Y03 BS+Sx+5d9Jo2vpd8aIXYXR27wGfcHZD+80NzQZRcJdFmbCxNzhRiDEU0azI/DD2SVbPO E3Sw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gEyTg1/15rrTdgpvmLll/xMfJBNGaIQ3G3VuvN4nRBI=; b=Z69t/zV1U80EvcEtMaZ6RP82lNUJZ7tgIocgs85v+5fw2R+FzoP53taPlYfVgj7rkx wn7w4VuEN2scygFIV9tZq3amn4stkkos/OEwCCQ1k5eON7PK4/aTKvwlFoUJWsmw1dgn UQkva7u805SOHCmhnV3BKeI5HHYvjtQvF35wgrXoBm/ka0QzdFkNSmo3DSbJtXhsVfbw cwOe3WSo+LaWKG0LIy8NUgAndHpIdjKKNgxSn10n8p2tC9urCwEKYwTR1sG1yjVEzFtE 0tWv3vUwqSopWRq9iXMupjHLqrIxeQ5me6vzoZsUyhuKboSVTSYNFGobmBW0aU5dNUpv D18w==
X-Gm-Message-State: AJaThX7WWG40iXFWwdBvbQIICaKXWZx/nt+tADZeVEypQMLhlaXibM3G SVbFwMCC6JqGeh1rXBKIYx5U5OkwJtwSBHyu1CzdYQ==
X-Google-Smtp-Source: AGs4zMYMmB7IiRQSl0qt+zX+D3E0iyuFU2135FPqkWpdGoe8lfZ38raio8ByPizCdnex6YYi8EmkmG/lRv1vtvJkdEc=
X-Received: by 10.25.161.132 with SMTP id k126mr220609lfe.110.1511906783846; Tue, 28 Nov 2017 14:06:23 -0800 (PST)
MIME-Version: 1.0
Received: by 10.46.32.136 with HTTP; Tue, 28 Nov 2017 14:06:23 -0800 (PST)
In-Reply-To: <00F17C92-E43D-4BFB-81B1-534DD221E66F@outlook.com>
References: <3A4A67EC-042C-4F8A-80AB-E7A5F638DE15@vmware.com> <76804F35-63BB-46A0-A74C-9E41B2C213B4@outlook.com> <6FB7BA5C-8ECC-4330-89D0-8FD7306217F5@vmware.com> <00F17C92-E43D-4BFB-81B1-534DD221E66F@outlook.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Tue, 28 Nov 2017 14:06:23 -0800
Message-ID: <CA+RyBmXgLBdE7JTEs2pQHs59t+vVNagLxsKR7riBJc5JceX9Uw@mail.gmail.com>
Subject: Re: Service Redundancy using BFD
To: Ashesh Mishra <mishra.ashesh@outlook.com>
Cc: Sami Boutros <sboutros@vmware.com>, Ankur Dubey <adubey@vmware.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, Reshad Rahman <rrahman@cisco.com>
Content-Type: multipart/alternative; boundary="001a113f21ecc8e681055f123a57"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/-y3yawWozAPPxBqydlnTpleMcgo>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Nov 2017 22:06:29 -0000

Hi Ashesh,
I believe that the abstract of RFC 5880 is very clear of what is the goal
of BFD:

   This document describes a protocol intended to detect faults in the
   bidirectional path between two forwarding engines, including
   interfaces, data link(s), and to the extent possible the forwarding
   engines themselves, with potentially very low latency.  It operates
   independently of media, data protocols, and routing protocols.


Applications, e.g. routing protocols, residing on the BFD node may use
notifications of BFD state changes to trigger their own processes. An
implementation may use BFD state changes to draw conclusions of state of
its remote peer but, I strongly believe, BFD is not intended to verify
anything but path continuity between two nodes and, to some extent, proper
functioning of the forwarding engines at BFD nodes.

Regards,
Greg

On Tue, Nov 28, 2017 at 1:14 PM, Ashesh Mishra <mishra.ashesh@outlook.com>;
wrote:

> Thanks for the response, Sami. I think our disconnect lies in the
> definition of a service. From a BFD perspective, I expect the service to be
> established across two nodes, at the very least, so that BFD can monitor
> its liveness. Can you elaborate on
>
>
>
> -          What, in the context of this draft, a service is?
>
> -          How does BFD signal for a service that it is not monitoring
> the liveness for?
>
>
>
> Thanks,
>
> Ashesh
>
>
>
> *From: *Sami Boutros <sboutros@vmware.com>;
> *Date: *Tuesday, November 28, 2017 at 1:23 PM
> *To: *Ashesh Mishra <mishra.ashesh@outlook.com>;, Ankur Dubey <
> adubey@vmware.com>;, "rtg-bfd@ietf.org"; <rtg-bfd@ietf.org>;
> *Cc: *Reshad Rahman <rrahman@cisco.com>;
>
> *Subject: *Re: Service Redundancy using BFD
>
>
>
> Hi Ashesh,
>
>
>
> Thanks for your comments.
>
>
>
> For your first comment the draft applies to both single hop or what you
> call interface BFD and multi hop BFD too. And yes the per service could be
> per interface too if this is a single hop BFD, we can clarify that in the
> draft.
>
>
>
> For your second comment, I am not sure I understand. The service will be
> active only on one node, if the service is associated with the whole node,
> then the BFD session is monitoring the node liveness. And when the service
> is associated with an interface the BFD session will monitor the interface
> connectivity as well. So, a primary service can’t be active at the 2 node
> endpoints hosting the BFD session.
>
>
>
> Thanks,
>
>
>
> Sami
>
> *From: *Ashesh Mishra <mishra.ashesh@outlook.com>;
> *Date: *Tuesday, November 28, 2017 at 4:04 AM
> *To: *Ankur Dubey <adubey@vmware.com>;, "rtg-bfd@ietf.org"; <
> rtg-bfd@ietf.org>;
> *Cc: *Reshad Rahman <rrahman@cisco.com>;, Sami Boutros <sboutros@vmware.com
> >
> *Subject: *Re: Service Redundancy using BFD
>
>
>
> Hi Ankur,
>
>
>
> This is a good proposal to pursue within the BFD-wg.
>
>
>
> Couple of comments:
>
> -          BFD can only signal this diag code for the interface that it
> is monitoring (the IP next hop, MPLS LSP, etc.). You mention per-service
> (which I assume means per-service-per-interface) failover in the draft but
> it may be worthwhile defining behavior on per-*service-type*-per-interface
> as well.
>
> -          There still needs to be a method for the primary and backup
> pairs (two BFD end-points on primary service and two on backup service) to
> communicate with each other (primary-to-primary and backup-to-backup) if
> the service is active or standby. This is useful in the scenario when the
> primary cannot communicate with backup nodes (it is a failure condition
> after all).
>
>
>
> Again, at 10k ft, I like the idea of signaling active/standby using BFD.
>
>
>
> Cheers,
>
> Ashesh
>
>
>
> *From: *Rtg-bfd <rtg-bfd-bounces@ietf.org>; on behalf of Ankur Dubey <
> adubey@vmware.com>;
> *Date: *Monday, November 27, 2017 at 9:47 PM
> *To: *"rtg-bfd@ietf.org"; <rtg-bfd@ietf.org>;
> *Cc: *Reshad Rahman <rrahman@cisco.com>;, Sami Boutros <sboutros@vmware.com
> >
> *Subject: *Service Redundancy using BFD
>
>
>
> Hi all,
>
>
>
> Please review and provide comments for the following draft:
>
>
>
> https://datatracker.ietf.org/doc/draft-adubey-bfd-service-redundancy/
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dadubey-2Dbfd-2Dservice-2Dredundancy_&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=IVzcTRLQdpta08L0b_y2zDkqvwJhRKMCAbX-2K-LV98&m=3D1zKBUXYinynnVWgCSqOkn4ccSIcx6rzDitjPm2dfs&s=d4DdCstEXxJ0sOJ09fOaHRCfpS3chnYNcuVWImRCcFQ&e=>
>
>
>
>
>
> *Summary of draft:*
>
>
>
> This draft proposes a new BFD diag code via which a node running a BFD
> session with another node, can inform the other node after a BFD session
> times out, that it didn’t go down and did live through the failure.
>
>
>
> Such notification is useful for a set of nodes providing Active/Standby
> redundancy. When these nodes are running multiple L2/L3/L4-L7 services  in
> non-revertive mode of redundancy, the standby node taking over as active
> for non-revertive services after BFD times out needs to indicate in the BFD
> packet that it outlived the other failed old active node. The new diag code
> will be used for this purpose. When this diag code is set in the BFD
> packets, it will provide an indication to the failed old active node that
> it MUST NOT activate the non-revertive services when it comes up.
>
>
>
> For providing a per service level failover, a node activating certain
> non-revertive services needs to indicate that it is Active ONLY for those
> non-revertive services. This can be done by using a unique bitmap where
> each bit position is uniquely identifying a service. This unique bitmap is
> configured on all nodes by a network controller. When there is at least one
> non-revertive service for which a node is not active AND it is active for
> at least 1 non-revertive service, this node will set bits identifying the
> active services in the bitmap and send it in the payload of the BFD packet.
>
>
>
>
>
> Thanks,
>
> --Ankur
>