Re: [VRRP] Proposal for Maintenance Event for VRRP [RFC-5798]

"Anurag Kothari (ankothar)" <ankothar@cisco.com> Tue, 26 February 2013 15:13 UTC

Return-Path: <ankothar@cisco.com>
X-Original-To: vrrp@ietfa.amsl.com
Delivered-To: vrrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 264C521F8820 for <vrrp@ietfa.amsl.com>; Tue, 26 Feb 2013 07:13:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.441
X-Spam-Level:
X-Spam-Status: No, score=-10.441 tagged_above=-999 required=5 tests=[AWL=0.157, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DOpYuPXBgDxw for <vrrp@ietfa.amsl.com>; Tue, 26 Feb 2013 07:13:53 -0800 (PST)
Received: from rcdn-iport-9.cisco.com (rcdn-iport-9.cisco.com [173.37.86.80]) by ietfa.amsl.com (Postfix) with ESMTP id 25A1821F870C for <vrrp@ietf.org>; Tue, 26 Feb 2013 07:13:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=17448; q=dns/txt; s=iport; t=1361891633; x=1363101233; h=from:to:subject:date:message-id:mime-version; bh=OYkI9IhnAxEX4E5J2jrCQnCWTCKiJ0g/p+Zqsb5ec8U=; b=RLC+a/zK4G+wD8hTF+VnUey/GiPpNXWJbuRKPyBUJ/xXqfeW3q+qavb3 reFWImbS3aYcQY//I9a6nZIigS6mvAPBud7FfkxRdimkMPAh27X24xwFg 95fDfM/z4bZ+6j71qF26k0dcHxH+FYrYhkr7OVHfVuh+vvCYD52sPK1zV w=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgAFANvQLFGtJV2Y/2dsb2JhbABFgkO/Jn4Wc4IhAQQtXgEIIlYmAQQbE4d4nlaRFY9ojTqBKYMXYQOnKIMIgic
X-IronPort-AV: E=Sophos; i="4.84,740,1355097600"; d="scan'208,217"; a="178336317"
Received: from rcdn-core-1.cisco.com ([173.37.93.152]) by rcdn-iport-9.cisco.com with ESMTP; 26 Feb 2013 15:13:52 +0000
Received: from xhc-rcd-x01.cisco.com (xhc-rcd-x01.cisco.com [173.37.183.75]) by rcdn-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id r1QFDqps018373 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for <vrrp@ietf.org>; Tue, 26 Feb 2013 15:13:52 GMT
Received: from xmb-rcd-x05.cisco.com ([169.254.15.93]) by xhc-rcd-x01.cisco.com ([173.37.183.75]) with mapi id 14.02.0318.004; Tue, 26 Feb 2013 09:13:52 -0600
From: "Anurag Kothari (ankothar)" <ankothar@cisco.com>
To: "vrrp@ietf.org" <vrrp@ietf.org>
Thread-Topic: [VRRP] Proposal for Maintenance Event for VRRP [RFC-5798]
Thread-Index: Ac4UM9OxSXkTjEjFQomcNOLGEc+dwg==
Date: Tue, 26 Feb 2013 15:13:51 +0000
Message-ID: <A2BB90B33FFDF740876C5AAE307D83DB1C2946D3@xmb-rcd-x05.cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.82.221.130]
Content-Type: multipart/alternative; boundary="_000_A2BB90B33FFDF740876C5AAE307D83DB1C2946D3xmbrcdx05ciscoc_"
MIME-Version: 1.0
Subject: Re: [VRRP] Proposal for Maintenance Event for VRRP [RFC-5798]
X-BeenThere: vrrp@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Virtual Router Redundancy Protocol <vrrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/vrrp>, <mailto:vrrp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/vrrp>
List-Post: <mailto:vrrp@ietf.org>
List-Help: <mailto:vrrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/vrrp>, <mailto:vrrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Feb 2013 15:13:54 -0000

Some concerns have been raised regarding the following situation / Corner Case:
Due to congestion or some other fault in the network the ADVERTISEMENTs from the Master router do not reach backup router(s). The Master_Down_timer fires and one of the back router transitions to Master State at the same time the priority of this router and the original Master is set to zero. How would the network behave in this case?

For sure the probability of this happening would be very low (as it is a double/multiple fault situation). Here is my analysis of the situation and solution for the same.

We will have two Master routers both sending ADVERTISEMENTs with Priority = 0. This will create a vicious circle of sending ADVERTISEMENTs (one router triggering the other) at a fast rate. This would also cause the Virtual MAC to shuttle continuously between two ports on the switch(es) at a fast rate. Both of these can probably cause high CPU utilization on the router and the LAN Switches.

If there is an additional backup router available then it will become master on seeing the first Advertisement with Priority = 0 and both the Masters (with priority = 0) will transition to Backup on seeing the advertisement with non-zero priority from the new Master.

For the situation where we do not have an additional backup router we can avoid this by making following modifications (Differences from RFC-5798 highlighted in RED):

(700) - If an ADVERTISEMENT is received, then:

   (705) -+ If the Priority in the ADVERTISEMENT is zero,

   (701) -+ and

   (702) -+ Local Priority is greater than zero, then:

      (710) -* Send an ADVERTISEMENT

      (715) -* Reset the Adver_Timer to Advertisement_Interval

   (720) -+ else // priority was non-zero or local priority was zero

      (725) -* If the Priority in the ADVERTISEMENT is greater
      than the local Priority,

      (730) -* or

      (735) -* If the Priority in the ADVERTISEMENT is equal to
      the local Priority and the primary IPvX Address of the
      sender is greater than the local primary IPvX Address, then:

         (740) -@ Cancel Adver_Timer

         (745) -@ Set Master_Adver_Interval to Adver Interval
         contained in the ADVERTISEMENT

         (750) -@ Recompute the Skew_Time

         (755) @ Recompute the Master_Down_Interval

         (760) @ Set Master_Down_Timer to Master_Down_Interval

         (765) @ Transition to the {Backup} state

      (770) * else // new Master logic

         (775) @ Discard ADVERTISEMENT

      (780) *endif // new Master detected

   (785) +endif // was priority zero?

(790) -endif // advert recv


Please let me know if I have missed something.

Thanks
-Anurag