Re: [Lsr] WG Adoption Call for draft-li-lsr-dynamic-flooding-02 + IPR poll.

Tony Li <tony1athome@gmail.com> Wed, 27 February 2019 07:07 UTC

Return-Path: <tony1athome@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 71CFE130E66; Tue, 26 Feb 2019 23:07:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2qneo5HIGlsv; Tue, 26 Feb 2019 23:07:12 -0800 (PST)
Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D16FE12F19D; Tue, 26 Feb 2019 23:07:11 -0800 (PST)
Received: by mail-pg1-x52a.google.com with SMTP id r124so7498099pgr.3; Tue, 26 Feb 2019 23:07:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=sJSYdbENToju0bqJ2D4O4HVTN4uMsX1YAw7yG+ybZSI=; b=G4QG8Dp/JDWhEIGWUETa1QQP/jGRO1n7zopEEb6fja66Ow0D++294GgDuZBTqsfIV1 Vohx3BHL2pSP/81vciJCQtSCmYCe9Bv/aOA1+uSGFVauZf03xYVaDvCvhKsBEWyIa/cc XhqiLL5lkXU9p4xXvkl6oV02J53r3y4fekvtsIJ4emjAa92s01ZomZ5qd66A+9FQ/aoA GcvDQ69c1VdUDda0OhHwjLrNjm881CvgyJp0CeDqjUraC3kvXh8qA5QVRQ5svTuTvxq5 Nqg8EJdgFVhms5SpzOfEQKYVXyXYT/1BAvGp4LoeVkMnqYTDIpRcrCItVD9tXzNihcWo RNWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=sJSYdbENToju0bqJ2D4O4HVTN4uMsX1YAw7yG+ybZSI=; b=mFGBuNMKhb+WfYQ2bjOucjR7H2jFmM5pmNnKzdUqeNAIjlAbVrsiMOXo2C+14BK9J4 6qSHARzftq/mLd2pLXqPR+NcByhZ5R7yjzYbfUEChv4v+tHNnzb2zDyRMkF52OKpF5sy zjfbOEpoAdh1LA3rneRiL1mXiKSSDlUhMr0lq1kIxX6TZTprgpawZV1Ld1wtokVb2tqo 8WblaTLtDjdeJp+8L7t3PdqvLFHX7AkHqpJUkT1ugf+L24KaTFbKhTgCLwgPL8iGlIxR 85ytP1kQ6XPQX57eEmyC4JWpfnbCtheAfxl2ACIWmhkuvX2IayVPDo7ITufYsl/1HfK5 f5Nw==
X-Gm-Message-State: AHQUAubNzKBJaraw+kUR3InDg2tkq4X2jUBJvmKiZSsohq5L0RgVAQtq iFo7lVbUUmvJYigh51S3Bik=
X-Google-Smtp-Source: AHgI3IYnibEnJOjJtmPPMLIPdra3l7Y9MkItygI77/WragRkBlyHTXPig6qR9QP7ND47j5Ud8bUyPw==
X-Received: by 2002:a65:6546:: with SMTP id a6mr1603090pgw.296.1551251231124; Tue, 26 Feb 2019 23:07:11 -0800 (PST)
Received: from [10.95.92.155] ([162.210.129.5]) by smtp.gmail.com with ESMTPSA id i4sm7487121pfo.158.2019.02.26.23.07.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Feb 2019 23:07:09 -0800 (PST)
From: Tony Li <tony1athome@gmail.com>
Message-Id: <0B4DF2AC-8EE1-41CA-B357-98325067CA30@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_C1876026-EC08-475B-A4FD-C29BF8B32390"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
Date: Tue, 26 Feb 2019 23:07:06 -0800
In-Reply-To: <5316A0AB3C851246A7CA5758973207D463B59041@sjceml521-mbx.china.huawei.com>
Cc: Peter Psenak <ppsenak@cisco.com>, Christian Hopps <chopps@chopps.org>, "lsr@ietf.org" <lsr@ietf.org>, "lsr-chairs@ietf.org" <lsr-chairs@ietf.org>, "lsr-ads@ietf.org" <lsr-ads@ietf.org>
To: Huaimo Chen <huaimo.chen@huawei.com>
References: <sa6lg2md2ok.fsf@chopps.org> <SN6PR11MB284553735B2351FB584BE792C17F0@SN6PR11MB2845.namprd11.prod.outlook.com> <5316A0AB3C851246A7CA5758973207D463B5858A@sjceml521-mbx.china.huawei.com> <420ed1b5-d849-99cc-bcb0-d159783e4de2@cisco.com> <5316A0AB3C851246A7CA5758973207D463B59041@sjceml521-mbx.china.huawei.com>
X-Mailer: Apple Mail (2.3445.102.3)
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/oGKa0Rd9VjtokVT_Lgbbc6h33Bs>
Subject: Re: [Lsr] WG Adoption Call for draft-li-lsr-dynamic-flooding-02 + IPR poll.
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Feb 2019 07:07:14 -0000

Hi Huaimo,

> > > 1)           There is no concrete procedure/method for fault tolerance
> > > to multiple failures. When multiple failures happen and split the
> > > flooding topology, the convergence time will be increased
> > > significantly without fault tolerance. The longer the convergence
> > > time, the more the traffic lose.
> > 
> > there is a solution for multiple failures - see section 6.7.11.
> > 
>  
> Section 6.7.11 just briefly mentions that the edges of split parts will determine and repair the split after the split of the flooding topology happens. However, there is not any details or description on how to determine or repair the split. This is not useful for implementers.


I’m sorry that you don’t find it useful. Determining the split is trivial: when you receive an IIH, it has a system ID of the another system in it. If that other system is not currently part of the flooding topology, then it is quite clear that it is disconnected from the flooding topology. Repairing the split is done by enabling temporary flooding on the new link.

There is an issue here that we have not yet resolved, which is the rate that new links should be temporarily added to the flooding topology.  Some believe that adding any new link is the correct thing to do as it minimizes the recovery time. Others feel that enabling too many links could cause a flooding collapse, so link addition should be highly constrained. We are still discussing this and invite the WG’s opinions.


> > > 2)           The extensions to Hello protocols for enabling “temporary
> > flooding” over a new link is not needed.
> > 
> > not if you do flooding on every link that comes up. If you want to be smarter, then you need to
> > selectively enable flooding only under specific conditions and that must be done from both sides of
> > the new link.
>  
> There are only a limited number of conditions (or cases).  In each condition/case, it is deterministic whether we need to enable “temporary flooding” for a new link when it is up.  Thus there is no need for any extensions to Hello protocols for enabling “temporary flooding” on a new link.


We know of only two cases: (1) the neighbor is not part of the flooding topology and we feel that we can add more temporary flooding. (2) The neighbor is not part of the flooding topology and we cannot add more temporary flooding.

Obviously, in the case where we want to add temporary flooding, that TLV is needed in the IIH.

 
> For example, suppose that we have a current flooding topology containing all live nodes in an area, when a new link comes up, we may just have two conditions/cases. One condition/case is that the new link is attached to a new node not on the current flooding topology. In this condition/case, the new link needs to be enabled for “temporary flooding” after it is up.


Agreed, which is why we need the TLV.


> The other condition/case is that the new link is attached to nodes on the current flooding topology. In this condition/case, there is no need to enable “temporary flooding” on the link.


Agreed.

Note that there are some additional corner cases.  Since the two neighbors may not have the exact same information, one may consider the other to be on the flooding topology when in fact it is not.  This might happen in the case of a node reboot. The IIH TLV gives us an explicit way of signaling, rather than simply guessing and sometimes getting it wrong.


> > > 3)           The extensions to Hello protocols for requesting/signaling
> > > “temporary flooding” for a connection does not work.
> > 
> > sorry, but if you see a problem, please provide details, saying above is
> > simply unproductive.
>  
> “The nodes … will try to repair the flooding topology locally by enabling temporary flooding towards the nodes that they consider disconnected from the flooding topology ...”
>  
> The above quoted text is from draft-li-lsr-dynamic-flooding-02, where “enabling temporary flooding towards the nodes” is to request/signal “temporary flooding” for a connection to connect partitioned/disconnected flooding topology into one through the extensions to Hello protocols described in draft-li-lsr-dynamic-flooding-02. Right?
>  
> The extensions to Hello protocols for requesting/signaling “temporary flooding” for a connection to connect partitioned/disconnected flooding topology into one does not work since the connection may have two or more hops and a Hello packet may get lost.


All adjacencies are a single hop in both IS-IS and OSPF.  Yes, Hello packets may be lost. Fortunately, they are periodically transmitted, thus the next transmission will also contain the TLV.  If IIH’s are getting lost at a significant rate, then the adjacency will not (and should not) come up.  Thus, the request for temporary flooding will propagate to the neighbor in all cases that matter.


> It is not convenient for a user/operator to configure on an area leader since the leader is dynamically selected. How do you address this?


No configuration is required.  The election algorithm selects the area leader.  The rules are in the draft.  An implementation may have a default priority and a default algorithm setting, so no configuration is mandatory.  If the operator desires a specific node to become area leader, then configuration may be required to adjust the priority.  FWIW, we have this already working in our implementation.  It Just Works.


> After the user/operator does some configurations on the (designated) leader, will the backup leader takes over the configurations after the designated leader is down?


There is no need for a backup leader.  If the area leader is partitioned from the topology, then leader election is repeated, resulting in a new leader.  Again, no configuration is required.

Tony