Re: [Lsr] Open issues with Dynamic Flooding

Robert Raszuk <robert@raszuk.net> Tue, 05 March 2019 19:12 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 33CE11312BD for <lsr@ietfa.amsl.com>; Tue, 5 Mar 2019 11:12:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vhGzsB0FBJcW for <lsr@ietfa.amsl.com>; Tue, 5 Mar 2019 11:12:53 -0800 (PST)
Received: from mail-qt1-x82a.google.com (mail-qt1-x82a.google.com [IPv6:2607:f8b0:4864:20::82a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3C6C413129C for <lsr@ietf.org>; Tue, 5 Mar 2019 11:12:53 -0800 (PST)
Received: by mail-qt1-x82a.google.com with SMTP id w4so10129010qtc.1 for <lsr@ietf.org>; Tue, 05 Mar 2019 11:12:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bFEWNzTZCLEb1bJMHVIyE0Np4VOLcdk05/iyZ+49c8E=; b=Z5HpCqpeqWdIRxizypMGblfw8cdyu1dF2D/Ge2TEo+gbRMKdKHR6fWzdYqfYUcqyO3 aD9S++rkUB0OHFWj0M97La7TiPNW6OlHMyJSmdpgxhnmrG9aJAT0ellAkYmiTm+1o/o/ eCvXi5pE6LxymaU38z3CvKYQYacY9MKAOn1229LBZZaIavnMR+mjUW7pXIbjDMACxwoG OmHEstBBiAYw04aHkxSpcjQNe0M9dVFMDpTmwYdm0Tfqx4oV51q/IK6ZN5sYu794QpbR JLABpDMuL9oXqxi8HjhMNtLva+35ngNDfqzHu/S2a7GV7yGAbCCcaI3EpO0YpXCLfEmU H2ZQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bFEWNzTZCLEb1bJMHVIyE0Np4VOLcdk05/iyZ+49c8E=; b=rq9fcYwi945yXZ+RDeE7pPx1VXi/EcSxFOk/0DW6PS5wiDW4iOrlJYTFC4+wh8F34t IaLP+LzV6QJqeTNYEdwG5xp/MIPAU+wfXF6ui/I3qrr7jdCq6hxa3/TAbIJnWMJGCsNL c+QZBOMiM5JK8lYGLqnLCMTeXYWmyMV+5EPh1lLdtw6jCxTaFmxybnY5kST4iAXHANr3 QAELSu2SbjC9FZj5f6khOeGLtiW9qMDIBVrnhFW+I2F9sYabywNiOXY2ILyegQNl+VTC +t59JJQ8/Vaz6WZ6JBq7NNZa8yJSm4i3FHSC0W3wIGF9oiI2etqNwvZ2Xv9ZVvKvwb2b MokA==
X-Gm-Message-State: APjAAAXYGen7T4tT0hfkIiXvsBqO3LpBFqyx9eJUYXDy8wpYGio+EPOQ TZyq0RoxueAn6U9kgXevLmO91qxPyLDAIIbBzCsdbA==
X-Google-Smtp-Source: APXvYqwrI5AN55hgzpnvLyxfcdf3MrTPInzCfANKjm6roWWnvIoW3aYaEhRRFFRArFWoiBq1rJFDifYSkCddNgryDYw=
X-Received: by 2002:a0c:95dd:: with SMTP id t29mr3300087qvt.174.1551813172128; Tue, 05 Mar 2019 11:12:52 -0800 (PST)
MIME-Version: 1.0
References: <AAD29CF0-F0CA-4C3C-B73A-78CD2573C446@tony.li> <c1adac3a-cd4b-130e-d225-a5f40bf0ef55@cisco.com> <F3C4B9B2-F101-4E28-8928-9208D5EBAF99@tony.li> <be28dbcf-8382-329a-229f-5b146538fabe@cisco.com> <CA+wi2hPt-UrekyA9LpCWJHo9KyaOR1=eVQD29y54sciv3zh10A@mail.gmail.com>
In-Reply-To: <CA+wi2hPt-UrekyA9LpCWJHo9KyaOR1=eVQD29y54sciv3zh10A@mail.gmail.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Tue, 05 Mar 2019 20:12:41 +0100
Message-ID: <CAOj+MMGPp=DffEw7vS4PH_vDtmYL5y2Xxgx2utNt4R6cxsCiwg@mail.gmail.com>
To: Tony Przygienda <tonysietf@gmail.com>
Cc: Peter Psenak <ppsenak@cisco.com>, lsr@ietf.org, Tony Li <tony.li@tony.li>
Content-Type: multipart/alternative; boundary="000000000000e2081d05835da883"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/lRmfIUE4Km_WaR0zMfLITNIYnac>
Subject: Re: [Lsr] Open issues with Dynamic Flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2019 19:12:57 -0000

> Slow convergence is obviously not a good thing

Could you please kindly elaborate why ?

With tons of ECMP in DCs or with number of mechanism for very fast data
plane repairs in WAN (well beyond FRR) IMHO any protocol *fast convergence*
is no longer a necessity. Yet many folks still talk about it like the only
possible rescue ...


On Tue, Mar 5, 2019 at 5:42 PM Tony Przygienda <tonysietf@gmail.com> wrote:

> in practical terms +1 to Peter's take here ... Unless we're talking tons
> of failures simultaneously (which AFAI talked to folks are not that common
> but can sometimes happen in DCs BTW due to weird things) smaller scale
> failures with few links would cause potentially diffused "chaining" of
> convergence behavior rather than IGP-style fast healing (and on top of that
> I didn't see a lot of interest in formalizing a rigorous distributed
> algorithm which IMO would be necessary to ensure ultimate convergence when
> only one/subset of links is used). Slow convergence is obviously not a good
> thing unless we assume people will run FRR with its complexity in DC and/or
> no more than one link every fails which seems to me bending assumptions to
> whatever solution is available/preferred. To Tony's point though, on large
> scale failures enabling all links would cause heavy flood load, yes, but in
> a sense it's the "initial bootup" case anyway (especially in centralized
> case) since nodes need all topology to make informed correct decisions
> about what the FT should be if they don't rely on whatever the centralized
> instance thinks (which they won't be able to do given the FT from
> centralized instance will indicate lots links that are "gone" due to
> failure). As to p2p, I suggest to agree whether you use dense mesh (DC)
> case or sparse mesh (WAN) case or "every topology imaginable" since that
> drives lots design trade-offs.
>
> my 2.71828182 cents ;-)
>
> --- tony
>
> On Tue, Mar 5, 2019 at 8:27 AM Peter Psenak <ppsenak@cisco.com> wrote:
>
>> Hi Tony,
>>
>> On 05/03/2019 17:16 , tony.li@tony.li wrote:
>> >
>> > Peter,
>> >
>> >>>    (a) Temporarily add all of the links that would appear to remedy
>> the partition. This has the advantage that it is very likely to heal the
>> partition and will do so in the minimal amount of convergence time.
>> >>
>> >> I prefer (a) because of the faster convergence.
>> >> Adding all links on a single node to the flooding topology is not
>> going to cause issues to flooding IMHO.
>> >
>> >
>> > Could you (or John) please explain your rationale behind that? It seems
>> counter-intuitive.
>>
>> it's limited to the links on a single node. From all the practical
>> purposes I don't expect single node to have thousands of adjacencies, at
>> least not in the DC topologies for which the dynamic flooding is being
>> primary invented.
>>
>> In the environments with large number of adjacencies (e.g.
>> hub-and-spoke) it is likely that we would have to make all these links
>> part of the flooding topology anyway, because the spoke is typically
>> dual attached to two hubs only. And the incremental adjacency bringup is
>> something that an implementation may already support.
>>
>> >
>> >
>> >
>> >> given that the flooding on the LAN in both OSPF and ISIS is done as
>> multicast, there is currently no way to enable flooding, either permanent
>> or temporary, towards a subset of the neighbors on the LAN. So if the
>> flooding is enabled on a LAN it is done towards all routers connected to
>> the it..
>> >
>> >
>> > Agreed.
>> >
>> >
>> >> Given that all links between routers are p2p these days, I would vote
>> for simplicity and make the LAN always part of the FT.
>> >
>> >
>> > I’m not on board with this yet.  Our simulations suggest that this is
>> not necessarily optimal.  There are lots of topologies (e..g., parallel
>> LANs) where this blanket approach is suboptimal.
>>
>> the question is how much are true LANs used as transit links in today's
>> networks.
>>
>> thanks,
>> Peter
>>
>> >
>> > Tony
>> >
>> > .
>> >
>>
>> _______________________________________________
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>