Re: [Lsr] Open issues with Dynamic Flooding

Tony Przygienda <tonysietf@gmail.com> Tue, 05 March 2019 16:42 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D2A681312BC for <lsr@ietfa.amsl.com>; Tue, 5 Mar 2019 08:42:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OAFnUaURR3bG for <lsr@ietfa.amsl.com>; Tue, 5 Mar 2019 08:42:37 -0800 (PST)
Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 23B4A131302 for <lsr@ietf.org>; Tue, 5 Mar 2019 08:42:37 -0800 (PST)
Received: by mail-ed1-x529.google.com with SMTP id g19so7789539edp.2 for <lsr@ietf.org>; Tue, 05 Mar 2019 08:42:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vDgoCoOezd6SWZocvovmNCFMoRADKQ34uz1GSGf0RXI=; b=ot4I3ix1Jlq9ZHrqNAFxSef+sYf2tgzB4Du6BPyB3R266G7Y7cxLtQF7axKTMeSoNa aLl1BZI/KIiHsLSVtpwsqgWDQb9QchkhdXCR12WtjMw/o885OQqFDfvz/NmKAncwzlVK m2R+m9JGuFX+iqa0WwZu24//D6Oq6XEkWTELzMWKYvxd3/EdLLmOESVm6BAxMlW+gJ8A PiHcacgJDmCEF2R6yDmlmviP4SG1IqVoAJ2HjKCwONZ3V5bapfa352nhuAlyjdcdHJar YQGVAMp2cXfZQtUbpss7HfRiw7gxkD7Z94tTvld0G+gxjBjhePAbjYCBaDxGSi8G0ZWj g/Cg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vDgoCoOezd6SWZocvovmNCFMoRADKQ34uz1GSGf0RXI=; b=WxbAKuGn1z3lcGczohX55G5+/SJ/wytVyp1aKiZ2IrNE68WFyUJD2DlvjwzyFMVDpV 13qFfRUAncsO6bZNkBsJsDH6vGJoiH8dLZJHbBHhZueLv0HNHpWXvSJ83NontnEeJn3o pB4v6LhvhZLodFM5Ca/pcNbGw6O+3t9ZVxiDyDQSq4K3yrZOrzOEykdh2o0/TpuW1VMe 1N2OlIOGh6spfTbpwWoDTmmYyIyc8JAsxmTU/tdTSh3tY+i/virKkbtfI0XOFZhYnL6R DCzgToxoKU2RB5z5nRnNYUcHYdoOTFoqOjWmAhJqVPiUv4QIq+aigP197vnsdi+QPgo/ gdKg==
X-Gm-Message-State: APjAAAW1Mr6F+ZBPTmkl/gSZZYOBzHz6vkwzetiHZEnQAtjRNc5ElfOs Uc+T8Cus67FYoYeCiDHX+dWHDsXMa+nxIGobBRw=
X-Google-Smtp-Source: APXvYqxhhu6u14CYYBLmVQWLKDrRiIHXpG+tr18KUOu5Kb3IubZp6d98gIIzeQymqrq5nNHRuQbwJtCiA2yRoJ/BOAM=
X-Received: by 2002:a50:b8e2:: with SMTP id l89mr20578715ede.140.1551804155430; Tue, 05 Mar 2019 08:42:35 -0800 (PST)
MIME-Version: 1.0
References: <AAD29CF0-F0CA-4C3C-B73A-78CD2573C446@tony.li> <c1adac3a-cd4b-130e-d225-a5f40bf0ef55@cisco.com> <F3C4B9B2-F101-4E28-8928-9208D5EBAF99@tony.li> <be28dbcf-8382-329a-229f-5b146538fabe@cisco.com>
In-Reply-To: <be28dbcf-8382-329a-229f-5b146538fabe@cisco.com>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Tue, 05 Mar 2019 08:41:59 -0800
Message-ID: <CA+wi2hPt-UrekyA9LpCWJHo9KyaOR1=eVQD29y54sciv3zh10A@mail.gmail.com>
To: Peter Psenak <ppsenak@cisco.com>
Cc: Tony Li <tony.li@tony.li>, lsr@ietf.org
Content-Type: multipart/alternative; boundary="00000000000072099005835b8f8d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/kX58GCzIfJciY9UtfXziIEL7gYU>
Subject: Re: [Lsr] Open issues with Dynamic Flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2019 16:42:41 -0000

in practical terms +1 to Peter's take here ... Unless we're talking tons of
failures simultaneously (which AFAI talked to folks are not that common but
can sometimes happen in DCs BTW due to weird things) smaller scale failures
with few links would cause potentially diffused "chaining" of convergence
behavior rather than IGP-style fast healing (and on top of that I didn't
see a lot of interest in formalizing a rigorous distributed algorithm which
IMO would be necessary to ensure ultimate convergence when only one/subset
of links is used). Slow convergence is obviously not a good thing unless we
assume people will run FRR with its complexity in DC and/or no more than
one link every fails which seems to me bending assumptions to whatever
solution is available/preferred. To Tony's point though, on large scale
failures enabling all links would cause heavy flood load, yes, but in a
sense it's the "initial bootup" case anyway (especially in centralized
case) since nodes need all topology to make informed correct decisions
about what the FT should be if they don't rely on whatever the centralized
instance thinks (which they won't be able to do given the FT from
centralized instance will indicate lots links that are "gone" due to
failure). As to p2p, I suggest to agree whether you use dense mesh (DC)
case or sparse mesh (WAN) case or "every topology imaginable" since that
drives lots design trade-offs.

my 2.71828182 cents ;-)

--- tony

On Tue, Mar 5, 2019 at 8:27 AM Peter Psenak <ppsenak@cisco.com> wrote:

> Hi Tony,
>
> On 05/03/2019 17:16 , tony.li@tony.li wrote:
> >
> > Peter,
> >
> >>>    (a) Temporarily add all of the links that would appear to remedy
> the partition. This has the advantage that it is very likely to heal the
> partition and will do so in the minimal amount of convergence time.
> >>
> >> I prefer (a) because of the faster convergence.
> >> Adding all links on a single node to the flooding topology is not going
> to cause issues to flooding IMHO.
> >
> >
> > Could you (or John) please explain your rationale behind that? It seems
> counter-intuitive.
>
> it's limited to the links on a single node. From all the practical
> purposes I don't expect single node to have thousands of adjacencies, at
> least not in the DC topologies for which the dynamic flooding is being
> primary invented.
>
> In the environments with large number of adjacencies (e.g.
> hub-and-spoke) it is likely that we would have to make all these links
> part of the flooding topology anyway, because the spoke is typically
> dual attached to two hubs only. And the incremental adjacency bringup is
> something that an implementation may already support.
>
> >
> >
> >
> >> given that the flooding on the LAN in both OSPF and ISIS is done as
> multicast, there is currently no way to enable flooding, either permanent
> or temporary, towards a subset of the neighbors on the LAN. So if the
> flooding is enabled on a LAN it is done towards all routers connected to
> the it.
> >
> >
> > Agreed.
> >
> >
> >> Given that all links between routers are p2p these days, I would vote
> for simplicity and make the LAN always part of the FT.
> >
> >
> > I’m not on board with this yet.  Our simulations suggest that this is
> not necessarily optimal.  There are lots of topologies (e.g., parallel
> LANs) where this blanket approach is suboptimal.
>
> the question is how much are true LANs used as transit links in today's
> networks.
>
> thanks,
> Peter
>
> >
> > Tony
> >
> > .
> >
>
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>