Re: [Lsr] Dynamic flow control for flooding

Tony Przygienda <tonysietf@gmail.com> Tue, 23 July 2019 21:32 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 535D31209A7 for <lsr@ietfa.amsl.com>; Tue, 23 Jul 2019 14:32:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s8LSrl63tXyt for <lsr@ietfa.amsl.com>; Tue, 23 Jul 2019 14:32:57 -0700 (PDT)
Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com [IPv6:2a00:1450:4864:20::542]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84C261209A2 for <lsr@ietf.org>; Tue, 23 Jul 2019 14:32:56 -0700 (PDT)
Received: by mail-ed1-x542.google.com with SMTP id k21so45291283edq.3 for <lsr@ietf.org>; Tue, 23 Jul 2019 14:32:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=De1aeenIN8pA9FmRM62AUMejqu7hNVrySVNL9kSdhNY=; b=U3D2BTER7MTourqCcG8CnYy47ce2lQHADJbFICnY7yOCTayIQI3pdcDzkKWTgIfpBX IbUg8/piZ4leFc2dlytrmH8V11Q6CO7bKF7vHrk3haVu1jG342g/Mu3T24KhOKX3rU8d Xyeg6kpvBIwuwekBgsNOzRRx1pr1ZPhYWUDfTjfLbpanroaT3qUHKNoFRdc561z8/t/M TmxbcS8YDLtFcF2vXOh7+ZlaJVs+siII9LVo6VutCI3SG6/dt8QisC9OYXB0SO2HU5S/ uOXkXRiq1dNrKsICh7VB4uFndiXKRw5nnKjpt2d6zXimcRAuDfn1SqDKSJA3554z+/d6 Tp6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=De1aeenIN8pA9FmRM62AUMejqu7hNVrySVNL9kSdhNY=; b=QTPeOBtvD3wvTn0JxLJwFrKm/cj3PAxa37NDnIAW7cmN5QTjVa22L9bNsfpv3Wdmp4 7Ne/r1mInMcXumL0KipKH0bqFXcQbpzB3Beli3paYf6CVcJ+5hduGpiF7fVtzg0+DcY9 IrMci5iI62KHbgDPXAHHtTS8rIbgO0fMrXXnbZajWTQX4K7/VZiV06v0XH8v0itsvC7i eGxfdcNBi1my+QjD5+hx2pOOJ28voqXoRr3+YGwn8ZyO7+M3F02JvWSr2ecg26BCLst2 rROLZ6lceXaqxrrLLy7tyL1Q9KbpSVBMotHNfQlzU/++DaAx3id6g6sDQHXy3qm50EJ6 rmQA==
X-Gm-Message-State: APjAAAWmaN2GDRTQV0LVDjSsnVcvJPyHOwbCdu3B7YT+KBs1BUlP5FZt U/BZxWYUNxjqnrOs9ys8EBnqf6wwxu/vzuVfMrg=
X-Google-Smtp-Source: APXvYqyQPv9ry3mcHdJ3GrQUVjbhmydS3IsNc4G5Ce6IeD0YM1pRWWoIu+QaMC88zYFRPhCgtrQuHUycnTMljOBx8rE=
X-Received: by 2002:a17:906:490d:: with SMTP id b13mr60224203ejq.16.1563917575099; Tue, 23 Jul 2019 14:32:55 -0700 (PDT)
MIME-Version: 1.0
References: <CAMj-N0LdaNBapVNisWs6cbH6RsHiXd-EMg6vRvO_U+UQsYVvXw@mail.gmail.com> <BYAPR11MB36382C89363202D1B5659614C1C70@BYAPR11MB3638.namprd11.prod.outlook.com> <CA+wi2hM5QqjfyakGYPVwmb4amXKRSy5-_YuQEY5V1CePzv77cw@mail.gmail.com> <BYAPR11MB363844E1525B8B466AD35443C1C70@BYAPR11MB3638.namprd11.prod.outlook.com>
In-Reply-To: <BYAPR11MB363844E1525B8B466AD35443C1C70@BYAPR11MB3638.namprd11.prod.outlook.com>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Tue, 23 Jul 2019 17:32:19 -0400
Message-ID: <CA+wi2hOVkCncvzd-SDpbM4p_jETrz9pKAO8bn9sJVbO6ZWT-vA@mail.gmail.com>
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
Cc: Tony Li <tony.li@tony.li>, "lsr@ietf.org" <lsr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000008599e9058e5fef8f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/XfIe8dHWV7hHta7JXP2iTEA1FyU>
Subject: Re: [Lsr] Dynamic flow control for flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jul 2019 21:33:10 -0000

On Tue, Jul 23, 2019 at 5:24 PM Les Ginsberg (ginsberg) <ginsberg@cisco.com>
wrote:

> Tony –
>
>
>
> As usual, you cover a lot of territory – and even after a couple of
> readings I am not sure I got everything.
>

I was being accused of being too flowerly in my prose for many years so I
adopted an acerbic, terse style ;-)

>
> *From:* Tony Przygienda <tonysietf@gmail.com>
> *Sent:* Tuesday, July 23, 2019 1:56 PM
> *To:* Les Ginsberg (ginsberg) <ginsberg@cisco.com>
> *Cc:* Tony Li <tony.li@tony.li>; lsr@ietf.org
> *Subject:* Re: [Lsr] Dynamic flow control for flooding
>
>
>
>
>
>
>
> It is a mistake to equate LSP flooding with a set of independent P2P
> “connections” – each of which can operate at a rate independent of the
> other.
>
>
>
>
>
>
>
> At least my experience much disagrees with that and such a proposal seems
> to steer towards slowest receiver in the whole network problem so I wait
> for others to chime in.
>
> *[Les:] This is NOT AT ALL where I am going.*
>
> *If I have a “large network” and I have a node which consistently cannot
> support the flooding rates necessary to deal with Tony Li’s example (node w
> many neighbors fails) then the network has a problem.*
>
> *Slowing down everyone to meet the flooding speed of the slowest speed is
> not something I would expect a customer to accept. The network will not be
> able to deliver the convergence expected. The node in question needs to be
> identified and steps taken to either fix it or upgrade or replace it or…*
>
>
>
> *The point I am also making is trying to run the network with some links
> flooding fast and some links flooding slow isn’t a solution either.*
>

hmm, then I don't know what you propose in normal case except saying
nothing seems to skin the cat properly when your network is loop-sided
enough. On which we agree I guess ...


>
>
> Then, to clarify on Tony's mail, the "problem" I mentioned anecdotally
> yesterday as behavior I saw on things I did in their time was of course
> when processors were still well under 1GHz and links in Gigs and not 10s
> and 100s of Gigs we have today but yes, the limiting factor was the
> flooding rate (or rather effective processing rate of receiver AFAIR before
> it started drop the RX queues or was late enough to cause RE-TX on senders)
> in terms of losses/retransmissions necessary that were causing transients
> to the point it looked to me then the cure seemed worse than the disease
> (while the disease was likely a flu then compared to today given we didn't
> have massively dense meshes we steer towards today). The base spec &
> mandated flooding numbers didn't change but what is possible in terms of
> rates when breaking the spec did change of course in terms of CPU/links
> speed albeit most ISIS implementations go back to megahertz processors
> still ;-) And the dinner was great BTW ;-)
>
>
>
> So yes, I do think that anything that will flood @ reasonable rate without
> excessive losses will work well on well-computed
> double-flood-reduced-graph, the question is how to get the "reasonable" in
> place both in terms of numbers as well as mechanism for which we saw tons
> lively discussions/proposal yesterday, most obvious being of course going
> and manually bumping e'one's implementation to the desired (? ;-) value
> ...  Other consideration is having computation always trying to get more
> than 2 links in minimal cut on the graph of course which should alleviate
> any bottleneck or rather, make the cut less likely. Given quality of
> max-disjoint-node/link graph computation algorithms that should be doable
> by gut feeling. If e.g. the flood rate per link is available the algorithms
> should be doing even better in centralized case.
>
>
>
> *[Les:] Convergence issues and flooding overload as a result of excessive
> redundant flooding is a real issue – but it is a different problem (for
> which we have solutions) and we should not introduce that issue into this
> discussion.*
>

hmm, we are trying to build flood reduction to deal with exactly this
problem I thought and we are trying to find a good solution in the design
space between a hamiltonian path and not reducing any links @ all where on
one hand the specter of long flooding chains & partitions on single link
failures looms while beckoning with very low CPU load and on the other hand
we can do nothing @ all while staring down the abyss of excessivly large,
densely meshed networks and falling of the cliff of melted flooding ...
So, I'm not sure I introduced anything new but if I did, ignore my attempt
@ clarification of what I said yesterday ...

--- tony

>