Re: [Lsr] Dynamic flow control for flooding

<stephane.litkowski@orange.com> Wed, 24 July 2019 04:49 UTC

Return-Path: <stephane.litkowski@orange.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A8C312003F for <lsr@ietfa.amsl.com>; Tue, 23 Jul 2019 21:49:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u9LtYiDXWC50 for <lsr@ietfa.amsl.com>; Tue, 23 Jul 2019 21:49:56 -0700 (PDT)
Received: from relais-inet.orange.com (relais-inet.orange.com [80.12.70.34]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8E000120018 for <lsr@ietf.org>; Tue, 23 Jul 2019 21:49:56 -0700 (PDT)
Received: from opfednr03.francetelecom.fr (unknown [xx.xx.xx.67]) by opfednr22.francetelecom.fr (ESMTP service) with ESMTP id 45tjZB5Gkfz10GS; Wed, 24 Jul 2019 06:49:54 +0200 (CEST)
Received: from Exchangemail-eme6.itn.ftgroup (unknown [xx.xx.13.51]) by opfednr03.francetelecom.fr (ESMTP service) with ESMTP id 45tjZB3HrLzDq7g; Wed, 24 Jul 2019 06:49:54 +0200 (CEST)
Received: from OPEXCAUBMA3.corporate.adroot.infra.ftgroup ([fe80::90fe:7dc1:fb15:a02b]) by OPEXCAUBM22.corporate.adroot.infra.ftgroup ([::1]) with mapi id 14.03.0439.000; Wed, 24 Jul 2019 06:49:54 +0200
From: stephane.litkowski@orange.com
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>, Tony Li <tony.li@tony.li>, "lsr@ietf.org" <lsr@ietf.org>
Thread-Topic: [Lsr] Dynamic flow control for flooding
Thread-Index: AQHVQVtmKvjAt9eVxU+2Fnzeg1St4qbYhmsAgACjvbA=
Date: Wed, 24 Jul 2019 04:49:53 +0000
Message-ID: <5841_1563943794_5D37E372_5841_105_1_9E32478DFA9976438E7A22F69B08FF924D9C373E@OPEXCAUBMA3.corporate.adroot.infra.ftgroup>
References: <CAMj-N0LdaNBapVNisWs6cbH6RsHiXd-EMg6vRvO_U+UQsYVvXw@mail.gmail.com> <BYAPR11MB36382C89363202D1B5659614C1C70@BYAPR11MB3638.namprd11.prod.outlook.com>
In-Reply-To: <BYAPR11MB36382C89363202D1B5659614C1C70@BYAPR11MB3638.namprd11.prod.outlook.com>
Accept-Language: fr-FR, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.114.13.245]
Content-Type: multipart/alternative; boundary="_000_9E32478DFA9976438E7A22F69B08FF924D9C373EOPEXCAUBMA3corp_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/M412WjAwNshffAf5WTmCpsQfork>
Subject: Re: [Lsr] Dynamic flow control for flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jul 2019 04:50:00 -0000

Hi Les,

I agree that flooding is a global thing and not a local mechanism if we consider that the ultimate goal is to get the LSDB in-sync as fast as we can on all the nodes.

I just want to highlight three things:

-          Link delay (due to transmission link distance) is already affecting the flooding speed (especially when we need to cross some links which have 100msec of RTD), so the flooding speed is already not equal on each link

-          I put this one in parenthesis as it may be controversial ☺ (To converge a path after a topology change, we do not always require all the nodes to get the LSDB in-sync (I mean from a fwding point of view). That’s a tricky topic because it is highly depending on the network topology and in one hand flooding one or two hops away allows to converge the path, while in an other hand, it may create microloops with another network design. )

-          I’m really wondering how much difference we may have considering the different routers we have in a single area today. Even if we have some legacy routers still deployed, they are more powerful compared to the time the ISO spec was done. Are we expecting hundreds of msec difference or tens between last generation of routers deployed and the legacy one ? In addition, in our case, we try to create consistent design, which means that we are trying to avoid having legacy routers in transit between last generation of routers and we are pushing the legacy one at the edge or try to remove them. There may be some transient situation when it happens but that’s not a design goal. This is to say that I’m not hurted to get a very fast flooding value on my core and last generation edges while letting a more conservative value for legacy edges. And I’m not expecting to have so much differences between the two (at least not really more than the link delay that may already exists and impact flooding).

Another point is that I would be really glad to see how much the flooding time is impacting the convergence time in real networks taking into account that the FIB rewrite is usually the biggest contributor (unfortunately we don’t have really instrumentation today to measure flooding). I’m not telling that there is nothing to do, of course the default flooding time we had for years could be improved and I fully agree. I’m just always interested to have some potential gain measurement.

Flow control is required in any case, we can always find a case when the IS-IS process will not get enough CPU time because CPU is busy doing other stuffs and IS-IS can’t process the input PDUs (as an example).


Brgds,

From: Lsr [mailto:lsr-bounces@ietf.org] On Behalf Of Les Ginsberg (ginsberg)
Sent: Tuesday, July 23, 2019 16:30
To: Tony Li; lsr@ietf.org
Subject: Re: [Lsr] Dynamic flow control for flooding

Tony –

Thanx for picking up the discussion.
Thanx also for doing the math to show that bandwidth is not a concern. I think most/all of us knew that – but it is good to put that small question behind us.

I also think we all agree on the goal - which is to flood significantly faster than many implementations do today to handle deployments like the case you mention below.

Beyond this point, I have a different perspective.

As network-wide convergence depends upon fast propagation of LSP changes – which in turn requires consistent flooding rates on all interfaces enabled for flooding – a properly provisioned network MUST be able to sustain a consistent flooding rate or the operation of the network will suffer. We therefore need to view flow control issues as indicative of a problem.

It is a mistake to equate LSP flooding with a set of independent P2P “connections” – each of which can operate at a rate independent of the other.

If we can agree on this, then I believe we will have placed the flow control problem in its proper perspective – in which case it will become easier to agree on the best way to implement flow control.

   Les



From: Lsr <lsr-bounces@ietf.org<mailto:lsr-bounces@ietf.org>> On Behalf Of Tony Li
Sent: Tuesday, July 23, 2019 6:34 AM
To: lsr@ietf.org<mailto:lsr@ietf.org>
Subject: [Lsr] Dynamic flow control for flooding


Hi all,

I’d like to continue the discussion that we left off with last night.

The use case that I posited was a situation where we had 1000 LSPs to flood. This is an interesting case that can happen if there was a large network that partitioned and has now healed.  All LSPs from the other side of the partition are going to need to be updated.

Let’s further suppose that the LSPs have an average size of 1KB.  Thus, the entire transfer is around 1MB.

Suppose that we’re doing this on a 400Gb/s link. If we were to transmit the whole batch of LSPs at once, it takes a whopping 20us.  Not milliseconds, microseconds.  2x10^-5s.  Clearly, we are not going to be rate limited by bandwidth.

Note that 20us is an unreasonable lower bound: we cannot reasonably expect a node to absorb 1k PDUs back to back without loss today, in addition to all of it’s other responsibilities.

At the opposite end of the spectrum, suppose we transmit one PDU every 33ms.  That’s then going to take us 33 seconds to complete. Unreasonably slow.

How can we then maximize our goodput?  We know that the receiver has a set of buffers and a processing rate that it can support. The processing rate will vary, depending on other loads.

What we would like the transmitter to do is to transmit enough to create a small processing queue on the receiver and then transmit at the receiver’s processing rate.

Can we agree on this goal?

Tony


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.