Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

bruno.decraene@orange.com Wed, 22 April 2020 18:03 UTC

Return-Path: <bruno.decraene@orange.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E11883A11CD for <lsr@ietfa.amsl.com>; Wed, 22 Apr 2020 11:03:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=orange.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EKsEGC1q1yZk for <lsr@ietfa.amsl.com>; Wed, 22 Apr 2020 11:03:27 -0700 (PDT)
Received: from relais-inet.orange.com (relais-inet.orange.com [80.12.66.41]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 400783A11CB for <lsr@ietf.org>; Wed, 22 Apr 2020 11:03:26 -0700 (PDT)
Received: from opfedar01.francetelecom.fr (unknown [xx.xx.xx.2]) by opfedar21.francetelecom.fr (ESMTP service) with ESMTP id 496pFm3TM2z7tjk; Wed, 22 Apr 2020 20:03:24 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.com; s=ORANGE001; t=1587578604; bh=YW67b+t+MwlOVM8ylGL2DlsdqqSzyixoaAW5XdRMdTc=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=iRDkg4CLL069tO613OTRF3ttBNewOdOr6Zn0Szw+OPQV3czZC2qkzHdW3wbIIjT5k LOZw67RjxtJBwfBQY9g3NIaV/vUF247yyrBikPuFFvbt1hBQvUiCnPTQtTmjV7eBWD z53vniT/SfNd92SRUlGhTSPO9ckmCMsH/Zt2QrsqwbJsFuofwQ47mzIAmvpRC2t04j qivfvfsGG2V5FsurVsuywZ5h+Cjm9Mh8jmoYAeK6lR5D6qIvHJe3tHu8v2walC61y3 rM2+9B1xfvD0MndMKR/r4x9LEpj8OtvmclO3mJktAn0z14EI4Qrho6nXOGqI3HilhK 263g0qwAuLpmg==
Received: from Exchangemail-eme6.itn.ftgroup (unknown [xx.xx.13.92]) by opfedar01.francetelecom.fr (ESMTP service) with ESMTP id 496pFm2F24zBrM1; Wed, 22 Apr 2020 20:03:24 +0200 (CEST)
From: bruno.decraene@orange.com
To: Tony Przygienda <tonysietf@gmail.com>
CC: "lsr@ietf.org" <lsr@ietf.org>, "Les Ginsberg (ginsberg)" <ginsberg=40cisco.com@dmarc.ietf.org>
Thread-Topic: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
Thread-Index: AdXmy57fbkJZPBB7TgK3RmmTteJ+Kv//8YOA//PhizD/k1ktkP8kej8gNuDpVoD//uDbAA==
Date: Wed, 22 Apr 2020 18:03:23 +0000
Message-ID: <6448_1587578604_5EA086EC_6448_75_1_53C29892C857584299CBF5D05346208A48E26E6F@OPEXCAUBM43.corporate.adroot.infra.ftgroup>
References: <MW3PR11MB46191E81D5B22B454D8184A4C1100@MW3PR11MB4619.namprd11.prod.outlook.com> <MW3PR11MB461942C752F9CCB0A6E6C1BFC1100@MW3PR11MB4619.namprd11.prod.outlook.com> <13222_1587383221_5E9D8BB5_13222_339_1_53C29892C857584299CBF5D05346208A48E22AF0@OPEXCAUBM43.corporate.adroot.infra.ftgroup> <MW3PR11MB46191D244D51A05F9AA4631DC1D50@MW3PR11MB4619.namprd11.prod.outlook.com> <CA+wi2hN2A3oZcZWngNjBnZ214jiGNfqyTZpytpK0jrxH68SnqQ@mail.gmail.com>
In-Reply-To: <CA+wi2hN2A3oZcZWngNjBnZ214jiGNfqyTZpytpK0jrxH68SnqQ@mail.gmail.com>
Accept-Language: fr-FR, en-US
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.114.13.247]
Content-Type: multipart/alternative; boundary="_000_53C29892C857584299CBF5D05346208A48E26E6FOPEXCAUBM43corp_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/nROy-gk6wkDpDXv_Rd6FR_VeWeA>
Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Apr 2020 18:03:32 -0000

Tony, all,

Thanks Tony for the technical and constructive feedback.
Please inline [Bruno]

From: Tony Przygienda [mailto:tonysietf@gmail.com]
Sent: Wednesday, April 22, 2020 1:19 AM
To: Les Ginsberg (ginsberg)
Cc: DECRAENE Bruno TGI/OLN; lsr@ietf.org
Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

neither am I aware of anything like this (i.e. per platform/product flooding rate constants) in any major vendor stack for whatever that's worth. It's simply unmaintanable, point. All major vendors have extensive product lines over so many changing hardware configuration/setups it is simply not viable to attempt precise measurements (and even then, user changing config can throw the stuff off in a millisecond). There may have been here and there per deployment scenario some "recommended config" (not something I immediately recall either) but that means very fixed configuration of things & how they go into networks and even then I'm not aware of anyone having had a "precise model of the chain in the box". yes, probes to measure lots and lots of stuff in the boxes exist but again, those are chip/linecard/backplane/chassis/routing engine specific and mostly used in complex test/peformance scenarios and not to derive some kind of equations that can predict anything much ...
[Bruno] Good points.
Yet, one of the information that we propose to advertise by the LSP receiver to the LSP sender is the Receive Window.

-          This is a very common parameter and algorithm. Nothing fancy nor reinvented. In particular it’s a parameter used by TCP.

-          I would argue that TCP implementations also run on a variety of hardware and systems, must wider range than IS-IS platform. And those TCP implementations on all those platform manage to advertise this parameter (TCP window)

-          I fail to understand that when some WG contributors proposed the use of TCP, nobody said that determining and advertising a Receive Window would be an issue, difficult or even impossible. But when we propose to advertise a Receive Window in an IS-IS TLV, this becomes difficult or even impossible for some platforms. Can anyone help me understand the technical difference?

Bruno, if you're so deeply interested in that stuff we can talk 1:1 off-line about implementation work on rift towards adapatable flooding rate
[Bruno] Sure. My pleasure. Please propose me some timeslot offline. Please note that I’m based in Europe (CEST), so a priori during your morning and my evening.
If you can also extend the offer to discuss the implementation work on the IS-IS implementation of your employer with regards to adaptable flooding rate, and/or how network operator can configure the CLI parameters of the LSP senders so as to improve flooding rate without overloading the Juniper receiver (possibly depending on the capability of the receiver, its number of IS-IS neighbors… and/or whatever parameter that you feel are relevant) that would be most appreciated. And if you believe that the Juniper LSP receiver can handle any rate from any reasonable (e.g. 50)  number of IGP neighbors, without (significantly) dropping the received LSPs, that would be even simpler, please advise.

--Bruno
(algorithm you see in the -02 draft Les put out is a _very rough_ approximation of that BTW. I joined as co-author after we had some very fruitful discussions and I consider the draft close to what can be _realistically_ done  today in ISIS. I don't consider further details generic enough to merit wide forum discussions). And RIFT put a couple of things into packet formats we can't put into ISIS (I talked with Les about it) to improve the adaptability of the flooding rate BTW and some interesting, coarse indication from receiver. Again, this is not a constant that is calculated, it's all adaptive driven almost completely from the transmitter side and the feedback it gathers.

all my very own 2c

-- tony

On Tue, Apr 21, 2020 at 2:48 PM Les Ginsberg (ginsberg) <ginsberg=40cisco.com@dmarc.ietf.org<mailto:40cisco.com@dmarc.ietf.org>> wrote:
Bruno –

You have made an assumption that historically vendors have tuned LSP transmission rates to a platform specific value.
That certainly is not true in the case of my employer (happy to hear what other vendors have been doing for the past 20 years).

The default value is based on minimumBroadcastLSPTransmissionInterval specified in ISO10589.
A knob is available to allow tuning (faster or slower) in a given deployment – though in my experience this knob is rarely used.

We already discuss in https://tools.ietf.org/html/draft-ginsberg-lsr-isis-flooding-scale-02#section-2 that this common interpretation isn’t the most appropriate, but historically the concern has been to avoid flooding too fast – not to optimize flooding speed.
Obviously, we are revisiting that approach and saying it needs to change – which is something I think we have consensus on.

I have provided a description in my recent response as to why it is difficult to derive an optimal value on a per platform basis. You may disagree – but please do not claim that we actually have been doing this for years. We haven’t been.

  Les

From: bruno.decraene@orange.com<mailto:bruno.decraene@orange.com> <bruno.decraene@orange.com<mailto:bruno.decraene@orange.com>>
Sent: Monday, April 20, 2020 4:47 AM
To: Les Ginsberg (ginsberg) <ginsberg@cisco.com<mailto:ginsberg@cisco.com>>
Cc: lsr@ietf.org<mailto:lsr@ietf.org>
Subject: RE: Flow Control Discussion for IS-IS Flooding Speed

Les,

After nearly 2 months, can we expect an answer from your side?

More specifically, the below question

[Bruno] _Assuming_ that the parameters are static, the parameters proposed in draft-decraene-lsr-isis-flooding-speed are the same as the one implemented (configured) on multiple implementations, including the one from your employer.
Now do you believe that those parameters can be determined?

§  If yes, how do you do _today_ on your implementation? (this seems to contradict your statement that you have no way to figure out how to find the right value)

§  If no, why did you implement those parameters, and ask network operator to configure them?


Thank you,
--Bruno

From: DECRAENE Bruno TGI/OLN
Sent: Wednesday, February 26, 2020 8:03 PM
To: 'Les Ginsberg (ginsberg)'
Cc: lsr@ietf.org<mailto:lsr@ietf.org>
Subject: RE: Flow Control Discussion for IS-IS Flooding Speed

Les,

Please see inline[Bruno]

From: Lsr [mailto:lsr-bounces@ietf.org] On Behalf Of Les Ginsberg (ginsberg)
Sent: Wednesday, February 19, 2020 3:32 AM
To: lsr@ietf.org<mailto:lsr@ietf.org>
Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

Base protocol operation of the Update process tracks the flooding of
LSPs/interface and guarantees timer-based retransmission on P2P interfaces
until an acknowledgment is received.

Using this base protocol mechanism in combination with exponential backoff of the
retransmission timer provides flow control in the event of temporary overload
of the receiver.

This mechanism works without protocol extensions, is dynamic, operates
independent of the reason for delayed acknowledgment (dropped packets, CPU
overload), and does not require additional signaling during the overloaded
period.

This is consistent with the recommendations in RFC 4222 (OSPF).

Receiver-based flow control (as proposed in https://datatracker.ietf.org/doc/draft-decraene-lsr-isis-flooding-speed/ )
requires protocol extensions and introduces additional signaling during
periods of high load. The asserted reason for this is to optimize throughput -
but there is no evidence that it will achieve this goal.

Mention has been made to TCP-like flow control mechanisms as a model - which
are indeed receiver based. However, there are significant differences between
TCP sessions and IGP flooding.

TCP consists of a single session between two endpoints. Resources
(primarily buffer space) for this session are typically allocated in the
control plane and current usage is easily measurable..

IGP flooding is point-to-multi-point, resources to support IGP flooding
involve both control plane queues and dataplane queues, both of which are
typically not per interface - nor even dedicated to a particular protocol
instance. What input is required to optimize receiver-based flow control is not fully specified.

https://datatracker.ietf.org/doc/draft-decraene-lsr-isis-flooding-speed/ suggests (Section 5) that the values
to be advertised:

"use a formula based on an off line tests of
   the overall LSPDU processing speed for a particular set of hardware
   and the number of interfaces configured for IS-IS"

implying that the advertised value is intentionally not dynamic. As such,
it could just as easily be configured on the transmit side and not require
additional signaling. As a static value, it would necessarily be somewhat
conservative as it has to account for the worst case under the current
configuration - which means it needs to consider concurrent use of the CPU
and dataplane by all protocols/features which are enabled on a router - not all of whose
use is likely to be synchronized with peak IS-IS flooding load.
[Bruno] _Assuming_ that the parameters are static, those parameters

o   are the same as the one implemented (configured) on multiple implementations, including the one from your employer. Now do you believe that those parameters can be determined?

•  If yes, how do you do _today_ on your implementation? (this seems to contradict your statement that you have no way to figure out how to find the right value)

•  If no, why did you implement those parameters, and ask network operator to configure them?

•  There is also the option to reply: I don’t know but don’t care as I leave the issue to the network operator.

o   can still provide some form of dynamicity, by using the PSNP as dynamic acknowledgement.

o   are really dependent on the receiver, not the sender.

•  the sender will never overload itself.

•  The receiver has more information,  knowing its processing power (low end, high end, 80s, 20s (currently we are stuck with 20 years old value assuming the worst possible receiver (and worst there were, including with packet processing partly done in the control plane processor)), its expected IS-IS load (#neighbors), its preference for bursty LSP reception (high delay between IS-IS CPU allocation cycles, memory not an issue up to x kilo-octet…), its expected control plane load if IS-IS traffic has not higher priority over other control plane traffic…), it’s expected level of QoS prioritization [1]

•          [1] looks for “Extended SPD Headroom”. E.g. “Since IGP and link stability are more tenuous and more crucial than BGP stability, such packets are now given the highest priority and are given extended SPD headroom with a default of 10 packets. This means that these packets are not dropped if the size of the input hold queue is lower than 185 (input queue default size + spd headroom size + spd extended headroom).”

o   And this is for distributed architecture, 15 years ago. So what about using the above number (in the router configuration), applies Tony’s proposal (*oversubscription/number of IS neighbhors), and advertise this value to your LSP sender?



[1] https://www.cisco.com/c/en/us/support/docs/routers/12000-series-routers/29920-spd.html


-
--Bruno


Unless a good case can be made as to why transmit-based flow control is not a good
fit and why receiver-based flow control is demonstrably better, it seems
unnecessary to extend the protocol.

    Les


From: Lsr <lsr-bounces@ietf.org<mailto:lsr-bounces@ietf.org>> On Behalf Of Les Ginsberg (ginsberg)
Sent: Tuesday, February 18, 2020 6:25 PM
To: lsr@ietf.org<mailto:lsr@ietf.org>
Subject: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

Two recent drafts advocate for the use of faster LSP flooding speeds in IS-IS:

https://datatracker.ietf.org/doc/draft-decraene-lsr-isis-flooding-speed/
https://datatracker.ietf.org/doc/draft-ginsberg-lsr-isis-flooding-scale/

There is strong agreement on two key points:

1)Modern networks require much faster flooding speeds than are commonly in use today

2)To deploy faster flooding speeds safely some form of flow control is needed

The key point of contention between the two drafts is how flow control should be implemented.

https://datatracker.ietf.org/doc/draft-decraene-lsr-isis-flooding-speed/ advocates for a receiver based flow control where the receiver advertises in hellos the parameters which indicate the rate/burst size which the receiver is capable of supporting on the interface. Senders are required to limit the rate of LSP transmission on that interface in accordance with the values advertised by the receiver.

https://datatracker.ietf.org/doc/draft-ginsberg-lsr-isis-flooding-scale/  advocates for a transmit based flow control where the transmitter monitors the number of unacknowledged LSPs sent on each interface and implements a backoff algorithm to slow the rate of sending LSPs based on the length of the per interface unacknowledged queue.

While other differences between the two drafts exist, it is fair to say that if agreement could be reached on the form of flow control  then it is likely other issues could be resolved easily.

This email starts the discussion regarding the flow control issue.




_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.



This message and its attachments may contain confidential or privileged information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.

Thank you.
_______________________________________________
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
https://www.ietf.org/mailman/listinfo/lsr

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.