Re: [Lsr] Dynamic flow control for flooding

Henk Smit <henk.ietf@xs4all.nl> Wed, 24 July 2019 13:18 UTC

Return-Path: <henk.ietf@xs4all.nl>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CF59612024C for <lsr@ietfa.amsl.com>; Wed, 24 Jul 2019 06:18:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pz6CJuJWp-5M for <lsr@ietfa.amsl.com>; Wed, 24 Jul 2019 06:18:15 -0700 (PDT)
Received: from lb2-smtp-cloud8.xs4all.net (lb2-smtp-cloud8.xs4all.net [194.109.24.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CB3A61203B9 for <lsr@ietf.org>; Wed, 24 Jul 2019 06:18:12 -0700 (PDT)
Received: from webmail.xs4all.nl ([IPv6:2001:888:0:22:194:109:20:216]) by smtp-cloud8.xs4all.net with ESMTPA id qH9lh37pgeD5bqH9lheTXV; Wed, 24 Jul 2019 15:18:09 +0200
Received: from knint.xs4all.nl ([83.163.74.169]) by webmail.xs4all.nl with HTTP (HTTP/1.1 POST); Wed, 24 Jul 2019 15:18:09 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Date: Wed, 24 Jul 2019 15:18:09 +0200
From: Henk Smit <henk.ietf@xs4all.nl>
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
Cc: stephane.litkowski@orange.com, Tony Li <tony.li@tony.li>, lsr@ietf.org
In-Reply-To: <BYAPR11MB363856BB026992DFBB3BB224C1C60@BYAPR11MB3638.namprd11.prod.outlook.com>
References: <CAMj-N0LdaNBapVNisWs6cbH6RsHiXd-EMg6vRvO_U+UQsYVvXw@mail.gmail.com> <BYAPR11MB36382C89363202D1B5659614C1C70@BYAPR11MB3638.namprd11.prod.outlook.com> <5841_1563943794_5D37E372_5841_105_1_9E32478DFA9976438E7A22F69B08FF924D9C373E@OPEXCAUBMA3.corporate.adroot.infra.ftgroup> <BYAPR11MB363856BB026992DFBB3BB224C1C60@BYAPR11MB3638.namprd11.prod.outlook.com>
Message-ID: <8376a87831ffa6f5298c5122907c6e66@xs4all.nl>
X-Sender: henk.ietf@xs4all.nl
User-Agent: XS4ALL Webmail
X-CMAE-Envelope: MS4wfGzTqtsSDeuAjPso5keErrmo4XXQsgRBWTbQh3YwIQqNk/OujTW61VcO6yPo+NR8CZM+r+lxps4weZ/5iRNEnRtHNOwVMjTcmF0B1FS9sQpghv0MiaLW 6ujx/DkeI8G326mWOzTS62tADWgzlREOB2IcV5+w7Qsbek5D3Uxwj0edbV74vauU5I3jDc+qzQ1y0c9hmrRQMjq+f63T2S9+IBGEW0561uHQ5QHKQyGPV4PE fM9GuPSfOwE4oWV7M4HiVtqMRc7AzrpHvtDKFmAfAn+VZXgTZuAQJQ5/fZswOzKIek8r6/A/5SEApfFSt8PlMw==
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/Oq4ZZo415FuUEXC_yqZkamHw2F4>
Subject: Re: [Lsr] Dynamic flow control for flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jul 2019 13:18:20 -0000

Hello Les,

Les Ginsberg (ginsberg) wrote on 2019-07-24 07:17:

> If you accept that, then it makes sense to look for the simplest way
> to do flow control and that is decidedly not from the RX side. (I
> expect Tony Li to disagree with that 😊 – but I have already
> outlined why it is more complex to do it from the Rx side.)

In your talk on Monday you called the idea in
draft-decraene-lsr-isis-flooding-speed-01 "receiver driven flow 
control".
You don't like that. You want "transmit based flow control".
You argued that you can do "transmit based flow control" on the sender 
only.
Therefor your algorithm is merely a "local trick".
And "local tricks" don't need RFCs. I agree with that.
But I don't agree that your algorithm is just a "local trick".


In your algorithm, a "sender" sends a number of LSPs to a receiver.
Without waiting for acks (PNSPs). Like in any sliding window protocol.
The sending router keeps an eye on the number of unacked LSPs.
And it determines how fast it can send more LSPs based on the current
number of unacked LSPs. Every time the sender receives a PSNP, it
knows the receiver got a number of LSPs, so it can increase its
send-window again, and then send more LSPs.
Correct ?

I agree that the core idea of this algorithm makes sense.
After all, it looks a lot like TCP.
I believe the authors of draft-decraene-lsr-isis-flooding-speed were
planning something like that for the next version of their draft.


However, I do not agree with the name "tx driven flow control".
I also do not agree that this algorithm is "a local trick".
Therefor I also do not think this algorithm doesn't need to be
documented (in an RFC).

In your "tx based flow control", the sender (tx) sends LSPs at a rate
that is derived from the rate at which it receives PSNPs. Therefor
it is the sender of the PSNPs that sets the speed of transmission !
So it is still the receiver (of LSPs) that controls the flow control.
The name "tx based flow control" is a little misleading, imho.


It is important to realize that the success of your algorithm actually
depends on the behaviour of the receiver. How does it send PSNPs ?
Does it send one PSNP per received LSP ? Or does it pack multiple acks
in one PSNP ? Does it send a PSNP immediatly, or does it wait a short
time ? Does it try to fill a PSNP to the max (putting ~90 acks in one
PSNP) ? Does the receiver does something in between ? I don't think
the behaviour is specified exactly anywhere.

I know about an IS-IS implementation from the nineties. When a router
would receive an LSP, it would a) set the SSN bit (for that 
LSP/interface),
and b) start the psnp-timer for that interface (if not already running).
The psnp-timer would expire 2 seconds later. The router would then walk
the LSPDB, find all LSPs with the SSN-bit set for that interface. And
then build a PSNP with acks for all those LSPs. The result would be
that: a) the first PSNP would be send 2 seconds (+/- jitter) after
receiving the first LSP, and b) the PSNP would include ~66 acks. (As
a router receiving at full speed would have received 66 LSPs in 2 
seconds).

For your "tx based flow control" algorithm to work properly, this has
to change. The receiving router must send PSNPs more quickly and more
aggressively. The result would be that there will be less acks in each
PSNP. And thus more PSNPs will be sent.

This makes us realize: in the current situation, if a router receives
a 1000 LSPs, and sends those LSPs to 64 neighbors, it would receive:
- the 1000 LSPs from an upstream neighbor, plus
- 1000/66 = 16 PSNPs from each downstream neighbor = 64 * 16 = 1024 
PSNPs.
This makes a total of ~2000K PDUs received.

If routers would send one PSNP per LSP (to have faster flow control),
then the router in this example would receive:
- the 1000 LSPs from an upstream neighbor, plus
- 1000 PSNPs from each downstream neighbor * 16 = 1600 PSNPs.
This makes a total of ~17000 PDUs received.

The total number of PDUs received on this router would go from 2K PDUs
to 17K PDUs.

Remember that the problem we're trying to solve here is to make sure
that routers do not get overrun on the receipt side with too many
packets too quickly. It seems an aggressive PSNP-scheme, to achieve
faster flow-control, is actually very counter-productive.

Of course the algorithm can be tweaked. E.g. TCP sends one ack per
every 2 received segments (if I'm not mistaken). If we do that here,
the number of PDUs would go down from 17K to 9K PDUs. What do you
propose ? How do you want the feedback of PSNPs to be quick, while
maintaining an efficient packing of multiple acks per PSNP ?


In any case, the points I'm trying to make here:
*) Your algorithm is not sender-driven, but still receiver-driven.
*) Your algorithm changes/dictates behaviour both on sender and 
receiver.
*) Interaction between a sender and a receiver is what we call a 
protocol.
    If you want to make this work, especially in multi-vendor 
environments,
    we need to document these algorithms. Aka in an RFC.

Kind regards,

henk.