Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

Tony Przygienda <tonysietf@gmail.com> Thu, 20 February 2020 00:21 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A1ED120854 for <lsr@ietfa.amsl.com>; Wed, 19 Feb 2020 16:21:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dF4jf55Pe3YA for <lsr@ietfa.amsl.com>; Wed, 19 Feb 2020 16:21:17 -0800 (PST)
Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B6A6F120274 for <lsr@ietf.org>; Wed, 19 Feb 2020 16:21:17 -0800 (PST)
Received: by mail-io1-xd33.google.com with SMTP id z16so2654032iod.11 for <lsr@ietf.org>; Wed, 19 Feb 2020 16:21:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PCZ7MryOQdzj4zu4AJz1Vi0v2lJaHcRkKJstiAIWkwM=; b=aRIPw+TgW0Cro3Uikvb3i0fcG9DPQkfjarlJOWcX6ZSo/Gsw4MCLarLa41csnJxWk6 Url13R9gVvHJLv+VeKxymlmOEUaMMXAZvY4+dh9uDm7FYebm/aSXuV5ZiZLtYziiJMEL hCCGoy+hnaZD0Xxr27v3O+Ka22n4ys0NQ74HJRuTE89nOMqG/OrOHCJ+xObvvE/IIGwN ZgPbsqatoKOjdDzR5apah3jRFHwxK26NmFPjtSC6AxSbjGjrjO+jp6qtpBUZt/LXF9S9 IwsB0q1hk0D85I3HF8G3MVROYC/EZ5XFQTk3dbx+xtlmFeaTj7267YIDevjwkAK3u8zN NFSg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PCZ7MryOQdzj4zu4AJz1Vi0v2lJaHcRkKJstiAIWkwM=; b=SZy+YgF+pxKqvgLzjONL9wRFfoB2wr25IS/cO8giOwa+TxN6oxfz9ynM8cPq8LbcIf CrlsuDLiaSoOYShdD1wX64BZ/c9AWbcdBA0abfCgbzSKfEohAqyECuD1bhx7AKKxT8vN 3sOdU2osJIF+KXiN9mOz+boFfeTmTXtXtgTYZ1dyeM3ACb9C03gucxCBL6Ck6rtFk9T4 +rMScEg9QsEXu1Tk5+mTZDkD2WxVdc4i/Kh/uThaOpnXKbga/8szmNrwVo57W6eqbam5 KNHzl7vH/ZPJzdxElUsKBIsahZopZkVhes7vQG01YA100/vIx9gWbwNrZHRLQhVW9BmV ok8g==
X-Gm-Message-State: APjAAAVCWVK4bjnIV2iSzhpXdwv1iW5DDBhdkugWx0ZpJgXdXa9JLurh 91SeudipCa2+pV/5xyaWCk+lS/5A6Ye7NF3yxE2uBW4p
X-Google-Smtp-Source: APXvYqymPr5RppBpxbHAozVOv1TXpEiQWJW2zLqYpYIqfsTE/0YDxZMtRHIAGXuaReVunIw88C3YF6nMZVa6VzQbh1k=
X-Received: by 2002:a02:cc75:: with SMTP id j21mr22597497jaq.113.1582158077054; Wed, 19 Feb 2020 16:21:17 -0800 (PST)
MIME-Version: 1.0
References: <5b430357-56ad-2901-f5a8-c0678a507293@cisco.com> <4FC90EB2-D355-4DC5-8365-E5FBE037954E@gmail.com> <f5b56713-2a4d-1bf7-8362-df4323675c61@cisco.com> <MW3PR11MB4619C54F5C6160491847AA45C1100@MW3PR11MB4619.namprd11.prod.outlook.com> <CA+wi2hMH1PjiaGxdE5Nhb2tjsZtCL7+vjxwE+dk9PWN1fyz7vQ@mail.gmail.com> <MW3PR11MB46194A956A31261459526B43C1100@MW3PR11MB4619.namprd11.prod.outlook.com>
In-Reply-To: <MW3PR11MB46194A956A31261459526B43C1100@MW3PR11MB4619.namprd11.prod.outlook.com>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Wed, 19 Feb 2020 16:20:21 -0800
Message-ID: <CA+wi2hNxRCB4MxTm79ywhZLjv3BBS0djHHKKLaWC=aTLaBjYJg@mail.gmail.com>
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
Cc: "Peter Psenak (ppsenak)" <ppsenak@cisco.com>, Tony Li <tony1athome@gmail.com>, "lsr@ietf.org" <lsr@ietf.org>, "tony.li@tony.li" <tony.li@tony.li>
Content-Type: multipart/alternative; boundary="0000000000002956cb059ef6e287"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/FyEbTYxo7tZq8cEhJyYMxicFR4E>
Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Feb 2020 00:21:23 -0000

My sense of humor to be excused, Les 😗😁

Yes, so here's suggestion taht will build a better sloped hysteresis that
will allow you to ramp up slower first to not oscillate and also ramp down
somewhat more graciously. It's a sketch, more interesting metrics can be
taken into it further but that's the flavor good enough for the draft IMO.
holding down on timer is better achieved by 'normalizing' by exponential
decay on every tick that 'primes' to max rate over time when nothing is
happening.  Queue lengths except somle low/high watermark that stops/starts
sending are quite misleading IME given that queue length is not very
meaningful unless one can measure producer/consumer rates and those depend
largely on CPU cycles available & memory congestion and can be very, very
bursty. And all the queues involved as well to an extent (IME in halfway
non-trivial architecture @ least 3 of those in- and out- the box).

MaxAllowedEver/sec = maximum rate ever allowed (constant)
MinAllowedEver/sec = minimum rate ever allowed (constant)
MaxAllowedRate/sec = maximum packets (CSNP/PSNP/LSP) allowed out the
interface per sec
CurrentRate/sec = packets sent out this second
Re-TX/sec = retransmissions this second

per every second tic:

if Re-TX {
    CurrentRate/sec = max(MinAllowedEver, CurrentRate - 30% ) // slope down
fast on re-tx
}

if CurrentRate/sec >= MaxAllowedRate/sec {
  MaxAllowedRate/sec = min(MaxAllowedEver, MaxAllowedRate + 20%) // when
under load slope up fast
} else {
  MaxAllowedRate/sec += min(MaxAllowedRate, CurrentRate + 5%) // slowly
normalizes even if no traffic
}

What you write about ack'ing only is dangerous and can seriously affect the
protocol. If you're holding newer LSPs than the ones received ack'ing old
versions is not helpful. one should flood back newer stuff. Generally it is
better to run proper WFQ per type and not just start send one type of
packets since that can hit ugly corner conditions in flooding & starve it
IME if the queues get long enough one can't drain them in a sec tic.

hope that's precise & lucid enough without smothering folks with too much
detail ...

--- tony






On Wed, Feb 19, 2020 at 2:01 PM Les Ginsberg (ginsberg) <ginsberg@cisco.com>
wrote:

> Tony –
>
>
>
> If you have a suggestion for Tx back-off algorithm please feel free to
> share.
>
> The proposal in the draft is just a suggestion.
>
> As this is a local matter there is no interoperability issue, but
> certainly documenting a better algorithm is worthwhile.
>
>
>
>    Les (claws in check 😊 )
>
>
>
>
>
> *From:* Tony Przygienda <tonysietf@gmail.com>
> *Sent:* Wednesday, February 19, 2020 11:25 AM
> *To:* Les Ginsberg (ginsberg) <ginsberg@cisco.com>
> *Cc:* Peter Psenak (ppsenak) <ppsenak@cisco.com>; Tony Li <
> tony1athome@gmail.com>; lsr@ietf.org; tony.li@tony.li
> *Subject:* Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
>
>
>
> Having worked for last couple of years on implementation of flooding
> speeds that converge LSDBs some order of magnitudes above today's speeds
> ;-) here's a bunch of observations
>
>
>
> 1. TX side is easy and useful. My observation having gone quickly over the
> -ginsberg- draft is that you really want a better hysterisis there, it's
> bit too vertical and you will generate oscillations rather than walk around
> the equilibrium ;-)
>
> 2. Queue per interface is fairly trivial with modern implementation
> techniques and memory sizes if done correctly. Yes, very memory constrained
> platforms are a mildly different game and kind of precondition a different
> discussion.
>
> 3. RX side is possible and somewhat useful but much harder to do well
> depending on flavor. If we're talking about the RX advertising a very
> static value to cap the flooding speed that's actually a useful knob to
> have IMO/IME. Trying to cleverly communicate to the TXer a window size is
> not only fiendishly difficult, incurs back propagation speed (not
> neglectible @ those rates IME) but can easily lead to subtle flood
> starvation behaviors and lots of slow starts due to mixture of control loop
> dynamics and implementation complexity of such a scheme. Though, giving the
> TXer some hint that a backpressure is desired is however not a bad thing
> IME and can be derived failry easily without needs for checking queue sizes
> and so on. It's observable by looking @ some standard stats on what is
> productive incoming rate on the interface. Anything smarter needs new TLVs
> on packets & then you have a problem under/oversampling based on hellos
> (too low a frequency) and ACKs (too bursty, too batchy) and flooded back
> LSPs (too unpredictable)
>
>
>
> For more details I can recommend rift draft of course ;-)
>
>
>
> otherwise I'm staying out from this mildly feline spat ;-)
>
>
>
> --- tony
>
>
>
> On Wed, Feb 19, 2020 at 9:59 AM Les Ginsberg (ginsberg) <
> ginsberg@cisco.com> wrote:
>
> Tony -
>
> Peter has a done a great job of highlighting that "single queue" is an
> oversimplification - I have nothing to add to that discussion.
>
> I would like to point out another aspect of the Rx based solution.
>
> As you need to send signaling based upon dynamic receiver state and this
> signaling is contained in unreliable PDUs (hellos) and to be useful this
> signaling needs to be sent ASAP - you cannot wait until the next periodic
> hello interval (default 10 seconds) to expire. So you are going to have to
> introduce extra hello traffic at a time when protocol input queues are
> already stressed.
>
> Given hellos are unreliable, the question of how many transmissions of the
> update flow info is enough arises. You could make this more deterministic
> by enhancing the new TLV to include information received from the neighbor
> so that each side would know when the neighbor had received the updated
> info. This then requires additional hellos be sent in both directions -
> which exacerbates the queue issues on both receiver and transmitter.
>
> It is true (of course) that hellos should be treated with higher priority
> than other PDUs, but this does not mean that the additional hellos have no
> impact on the queue space available for LSPs/SNPs.
>
> Also, it seems like you are proposing interface independent logic, so you
> will be adjusting flow information on all interfaces enabled for IS-IS,
> which means that additional hello traffic will occur on all interfaces. At
> scale this is concerning.
>
>    Les
>
>
> > -----Original Message-----
> > From: Peter Psenak <ppsenak@cisco.com>
> > Sent: Wednesday, February 19, 2020 2:49 AM
> > To: Tony Li <tony1athome@gmail.com>
> > Cc: Les Ginsberg (ginsberg) <ginsberg@cisco.com>; tony.li@tony.li;
> > lsr@ietf.org
> > Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
> >
> > Tony,
> >
> > On 19/02/2020 11:37, Tony Li wrote:
> > > Peter,
> > >
> > >> I'm aware of the PD layer and that is not the issue. The problem is
> that
> > there is no common value to report across different PD layers, as each
> > architecture may have different number of queues involved, etc. Trying to
> > find a common value to report to IPGs across various PDs would involve
> > some PD specific logic and that is the part I'm referring to and I would
> like
> > NOT to get into.
> > >
> > >
> > > I’m sorry that scares you.  It would seem like an initial
> implementation
> > might be to take the min of the free space of the queues leading from the
> > >interface to the CPU. I grant you that some additional sophistication
> may be
> > necessary, but I suspect that this is not going to become more
> >complicated
> > than polynomial evaluation.
> >
> > I'm not scared of polynomial evaluation, but the fact that my IGP
> > implementation is dependent on the PD specifics, which are not generally
> > available and need to be custom built for each PD specifically. I always
> > thought a good IGP implementation is PD agnostic.
> >
> > thanks,
> > Peter
> >
> > >
> > > Tony
> > >
> > > _______________________________________________
> > > Lsr mailing list
> > > Lsr@ietf.org
> > > https://www.ietf.org/mailman/listinfo/lsr
> > >
> > >
>
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>