Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed

Tony Przygienda <tonysietf@gmail.com> Wed, 19 February 2020 19:25 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 86942120112 for <lsr@ietfa.amsl.com>; Wed, 19 Feb 2020 11:25:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id q6Cbz5l1aCVi for <lsr@ietfa.amsl.com>; Wed, 19 Feb 2020 11:25:50 -0800 (PST)
Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D18FA120088 for <lsr@ietf.org>; Wed, 19 Feb 2020 11:25:49 -0800 (PST)
Received: by mail-io1-xd2a.google.com with SMTP id z8so1881203ioh.0 for <lsr@ietf.org>; Wed, 19 Feb 2020 11:25:49 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d2uHWNUvV/WnNsJI11EKCYv91HFlU5YMeaXIhvZtRoE=; b=TgaBs3iw6osHuSDjbs0d2mlKZ21/swHaDdhWIuez6tlSQ0MY+LyESwVdiPlEW0qUtW 58WlQi4TyKi+j4dbj48zpCqsevGYrwJOWmG7Row/R3Ie/cKGk1hcR/ydRMkzu/L4q24M jv5dlxoh+PWfpNMF4dWrxEqfo5XZsLhWoRXojID0GP0S2FtNbybFGbhPd9Bx1DqaZbFV zKf0byKHvndOs0I+aSxnXhroBas7rWZ+ul0e6zpab4jHC20m4AJow3xnDfL7FgSUlKdQ r1jQiXeYZt9kt1nTK6ILNA4G8Cho/cAWIWTCzQyu/T+otYMXGToOsnxhrdZ19VWRaxNt XkKQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d2uHWNUvV/WnNsJI11EKCYv91HFlU5YMeaXIhvZtRoE=; b=MBINCrsOPCCYZm8lb13MTNabC3wZZo7LZRByHyv5jRlSgQHrlIhbbUs6O/2VWttLfM 1R1A7vtoHfh0T/atWt3PU+v7OaPBGQ4IkPagfmJsj3KhDAafdbUHk4/zxBboAiZMhTFJ kma+iWmwaC5c1OUS4j7Oho3hibggY4dWajAbRsMDpdNuZjwKvAh+E932+fBPiA6etDTS WmcRHYVYevuHPZooI5MyoEra1QJhbdn9HWq6PURlYBxlX5VLKBya9RfpJrOIuW5W/b1M ySaUys8PyNGuVkTVx4H9oO3N3ix0tRW4bsiUpegq8us9EP/da+YOLmlNQZRvgSzZw0aI FYMA==
X-Gm-Message-State: APjAAAWlF0n8XLonBOWxgUslBftNfFXs2Y2p7Ao++H3M5ipSlgtUksRJ KkV3iHfBjosBGMNQpQ6dAo2TeM4F2sE52iENojXIBw==
X-Google-Smtp-Source: APXvYqwAJSNQq7p3mDAjYClGpxx0K3UdORvadiHyBJERE67v3TquYTMMFOCgngcNbcplzBkc8JfzNAGgUkbiJtr7V5M=
X-Received: by 2002:a02:c78f:: with SMTP id n15mr22379918jao.100.1582140349065; Wed, 19 Feb 2020 11:25:49 -0800 (PST)
MIME-Version: 1.0
References: <5b430357-56ad-2901-f5a8-c0678a507293@cisco.com> <4FC90EB2-D355-4DC5-8365-E5FBE037954E@gmail.com> <f5b56713-2a4d-1bf7-8362-df4323675c61@cisco.com> <MW3PR11MB4619C54F5C6160491847AA45C1100@MW3PR11MB4619.namprd11.prod.outlook.com>
In-Reply-To: <MW3PR11MB4619C54F5C6160491847AA45C1100@MW3PR11MB4619.namprd11.prod.outlook.com>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Wed, 19 Feb 2020 11:24:54 -0800
Message-ID: <CA+wi2hMH1PjiaGxdE5Nhb2tjsZtCL7+vjxwE+dk9PWN1fyz7vQ@mail.gmail.com>
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
Cc: "Peter Psenak (ppsenak)" <ppsenak@cisco.com>, Tony Li <tony1athome@gmail.com>, "lsr@ietf.org" <lsr@ietf.org>, "tony.li@tony.li" <tony.li@tony.li>
Content-Type: multipart/alternative; boundary="0000000000007daf05059ef2c150"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/Y-mj4aLUpzz0FQfWSkUbUj0q7Kc>
Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Feb 2020 19:25:53 -0000

Having worked for last couple of years on implementation of flooding speeds
that converge LSDBs some order of magnitudes above today's speeds  ;-)
here's a bunch of observations

1. TX side is easy and useful. My observation having gone quickly over the
-ginsberg- draft is that you really want a better hysterisis there, it's
bit too vertical and you will generate oscillations rather than walk around
the equilibrium ;-)
2. Queue per interface is fairly trivial with modern implementation
techniques and memory sizes if done correctly. Yes, very memory constrained
platforms are a mildly different game and kind of precondition a different
discussion.
3. RX side is possible and somewhat useful but much harder to do well
depending on flavor. If we're talking about the RX advertising a very
static value to cap the flooding speed that's actually a useful knob to
have IMO/IME. Trying to cleverly communicate to the TXer a window size is
not only fiendishly difficult, incurs back propagation speed (not
neglectible @ those rates IME) but can easily lead to subtle flood
starvation behaviors and lots of slow starts due to mixture of control loop
dynamics and implementation complexity of such a scheme. Though, giving the
TXer some hint that a backpressure is desired is however not a bad thing
IME and can be derived failry easily without needs for checking queue sizes
and so on. It's observable by looking @ some standard stats on what is
productive incoming rate on the interface. Anything smarter needs new TLVs
on packets & then you have a problem under/oversampling based on hellos
(too low a frequency) and ACKs (too bursty, too batchy) and flooded back
LSPs (too unpredictable)

For more details I can recommend rift draft of course ;-)

otherwise I'm staying out from this mildly feline spat ;-)

--- tony

On Wed, Feb 19, 2020 at 9:59 AM Les Ginsberg (ginsberg) <ginsberg@cisco.com>
wrote:

> Tony -
>
> Peter has a done a great job of highlighting that "single queue" is an
> oversimplification - I have nothing to add to that discussion.
>
> I would like to point out another aspect of the Rx based solution.
>
> As you need to send signaling based upon dynamic receiver state and this
> signaling is contained in unreliable PDUs (hellos) and to be useful this
> signaling needs to be sent ASAP - you cannot wait until the next periodic
> hello interval (default 10 seconds) to expire. So you are going to have to
> introduce extra hello traffic at a time when protocol input queues are
> already stressed.
>
> Given hellos are unreliable, the question of how many transmissions of the
> update flow info is enough arises. You could make this more deterministic
> by enhancing the new TLV to include information received from the neighbor
> so that each side would know when the neighbor had received the updated
> info. This then requires additional hellos be sent in both directions -
> which exacerbates the queue issues on both receiver and transmitter.
>
> It is true (of course) that hellos should be treated with higher priority
> than other PDUs, but this does not mean that the additional hellos have no
> impact on the queue space available for LSPs/SNPs.
>
> Also, it seems like you are proposing interface independent logic, so you
> will be adjusting flow information on all interfaces enabled for IS-IS,
> which means that additional hello traffic will occur on all interfaces. At
> scale this is concerning.
>
>    Les
>
>
> > -----Original Message-----
> > From: Peter Psenak <ppsenak@cisco.com>
> > Sent: Wednesday, February 19, 2020 2:49 AM
> > To: Tony Li <tony1athome@gmail.com>
> > Cc: Les Ginsberg (ginsberg) <ginsberg@cisco.com>; tony.li@tony.li;
> > lsr@ietf.org
> > Subject: Re: [Lsr] Flow Control Discussion for IS-IS Flooding Speed
> >
> > Tony,
> >
> > On 19/02/2020 11:37, Tony Li wrote:
> > > Peter,
> > >
> > >> I'm aware of the PD layer and that is not the issue. The problem is
> that
> > there is no common value to report across different PD layers, as each
> > architecture may have different number of queues involved, etc. Trying to
> > find a common value to report to IPGs across various PDs would involve
> > some PD specific logic and that is the part I'm referring to and I would
> like
> > NOT to get into.
> > >
> > >
> > > I’m sorry that scares you.  It would seem like an initial
> implementation
> > might be to take the min of the free space of the queues leading from the
> > >interface to the CPU. I grant you that some additional sophistication
> may be
> > necessary, but I suspect that this is not going to become more
> >complicated
> > than polynomial evaluation.
> >
> > I'm not scared of polynomial evaluation, but the fact that my IGP
> > implementation is dependent on the PD specifics, which are not generally
> > available and need to be custom built for each PD specifically. I always
> > thought a good IGP implementation is PD agnostic.
> >
> > thanks,
> > Peter
> >
> > >
> > > Tony
> > >
> > > _______________________________________________
> > > Lsr mailing list
> > > Lsr@ietf.org
> > > https://www.ietf.org/mailman/listinfo/lsr
> > >
> > >
>
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>