Re: [Lsr] Dynamic flow control for flooding

tony.li@tony.li Wed, 24 July 2019 19:04 UTC

Return-Path: <tony1athome@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DAD5120413 for <lsr@ietfa.amsl.com>; Wed, 24 Jul 2019 12:04:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.558
X-Spam-Level:
X-Spam-Status: No, score=-1.558 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.091, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0I31E9cteMrk for <lsr@ietfa.amsl.com>; Wed, 24 Jul 2019 12:04:12 -0700 (PDT)
Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DD30612034D for <lsr@ietf.org>; Wed, 24 Jul 2019 12:04:11 -0700 (PDT)
Received: by mail-pg1-x52d.google.com with SMTP id w10so21684229pgj.7 for <lsr@ietf.org>; Wed, 24 Jul 2019 12:04:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=3ynbgcWdodiAs9yUS1me0BemexLNPYp6pLZH7v0PT74=; b=nKoZIn81QD1P6/Zg2HDm5fv7Oi7J9lSRUT8tn7yXwj7yP0R/vIDCd6jkdT50dddw2V oOdctwv5+uf/vlKPQPCcKaYQofRKn8H2gp+x78QLt7gXIcGYPgxT8O5sCPza3OcdNvHJ R/UGLKPG197Tf4gF4LfNeEkoOdL22SH7/kYF4G/hduYakwYDI3ltuxdowlnc4heLykaw ytBpoYKBzDWE4BvNxSUxvVOJoodp+DsMIl/qm+w/mvc6F6b3CyMg+0yWZIOj2FIHepP+ ZNNBPwu/LatCjOjVksR9kQCLdiN1+KCWYkHwnvEa5Yz/SDQ2zbspa0jyOojKAc2EX/FH yRag==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=3ynbgcWdodiAs9yUS1me0BemexLNPYp6pLZH7v0PT74=; b=Lnxqq+q2Ex8iHLqzPnLTiHY2Qu6Iu5ZgJJiFQ82bcUzphRDyblhC8kUlb5gDldctn5 F8ZiaTopWT75MseMMj10PXUhOIJyJbEpNAJ+T0wSjNS9rRmgrUgFYzujPTzXkzU8bwqV SfQgrq8utpHE20djBARNcrYgQL3PuSW6tE/k/G6tIlLLDEjtCuE2AQxTw4/ftKQmD0cB FbSeCmej9pdAPYFFvUXkITSrHu+g+iAw1cdBWKEnSXP6/+gGccLvk9fKvCIqiUfRRwb9 GHUvIwrTXyPiyjam7U1ogvPD6mIo4AbGVLMgkTNZoyOHYAPZ4kxeFaNNhaPcIt41roef 0haw==
X-Gm-Message-State: APjAAAXNvw7kp7gCg3b5JnCAOK6iw12ZoMLYAbqcE7/RIKwfeR0GAZtZ VS2FqMPwqVykAnmWIx/lQow=
X-Google-Smtp-Source: APXvYqxRlz6n7UfIETbg5wsM8sOf19iBUjSLJrXoNmjtFwFwuPnMnnEpnozzlWlVYPAT8bcmKVSMMA==
X-Received: by 2002:a63:2744:: with SMTP id n65mr69622606pgn.277.1563995051334; Wed, 24 Jul 2019 12:04:11 -0700 (PDT)
Received: from [172.22.228.115] ([162.210.130.3]) by smtp.gmail.com with ESMTPSA id d8sm44077878pgh.45.2019.07.24.12.04.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jul 2019 12:04:10 -0700 (PDT)
Sender: Tony Li <tony1athome@gmail.com>
From: tony.li@tony.li
Message-Id: <63EC078F-795D-4A20-9EBC-F87EE28C5EAB@tony.li>
Content-Type: multipart/alternative; boundary="Apple-Mail=_C6447CC6-D1D6-429C-8326-8FB127CB97F0"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Date: Wed, 24 Jul 2019 12:04:09 -0700
In-Reply-To: <BYAPR11MB36381F5B3EC20BC8BE2217D5C1C60@BYAPR11MB3638.namprd11.prod.outlook.com>
Cc: "lsr@ietf.org" <lsr@ietf.org>
To: Les Ginsberg <ginsberg@cisco.com>
References: <CAMj-N0LdaNBapVNisWs6cbH6RsHiXd-EMg6vRvO_U+UQsYVvXw@mail.gmail.com> <BYAPR11MB36382C89363202D1B5659614C1C70@BYAPR11MB3638.namprd11.prod.outlook.com> <593D6ED8-A568-4B41-8882-3D32A6D0111F@tony.li> <BYAPR11MB36381F5B3EC20BC8BE2217D5C1C60@BYAPR11MB3638.namprd11.prod.outlook.com>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/kSOhRzTyXkGz224GNemlOW-C0uc>
Subject: Re: [Lsr] Dynamic flow control for flooding
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Jul 2019 19:04:14 -0000

Les,


> Optimizing the throughput through a slow receiver is pretty low on my list because the ROI is low.


Ok, I disagree. The slow receiver is the critical path to convergence.  Only when the slow receiver has absorbed all changes and SPFed do we have convergence.


> First, the rate that you select might be too fast for one neighbor and not for the others.  Real flow control would help address this.
>  
> [Les:] At the cost of convergence. Not a good tradeoff.
> I am arguing that we do want to flood at the same rate on all interfaces used for flooding. When we cannot, flow control does not help with convergence. It may decrease some wasted bandwidth – but as we all agree that bandwidth isn’t a significant limitation this isn’t a great concern.


Rate limiting flooding delays convergence.  Please consider the following topology:


1 —————— 2 —————— 3
|        |        |
|        |        |
4 —————— 5 —————— 6
|        |        |
|        |        |
7 —————— 8 —————— 9


Suppose that we have 1000 LSPs injected at router 1.  Suppose further that router 2 runs at half the rate of router 4.  [How router 1 knows this requires $DEITY and is out of scope for the moment.]

Router 1 now floods at the optimal rate for router 2.  Router 1 uses that same rate to flood to router 4.  Suppose that it takes time T for this to complete.

When does the network converge?

Option 1: All nodes use the same flooding rate.

Router 2 will flood to router 3 concurrent with receiving updates from router 1. Thus, router 3 will receive all updates in time T + delta, where delta is router 2’s processing time.  For now, let’s approximate delta as zero.

Similarly, all routers will use the same rate, so router 4 will flood to 7 in time T + delta, and so on, with router 9 receiving everything in time T + 3 * delta.

Assuming no nodes SPF during the process, the network converges nearly simultaneously in about time T.

Option 2: We flood a bit faster where we can.

Suppose that router 1 now floods at the full rate to router 4.  The full update now takes time T/2.  Because all of the other nodes in the network are fast, router 4 floods in time T/2 + delta to nodes 5 and 7.  Carrying this forward, router 9 gets a full update in time T/2 + 3 * delta.  Even router 3 has full updates in T/2 + 3 * delta.

With the exception of node 2, the network has converged in half the time.  Even node 2 converges in time T.

Key points: 

1) Yes, the slow node delays convergence and causes micro-loops as everyone around it SPFs.  The point here (and I think you agree) is that slow nodes need to be upgraded.

2) There is no way for us to know how fast a node can go without some form of flow control, other than to go absurdly slowly.

3) There are many folks who want to converge quickly.  It is mission critical for them.  They will address slow nodes. They will not accept pessimal timing to avoid micro-loops.


> [Les:] I do not see how flow control improves things.


Flow control allows the transmitter to transmit at the optimal rate for the receiver.


> Dropping down to the least common denominator CPU speed in the entire network is going to be undoable without an oracle, and absurdly slow even with that.
>  
> [Les:] Never advocated that – please do not put those words in my mouth.


How is that different than what you’ve proposed?  Router 1 can only flood at the rate that it gets PSNPs from router 2.  That paces its flooding to router 4.  Following that logic, you somehow want router 4 to run at the same rate, forcing a uniformly slow rate.

Tony