Re: [Rift] Device restart problem

Tony Przygienda <> Thu, 24 October 2019 15:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1A7751200CC for <>; Thu, 24 Oct 2019 08:59:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id jVoPwSU5_1Ta for <>; Thu, 24 Oct 2019 08:59:18 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::136]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 940C5120052 for <>; Thu, 24 Oct 2019 08:59:18 -0700 (PDT)
Received: by with SMTP id v2so22894099ilm.0 for <>; Thu, 24 Oct 2019 08:59:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nfmyHjzjiVv2WvcIeHUks8vgMnGMqBziG5A6dEQafSI=; b=Ea2PUyfq5pWa3zf5eMqMrUwMDWdt58lz18UvJFngRGfVPGr1hn785dVKKOH7zP6r4x 3Z3DG4IqZeszLGOaygZ/3iWU7e9LbDp02eIsuOvzbI1IvYWX7G7CCqFY/SsI8kGdH3PD HdX7fkplOOZMUSV7CyPoeqf4G4IsrvVhkxZ2jRahv9aT0zdR03u/wRRg2hn0nvxDSTH8 EyFacmP+HNFO86zWt5YlZUDrW53aYpYDhsVs5PHwFaq8ZlOBF9jeDJA0qlMk87hBiwzL 1MDoiK29kXYfGxiZxxqqVBM9EWM1shxDpNId7+2iHuWFmH1zazI5v9AoUvpDgqrR3Dge T/qQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nfmyHjzjiVv2WvcIeHUks8vgMnGMqBziG5A6dEQafSI=; b=ilrY7Io8QtnO03ol/BMk0cxxpnf7hL2TGqvT+4fPnSfaEIzM53vznPGGmY4o2wxGrv WbOpxydQS1Pwj+KndobZPS60JYrDs9GJ+ogS4+BMrjorPClUlHcok9BfJTfwHZI1JJYJ I7FXlo21/VlXo6SFWIbLAYZeBHK2ck/jwvytTkETG/AeNM9X1bNP1gT/HVn61gro0O5A DutA/cQX3DZWsoh1B+5DdweNgtAfVKMlN/egUlxkrsJPEJf+pTlSevkIZczy13lCTpKv rf9UOg+24opcihgMhRTaUrFfnl4oopHI+b8cuqHPzF0kgailVA3BwY8xex4yikwczJ/u HWKg==
X-Gm-Message-State: APjAAAX8VCaCbONPorZSV+gauGMFuswwB8tehrszjMURdxnvgMUT0Kxh dsF72L98Yo4X5Y6T/G/ZACXNaT0vMTkclAhlYUM5WJeY
X-Google-Smtp-Source: APXvYqwKQka4CFARhF+DsS/0XhEmcXULPBwWF6uvmTgjD9/W0ck+DBvgoAGS+TNX7qWTUsB08I4jXvHi2TMFQl6MyP0=
X-Received: by 2002:a92:dac1:: with SMTP id o1mr47194262ilq.132.1571932757726; Thu, 24 Oct 2019 08:59:17 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Tony Przygienda <>
Date: Thu, 24 Oct 2019 08:58:35 -0700
Message-ID: <>
Content-Type: multipart/alternative; boundary="000000000000a29f740595aa1d16"
Archived-At: <>
Subject: Re: [Rift] Device restart problem
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 24 Oct 2019 15:59:21 -0000

hey xu, I see deeper and deepr into implementation, you just found first
layer of the onion here BTW ;-)  The seqnr# handling is since times
immemorial one of the trickier parts of IGP implementation (but not only,
same problems exists in  other places but there, the information is not
persistent so problem is not as pressing).

Multiple mechanisms kick in here

a) the seqnr# is circular which is a very important piece of the puzzle.
you cannot generate a "biggest" number no'one can override. math explained
in appendix in lots detail. BTW, not my invention, smarter people than me
worked stuff out long time ago but there was never a full, detailed, easy
to implement writedown AFAIK.
b) yes, the fact that we flood only northbound prevents via normal "flat
flooding"  Leaf111 "getting" its old TIE with a higher sequence number.
Using flat flooding south would of course kill largely the scalability of
the protocol and make it equivalent to OSPF or ISIS  or any other "normal"
link-state approach in terms of flooding complexity (well, flood reduction
would still work ;-)
c) However, observe that Table 3 holds the key to the solution. TIDE/South
tells you what you need to do to describe your database to the neighbor
south. The description from Spine111 includes the description of N-TIEs of
Leaf111 and with that Leaf111 can realize that there is a stale N-TIE it
originated before reboot and re-issue with a higher sequence number (that's
where a] comes into play)

When you keep on implementing and testing you'll find another very
interesting, far more complex case that we solved but I will keep the
suspension going ;-)

The observation on the one week is also correct. Done very purposefully.
Let's say RIFT runs on 0.5M devices (scale we aim at given
multi-homed/overlay originating servers can run it as well). If you assume
5 TIEs per device that's 2.5M TIEs @ the top of the fabric (large but not a
scary number compared to what we do with BGP and add/path on daily basis in
world's most scalable implementations ;-). If we'd have something like 1hr
reorigination we talk  2.5M/24 = 100K re-originations per hour. That gives
you a flooding rate into ToFs of 30 TIEs/sec (assuming perfect flood
reduction). All that disregarding things like server rebooting or container
architectures which will possibly inject lots prefixes on moves/boots and
so on. So refresh often is churn that is unnecessary. With 1 week lifetime
we're talking 15K TIEs per hour refresh which is a manageable number given
we're talking 0.5M devices.

Observe however that you can issue with any lifetime you choose as a device
and RIFT will work (and when emptying TIEs you are supposed to originate
with 300secs only). So the 1 week is basically a protocol constant that can
be knobbed.

Let us know when you got first pieces inter'oped with Bruno's open source
BTW. Things always become much more clear when implementations are bashed
against each other ;-)

--- tony

On Thu, Oct 24, 2019 at 1:52 AM <>; wrote:

> Hi, Tony
> There is a device restart problem
> In draft-ietf-rift-rift-08 Figure 2, N-TIE of Leaf111 flooded to ToF21 via
> Spine111. Seq NR may be larger.
> Leaf111 restarts and regenerates N-TIE. The random seq NR may be small, so
> that when Spine111 receives it, it will compare the seq NR and discard the
> new message.
> According to the behavior of Appendix c.3.4 b.3, it is hoped that Spine111
> sends DBTIE to Leaf111 to update seq NR.
> However, according to the flooding range of N-TIE, this message cannot be
> sent out.
> In this way, there will be a large number of invalid N-TIEs in the network
> for a long time (the default expire time of the protocol is 1 week)
> Is this understanding correct? How does rift solve this problem?
> Thank you!
> Benchong