Re: [Rift] Device restart problem

Tony Przygienda <tonysietf@gmail.com> Mon, 28 October 2019 14:27 UTC

Return-Path: <tonysietf@gmail.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EAE42120897 for <rift@ietfa.amsl.com>; Mon, 28 Oct 2019 07:27:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R2zh7akJpUcQ for <rift@ietfa.amsl.com>; Mon, 28 Oct 2019 07:27:50 -0700 (PDT)
Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 137C912088E for <rift@ietf.org>; Mon, 28 Oct 2019 07:27:50 -0700 (PDT)
Received: by mail-io1-xd2a.google.com with SMTP id c6so10850507ioo.13 for <rift@ietf.org>; Mon, 28 Oct 2019 07:27:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GOWj4u2FXpKdsOUqPXhBlq+p7T1QvL+yjtTuhx8N6dc=; b=hx6Sddm7XUta3xu0LFMD4gx3Qn8v/3kjdQMdZMo9fdMH9ZsNyMBTv6Nvqlz/Lm1V/V opZO0WBMrojcnU8xPziOQ35Xd/1miEh4srvFXmAnr9D/gU5XDXypkROMUYZYYlyYFlPf zEDwzllde0TL2E/jP0yOCBP+rXKLPl21BWZbw0Rx8Tmsx1BGgHzgpt07s1GvPXFMzrE/ w1jhLRyFJQbU36bX6LQX8n9TrBtAjtMgS7wsl7PH8Di9dejeon3S0f78Yb2xoO5v/FFt bMRR5TG14G1FlOCAm1Mgb1CWIt/D3ey8xo1+l/9WFQbCLzT8Q9////2Ixm7+5ZXKWd0Z L/mQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GOWj4u2FXpKdsOUqPXhBlq+p7T1QvL+yjtTuhx8N6dc=; b=KpT0hE1niECyNUJ4m6A6p1MWrf92MZ45+D+cnNdBjMp2+96ppdhgaXIuQ+BntXwWnN PwqhgcjXegNh0HuRT4HYycaX8iH9vFZAMR7jL6yaISNpioFiRI3FvuiuQBvT4filuJyJ XdWy1eXfUONESGSk5yHS/fUZZo7IYLFJRCQnGZ96KspqAsY5yjCJbkJB+czjIl5xYXZt hN9SYTMxfHLjHx/cLxowiLsWN1MTQjQRDJWkXTIBSIXgFeM/h4nNoAFnwdKQs4SkwlCA eLKo98tsZPP7Ob46m3AY37tYY/r0c1Kj4wQtFJtHHehz3dNZpw8RLh6qDBf2s2PKu+F3 ODYQ==
X-Gm-Message-State: APjAAAVYJzlFhk66vh9xujCbdPrEVheRbHjKsWbU69a2VVI7hLJAH1Jz hyKMMRtHHnw5SOoYQcmael3CBWeflUBEDhVBc29MXVXyXoM=
X-Google-Smtp-Source: APXvYqzCncHKKKGcPwlEPu6d42VZVFsBozqlQVe1zoTMi63LQaSNXQ/7cYlobxrR5eqi0kuWrjdgT+gLJcLXsdrOB20=
X-Received: by 2002:a6b:fb0c:: with SMTP id h12mr10804108iog.239.1572272869021; Mon, 28 Oct 2019 07:27:49 -0700 (PDT)
MIME-Version: 1.0
References: <CA+wi2hNN9JrRft2_n0eHmWq4+p2KHdBH3dwQ6pat8Ri02FTrHQ@mail.gmail.com> <201910251543067579596@zte.com.cn>
In-Reply-To: <201910251543067579596@zte.com.cn>
From: Tony Przygienda <tonysietf@gmail.com>
Date: Mon, 28 Oct 2019 07:27:03 -0700
Message-ID: <CA+wi2hOMm7xmKAEZme+iSN_zWgKx650zdaN=6VrZoioHWo1Lkw@mail.gmail.com>
To: xu.benchong@zte.com.cn
Cc: rift@ietf.org
Content-Type: multipart/alternative; boundary="000000000000d929350595f94dfa"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/AcTWgm6sPpk2y6WCnb6e6Dd-Ctk>
Subject: Re: [Rift] Device restart problem
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Oct 2019 14:27:53 -0000

TIDEs are never flooded, each node constructs them independently from the
database.

Yes, what you say is related to the 2nd onion somewhat but only an aspect
of it. When you implement flood procedures carefully and start to test it
will become clear what

C.3.2.2.6.I.i)

is necessary for.

Great you're getting there. Let us know when you want to hook up and
any help though Bruno's descriptions are very detailed and well
debugged given quite a lot of people were using the stuff already
including, from what I heard, running it on couple of Linux NOSes.

--- tony


On Fri, Oct 25, 2019 at 12:44 AM <xu.benchong@zte.com.cn> wrote:

>
> Hi,tony
>
> Thanks for your reply.
>
> When Leaf111 and Spine111 (assuming no other spines) are restarted at the
> same time, Spine111 can’t flood the received TIDE of ToF21 to Leaf111, and
> Seq NR can’t be updated -- Is this the second layer of onion;-)
>
> Can we update the lifetime of the tie by tide to avoid sending tie period?
>
> We are developing a verification implementation, and have not yet connect
> with Bruno's.
>
>
> Benchong
>
>
> 原始邮件
> *发件人:*TonyPrzygienda <tonysietf@gmail.com>
> *收件人:*徐本崇10065053;
> *抄送人:*rift@ietf.org <rift@ietf.org>;
> *日 期 :*2019年10月24日 23:59
> *主 题 :**Re: Device restart problem*
> hey xu, I see deeper and deepr into implementation, you just found first
> layer of the onion here BTW ;-)  The seqnr# handling is since times
> immemorial one of the trickier parts of IGP implementation (but not only,
> same problems exists in  other places but there, the information is not
> persistent so problem is not as pressing).
>
> Multiple mechanisms kick in here
>
> a) the seqnr# is circular which is a very important piece of the puzzle.
> you cannot generate a "biggest" number no'one can override. math explained
> in appendix in lots detail. BTW, not my invention, smarter people than me
> worked stuff out long time ago but there was never a full, detailed, easy
> to implement writedown AFAIK.
> b) yes, the fact that we flood only northbound prevents via normal "flat
> flooding"  Leaf111 "getting" its old TIE with a higher sequence number.
> Using flat flooding south would of course kill largely the scalability of
> the protocol and make it equivalent to OSPF or ISIS  or any other "normal"
> link-state approach in terms of flooding complexity (well, flood reduction
> would still work ;-)
> c) However, observe that Table 3 holds the key to the solution. TIDE/South
> tells you what you need to do to describe your database to the neighbor
> south. The description from Spine111 includes the description of N-TIEs of
> Leaf111 and with that Leaf111 can realize that there is a stale N-TIE it
> originated before reboot and re-issue with a higher sequence number (that's
> where a] comes into play)
>
> When you keep on implementing and testing you'll find another very
> interesting, far more complex case that we solved but I will keep the
> suspension going ;-)
>
> The observation on the one week is also correct. Done very purposefully.
> Let's say RIFT runs on 0.5M devices (scale we aim at given
> multi-homed/overlay originating servers can run it as well). If you assume
> 5 TIEs per device that's 2.5M TIEs @ the top of the fabric (large but not a
> scary number compared to what we do with BGP and add/path on daily basis in
> world's most scalable implementations ;-). If we'd have something like 1hr
> reorigination we talk  2.5M/24 = 100K re-originations per hour. That gives
> you a flooding rate into ToFs of 30 TIEs/sec (assuming perfect flood
> reduction). All that disregarding things like server rebooting or container
> architectures which will possibly inject lots prefixes on moves/boots and
> so on. So refresh often is churn that is unnecessary. With 1 week lifetime
> we're talking 15K TIEs per hour refresh which is a manageable number given
> we're talking 0.5M devices.
>
> Observe however that you can issue with any lifetime you choose as a
> device and RIFT will work (and when emptying TIEs you are supposed to
> originate with 300secs only). So the 1 week is basically a protocol
> constant that can be knobbed.
>
> Let us know when you got first pieces inter'oped with Bruno's open source
> BTW. Things always become much more clear when implementations are bashed
> against each other ;-)
>
> --- tony
>
>
>
>
> On Thu, Oct 24, 2019 at 1:52 AM <xu.benchong@zte.com.cn> wrote:
>
>> Hi, Tony
>>
>> There is a device restart problem
>>
>> In draft-ietf-rift-rift-08 Figure 2, N-TIE of Leaf111 flooded to ToF21
>> via Spine111. Seq NR may be larger.
>>
>> Leaf111 restarts and regenerates N-TIE. The random seq NR may be small,
>> so that when Spine111 receives it, it will compare the seq NR and discard
>> the new message.
>>
>> According to the behavior of Appendix c.3.4 b.3, it is hoped that
>> Spine111 sends DBTIE to Leaf111 to update seq NR.
>>
>> However, according to the flooding range of N-TIE, this message cannot be
>> sent out.
>>
>> In this way, there will be a large number of invalid N-TIEs in the
>> network for a long time (the default expire time of the protocol is 1 week)
>>
>> Is this understanding correct? How does rift solve this problem?
>>
>>
>> Thank you!
>>
>> Benchong
>>
>>
>>
>>
>>
>