Re: [Roll] Border router failure detection

Konrad Iwanicki <iwanicki@mimuw.edu.pl> Wed, 12 October 2022 20:04 UTC

Return-Path: <iwanicki@mimuw.edu.pl>
X-Original-To: roll@ietfa.amsl.com
Delivered-To: roll@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0E4FC14CE37 for <roll@ietfa.amsl.com>; Wed, 12 Oct 2022 13:04:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KJZBIRelVBqQ for <roll@ietfa.amsl.com>; Wed, 12 Oct 2022 13:04:32 -0700 (PDT)
Received: from mail.mimuw.edu.pl (mail.mimuw.edu.pl [193.0.96.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 26EE9C14CE2F for <roll@ietf.org>; Wed, 12 Oct 2022 13:04:31 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by duch.mimuw.edu.pl (Postfix) with ESMTP id 6D35F6F403EF0 for <roll@ietf.org>; Wed, 12 Oct 2022 22:04:28 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mimuw.edu.pl
Received: from duch.mimuw.edu.pl ([127.0.0.1]) by localhost (mail.mimuw.edu.pl [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id w_K252xRwAcd for <roll@ietf.org>; Wed, 12 Oct 2022 22:04:26 +0200 (CEST)
Received: from [IPV6:2a02:a311:813e:880:a5ec:1e6a:bbc4:b9c1] (unknown [IPv6:2a02:a311:813e:880:a5ec:1e6a:bbc4:b9c1]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by duch.mimuw.edu.pl (Postfix) with ESMTPSA for <roll@ietf.org>; Wed, 12 Oct 2022 22:04:25 +0200 (CEST)
Message-ID: <ca6caa5f-a185-522d-6751-e3b5218acd43@mimuw.edu.pl>
Date: Wed, 12 Oct 2022 22:04:24 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:91.0) Gecko/20100101 Thunderbird/91.12.0
From: Konrad Iwanicki <iwanicki@mimuw.edu.pl>
To: Routing Over Low power and Lossy networks <roll@ietf.org>
References: <CAP+sJUfcEY2DNEQV=duJdN6P8zZn0ccuei+4ra-B6TcLb5z8Kg@mail.gmail.com> <49ac5fc3-4a3c-fb87-d366-eb7e7cfd60df@mimuw.edu.pl> <18233.1583176305@localhost> <CAO0Djp3w4vWCOawQ+eegNTRzb_HRGYH6n=bdEH6iVf5ZO0AGFQ@mail.gmail.com> <f71fe153-c0d1-097e-a72e-49ece97cbd48@mimuw.edu.pl> <10272666-28c7-ab3e-9ceb-1b8f2bb6e5e5@mimuw.edu.pl> <CO1PR11MB4881A5AA0E5C5010FD2BE39ED8749@CO1PR11MB4881.namprd11.prod.outlook.com> <bc174171-4b68-40b2-d532-463709e5bea8@mimuw.edu.pl> <CO1PR11MB4881D0C985582B28AE2DE8BED84E9@CO1PR11MB4881.namprd11.prod.outlook.com> <ab695952-3b11-46ad-f638-622ca770f8e1@mimuw.edu.pl> <02c7a894-b7a8-8fcb-9119-172a91a3871b@mimuw.edu.pl> <8421.1620834368@localhost> <d0f9bd53-ed96-1512-5bc2-59063ba2d5dc@mimuw.edu.pl> <b556ca50-b2db-798f-1cf2-8d7a77d5ad63@mimuw.edu.pl> <21a67951-92c7-5cfa-7bda-a11ac004492c@mimuw.edu.pl> <260038.1647531160@dooku> <5fad2f45-02a5-7241-f797-43165b0fc9a2@mimuw.edu.pl> <386984.1649868259@dooku> <eed72f83-127a-8705-f956-d4765b967027@mimuw.edu.pl>
Content-Language: en-US
In-Reply-To: <eed72f83-127a-8705-f956-d4765b967027@mimuw.edu.pl>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/roll/h6UsXpjAYFfDHADHZ8phDgAoj28>
Subject: Re: [Roll] Border router failure detection
X-BeenThere: roll@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Over Low power and Lossy networks <roll.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/roll>, <mailto:roll-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/roll/>
List-Post: <mailto:roll@ietf.org>
List-Help: <mailto:roll-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/roll>, <mailto:roll-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Oct 2022 20:04:36 -0000

Dear all,

The RNFD document expired. I reposted it.

In addition, below I summarize the questions and answers from the last 
interim regarding the protocol. Hopefully, this will allow for 
progressing with the document.

1. What happens when Sentinels (the root's one-hop neighbors that 
monitor its state) don't hear each other? Does the algorithm detect the 
crash of the root?

Yes, the decision is made not only by Sentinels but all nodes. There is 
no requirement for the Sentinels to form a connected graph. In other 
words, if there is any path connecting the sub-DODAGS of the individual 
Sentinels and enough individual Sentinels consider the root as down, the 
entire network will eventually come to a decision that the root has crashed.

2. What if most of the direct links to the root fail but the root is in 
fact alive?

If most of the direct links to the root fail, then Sentinels monitoring 
those links will consider the root as dead. Since the root takes part in 
the communication, it will be aware that the number of such Sentinels 
increases. It will react by initiating a new DODAG version.

3. Is rebuilding the DODAG in such a case desirable?

If the majority of links to the root that have once formed a DODAG are 
currently down, then the DODAG should probably look different than for 
the network with those links up. Rebuilding the DODAG, at least in my 
opinion, makes a lot of sense in such a case. Furthermore, the threshold 
describing how large the majority is is configurable. Depending on 
whether one wants to prioritize speeding up root failure detection or 
slowing down DODAG rebuilding, different values can be chosen.

4. Why can't Sentinels ask the root whether it is dead?

In fact they do. If a Sentinel observes that an increasing number of 
other Sentinels considers the root as dead, it may perform verification 
by trying to contact the root via a direct link.

I hope to have covered all the questions.

Best,
--
- Konrad Iwanicki.