Re: [Roll] Border router failure detection

Konrad Iwanicki <> Fri, 18 March 2022 21:10 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 22E4D3A119D for <>; Fri, 18 Mar 2022 14:10:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id vpK8K42YmSoc for <>; Fri, 18 Mar 2022 14:10:39 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 69D833A1196 for <>; Fri, 18 Mar 2022 14:10:38 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 6AEE861FE2C37 for <>; Fri, 18 Mar 2022 22:10:35 +0100 (CET)
X-Virus-Scanned: amavisd-new at
Received: from ([]) by localhost ( []) (amavisd-new, port 10026) with ESMTP id FaxULw1Q-_ye for <>; Fri, 18 Mar 2022 22:10:33 +0100 (CET)
Received: from [] (unknown []) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by (Postfix) with ESMTPSA for <>; Fri, 18 Mar 2022 22:10:32 +0100 (CET)
Message-ID: <>
Date: Fri, 18 Mar 2022 22:10:25 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:91.0) Gecko/20100101 Thunderbird/91.7.0
From: Konrad Iwanicki <>
To: Routing Over Low power and Lossy networks <>
References: <> <> <18233.1583176305@localhost> <> <> <> <> <> <> <> <> <8421.1620834368@localhost> <> <> <> <260038.1647531160@dooku>
Content-Language: en-US
In-Reply-To: <260038.1647531160@dooku>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [Roll] Border router failure detection
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Over Low power and Lossy networks <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 18 Mar 2022 21:10:44 -0000

Thanks Michael for your comments. Please, find my replies inline.

On 17.03.2022 16:32, Michael Richardson wrote:
> Konrad Iwanicki <> wrote:
>     > Since the RNFD draft has been adopted by the group, I am expected to
>     > propose a set of work items on which we could iterate to progress on
>     > the draft.
> Please note that it is an anti-pattern (a pathology) of the IETF that we
> think that we have to improve/change the document after adoption.
> It might be that there is little work to do, but we won't really know until
> more people have read it.

I think I am starting to see your point of view.

>     > More specifically, achieving consensus in RNFD is done in such a way
>     > that the root node need not be involved. As long as the network remains
>     > connected, the nodes are able to conclude that the root has crashed,
>     > irrespective of how degenerated the DODAG may be because of the
>     > crash. What Pascal suggested (or at least what I understood) is that
>     > involving the root and using perhaps a different consensus algorithm
>     > may be worth considering.
> The idea being, I think, if the root hasn't crashed, it ought to be able to
> quickly prove that to all parties involved.

This is already present in the algorithm: the second paragraph of 
Section 5.4. Essentially, if the root observes that something wrong may 
be going on, it increases the DODAG Version (and resets its Trickle 
timer). In this way, the nodes (falsely) suspecting that the root may be 
down get a proof of its activity.

> The hidden node problem means that that some nodes might think that the root has
> gone, when it is just not very reachable to that node.
> A node that can speak to many other rank=2 ("root-children"?) nodes, but
> can't see the root ought to just become a child of those other nodes.

This is also what is already happening. Unless there is consensus that 
the root is considered down by all nodes, RNFD does not influence RPL's 
decision as to what parent a node is allowed to choose. Therefore, if a 
root's child's link to the root goes bad, the child will select another 
parent instead of the root, possibly another child or even a node deeper 
in the DODAG.

I may be wrong, but what I understood from Pascal's idea is that we 
could try to get the root more involved in the process of coordinating 
the children acting as sentinels, at least in the periods when it is up.

- Konrad Iwanicki.

>     > (Also, I believe that discussing the work items should first be done
>     > asynchronously/offline. However, if you prefer allocating a slot at
>     > IETF 113, please do let me know.)
> Discussion on the ML is always good and appropriate.
> --
> Michael Richardson <>, Sandelman Software Works
>  -= IPv6 IoT consulting =-
> _______________________________________________
> Roll mailing list