Re: [Roll] Border router failure detection

Konrad Iwanicki <> Wed, 04 March 2020 11:05 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C494D3A0C57 for <>; Wed, 4 Mar 2020 03:05:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id r4pNy-qjKNpX for <>; Wed, 4 Mar 2020 03:05:08 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8598B3A0C5D for <>; Wed, 4 Mar 2020 03:05:08 -0800 (PST)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1325C605A36F6; Wed, 4 Mar 2020 12:05:07 +0100 (CET)
X-Virus-Scanned: amavisd-new at
Received: from ([]) by localhost ( []) (amavisd-new, port 10026) with ESMTP id wRBx9ZegMWkK; Wed, 4 Mar 2020 12:05:04 +0100 (CET)
Received: from [IPv6:2001:6a0:5001:2:f041:42a0:d6:8071] (unknown [IPv6:2001:6a0:5001:2:f041:42a0:d6:8071]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPSA; Wed, 4 Mar 2020 12:05:03 +0100 (CET)
From: Konrad Iwanicki <>
To: Routing Over Low power and Lossy networks <>, Michael Richardson <>
References: <> <> <18233.1583176305@localhost>
Message-ID: <>
Date: Wed, 04 Mar 2020 12:05:20 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <18233.1583176305@localhost>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <>
Subject: Re: [Roll] Border router failure detection
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Over Low power and Lossy networks <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 04 Mar 2020 11:05:11 -0000

Dear Michael,

On 02.03.2020 20:11, Michael Richardson wrote:
> Konrad Iwanicki <> wrote:
>     > In a nutshell, I would like to propose an extension to RPL that had been
>     > invented to significantly improve handling crashes of border routers. Since I
>     > have little experience writing RFC-like drafts, I would greatly appreciate
>     > any help.
> Use the markdown method, and use someone's template github.

Thanks. But I guess, I would also need some guidelines doing the writing.

>     > What we observed, however, is that RPL does not efficiently handle crashes of
>     > border routers [1][2]. Upon such a failure, tearing down nonexistent upward
>     > routes can take a lot of time (depending on the data-plane traffic) and
>     > generate considerable control traffic, which is problematic in many
>     > applications.
> Rahul and Pascal (and others) have had a lot of conversation about how we
> deal with the various lollipop counters.  So I am interested in what your
> border router does when it boots: how does it announce the new DIOs?

Internal lollipop counters used within RNFD are always relative to the 
DODAG version number counter. Whenever that counter is refreshed, the 
others are initialized anew. Therefore, for RNFD, the easiest solution 
upon a border router reboot is to broadcast a DIO with a new value of 
the DODAG version.

This, however, is not completely straightforward. We have to either keep 
the counter value in the router's persistent memory, which is not always 
feasible, or learn the last DODAG version value by having the router 
broadcast a DIS and its neighbors reply with DIOs containing the last 
number they are aware of.

In either case, it makes sense for the router to delay broadcasting a 
new DODAG version (and hence the first DIO) for a while after the 
reboot. If "for a while" is long enough, this heuristic helps alleviate 
situations in which the router continuously reboots for some reason but 
manages to bump the DODAG version (e.g., when power is restored but then 
shortly after lost again). More elaborate mechanisms can be deployed as 

>     > [1] K. Iwanicki: “RNFD: Routing-Layer Detection of DODAG (Root) Node Failures
>     > in Low-Power Wireless Networks,” in IPSN 2016: Proceedings of the 15th
>     > ACM/IEEE International Conference on Information Processing in Sensor
>     > Networks. IEEE. Vienna, Austria. April 2016. pp. 1—12. DOI:
>     > 10.1109/IPSN.2016.7460720
> Unfortunately, it's behind the IEEE paywall.
> I have given up on getting documents from the IEEE.

Try the ACM Digital Library (the paper is in both libraries):

If you fail to download from this source, I can try to get you a preprint.

> I guess you have been working on this for at least five years now.

Well, we wanted to have the problem well studied. And it inspired other 
interesting research as well.

Best regards,
- Konrad Iwanicki.