Re: [Roll] Loop free local DODAG repair solution

Jianlin Guo <guo@merl.com> Tue, 30 October 2012 22:09 UTC

Return-Path: <guo@merl.com>
X-Original-To: roll@ietfa.amsl.com
Delivered-To: roll@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2F75E21F85B3 for <roll@ietfa.amsl.com>; Tue, 30 Oct 2012 15:09:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VUDa3qSansNK for <roll@ietfa.amsl.com>; Tue, 30 Oct 2012 15:09:38 -0700 (PDT)
Received: from ns1.merl.com (ns1.merl.com [137.203.5.1]) by ietfa.amsl.com (Postfix) with ESMTP id 014E221F85AC for <roll@ietf.org>; Tue, 30 Oct 2012 15:09:37 -0700 (PDT)
Received: from tsumi.merl.com (tsumi.merl.com [137.203.134.9]) by ns1.merl.com (8.13.8/8.12.10) with ESMTP id q9UM9RaW016536; Tue, 30 Oct 2012 18:09:27 -0400
Received: from [127.0.0.1] (dulcian.merl.com [137.203.143.95]) by tsumi.merl.com (8.12.10/8.12.10) with ESMTP id q9UM9GrV011429; Tue, 30 Oct 2012 18:09:27 -0400
Message-ID: <5090500B.9050403@merl.com>
Date: Tue, 30 Oct 2012 18:09:15 -0400
From: Jianlin Guo <guo@merl.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Philip Levis <pal@cs.stanford.edu>
References: <50194329.3040003@merl.com> <501945CC.5040801@merl.com> <5086A598.7030508@merl.com> <23378.1351166893@sandelman.ca> <50894640.1080804@merl.com> <97B69B30E0EF244B940B65EA541E3F2D21564932@DBXPRD0510MB395.eurprd05.prod.outlook.com> <508A8FDA.4040104@merl.com> <97B69B30E0EF244B940B65EA541E3F2D2156744D@DBXPRD0510MB395.eurprd05.prod.outlook.com> <508AAE0D.8030903@merl.com> <97B69B30E0EF244B940B65EA541E3F2D215676A7@DBXPRD0510MB395.eurprd05.prod.outlook.com> <508AE1F9.1080600@merl.com> <F3045258-0B7C-4EE3-9328-DA56C97DDF9E@cs.stanford.edu> <508FF9D3.40805@merl.com> <6B8958C3-3B39-4CA7-B570-BC7D6E8084AE@cs.stanford.edu>
In-Reply-To: <6B8958C3-3B39-4CA7-B570-BC7D6E8084AE@cs.stanford.edu>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
Cc: "<roll@ietf.org>" <roll@ietf.org>
Subject: Re: [Roll] Loop free local DODAG repair solution
X-BeenThere: roll@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Routing Over Low power and Lossy networks <roll.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/roll>, <mailto:roll-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/roll>
List-Post: <mailto:roll@ietf.org>
List-Help: <mailto:roll-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/roll>, <mailto:roll-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Oct 2012 22:09:39 -0000

It is local in the sense that repair does not propagate far away from 
initiation node. It also depends on the definition of "local', 1 hop 
only or 2 hop still considered as local.

Consider a case where a node N's parent node becomes unreachable. Ranks 
of all reachable neighbors are greater than node N's rank. Using DIS, 
node N has to increase its rank greater than its old rank. With this 
increased rank, node N needs to transmit a DIO. If any of node N's 
children does not receive this DIO (which is typical in lossy network), 
a loop may be created. Even though all children receive this DIO, this 
DIO may result in entire sub-tree to increase their ranks. DRQ/DRP 
repair does not increase rank of any node, therefore it is loop free.

Jianlin
On 10/30/2012 1:03 PM, Philip Levis wrote:
> So it's basically a multihop DIS? How is that then a local repair? In the case of a subtree becoming disconnected due to the top of the subtree having a null parent set, those nodes at the top would issue a DIS with Solicited Information to repair their local routes (next hop). Nodes below them in the DODAG shouldn't have to trigger a local repair.
>
> Put another way, can you describe a concrete case where DIS with Solicited Information can't solve the problem, yet DRQ/DRP can? Or is this purely an efficiency optimization? If the latter, I'd love to see concrete results indicating the savings (from a real LLN, not simulation).
>
> Phil
>
> On Oct 30, 2012, at 9:01 AM, Jianlin Guo wrote:
>
>> Hi Phil,
>>
>> Thank you for your valuable comments. There may be one more major case where a RPL node has a null parent set, that is, parent node of a RPL node becomes unreachable. This is a typical case in lossy networks. For example, a mobile object could block wireless link between two RPL nodes.
>>
>> The key differences between DIS and DRQ/DRP are:
>> 1) A DRQ message can travel multiple hops. # of hops is controlled by a parameter named MH (Maximum Hop) in DRQ message.
>> 2) Upon receiving a DRQ message, a neighbor node responds with DRP message only if neighbor node has non-empty parent set and a rank lower than rank of the DRQ transmitting node. Otherwise, it may forward DRQ message to its parent node.
>> 3) Receiving of a DRP message guarantees the discovery of a new parent node.
>> 4) Value of MH parameter controls local repair. MH can be set to a small value, e.g. 1,2 or 3, so that DRQ message do not travel too far.
>>
>> Jianlin
>> On 10/29/2012 6:09 PM, Philip Levis wrote:
>>> I'd like to take a step back here and discuss MaxRankIncrease and the notion of null parent sets.
>>>
>>> The two major cases where a RPL node has a null parent set are:
>>>    1) The node has no members in its neighbor set
>>>    2) The node has members of its neighbor set, but all of them advertise a Rank of INFINITE_RANK. The major case we're concerned about here is when rule 3 of 8.2.2.4 requires nodes to advertise INFINITE_RANK due to the value of MaxRankIncrease.
>>>
>>> I don't think the local repair scheme will fix case 1. So we're really talking about case 2.
>>>
>>> MaxRankIncrease emerged in specifying RPL as a way to support both ZigBee-style centralized periodic tree reconstruction triggers (MaxRankIncrease == 0) as well as CTP-style completely distributed operation while simultaneously providing a cap to the count-to-infinity problem. So really, the issue here is when you have a network with a small MaxRankIncrease and don't want to reconstruct the entire tree, there are valid parents that could be used, but a node does not have them in its neighbor/parent set.
>>>
>>> Thinking about it this way (am I wrong?), I am a bit confused by the claim that the draft is a local repair. Assuming a node has to follow rule 3 in 8.2.2.4, this is not a repair mechanism, but rather a neighbor discovery mechanism. Could you explain how DRQ/DRP solve a problem that a DIS with a proper Solicited Information option can't solve? This was the original intent of DIS with Solicited Information. Or perhaps I'm not understanding DRQ/DRP correctly?
>>>
>>> Phil
>>>
>>> On Oct 26, 2012, at 12:18 PM, Jianlin Guo wrote:
>>>
>>>> Thank you for the paper. I agree with [1] on that "dismantling of the sub-DAG, rooted at the node doing the rank increase, causes more turmoil in the network than the routing loops themselves".
>>>>
>>>> Now, consider a case in which a node's parent set becomes empty. In this case, RPL provides following set of actions:
>>>>
>>>> 1) Start its own floating DODAG
>>>> 2) Poison the broken path
>>>> 3) Trigger a local repair
>>>>
>>>> Both 1) and 2) actions will increase packet delivery delay time (which may be not acceptable for some applications) and possibly cause packet dropping due to limited buffer size of a LLN node (which may also be not acceptable for some applications). So, trigger a local repair is a practical option. Our local repair mechanism is designed for this purpose and it does not create any loops.
>>>>
>>>> Jianlin
>>>>
>>>> On 10/26/2012 12:10 PM, C Chauvenet wrote:
>>>>> Le 26 oct. 2012 à 17:36, Jianlin Guo a écrit :
>>>>>
>>>>>> We compared performance metrics such as packet delivery rate.
>>>>> Ok.
>>>>>
>>>>> In general do you have a document about your experiments that you would like to share ?
>>>>> I think it could be a good way to defend your mechanism.
>>>>>
>>>>> There are 2 sub questions related to your draft :
>>>>>
>>>>>   - Is there a strong need for an additional mechanism to prevent loops ? (the HbH header option mentioned by phil is already there).
>>>>>   - Is your mechanism the good way to do so (overhead induced, efficiency...)
>>>>>
>>>>> As mentioned by Phil, this subject has been previously discussed inside ROLL few years ago, and did not recommend to add such mechanisms.
>>>>>
>>>>> For instance, [1] concludes that
>>>>>
>>>>> "the turmoil caused
>>>>> by dismantling of the sub-DAGs in order to increase ranks
>>>>> may be much more than what the routing loops themselves
>>>>> will cause. Consequently, the use of such loop avoidance
>>>>> mechanism in the operation of a DAG based routing protocol
>>>>> can not be universally recommended."
>>>>>
>>>>> [1] : http://www.emmanuelbaccelli.org/publications/AINA_2010.pdf
>>>>>
>>>>> Best,
>>>>>
>>>>> Cédric.
>>>>>
>>>>>> On 10/26/2012 11:21 AM, C Chauvenet wrote:
>>>>>>> Hi,
>>>>>>> Thank you for your answer.
>>>>>>> See inline.
>>>>>>>
>>>>>>> Le 26 oct. 2012 à 15:27, Jianlin Guo a écrit :
>>>>>>>
>>>>>>>> On 10/25/2012 12:06 PM, C Chauvenet wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Le 25 oct. 2012 à 16:01, Jianlin Guo a écrit :
>>>>>>>>>
>>>>>>>>>> Hi Michael,
>>>>>>>>>>
>>>>>>>>>> For your first question, draft-clausen-lln-rpl-experiences-04 pointed out that "it can be observed that with current implementations of RPL, such as the ContikiRPL                                   implementation, loops do occur - and, frequently. During the same experiments described in Section 13, a snapshot of the DODAG was made every ten seconds. In 74.14% of the 4114 snapshots, at least one loop was observed".
>>>>>>>>> Is it something that you observed in your own deployments ?
>>>>>>>>> More specifically : did you find similar results ?
>>>>>>>> We observed the occurrence of loops, unfortunately we did not measure the percentage.
>>>>>>> So how did you evaluate the benefit of the mechanism that you proposed ?
>>>>>>>
>>>>>>> Cédric.
>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Cédric.
>>>>>>>>>
>>>>>>>>>> For your second question, further investigation and experiments are needed.
>>>>>>>>>>
>>>>>>>>>> Jianlin
>>>>>>>>>>
>>>>>>>>>> On 10/25/2012 8:08 AM, Michael Richardson wrote:
>>>>>>>>>>> Jianlin Guo <guo@merl.com>
>>>>>>>>>>>   wrote:
>>>>>>>>>>>      JG> Dear ROLL WG members,
>>>>>>>>>>>
>>>>>>>>>>>      JG> As we all know that loop is an open issue in RPL. Experiment shows that loop
>>>>>>>>>>>      JG> occurs quite often. We have proposed a loop free local DODAG
>>>>>>>>>>>
>>>>>>>>>>> Can you quantify "quite often"?
>>>>>>>>>>> Do you have any metrics for how often loops occur, and what the cost is
>>>>>>>>>>> of their repair?
>>>>>>>>>>>
>>>>>>>>>>> I think that the WG would be very very very interested in additional -experiences
>>>>>>>>>>> draft, or pointers to papers explaining same, that gives a repeateable
>>>>>>>>>>> experiment in which loops are observed.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Roll mailing list
>>>>>>>>>> Roll@ietf.org
>>>>>>>>>> https://www.ietf.org/mailman/listinfo/roll
>>>> _______________________________________________
>>>> Roll mailing list
>>>> Roll@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/roll
>>