Re: [Roll] Way forward for draft-clausen-lln-rpl-experiences

Mukul Goyal <mukul@uwm.edu> Mon, 21 May 2012 15:45 UTC

Return-Path: <prvs=481667b10=mukul@uwm.edu>
X-Original-To: roll@ietfa.amsl.com
Delivered-To: roll@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1270621F8593 for <roll@ietfa.amsl.com>; Mon, 21 May 2012 08:45:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.185
X-Spam-Level:
X-Spam-Status: No, score=-4.185 tagged_above=-999 required=5 tests=[BAYES_40=-0.185, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1LGeLhUmLona for <roll@ietfa.amsl.com>; Mon, 21 May 2012 08:45:52 -0700 (PDT)
Received: from ip2mta.uwm.edu (ip2mta.uwm.edu [129.89.7.20]) by ietfa.amsl.com (Postfix) with ESMTP id 386ED21F855D for <roll@ietf.org>; Mon, 21 May 2012 08:45:52 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap8EAPlhuk9/AAAB/2dsb2JhbABEDoUfsXgBAQEDAQEBAQsVQgIFAgIJDA8OAwEDAQEDAg0ZAikiBggGExuHbgULrDiJSokEBIEmiV8ahB+BFAOIQoxZj3CCMFeBOAkR
Received: from localhost (localhost.localdomain [127.0.0.1]) by mta01.pantherlink.uwm.edu (Postfix) with ESMTP id C72E8E6A72; Mon, 21 May 2012 10:45:50 -0500 (CDT)
X-Virus-Scanned: amavisd-new at mta01.pantherlink.uwm.edu
Received: from mta01.pantherlink.uwm.edu ([127.0.0.1]) by localhost (mta01.pantherlink.uwm.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lsTLzeQ+iu2r; Mon, 21 May 2012 10:45:49 -0500 (CDT)
Received: from mail17.pantherlink.uwm.edu (mail17.pantherlink.uwm.edu [129.89.7.177]) by mta01.pantherlink.uwm.edu (Postfix) with ESMTP id C7629E6A8D; Mon, 21 May 2012 10:45:49 -0500 (CDT)
Date: Mon, 21 May 2012 10:45:49 -0500
From: Mukul Goyal <mukul@uwm.edu>
To: Philip Levis <pal@cs.stanford.edu>
Message-ID: <2071633195.463484.1337615149395.JavaMail.root@mail17.pantherlink.uwm.edu>
In-Reply-To: <78FB3B50-3150-4729-A089-D9EAF0B02BB6@cs.stanford.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [99.20.249.193]
X-Mailer: Zimbra 6.0.15_GA_2995 (ZimbraWebClient - IE8 (Win)/6.0.15_GA_2995)
X-Authenticated-User: mukul@uwm.edu
Cc: roll WG <roll@ietf.org>, Michael Richardson <mcr@sandelman.ca>
Subject: Re: [Roll] Way forward for draft-clausen-lln-rpl-experiences
X-BeenThere: roll@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Routing Over Low power and Lossy networks <roll.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/roll>, <mailto:roll-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/roll>
List-Post: <mailto:roll@ietf.org>
List-Help: <mailto:roll-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/roll>, <mailto:roll-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 May 2012 15:45:54 -0000

Here is my review for first 9 sections of draft-clausen-lln-rpl-experiences-03. For later sections, I have nothing substaintial to add to what Philip has already said in the message below. 

One answer to many questions raised in draft-clausen is: use P2P-RPL. It is not the case that P2P-RPL must always be used simply because the problems it solves are not significant in all scenarios or application domains. But, where it makes sense, P2P-RPL can certainly be used. P2P-RPL is currently aiming for Experimental status. As more experience is gained with its operation, it will go for Standards Track status.

Thanks
Mukul

Section 4: Requirement Of DODAG Root
--------------------------------------

"RPL Routers provisioned with resources to act as DODAG Roots, and
   administratively configured to act as such, represent a single point
   of failure in the network.  As the memory requirements for the DODAG
   Root and for other RPL Routers are substantially different, unless
   all RPL Routers are provisioned with resources (memory, energy, ...)
   to act as DODAG Roots, effectively if the designated DODAG Root
   fails, the network fails and RPL is unable to operate.  Even if
   electing another RPL Router as temporary DODAG root (e.g., for
   forming a "Floating" DODAG) for providing internal connectivity
   between RPL Routers, this router may not have the necessary resources
   to satisfy this role as (temporary) DODAG Root.
"

If a deployment uses "RPL + P2P-RPL", the criticism listed above is not valid. There is no single point of failure. If the global DAG(s) cannot work because of the root's failure, the nodes can still discover routes to each other using P2P-RPL. Nodes do not need extra memory/CPU to run P2P-RPL.

"Another possible LLN scenario is that only internal point-to-point
   connectivity is sought, and no RPL Router has a more "central" role
   than any other - a self-organizing LLN.  Requiring special
   provisioning of a specific "super-node" as DODAG Root is both
   unnecessary and undesirable."

Such an LLN can just use P2P-RPL. No need for special provisioning for any node.  

Section 5:  RPL Data Traffic Flows
--------------------------------------

" RPL makes a-priori assumptions of data traffic types, and explicitly
   defines three such [I-D.ietf-roll-terminology] traffic types: sensor-
   to-root data traffic (multipoint-to-point) is predominant, root-to-
   sensor data traffic (point-to-multipoint) is rare and sensor-to-
   sensor (point-to-point) data traffic is extremely rare.  While not
   specifically called out thus in [RFC6550], the resulting protocol
   design, however, reflects these assumptions in that the mechanism
   constructing multipoint-to-point routes is efficient in terms of
   control traffic generated and state required, point-to-multipoint
   route construction much less so - and point-to-point routes subject
   to potentially significant route stretch (routes going through the
   DODAG Root in non-storing mode) and over-the-wire overhead from using
   source routing (from the DODAG Root to the destination) (see
   Section 7) - or, in case of storing mode, considerable memory
   requirements in all LLN routers inside the network (see Section 7).
"

P2P-RPL resolves any route stretch issues. Note that if the source and destination are far apart, the route stretch with core RPL is not much (i.e. the along-DAG routes wont be much worse than direct routes) although the constraint to traverse through the root may still cause congestion near the root. If source and destination are nearby, the route stretch with "along global DAG" route could be significant and that is where using P2P-RPL makes most sense.

Also, P2P-RPL supports discovery of both hop-by-hop routes and source routes. A source could choose which one to use for a particular destination. 

"The data traffic characteristics, assumed by RPL, do not represent a
   universal distribution of traffic types in LLNs:

   o  There are scenarios where sensor-to-sensor traffic is a more
      common occurrence, documented, e.g., in [RFC5867] ("Building
      Automation Routing Requirements in Low Power and Lossy Networks").

"

And in such cases, it makes sense to use P2P-RPL with or without core RPL.

"For the former, all sensor-to-sensor routes include the DODAG Root,
   possibly causing congestions on the communication medium near the
   DODAG Root, and draining energy from the intermediate RPL Routers on
   an unnecessarily long route.  If sensor-to-sensor traffic is common,
   RPL Routers near the DODAG Root will be particularly solicited as
   relays, especially in non-storing mode.
"

Use P2P-RPL.


Section 6:  Fragmentation Of RPL Control Messages And Data Packet
----------------------------------------------------------------
" While 79 octets
   may seem to be sufficient to carry RPL control messages, consider the
   following: RPL control messages are carried in ICMPv6, and the
   mandatory ICMPv6 header consumes 4 octets.  The DIO base another 24
   octets.  If link metrics are used, that consumes at least another 8
   octets - and this is using a hop count metric; other metrics may
   require more.  The DODAG Configuration Object consumes up to a
   further 16 octets, for a total of 52 octets.  Adding a Prefix
   Information Object for address configuration consumes another 32
   octets, for a total of 84 octets - thus exceeding the 79 octets
   available for L3 data payload and causing link-layer fragmentation of
   such a DIO. "

It is certainly possible for a DIO to exceed 79 bytes. But, it is not the case that a DIO would always fragment. DODAG Configuration Option is an "option" and so is the Prefix Information Option. Every DIO need not carry these options (or others defined in Section 6.7 of RPL RFC).

That being said, possible fragmentation of DIOs is a real issue. When designing a packet format, one always has to struggle to achieve a balance between the need to save space and the need to be modular - two conflicting goals. Pascal called the P2P Route Discovery Option (P2P-RDO, defined in P2P-RPL) a "garbage can option" because it includes all the information P2P-RPL route discovery needs (besides what is carried in the configuration option, the metric containers and the DIO base object). We designed P2P-RDO thinking that the need to save space is more critical. Various options defined in RPL RFC were designed giving more importance to the need for modularity. I dont think this is necessarily a bad thing as draft-clausen alleges. While many RPL deployments will use IEEE 802.15.4 as the link layer in near future, this may not always be the case. The correct solution to the problem of fragmentation in 802.15.4 LLNs is to devise a compression mechanism at 6lowpan level. Some thing similar to draft-bormann-6lowpan-ghc-04 or the RPL-specific compression scheme we proposed (draft-goyal-roll-rpl-compression but now working at 6lowpan layer).

"As a point of reference, the ContikiRPL [rpl-contiki]
   implementation includes both the DODAG Configuration option and the
   Prefix Information option in all DIO messages.  Any other options,
   e.g., Route Information options indicating prefixes reachable through
   the DODAG Root, increase the overhead and thus the probability of
   fragmentation.
"

OK, so a particular implementation always includes config and PI options in all DIOs. That sounds like a problem this particular implementation has created for itself.

"Given the minimal packet size of LLNs, the routing protocol must
   impose low (or no) overhead on data packets, hopefully independently
   of the number of hops [RFC4919].  However, source-routing not only
   causes increased overhead in the IP header, but also leads to a
   variable available payload for data (depending on how long the source
   route is)."

Again, this is not a simple one-sided issue. Yes, source routes eat precious bytes in the data packets. But, in many constrained environments (e.g. in some home automation deployments), only source routing is possible because devices are hard pressed for memory and cannot maintain any routing state.

"In point-to-point communication and when non-storing mode
   is used for downward traffic, the source of a data packet will be
   unaware of how many octets will be available for payload (without
   incurring L2.5 fragmentation) when the DODAG Root relays the data
   packet and add the source routing header.  Thus, the source may
   choose an inefficient size for the data payload: if the data payload
   is large, it may exceed the link-layer MTU at the DODAG Root after
   adding the source-routing header; on the other hand, if the data
   payload is low, the network resources are not used efficiently, which
   introduces more overhead and more frame transmissions.

"

This is a good point to make. Note that this problem is not present when the source (rather than the DAG root) itself includes the source route in the packet. So, this problem goes away when using P2P-RPL. 

Section 7: The DAO Mechanism: Downward and Point-to-Point Routes
---------------------------------------------------------------

"RPL specifies two distinct and incompatible "modes of operation" for
   downward traffic: storing mode, where each RPL Router is assumed to
   maintain routes to all destinations in its sub-DODAG, i.e., RPL
   Routers that are "deeper down" in the DODAG, and non-storing mode,
   where only the DODAG Root stores routes to destinations inside the
   LLN, and where the DODAG Root employs strict source routing in order
   to route data traffic to the destination RPL Router.
"

The above description of storing mode is incorrect. It is not the case that, in storing mode, each RPL Router maintains routes to all destinations in its sub-DODAG. The child decides which of its parents would receive a DAO. Further, for each advertized destination, the origin (of the advertisement) decides (via path control bits) how many separate routes could exist for this destination.

Section 7.1:

"In case a destination is
   unreachable, all the DODAG Root may do is require all destinations to
   re-issue their DAOs, by way of issuing a DIO with an increased DODAG
   version number, possibly provoking a broadcast-storm-like situation.
"

This problem is easily fixable by a simple DAO solicitation mechanism: root includes a solicitation in its DIO, routers in the DAG propagate the solicitation further in their DIOs and when the desired destination receives the solicitation, it sends the DAO to the root. Pascal has often talked about writing up this fix.

"A final point on the DAO mechanism: RPL supports point-to-point
   traffic only by way of relaying through the DODAG.  In networks where
   point-to-point traffic is no rare occurrence, this causes unduly long
   routes (with possibly increased energy consumption, increased
   probability of packet losses) as well as possibly congestion around
   the DODAG Root.
"

Use P2P-RPL in such scenarios.

Section 8: Address aggregation and summarization
------------------------------------------------------
I dont agree with the assertion that address aggregation will break down completely if a node's DAO parent is not same as the one whose advertized prefix was used by the node to configure its address. I think DAO parent selection at lower levels wont have too much impact on address aggregation at nodes at upper level. Sure address aggregation may not be possible at the node's parent or grandparent. But, nodes still higher up will be less and less affected.

Also, there might be factual inaccuracy in the text below.

" Any aggregated routes require the use of a prefix shorter than /64,
   and subsequent hierarchical assignment of prefixes down to a /64 (as
   any RPL Router itself provides a /64 subnet to any hosts connected to
   the router).

   Moreover, if the 6lowpan adaption layer [RFC4944] is used in the LLN,
   route aggregation is not possible since the same /64 is applied
   across the entire network."

I dont think an RPL router has to always advertize a 64 bit prefix to its hosts. Further, with RFC 6282, I am not sure if only /64 prefix could be used for autoconfiguration of 802.15.4 interfaces. So, the assertion "Any aggregated routes require the use of a prefix shorter than /64" may not be true.

Section 9: Link assumed bidirectional
-------------------------------------------------

"Unidirectional links are no rare occurrence, such as is known from
   wireless multi-hop networks.  Preliminary results from a test-bed of
   AMI (Automated Metering Infrastructure) devices using 950MHz radio
   interfaces, and with a total of 22 links, show that 36% of these
   links are unidirectional."
 
Could the authors provide reference to published literature showing that a significant fraction of links are unidirectional? The reference to an unpublished/unavailable study is not OK. Also, it seems that "950MHz" is a typo.

----- Original Message -----
From: "Philip Levis" <pal@cs.stanford.edu>
To: "Thomas Heide Clausen" <IETF@ThomasClausen.org>
Cc: "roll WG" <roll@ietf.org>, "Michael Richardson" <mcr@sandelman.ca>
Sent: Thursday, May 17, 2012 12:02:32 AM
Subject: Re: [Roll] Way forward for draft-clausen-lln-rpl-experiences


On May 16, 2012, at 10:11 AM, Thomas Heide Clausen wrote:

> 
> 
> For example: 
> 
> 	o	the state required for storing/non-storing mode does *not* depend on a 
> 		specific implementation and deployment, but is an artifact from the RPL design.
> 
> 	o	the message-size of RPL is not dependent on a specific implementation and deployment,
> 		but is an artifact from the RPL design.
> 
> 	o	the use of links "upwards" based on receipt of traffic "downwards" (i.e. the unidirectional
> 		link issue) is not dependent on a specific implementation and deployment,
> 		but is an artifact from the RPL design.
> 
> 	o	the unknown-by-source MTU issues for data-traffic when using non-storing mode 
> 		is not dependent on a specific implementation and deployment,
> 		but is an artifact from the RPL design.
> 
> I could go on, but then it'd be easier to just paste the whole I-D into this email.


Thomas,

I agree that some points in the ID are valid statements about the protocol itself, such as message sizes and the issues caused by floating DODAGs. Those, to me, seem like reasonable points to make.

However, some of the points are hypotheses about performance (14), a bit naive about how wireless networks behave (13) or somewhat snippy gripes (11, 12). In my opinion, a document which focused on factual statements of the protocol design not really open to interpretation and cut hypothetical or subjective statements might be ready for the working group to consider.

As just a start, I object to 10, 11, 12, 13, and 14 being considered inherent to the protocol.

10 assumes that a node only uses DIO reception to allow a parent; the specification is pretty clear that you should check the parent is usable (section 1.1). You're taking a bad implementation decision and assuming there isn't another way to do things. 

For 11, there are implementations of RPL smaller than 50kB; they do not implement every feature, but that was kind of the point of the protocol, that it could be implemented on a sliding scale of implementation complexity. The TinyOS implementation, for example, is, I believe, ~20kB, less than half the size. You don't report what architecture the 50kB is for, clearly it would be more for a 32-bit than a 16-bit architecture. 

For 12, "implementations may exhibit a bad performance if not carefully implemented."  I think it is safe to say this is true for almost ANY protocol. A specification is not intended to be a complete statement of efficient implementation, otherwise you give little latitude to future improvements and good engineering.

For 13, this assumes that a wireless network has a stable topology which the protocol can converge to. Wireless networks are often NOT stable: one cannot expect a protocol to converge on a dynamic graph.

14 is similarly confused about what a wireless network looks like. How can the state of a distributed system based on a dynamic topology be "consistent?" I think this is a fundamental misunderstanding of how the network works.

That being said, I think point 6 is well taken and should be considered, maybe others. Maybe the constructive way to take this document, if you don't want to take the step of specifying solutions, is at least casting it as a roadmap for ways in which you think the WG should improve RPL?

Phil

_______________________________________________
Roll mailing list
Roll@ietf.org
https://www.ietf.org/mailman/listinfo/roll