[Roll] RPL Next Steps

Tim Winter <wintert@acm.org> Tue, 11 August 2009 22:59 UTC
Message-ID: <4A81F79A.10104@acm.org>
Date: Tue, 11 Aug 2009 18:58:34 -0400
From: Tim Winter <wintert@acm.org>
User-Agent: Thunderbird 2.0.0.21 (X11/20090330)
MIME-Version: 1.0
To: ROLL WG <roll@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: [Roll] RPL Next Steps
Precedence: list
WG,

Please find below some additional feedback from the design team on the
questions that have been raised so far for RPL.  Not everything has been
covered- but the questions have served to point out some areas of the draft
where we need to continue and focus efforts.

We (DT) propose, in the -01 version of the draft, to clarify as much
as possible the outstanding questions and concerns in the current
specification of RPL, with emphasis on the existing (-00) mechanisms.  The
intent is to provide a solid, unambiguous, implementable, foundation of the
existing core mechanisms in -01.  On this foundation we can then continue
to build and expand other necessary mechanisms in later revisions, such as
are being discussed for P2P routing.  There is no doubt that the P2P issues
need to be further discussed and addressed, but the thought is that by
clearing up the existing mechanisms we may be able to make better progress
in moving forward beyond the existing mechansisms in later revisions.

WG, how does this sound as a strategy to make progress on RPL?



Miscellaneous:
- Based on WG feedback, instead of `depth' we will refer to `rank'
- Instead of `siblings' we will refer to `nodes of the same rank'
- We will update the text to reflect that
        1) Grounded nodes do not always provide a default route
        2) Destination prefix suboption of DIO includes lifetime


---------------------------------------------------------------------------
        From: Alexandru Petrescu <alexandru.petrescu@gmail.com>

        I would like to get clarification on the IPv6 addressing architecture
        and subnet structure considered in RPL.

        -is it the AUTOCONF addressing model?

DT: It is important that RPL work with whatever addressing model we utilize
in the future and that it operate in contexts where the addressing model is
already constrained.  Thus, RPL treats the address as opaque.  If you like,
you can view it as flat, but really it is uninterpreted at this stage.

With this in mind, it is clear that RPL will benefit from the ability to do
route summarization along the DAG, particularly in order to improve
scalability.  Future work on the draft may evolve this mechanism, and there
is work to do on the ROLL architectural framework that may offer additional
clarifications.


        -is it the 6LoWPAN route over or mesh-under?
DT: RPL intends to operate purely at L3, and it is the intention of RPL to
be L2 agnostic, so it "routes over".  If it were over, e.g. switched
ethernet or 802.11 mesh or 802.15.4 mesh underneath, it should work.  It
should also work on, e.g., ethernet, 802.11, and 802.15.4 where no
"meshing" is done at layer 2.

        -how many physical interfaces does a node have?
DT: This is out of scope for RPL.

        -if a node has several interfaces then does it have only one physical
          interface?
DT: This may be a typical case, but again out of scope for RPL.  A
subnetwork / adaptation layer should handle this.

        -what is the default route behaviour?  Does each node have default
          route?
DT: A default route MAY be offered by a grounded DAG, in which case it will
be installed.  In some cases (application specific) the default route may
not be offered by a grounded DAG, in which case there is no default route
and a node will not be able to use a default route to forward packets.  In
such cases it is presumed that an application, choosing not to provision a
default route in a DAG root, will take care to use only prefixes that can
be routed within the DAG.

        -are there IPv6 prefix subnets /64?
DT: There could be, but it is not required.  In cases where RPL allows
prefix specification, the prefix length may be specified from 0-128.

        Here is a DAG excerpt from 01 draft:

                           (A)
                            |\
                            | `-----.
                            |        \
                           (B)        \
                             \        |
                              `-----. |
                                     \|
                                     (C)

        Does that mean the IPv6 addressing arechitecture is this:


                           (A)
                            |\
            2001:db8:1::/64 | `-----.
                            |        \  2001:db8:2::/64
                           (B)        \
                             \        |
                              `-----. |
                 2001:db8:3::/64     \|
                                     (C)



---------------------------------------------------------------------------
        From: "Don Sturek" <d.sturek@att.net>

        Here are some points that I would like to see the design team
        clarify with respect to the current RPL draft:
        1)  P2P support
                a.    I believe the P2P feature uses Destination Address
                        advertisements from the device wishing to receive
                        P2P traffic:
                        i.  What are the requirements for the receiving
                                device advertising via Destination Address
                                Advertisements assuming it is not the
                                DAG-root (eg, What is the rate of
                                Destination Address advertisements, How
                                does the device (or does it/can it) notify
                                devices of lower rank/depth that it has
                                detached from the DAG).   If there is no
                                mechanism for the receiving device to
                                notify devices of lower rank/depth it if
                                detaches, how is maintenance of the
                                Destination Address advertisement caching
                                within the DAG performed.
DT: Destination advertisements as specified in -00 may be triggered by any
node in the DAG who observes a change in the DAG.  The request is made by
setting the `D' bit and propagating a new DIO.  This change may have been
observed directly, e.g. a node changes its DAG Parents and preemptively
triggers the DA mechanism to update the inward nodes, or equivalently by
observing a change to the PathDigest.

A DAO of lifetime 0 (no-DAO) is used to indicate that the reachability is
lost (5.4.1.1).  If the Node has a chance to leave gracefully, it could
send that about itself.  The most common case is timing out a child though.

                        ii.   What are the requirements for devices with
                                lower rank/depth in the same DAG when
                                receiving a Destination Address
                                advertisement (eg, does it store the
                                Destination Address advertisements of its
                                children, Does it further re-propagate the
                                Destination Address advertisements to
                                devices of lower rank/depth, How long
                                should devices of lower rank/depth store
                                these destination address advertisements)

DT:
When a node is capable of storing DAs (i.e. it has sufficient storage for
the DA state), then it will do so, along with the depth, and after some
time will re-emit the best DA from the options it has seen, perhaps
performing route aggregation/summarization if it is able.

When a node is not capable of storing DAs (i.e. it has no storage for the
DA state), then it will update the reverse routing statck in the DA and
immediately pass it on.

There is a related concern, brought up in IETF-75, about the effect of
`fan-out' when coupled with the DA mechanism in a DAG, i.e. as DAs are
propagated among (potentially) multiple sets of DAG parents, what steps
need to be taken to keep the multiplicative effect under control.

For nodes who are storing DAs, the DA's will be stored until a no-DAO is
received or the DA lifetime expires.

No-DAO (lifetime 0) would be propagated immediately and unreliably.  This
another instance where a loop is easily created and thus requires
detection.  This is a case where an approach to loop detection would be
enough.

In general, it is clear that there is work to do in the RPL draft to more
clearly state the operation of the DA mechanism, and to remove some
ambiguity in the description of the core mechanism.

                b.    Routing of P2P traffic
                        i.     I assume that the DAG-root holds the address
                                of all Destination Address Advertisements
                                it has received from devices in its DAG.
                                Is this true?
DT: As mentioned on the list, there could be cases where only the MP2P flows
are of interest (no P2MP traffic), in which case the DA mechanism may not
even be invoked and there is not need to MUST the DAG-root to hold the
addresses.

If P2MP traffic is supported, and the DA mechanism is used, then the
DAG-root MUST store the state learned from the DAs it receives.  This
should be further clarified in the draft.

Note that if route summarization is in play, e.g. some sort of hierarchical
addressing structure is supported, then the DAG root may not receive a DA
for every device in its DAG; the addressing structure may allow DAs to be
absorbed and aggregated as the move up the DAG.  This mechanism needs
further exploration, as described in the answer to a related question
above.

                        ii.    I assume that other devices in the DAG of higher
                                rank/depth from the DAG-root but still
                                lower in rank than the final destination
                                routes packets down the DAG.  It seems
                                possible for the destination to receive
                                multiple copies (is this correct) and more
                                importantly it also seems possible for an
                                Unreachable notification to be generated by
                                some device in the DAG when in fact the
                                packet reached its destination through
                                another path (did I read the proposal
                                wrong?)
DT: Yes duplicates may be received but this is beyond the scope of RPL, and
any traffic running over an IP service should be prepared to handle
(or ignore, or not care) possible duplicates.  Likewise, in some
implementations, it may be possible for ICMP unreachable to be issued when
in fact the packet has slipped through.

                c.    Sibling routing seems to be not supported (though I
                                saw discussion of this on the reflector).
                                It seems problematic to have Siblings
                                processing Destination Address
                                advertisements within the same DAG.  We
                                should leave this out of the clarification
                                if my reading is correct and only add it
                                back as part of support for optimized P2P
                                if that is where the optimization leads.
DT: To clarify, `sibling routing' in this question is the same as
`next-door neighbor' or `1-hop' neighbor that has been discussed on the
list, and in fact entails all neighbors (not just siblings as in
nodes-of-the-same-rank).  It is the intention that this type of routing is
supported by RPL.  The proposal on the table is that nodes will make use of
mulicast DAOs in order to inform their neighbors of their owned prefixes.
Text will be added to the -01 draft to specify this mechanism.


        I also have some questions on the freezing of DAGs and ungrounded
        DAGs but I think that is a separate issue and I will put out a note
        later in the week on this.
DT: Please feel free to ask at any time- it will help to produce a better
-01 draft.

---------------------------------------------------------------------------
        From: Jerald.P.Martocci@jci.com

        I would like the following questions answered on subsequent
        revisions of the draft:

        a) What is the roll of 'siblings'?  They are in the current RPL
        document, yet seem to be demoted from parental links.  As I read
        the draft, one cannot use a sibling link unless the parental links
        are exhausted.  The nodes don't seem to explicitly define sibling
        links.
DT: We will clarify this in the draft -01.  Siblings are neighboring nodes
of the same rank in the DAG.  The idea is that forwarding via siblings may
allow a packet to make progress when the parents not available, i.e. the
sibling may have better luck making forward progress.  It is certainly
better than forwarding to a node of greater (outward) rank, and better than
dropping the packet.  So siblings are useful when making the list of
successors; first try the parents, second try siblings, last give up.

        b) What is the plan on communication to devices within direct radio
        range?  As I read the draft unless a node is made a 'parent', or
        possibly a 'sibling' it cannot be communicated with.  However,
        Anand mentions in his memo that at the Stockholm meeting there was
        discussion on directly connected devices.  Unfortunately, I
        couldn't attend the meeting.
DT: The intent is to support this and we intend to elaborate the mechanism
in the next revistion.  Please see the answer to `c' from Don
above.

        c) What are the approximate timer values for RPL?  The document
        gives no indication as to these timers, hence I can't calculate how
        long a floating DAG may be dislodged from its network.  In a
        real-time facility management control system, all nodes are
        monitored for falling off-line.  If the convergence time is too
        long (more than 1 minute), the devices will be flagged off-line and
        reported to the customer.
DT: The intent is that RPL may be parameterized first in a set of
applicability statements in each application domain and finally by the
implementors/administrators of the installed LLNs.  With this in mind, it
is clear that the document needs to better extract and clarify the role of
each timer, constraints on its values, and interactions with RPL mechanisms
in order that informed decisions can be made.  In some cases it may be
appropriate to derive relationships between the timers (e.g. timeout X
should be 3 times timeout Y), and in some cases it may be desirable to
have timers be adaptive.  We propose to clarify the timers in the -01 draft
so that there is a better basis to have these discussions.  A related task
is, e.g. the proper operation of the supression mechanism as has been
discussed on the list.

        d) What is the plan for security?  RPL doesn't currently weigh in
        on the topic.  Are security policies optional or mandatory?  Must
        the policy be consistent with the rest of the enterprise's IT
        security on other parts of the network?  Will security require
        nodes to keep state info as Rene suggests?
DT: There is no doubt that security is important and challenging for LLNs,
and it is important that security be designed in and not bolted on.  We
await WG progress in the security framework, and the guidance of ROLL's
advisor Rene Struik and will incorporate the necessary mechanisms into RPL.

Of particular interest here is to identify any potential places where the
unique issues of low power and lossy networks impact routing security.  For
example, have any of the measures that have been taken to operate with a
small routing footprint or at low protocol overhead increased
vulnerabilities?  (It is conceivable that by recording very little
information, being very parsimonious about what it communicated, and coping
with transient inconsistencies and uncertainty, these networks are actually
less vulnerable.) This may be distinct from the family of security issues for
the end-hosts that may have interfaces to such networks.  Of course, we
need also to attend to the relatively common case where a node is both host
and router.

        e) What requirements from the 4 requirement drafts are being
        considered?  When the DT first was engaged, it said it would
        publish the list of requirements as an appendix and note if the
        current draft supported which requirement.  This hasn't occurred
        and may be adding to the email angst.  As a requirement's author, I
        would rather know up front that some of my MUST requirements are
        not being considered.
DT: All requirements are to be considered, as summarized in
http://www.ietf.org/mail-archive/web/roll/current/msg01291.html (but
superceded as the application drafts are updated to become RFCs).  The
intent is not to re-iterate the requirements in the appendix, but rather to
justify any requirements that are not met in the final specification.  WG
members must help to ensure that the requirements are suitably met by the
proposed specification.

        f) What is the expected frame overhead size based on all the above
        criteria?  6LoWPAN requires subheader overhead for mesh, fragmentation,
        UDP/TCP.  We need to add in the encryption and authentication overhead;
        and now maybe source routing.  I realize ROLL is trying to be
        agnostic to L2, however in practice 802.15.4 is the only game in
        town.  It only carries a 128 by packet size.  The DT/WG should at
        least give an accounting of the expected frame overhead so we can
        determine what L2s are feasible for the protocol.
DT: As small as possible ;)  ROLL is agnostic to L2, and there are most
certainly folks in the WG involved with L2 other than 802.15.4.  We do
recognize that typical LLN solutions will be using very constrained link
technologies.  Let us continue to clarify the RPL specification, with the
goal in mind to provide the simplest, smallest footprint core mechanism,
and then see that any overhead is efficiently allocated.  Then we should
have justification of what an L2 may need to provide to support RPL,
whether or not it is appropriate, and/or what sort of
adaptation/subnetwork, is required for RPL operation.

As a related point, there is still work to do with regards to header
compression/address compression, ...

        g) What routing data must persist a warmstart/coldstart?  The
        overhead to establish a DAG is significant.  We should consider
        what data might be able to transcend a network reboot to minimize
        communication startup bottlenecks.
DT: It is not *required* that routing data persist a warmstart/coldstart,
e.g. RPL should not be fundamentally broken by a failure to do so, but as
you note it may certainly help to make things more efficient.  We propose
to clarify what data an implementation may consider to perist in the next
version of the draft.

        As Pascal has noted in previous emails, some of these issues may be
        considered out-of bounds for the protocol and should be an
        implementation decision.  That may indeed be true, but RPL should
        state its case explicitly.  In the Building Market, it is
        multi-disciplined (HVAC, Fire, Lighting) and multi-vendor.  Unless
        some higher authority prescribes operation, the chances of LLN
        node-to-node interoperability are moot.

               -------------------------------------------------------

        RPL currently makes no mention of sleepy devices.  These will be
        very commonplace in WSNs.  I would like to amend my list below and
        add sleepy device management.  RPL needs to state what happens to
        packets routed to sleepy devices while they are asleep.  Are they
        dropped?  Will a proxy manage the packet until the device awakens?
        Will the last router return an error to the source?
DT: This should be handled with an adaptation layer, and the underlying L2,
although L3 may need to be aware of such devices, e.g. to provision proper
DA timeouts for `long sleepers', etc.  We propose to follow the lead of
6LowPAN here and related L2 efforts such as 802.15.4e for the case of
LoWPANs.  PLC L2 layers and emerging WLAN, e.g., low power 802.11 have
related but distinct behaviors.


---------------------------------------------------------------------------
              "Mathilde Durvy (mdurvy)" <mdurvy@cisco.com>

        I would personally want to see more discussion on 3 topics where
        I'm still not fully convinced:

        [SIBLINGS] The role of siblings: this adds quite some extra
        complexity to the routing but what does it really bring in
        practice?  Should siblings be part of the foundation layer? My
        opinion is no.
DT: For the moment we think yes, see above answer to Jerry's question `a'
above.  But what does the rest of the WG think?

        [METRICS] If the choice of possible metrics is specified in the
        metric draft, why not impose the right conditions on these metrics
        so that we can base the DAG construction and the loop avoidance on
        a single metric rather than have a separate hop count or depth?
DT: We will clarify this further in the draft.  The intent is that the
depth (i.e. `rank') is to serve equivalently to this single metric you
described: its value is determined by the mechanism defined in the OCP, it
is to be derived in a way that is related to position in the DAG with
certain properties useful for loop avoidance, and its loop avoidance
properties are to be universal such that neighboring nodes may understand
even in the case where they don't understand the OCP (e.g. `esperanto').

        [TIMERS] Interaction between timers: I'm thinking in particular of
        the interaction of the RA timers and the DAG hop timers. In the
        loop avoidance example given by Jonathan during the IETF meeting
        can we guarantee the right sequencing of RA / DAG Hop timer firing?
DT: To be clarified, as per Jerry's question `c' above.  In addition, note
that we may not always be able to guarantee the sequencing of the RA / DAG
Hop timers as in the example- in the best case scenario the timer values
should allow for the coordinated freezing of the sub-DAG and subsequent DAG
Merge.  But in some cases, e.g. due to comm loss, the freezing might not be
coordinated and the DAG Merge might be choppy/fragmented.  But RPL should
still be able to operate, by detecting resuling inconsistencies and
repairing.

---------------------------------------------------------------------------
        From: "Julien Abeille (jabeille)" <jabeille@cisco.com>

        a few more items:
        [Packet format] Clarify the rationale to use RAs/NAs. More precisely:
        - are RAs sent for router discovery/prefix discovery the same as
                those used by ROLL (same timer).
                -- If yes, does it impact the way router discovery/prefix
                        discovery work (preformance, do they really have
                        the same timing constraints?)

DT: The DIO is specified as an RA option, and just like any other option,
it doesn't have to be sent in every RA. Sending the DIO does not have to be
on the same timer. However, it may be advantageous to include a prefix
information option along with a DIO so that a node can also autoconfigure
its address in addition to configuring a default route with a single
message transmission (reduced energy, channel utilization, etc). As a
result, it may mean that some options specified in RFC4861 may be sent more
frequently that if RPL was not using RA messages as a transport. We don't
think there is any issue with sending options more frequently than expected
- if you think there are, please raise them.

                -- if yes, what is the approach with regards to update of
                        RFC4861 (6lowpan-nd is not in scope in my opinion
                        as it is L2 dependant, while ROLL is not)?
                -- if not, why not using a specific packet?

        - NA:
                -- Are NAs sent for ROLL used for anything else? If not why
                        not use a specific packet to carry DAOs?

                -- NAs are already pretty overloaded (DAD, NUD, Address
                        resolution), using a different packet may bring NA
                        processing complexity down.
DT:  There was some discussion at IETF-75 regarding trying to generalize
the ND binding.  For now we may concentrate on the form and function of
DIOs/DAOs and how they interact in the RPL protocol, leaving the RA and
NA in place for the moment.  Once the core mechanisms are refined we may
then revisit/refine what is used to carry the advertisements.  A clear
preference would be to not cause modification to established IPv6 ND
behaviour.




- END -
Re: [Roll] RPL Next Steps Jerald.P.Martocci
Re: [Roll] RPL Next Steps Tim Winter
[Roll] RPL Next Steps Tim Winter
Re: [Roll] RPL Next Steps Alexandru Petrescu
Re: [Roll] RPL Next Steps Alexandru Petrescu