[6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 11 March 2020 17:55 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: 6tisch@ietf.org
Delivered-To: 6tisch@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 89F523A0FD7; Wed, 11 Mar 2020 10:55:27 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-6tisch-msf@ietf.org, 6tisch-chairs@ietf.org, 6tisch@ietf.org, Pascal Thubert <pthubert@cisco.com>, pthubert@cisco.com
X-Test-IDTracker: no
X-IETF-IDTracker: 6.120.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <158394932747.1671.4699004253009791924@ietfa.amsl.com>
Date: Wed, 11 Mar 2020 10:55:27 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/6tisch/k1MBKZWtpHICjeeRDIqA2_qbZ4A>
Subject: [6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)
X-BeenThere: 6tisch@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Discuss link layer model for Deterministic IPv6 over the TSCH mode of IEEE 802.15.4e, and impacts on RPL and 6LoWPAN such as resource allocation" <6tisch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/6tisch>, <mailto:6tisch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/6tisch/>
List-Post: <mailto:6tisch@ietf.org>
List-Help: <mailto:6tisch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/6tisch>, <mailto:6tisch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Mar 2020 17:55:28 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-6tisch-msf-12: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:


I'm concerned that the scheduling function for autonomous cells can
cause an infinite loop in the case of hash collision -- Section 3
specifies that AutoTxCell always takes precedence over AutoRxCell, but
if those two cells collide, the corresponding cells on the peer in
question will also collide.  If both peers try to send at the same time
and the hashes collide, they will both attempt to transmit indefinitely
and never be received.

There seems to be some "passing the buck" going on with respect to
rate-limiting unauthenticated (join) traffic:
draft-ietf-6tisch-minimal-security (Section 6.1.1) says that the SF
"SHOULD NOT allocate additional cells as a result of traffic with code
point AF43"; this document is implementing a SF, and yet we try to avoid
the issue, saying that "[t]he at IPv6 layer SHOULD ensure that this join
traffic is rate-limited before it is passed to 6top sublayer where MSF
can observe it".  I think we need a clear and consistent story about
where this rate-limiting is supposed to happen.


I support Roman's Discuss -- we need more information for this to be a
useful reference; even what seem to be the official DASFAA 1997
proceedings (https://dblp.org/db/conf/dasfaa/dasfaa97) do not have an
associated document).

Basing various scheduling aspects on (a hash of) the EUI64 ties
functionality to a persistent identifier for a device.  How significant
a disruption would be incurred if a device periodically changes its
presented EUI64 for anonymization purposes?

There seems to be a general pattern of "if you don't have a
6P-negotiated Tx cell, install and AutoTxCell to send your one message
and then remove it after sending"; I wonder if it would be easier on the
reader to consolidate this as a general principle and not repeat the
details every time it occurs.

Requirements Language

"NOT RECOMMENDED" is not in the RFC2119 boilerplate (but is a BCP 14 keyword).

Section 1

   the 6 steps described in Section 4.  The end state of the join
   process is that the node is synchronized to the network, has mutually
   authenticated to the network, has identified a routing parent, and

nit(?): I guess maybe "mutually authenticated with" is more correct for
the bidirectional operation.

   It does so for 3 reasons: to match the link-layer resources to the
   traffic, to handle changing parent, to handle a schedule collision.

nit: end the list with "or" (or "and"?).

   MSF works closely with RPL, specifically the routing parent defined
   in [RFC6550].  This specification only describes how MSF works with
   one routing parent, which is phrased as "selected parent".  The

nit: I suggest '''one routing parent; this parent is referred to as the
"selected parent"'''.

   activity of MSF towards to single routing parent is called as a "MSF

nit: "towards the"

   *  We added sections on the interface to the minimal 6TiSCH
      configuration (Section 2), the use of the SIGNAL command
      (Section 6), the MSF constants (Section 14), the MSF statistics
      (Section 15).

nit: end the list with "and".

Section 2

   In a TSCH network, time is sliced up into time slots.  The time slots
   are grouped as one of more slotframes which repeat over time.  The

nit(?): should this be "one or more"?

   channel) is indicated as a cell of TSCH schedule.  MSF is one of the
   policies defining how to manage the TSCH schedule.

nit: if there is only one such policy active at a given time for a given
network, I suggest "MSF is a policy for managing the TCSH schedule".
(If multiple policies are active simultaneously, no change is needed.)

   MSF uses the minimal cell for broadcast frames such as Enhanced
   Beacons (EBs) [IEEE802154] and broadcast DODAG Information Objects
   (DIOs) [RFC6550].  Cells scheduled by MSF are meant to be used only
   for unicast frames.

If this paragraph was moved before the previous paragraph, then EB and
DIO would be defined before their first usage.

   bandwidth of minimal cell.  One of the algorithm met the rule is the
   Trickle timer defined in [RFC6206] which is applied on DIO messages
   [RFC6550].  However, any such algorithm of limiting the broadcast

nit(?): "One of the algorithms that fulfills this requirement"?

   MSF RECOMMENDS the use of 3 slotframes.  MSF schedules autonomous
   cells at Slotframe 1 (Section 3) and 6P negotiated cells at Slotframe
   2 (Section 5) , while Slotframe 0 is used for the bootstrap traffic
   as defined in the Minimal 6TiSCH Configuration.  It is RECOMMENDED to
   use the same slotframe length for Slotframe 0, 1 and 2.  Thus it is

Perhaps this is just a question of writing style, but if an
implementation is free to use an alternative SF or a variant of MSF,
could we not say that "MSF uses 3 slotframts", "MSF uses the same
slotframe length for", etc.?

Section 3

Is there any risk of unwanted correlation between slot and channel
offsets when using the same hash function and input for both

   hash function.  Other optional parameters defined in SAX determine
   the performance of SAX hash function.  Those parameters could be
   broadcasted in EB frame or pre-configured.  For interoperability
   purposes, an example how the hash function is implemented is detailed
   in Appendix B.

Given the lack of usable reference for [SAX-DASFAA], I assume that the
content in Appendix B is going to be used as a specification, not just
an example.

   *  The AutoRxCell MUST always remain scheduled after synchronized.

nit: s/synchronized/synchronization/

   AutoRxCell.  In case of conflicting with a negotiated cell,
   autonomous cells take precedence over negotiated cell, which is
   stated in [IEEE802154].  However, when the Slotframe 0, 1 and 2 use
   the same length value, it is possible for negotiated cell to avoid
   the collision with AutoRxCell.

Presumably this factors in to the recommendation to have the three
listed slotframes use the same length, but mentioning it explicitly
(whether here or where the recommendation is made) might be nice.

Section 4

   network.  Alternative behaviors may involved, for example, when
   alternative security solution is used for the network.  Section 4.1

nit: singular/plural mismatch "behaviors"/"solution is used"

Section 4.1

   A node implementing MSF SHOULD implement the Minimal Security
   Framework for 6TiSCH [I-D.ietf-6tisch-minimal-security].  As a

Didn't this get renamed to CoJP?

Section 4.2

I a little bit wonder if there is a better description than "available
frequencies" but don't have one to offer.

Section 4.3

   While the exact behavior is implementation-specific, it is
   RECOMMENDED that after having received the first EB, a node keeps
   listen for at most MAX_EB_DELAY seconds until it has received EBs
   from NUM_NEIGHBOURS_TO_WAIT distinct neighbors, which is defined in

nit(?): this phrasing implies that only NUM_NEIGHBOURS_TO_WAIT is
defined in RFC 8180, but MAX_EB_DELAY is also defined there.

not-nit: this phrasing is ambiguous as to whether one of MAX_EB_DELAY
and NUM_NEIGHBOURS_TO_WAIT is sufficient to move to the next step or
whether both are required.

Section 4.4

   After selected a JP, a node generates a Join Request and installs an
   AutoTxCell to the JP.  The Join Request is then sent by the pledge to
   its JP over the AutoTxCell.  The AutoTxCell is removed by the pledge

editorial: I'd suggest s/its JP/its selected JP/

   Response is sent out.  The pledge receives the Join Response from its
   AutoRxCell, thereby learns the keying material used in the network,
   as well as other configurations, and becomes a "joined node".

nit: maybe "other configuration values" or "other configuration

Section 4.6

   Once it has selected a routing parent, the joined node MUST generate
   a 6P ADD Request and install an AutoTxCell to that parent.  The 6P
   ADD Request is sent out through the AutoTxCell with the following

   *  CellOptions: set to TX=1,RX=0,SHARED=0
   *  NumCells: set to 1
   *  CellList: at least 5 cells, chosen according to Section 8

Is this listing describing the contents of the ADD request or the
AuthTxCell used to send it?  (I presume the former, in which case I
suggest to use "containing" or similar in preference to "with".)

Section 5.1

   The goal of MSF is to manage the communication schedule in the 6TiSCH
   schedule in a distributed manner.  For a node, this translates into
   monitoring the current usage of the cells it has to the selected

Is this goal strictly limited to traffic "to the selected parent" vs.
all traffic?

   *  If the node determines that the number of link-layer frames it is
      attempting to exchange with the selected parent per unit of time
      is larger than the capacity offered by the TSCH negotiated cells
      it has scheduled with it, the node issues a 6P ADD command to that
      parent to add cells to the TSCH schedule.
   *  If the traffic is lower than the capacity, the node issues a 6P
      DELETE command to that parent to delete cells from the TSCH

As written, this would potentially lead to oscillation when demand is
basically at capacity, due to the quantization of capacity.  Perhaps
some provisioning for hysteresis is appropriate?

   The cell option of cells listed in CellList in 6P Request frame
   SHOULD be either Tx=1 only or Rx=1 only.  Both NumCellsElapsed and
   NumCellsUsed counters can be used to both type of negotiated cells.

Would this be more clear as "(Tx=1,Rx=0) or (Tx=0,Rx=1)"?

   *  NumCellsElapsed is incremented by exactly 1 when the current cell
      is AutoRxCell.

This holds for all peers/parents we're keeping counters for, so the
AutoRxCell can get "double counted"?

   In case that a node booted or disappeared from the network, the cell
   reserved at the selected parent may be kept in the schedule forever.
   A clean-up mechanism MUST be provided to resolve this issue.  The
   clean-up mechanism is implementation-specific.  It could either be a
   periodic polling to the neighbors the nodes have negotiated cells
   with, or monitoring the activities on those cells.  The goal is to
   confirm those negotiated cells are not used anymore by the associated
   neighbors and remove them from the schedule.

I'm not sure that "monitoring the activities on those cells" is safe
with the current level of specification; if a node negotiates a 6P
transmit cell to a parent and uses it only sparingly, with the parent
eventually reclaiming it due to inactivity, I don't see a mechanism by
which the node will reliably discover the negotiated cell to be
nonfunctional and fall back to (e.g.) the corresponding AutoTxCell.  It
may be most prudent to just not mention that as an example (a "periodic
polling" procedure does not seem to have the same potential for
information skew)

Section 5.3

   schedule is executed and the node sends frames to that parent.  When
   NumTx reaches MAX_NUMTX, both NumTx and NumTxAck MUST be divided by
   2.  For example, when MAX_NUMTX is set to 256, from NumTx=255 and
   NumTxAck=127, the counters become NumTx=128 and NumTxAck=64 if one
   frame is sent to the parent with an Acknowledgment received.  This
   operation does not change the value of the PDR, but allows the
   counters to keep incrementing.  The value of MAX_NUMTX is

Does MAX_NUMTX need to be a power of two (to avoid errors when the
division occurs)?

   4.  For any other cell, it compares its PDR against that of the cell
       with the highest PDR.  If the difference is larger than
       RELOCATE_PDRTHRES, it triggers the relocation of that cell using
       a 6P RELOCATE command.

The recommended RELOCATE_PDRTHRES is given as "50 %".  Is this
"difference" performed as a subtraction (so that if the highest PDR is
less than 50%, no cells can ever be relocated) or a ratio (a PDR that's
half than the maximum PDR or smaller will trigger relocation)?

Section 7

Maybe reference Section 17.1 where the allocation will occur?

Section 8

   *  The slotOffset of a cell in the CellList SHOULD be randomly and
      uniformly chosen among all the slotOffset values that satisfy the
      restrictions above.
   *  The channelOffset of a cell in the CellList SHOULD be randomly and
      uniformly chosen in [0..numFrequencies], where numFrequencies
      represents the number of frequencies a node can communicate on.

Do these random selections need to be independent from each other?  (I
note that the selection for the autonomous cells are not.)

Section 9

Is there a reference for these three parameters (MAXBE, MAXRETRIES,
SLOTFRAME_LENGTH)?  SLOTFRAME_LENGTH seems new in this document and is
listed in the table in Section 14, but the other two are not listed

Section 14

Why is MAX_NUMTX not listed in the table?

Can we really give a recommended NUM_CH_OFFSET value, since this is in
effect dependent on the number of channels available?

KA_PERIOD is defined but not used elsewhere in the document.

What are the considerations in using a power of 10 vs. a power of 2 as

Section 16

   MSF defines a series of "rules" for the node to follow.  It triggers
   several actions, that are carried out by the protocols defined in the
   following specifications: the Minimal IPv6 over the TSCH Mode of IEEE
   802.15.4e (6TiSCH) Configuration [RFC8180], the 6TiSCH Operation

I'd suggest a brief note that the security considerations of those
protocols continue to apply (even though it ought to be obvious);
reading them could help a reader understand the behavior of this
document as well.

   Sublayer Protocol (6P) [RFC8480], and the Minimal Security Framework
   for 6TiSCH [I-D.ietf-6tisch-minimal-security].  In particular, MSF

[CoJP again]

   prevent it from receiving the join response.  This situation should
   be detected through the absence of a particular node from the network
   and handled by the network administrator through out-of-band means,
   e.g. by moving the node outside the radio range of the attacker.

"the radio range of the attacker" is not exactly a fixed constant ...
attackers are not in general bound by legal limits and can increase Tx
power subject only to their equipment and budget.

   MSF adapts to traffics containing packets from IP layer.  It is
   possible that the IP packet has a non-zero DSCP (Diffserv Code Point
   [RFC2597]) value in its IPv6 header.  The decision whether to hand

RFC 2597 is talking more about specifically assured forwarding PHB groups
than "DSCP codepoint"s per se.

Section 18.1

RFC 6206 seems to only be used as an example (Trickle), and could
probably be informative.

RFC 8505 might also not need to be normative.

Appendix B

   In MSF, the T is replaced by the length slotframe 1.  String s is

nit: "length of"

   2.  sum the value of L_shift(h,l_bit), R_shift(h,r_bit) and ci

Is this addition performed in "infinite precision" integer arithmetic or
limited to the output width of h, e.g., by modular division?  (It's not
clear to me whether this is the role T plays or not.)

   8.  assign the result of Step 5 to h

The value from step 5 *is* h, so taken literally this says "assign h to
h" and is not needed.