Re: [6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 31 March 2020 23:57 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: 6tisch@ietfa.amsl.com
Delivered-To: 6tisch@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B8D43A0DB9; Tue, 31 Mar 2020 16:57:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EnAaQsiNgO2h; Tue, 31 Mar 2020 16:57:42 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3C2E43A0DB8; Tue, 31 Mar 2020 16:57:41 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 02VNvZB3025414 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Mar 2020 19:57:37 -0400
Date: Tue, 31 Mar 2020 16:57:34 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: Tengfei Chang <tengfei.chang@gmail.com>
Cc: The IESG <iesg@ietf.org>, draft-ietf-6tisch-msf@ietf.org, 6tisch <6tisch@ietf.org>, 6tisch-chairs@ietf.org, "Pascal Thubert (pthubert)" <pthubert@cisco.com>
Message-ID: <20200331235734.GU50174@kduck.mit.edu>
References: <158394932747.1671.4699004253009791924@ietfa.amsl.com> <CAAdgstSMOf7wDSfbWMv5tEzpx1=otQZX_TZ+Xevm77f-1ZztNw@mail.gmail.com> <20200324192510.GE50174@kduck.mit.edu> <CAAdgstTzBTwncFgaEao62X2s4J610_spy4oFsTYNkB1KUKhHww@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAAdgstTzBTwncFgaEao62X2s4J610_spy4oFsTYNkB1KUKhHww@mail.gmail.com>
User-Agent: Mutt/1.12.1 (2019-06-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/6tisch/tVDpxa32e_Sv0JPdKnN7Oy7c9Zk>
Subject: Re: [6tisch] Benjamin Kaduk's Discuss on draft-ietf-6tisch-msf-12: (with DISCUSS and COMMENT)
X-BeenThere: 6tisch@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discuss link layer model for Deterministic IPv6 over the TSCH mode of IEEE 802.15.4e, and impacts on RPL and 6LoWPAN such as resource allocation" <6tisch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/6tisch>, <mailto:6tisch-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/6tisch/>
List-Post: <mailto:6tisch@ietf.org>
List-Help: <mailto:6tisch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/6tisch>, <mailto:6tisch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Mar 2020 23:57:50 -0000

Hi Tengfei,

Also inline.

On Mon, Mar 30, 2020 at 01:40:24PM +0200, Tengfei Chang wrote:
> Hi Ben,
> 
> I replied inline:
> 
> On Tue, Mar 24, 2020 at 8:27 PM Benjamin Kaduk <kaduk@mit.edu> wrote:
> 
> > Hi Tengfei,
> >
> > Also inline.
> >
> > On Tue, Mar 24, 2020 at 12:22:02PM +0100, Tengfei Chang wrote:
> > >    Hi Benjamin,
> > >    I replied inline starting with '>'
> > >    Thanks so much those detailed comments!
> > >    On Wed, Mar 11, 2020 at 6:55 PM Benjamin Kaduk via Datatracker
> > >    <noreply@ietf.org> wrote:
> > >
> > >      Benjamin Kaduk has entered the following ballot position for
> > >      draft-ietf-6tisch-msf-12: Discuss
> > >
> > >      When responding, please keep the subject line intact and reply to
> > all
> > >      email addresses included in the To and CC lines. (Feel free to cut
> > this
> > >      introductory paragraph, however.)
> > >
> > >      Please refer to
> > >      https://www.ietf.org/iesg/statement/discuss-criteria.html
> > >      for more information about IESG DISCUSS and COMMENT positions.
> > >
> > >      The document, along with other ballot positions, can be found here:
> > >      https://datatracker.ietf.org/doc/draft-ietf-6tisch-msf/
> > >
> > >
> > ----------------------------------------------------------------------
> > >      DISCUSS:
> > >
> > ----------------------------------------------------------------------
> > >
> > >      I'm concerned that the scheduling function for autonomous cells can
> > >      cause an infinite loop in the case of hash collision -- Section 3
> > >      specifies that AutoTxCell always takes precedence over AutoRxCell,
> > but
> > >      if those two cells collide, the corresponding cells on the peer in
> > >      question will also collide.  If both peers try to send at the same
> > time
> > >      and the hashes collide, they will both attempt to transmit
> > indefinitely
> > >      and never be received.
> > >
> > >
> > >    >. Notice that the AutoTxCell  is a shared cell, where the back-off
> > >    mechanism is applied.
> > >    > In case there is a collision on that cell, a back-off with different
> > >    exponent will be used on each side.
> > >    > The cell will be used AutoTxCell on each side at different timing.
> >
> > Ah, it seems I was misinterpreting "take precedence over" to apply to the
> > entire local scheduling, not merely the case when independent tx and rx
> > scheduling land on the same cell.  Thanks for clarifying here; is there
> > anything useful to say in the document about how even if there is a
> > collision in the assigned slot there's still a Tx backoff, so the cell is
> > usable for Rx some of the time?
> >
> 
> * RESPONSE: * We could add the following sentence right after the hashing
> collision statement:
> 
> Notice AutoTxCell is a shared type cell which applies back off mechanism.
> When the AutoTxCell and AutoRxCell are collided,  AutoTxCell takes
> precedence if there is a packet to transmit.
> In case in a back-off period, AutoRxCell is used.

That sounds good, thanks.

> 
> > >      There seems to be some "passing the buck" going on with respect to
> > >      rate-limiting unauthenticated (join) traffic:
> > >      draft-ietf-6tisch-minimal-security (Section 6.1.1) says that the SF
> > >      "SHOULD NOT allocate additional cells as a result of traffic with
> > code
> > >      point AF43"; this document is implementing a SF, and yet we try to
> > avoid
> > >      the issue, saying that "[t]he at IPv6 layer SHOULD ensure that this
> > join
> > >      traffic is rate-limited before it is passed to 6top sublayer where
> > MSF
> > >      can observe it".  I think we need a clear and consistent story about
> > >      where this rate-limiting is supposed to happen.
> > >
> > >    > Thanks for the comments! This has been discussed in some  previous
> > >    revision of MSF.
> > >    > It is not "passing the buck" but a decision based on the scheduling
> > >    function and security context.
> > >    > In the point of avoiding layer violation, the upper layer
> > information
> > >    suppose NOT see-able for linker layer where 6P and MSF are.
> >
> > If we assume strict layiner so that IP information is not visible to the
> > link layer where the scheduling function lives, then isn't that a flaw in
> > draft-ietf-6tisch-minimal-security to say that the scheduling function
> > should do [something relying on IP-layer information]?
> >
> > >    > But regarding to security, it seems it is not avoidable.
> > >    > IMO, the scheduling function is aiming to provide algorithm to
> > >    add/remove cell according to traffic.
> > >    > The traffic could contains unauthenticated  join request from both
> > >    normal devices and malicious devices.
> > >    > The function does NOT have enough information to differentiate them.
> > >    > We are assuming some other entity out side of MSF needs to resolve
> > this
> > >    issue.
> >
> > Nonetheless, we're currently not fulfilling a requirement that a SF should
> > meet.  If that requirement is unattainable, the requirement should be
> > modified or removed; if not, we should attain the requirement.
> >
> > >    >> If assuming the security info in the Ipv6 header is passed to MSF,
> > we
> > >    could abandon rate-limiting approach and simply jumping over a slot
> > if the
> > >    AF43 packet is sent on that slot.
> > >    > Hence the adapting traffic never happens to traffic marked as AF43.
> > >
> > >
> > ----------------------------------------------------------------------
> > >      COMMENT:
> > >
> > ----------------------------------------------------------------------
> > >
> > >      I support Roman's Discuss -- we need more information for this to
> > be a
> > >      useful reference; even what seem to be the official DASFAA 1997
> > >      proceedings (https://dblp.org/db/conf/dasfaa/dasfaa97) do not have
> > an
> > >      associated document).
> > >
> > >      Basing various scheduling aspects on (a hash of) the EUI64 ties
> > >      functionality to a persistent identifier for a device.  How
> > significant
> > >      a disruption would be incurred if a device periodically changes its
> > >      presented EUI64 for anonymization purposes?
> > >
> > >    > I assume you are saying a malicious device?
> > >    > There is no doubt this will influence the performance of joining
> > process
> > >    for normal devices.
> > >    > But normal devices still have a chance to join.
> > >    > the join proxy won't be affect as well since the cell will be
> > removed
> > >    right after the packet is sent out.
> >
> > I was thinking a non-malicious device, just one that (for example) changes
> > its physical location frequently, and wants to change its EUI64 when it
> > does so, to avoid that location being tracked and correlated over time.
> > That said, your answer still seems to answer my question, and since normal
> > devices will still have a chance to join, it seems like we probably do not
> > need to add text to discuss this situation.
> >
> 
> *RESPONSE:* Great!
> 
> >
> > >      There seems to be a general pattern of "if you don't have a
> > >      6P-negotiated Tx cell, install and AutoTxCell to send your one
> > message
> > >      and then remove it after sending"; I wonder if it would be easier
> > on the
> > >      reader to consolidate this as a general principle and not repeat the
> > >      details every time it occurs.
> > >
> > >    >  Yes, this is the feature of autonomous cell. Not sure if it would
> > >    easier to understand state just one time.
> > >    > There is little different for each adding/removing, e.g which node
> > to do
> > >    so, parent/JP?
> > >    > I personally feel it's clear to repeat this every time,  with
> > various
> > >    type of node, so highlighting the difference.
> >
> > Okay.  Thank you for considering the idea.
> >
> > >      Requirements Language
> > >
> > >      "NOT RECOMMENDED" is not in the RFC2119 boilerplate (but is a BCP 14
> > >      keyword).
> > >
> > >    > Thanks for pointing out. It will be removed in next revision.
> > >    > We also updated the RFC to RFC8174 instead of RFC2119.
> >
> > Oops, I think my comment was unclear.
> > RFC 8174 has a paragraph in it that you should copy/paste into your
> > document to replace this one.  ("NOT RECOMMENDED" is included in that
> > paragraph in RFC 8174.)
> >
> > Also, you should cite both RFC 2119 and RFC 8174, not just RFC 8174 -- BCP
> > 14 comprises both of them together.
> >
> > * RESPONSE: * Thanks for clarifying! Will use the paragraph from RFC 1874.
> 
> 
> > >      Section 1
> > >
> > >         the 6 steps described in Section 4.  The end state of the join
> > >         process is that the node is synchronized to the network, has
> > mutually
> > >         authenticated to the network, has identified a routing parent,
> > and
> > >
> > >      nit(?): I guess maybe "mutually authenticated with" is more correct
> > for
> > >      the bidirectional operation.
> > >
> > >    > will update in next revision.
> > >
> > >         It does so for 3 reasons: to match the link-layer resources to
> > the
> > >         traffic, to handle changing parent, to handle a schedule
> > collision.
> > >
> > >      nit: end the list with "or" (or "and"?).
> > >
> > >    > will update in next revision.
> > >
> > >         MSF works closely with RPL, specifically the routing parent
> > defined
> > >         in [RFC6550].  This specification only describes how MSF works
> > with
> > >         one routing parent, which is phrased as "selected parent".  The
> > >
> > >      nit: I suggest '''one routing parent; this parent is referred to as
> > the
> > >      "selected parent"'''.
> > >
> > >    > will update in next revision.
> > >
> > >         activity of MSF towards to single routing parent is called as a
> > "MSF
> > >
> > >      nit: "towards the"
> > >
> > >    > will update in next revision.
> > >
> > >         *  We added sections on the interface to the minimal 6TiSCH
> > >            configuration (Section 2), the use of the SIGNAL command
> > >            (Section 6), the MSF constants (Section 14), the MSF
> > statistics
> > >            (Section 15).
> > >
> > >      nit: end the list with "and".
> > >
> > >    > will update in next revision.
> > >
> > >      Section 2
> > >
> > >         In a TSCH network, time is sliced up into time slots.  The time
> > slots
> > >         are grouped as one of more slotframes which repeat over time.
> > The
> > >
> > >      nit(?): should this be "one or more"?
> > >
> > >    > it should be 'one or multiple slotframes". Will update in next
> > revision
> > >
> > >         channel) is indicated as a cell of TSCH schedule.  MSF is one of
> > the
> > >         policies defining how to manage the TSCH schedule.
> > >
> > >      nit: if there is only one such policy active at a given time for a
> > given
> > >      network, I suggest "MSF is a policy for managing the TCSH schedule".
> > >      (If multiple policies are active simultaneously, no change is
> > needed.)
> > >
> > >    > As indicated in RFC8480: A node MAY implement multiple SFs  and run
> > them
> > >    at the same time.
> > >    > so MSF is one of the policies defining how to manage the TSCH
> > schedule.
> >
> > Thank you for the reference, and sorry for missing it.
> >
> > >         MSF uses the minimal cell for broadcast frames such as Enhanced
> > >         Beacons (EBs) [IEEE802154] and broadcast DODAG Information
> > Objects
> > >         (DIOs) [RFC6550].  Cells scheduled by MSF are meant to be used
> > only
> > >         for unicast frames.
> > >
> > >      If this paragraph was moved before the previous paragraph, then EB
> > and
> > >      DIO would be defined before their first usage.
> > >
> > >    > Maybe I understand it wrong. Do you mean you prefer to move this
> > >    paragraph before the previous one?
> > >    > The EB and DIO are defined in the references, not sure we still need
> > >    define them in MSF.
> >
> > That is my preference, but I defer to your preference where it differs from
> > mine.
> >
> > >         bandwidth of minimal cell.  One of the algorithm met the rule is
> > the
> > >         Trickle timer defined in [RFC6206] which is applied on DIO
> > messages
> > >         [RFC6550].  However, any such algorithm of limiting the broadcast
> > >
> > >      nit(?): "One of the algorithms that fulfills this requirement"?
> > >
> > >    > will update accordingly.
> > >
> > >         MSF RECOMMENDS the use of 3 slotframes.  MSF schedules autonomous
> > >         cells at Slotframe 1 (Section 3) and 6P negotiated cells at
> > Slotframe
> > >         2 (Section 5) , while Slotframe 0 is used for the bootstrap
> > traffic
> > >         as defined in the Minimal 6TiSCH Configuration.  It is
> > RECOMMENDED to
> > >         use the same slotframe length for Slotframe 0, 1 and 2.  Thus it
> > is
> > >
> > >      Perhaps this is just a question of writing style, but if an
> > >      implementation is free to use an alternative SF or a variant of MSF,
> > >      could we not say that "MSF uses 3 slotframts", "MSF uses the same
> > >      slotframe length for", etc.?
> > >
> > >    > updated to "3 slotframes are used in MSF. " , "The same slotframe
> > length
> > >    for Slotframe 0, 1 and 2 is RECOMMENDED".
> > >
> > >      Section 3
> > >
> > >      Is there any risk of unwanted correlation between slot and channel
> > >      offsets when using the same hash function and input for both
> > >      calculations?
> > >
> > >         hash function.  Other optional parameters defined in SAX
> > determine
> > >         the performance of SAX hash function.  Those parameters could be
> > >         broadcasted in EB frame or pre-configured.  For interoperability
> > >         purposes, an example how the hash function is implemented is
> > detailed
> > >         in Appendix B.
> > >
> > >      Given the lack of usable reference for [SAX-DASFAA], I assume that
> > the
> > >      content in Appendix B is going to be used as a specification, not
> > just
> > >      an example.
> > >
> > >    > the new reference for SAX is updated in the new revision.
> > >
> > >         *  The AutoRxCell MUST always remain scheduled after
> > synchronized.
> > >
> > >      nit: s/synchronized/synchronization/
> > >
> > >         AutoRxCell.  In case of conflicting with a negotiated cell,
> > >         autonomous cells take precedence over negotiated cell, which is
> > >         stated in [IEEE802154].  However, when the Slotframe 0, 1 and 2
> > use
> > >         the same length value, it is possible for negotiated cell to
> > avoid
> > >         the collision with AutoRxCell.
> > >
> > >      Presumably this factors in to the recommendation to have the three
> > >      listed slotframes use the same length, but mentioning it explicitly
> > >      (whether here or where the recommendation is made) might be nice.
> > >
> > >    > it is mentioned before as:  The same slotframe length for Slotframe
> > 0, 1
> > >    and 2 is RECOMMENDED.
> >
> > I agree that it is mentioned before.  My point is that we have the
> > recommendation to use the same slotframe length (Section 2) in a different
> > place from discussion about why having the same slotframe length is
> > beneficial (here), so the reader has to remember and make the connection.
> > If we mention both the recommendation and the reason for the recommendation
> > in the same place, the reader has to do less work.
> >
> 
> *RESPONSE:* Agreed. Will add following sentence in the text.
> 
> However, when the Slotframe 0, 1 and 2 use the same length value, it is
> possible for negotiated cell to avoid the collision with AutoRxCell.
> *Hence, the same slotframe length for Slotframe 0, 1 and 2 is RECOMMENDED.*

Thanks!

> 
> >
> > >      Section 4
> > >
> > >         network.  Alternative behaviors may involved, for example, when
> > >         alternative security solution is used for the network.  Section
> > 4.1
> > >
> > >      nit: singular/plural mismatch "behaviors"/"solution is used"
> > >
> > >    > will be fixed in next revision.
> > >
> > >      Section 4.1
> > >
> > >         A node implementing MSF SHOULD implement the Minimal Security
> > >         Framework for 6TiSCH [I-D.ietf-6tisch-minimal-security].  As a
> > >
> > >      Didn't this get renamed to CoJP?
> > >
> > >    > Thanks for pointing it out! Will update in next revision.
> > >
> > >      Section 4.2
> > >
> > >      I a little bit wonder if there is a better description than
> > "available
> > >      frequencies" but don't have one to offer.
> > >
> > >    > The frequency to be selected is randomly picked. There is no one
> > that is
> > >    preferred comparing to others.
> >
> > I was not sure if this was "available" in the sense of "my hardware radio
> > has a list of frequencies that it can tune to", "the channels that my
> > network cycles amongst", or " the channels not already scheduled at this
> > time".
> >
> 
> *RESPONSE: * The second is what the sentence tries to convey. Does the
> following sentence is clear for you then?
> 
> *When switched on, the pledge randomly chooses a frequency from the
> channels that the network cycles amongst, and starts listening for EBs on
> that frequency.*

Yes, thank you.

> 
> > >      Section 4.3
> > >
> > >         While the exact behavior is implementation-specific, it is
> > >         RECOMMENDED that after having received the first EB, a node keeps
> > >         listen for at most MAX_EB_DELAY seconds until it has received EBs
> > >         from NUM_NEIGHBOURS_TO_WAIT distinct neighbors, which is defined
> > in
> > >         [RFC8180].
> > >
> > >      nit(?): this phrasing implies that only NUM_NEIGHBOURS_TO_WAIT is
> > >      defined in RFC 8180, but MAX_EB_DELAY is also defined there.
> > >
> > >    > The "which" here indicates the whole behavior.
> > >    > It will be rephrased  as "This behavior is defined in [RFC8180]".
> > >
> > >      not-nit: this phrasing is ambiguous as to whether one of
> > MAX_EB_DELAY
> > >      and NUM_NEIGHBOURS_TO_WAIT is sufficient to move to the next step or
> > >      whether both are required.
> > >
> > >    > The two are actually explaining two situations:
> > >    > 1 .keep listening, when EBs from NUM_NEIGHBOURS_TO_WAIT are
> > received, it
> > >    stops listening and synchronize to one of the neighbors  .
> > >    > 2. if after  MAX_EB_DELAY timeout,  EBs are received from number of
> > >    neighbors <  NUM_NEIGHBOURS_TO_WAIT, it stops listening as well and
> > >    synchronize to the neighbor or one of neighbors.
> >
> > Okay.  I would suggest to s/at most MAX_EB_DELAY seconds until it has
> > received/at most MAX_EB_DELAY seconds or until it has received/, then.
> >
> 
> *RESPONSE: * I agree to use "or" here. This sentence is the original
> sentence from RFC8180.
> Can I update the sentence by using "or" in this draft?

There is no rule preventing you from using "or" in this draft, though now
that I see the corresponding part of RFC 8180 I feel less strongly to make
this change.

> >
> > Also, I se that the -14 has changed this from RECOMMENDED to MAY; my naive
> > expectation would be that it is still RECOMMENDED, but I don't remember if
> > another reviewer's comment prompted this change.
> >
> 
> *RESPONSE:*  it is mentioned by one of the reviewers saying it's not
> consistent with RFC8180.
>   This behavior comes from  RFC8180, which uses MAY here.

Ah, thank you.

> >
> > >      Section 4.4
> > >
> > >         After selected a JP, a node generates a Join Request and
> > installs an
> > >         AutoTxCell to the JP.  The Join Request is then sent by the
> > pledge to
> > >         its JP over the AutoTxCell.  The AutoTxCell is removed by the
> > pledge
> > >
> > >      editorial: I'd suggest s/its JP/its selected JP/
> > >
> > >    > Will be updated in next revision.
> > >
> > >         Response is sent out.  The pledge receives the Join Response
> > from its
> > >         AutoRxCell, thereby learns the keying material used in the
> > network,
> > >         as well as other configurations, and becomes a "joined node".
> > >
> > >      nit: maybe "other configuration values" or "other configuration
> > >      settings"?
> > >
> > >    > Will be updated in next revision.
> > >
> > >      Section 4.6
> > >
> > >         Once it has selected a routing parent, the joined node MUST
> > generate
> > >         a 6P ADD Request and install an AutoTxCell to that parent.  The
> > 6P
> > >         ADD Request is sent out through the AutoTxCell with the following
> > >         fields:
> > >
> > >         *  CellOptions: set to TX=1,RX=0,SHARED=0
> > >         *  NumCells: set to 1
> > >         *  CellList: at least 5 cells, chosen according to Section 8
> > >
> > >      Is this listing describing the contents of the ADD request or the
> > >      AuthTxCell used to send it?  (I presume the former, in which case I
> > >      suggest to use "containing" or similar in preference to "with".)
> > >
> > >    > yes, it is the former. Will update in the next revision.
> > >
> > >      Section 5.1
> > >
> > >         The goal of MSF is to manage the communication schedule in the
> > 6TiSCH
> > >         schedule in a distributed manner.  For a node, this translates
> > into
> > >         monitoring the current usage of the cells it has to the selected
> > >         parent:
> > >
> > >      Is this goal strictly limited to traffic "to the selected parent"
> > vs.
> > >      all traffic?
> > >
> > >    > Theoretically MSF does not limit to traffic to the selected parent
> > but
> > >    any neighbors.
> > >    > However, all the experiment result with MSF we have made to verify
> > it is
> > >    to the selected parent only.
> > >    > Hence, We state here "the selected parent" only.
> >
> > I think the stated scope of applicability of the specification is not
> > limited to just the experiments that have been performed so far, so there
> > does not seem much justification for saying that "this translates into
> > monitoring [...] to the selected parent".
> >
> 
> *RESPONSE: *will update the text as following.
> 
> For a node, this translates into monitoring the current usage of the cells
> it has to *one of its neighbors, most cases to the selected parent. *

Thanks!

> >
> > >         *  If the node determines that the number of link-layer frames
> > it is
> > >            attempting to exchange with the selected parent per unit of
> > time
> > >            is larger than the capacity offered by the TSCH negotiated
> > cells
> > >            it has scheduled with it, the node issues a 6P ADD command to
> > that
> > >            parent to add cells to the TSCH schedule.
> > >         *  If the traffic is lower than the capacity, the node issues a
> > 6P
> > >            DELETE command to that parent to delete cells from the TSCH
> > >            schedule.
> > >
> > >      As written, this would potentially lead to oscillation when demand
> > is
> > >      basically at capacity, due to the quantization of capacity.  Perhaps
> > >      some provisioning for hysteresis is appropriate?
> > >
> > >    > Yes, if referring to the MSF cell usage algorithm in the following,
> > more
> > >    cell are scheduled than what needed.
> > >    > Here is to explain the basic concept of this scheduling function.
> > >
> > >         The cell option of cells listed in CellList in 6P Request frame
> > >         SHOULD be either Tx=1 only or Rx=1 only.  Both NumCellsElapsed
> > and
> > >         NumCellsUsed counters can be used to both type of negotiated
> > cells.
> > >
> > >      Would this be more clear as "(Tx=1,Rx=0) or (Tx=0,Rx=1)"?
> > >
> > >    > Yes it's more clear. Will update in next revision
> > >
> > >         *  NumCellsElapsed is incremented by exactly 1 when the current
> > cell
> > >            is AutoRxCell.
> > >
> > >      This holds for all peers/parents we're keeping counters for, so the
> > >      AutoRxCell can get "double counted"?
> > >
> > >    > one pair of counters is associated to one neighbor.
> > >    > If there is multiple parents, then there are two NumCellsElapsed
> > >    counters, one for each of the parents.
> >
> > I agree.  It seems that when an AutoRxCell occurs, the NumCellsElapsed
> > counter will increment in all of the counters (i.e., for each parent).
> > This is in some sense "double counting" that cell.  I'm not sure whether
> > this has a negative effect on the usefulness of the statistics, especially
> > in the (unlikely) case when there are a large number of parents.
> >
> 
> *RESPONSE:* probably just more memory occupation? Don't know either on the
> negative effect :-)
> 
> >
> > >         In case that a node booted or disappeared from the network, the
> > cell
> > >         reserved at the selected parent may be kept in the schedule
> > forever.
> > >         A clean-up mechanism MUST be provided to resolve this issue.  The
> > >         clean-up mechanism is implementation-specific.  It could either
> > be a
> > >         periodic polling to the neighbors the nodes have negotiated cells
> > >         with, or monitoring the activities on those cells.  The goal is
> > to
> > >         confirm those negotiated cells are not used anymore by the
> > associated
> > >         neighbors and remove them from the schedule.
> > >
> > >      I'm not sure that "monitoring the activities on those cells" is safe
> > >      with the current level of specification; if a node negotiates a 6P
> > >      transmit cell to a parent and uses it only sparingly, with the
> > parent
> > >      eventually reclaiming it due to inactivity, I don't see a mechanism
> > by
> > >      which the node will reliably discover the negotiated cell to be
> > >      nonfunctional and fall back to (e.g.) the corresponding
> > AutoTxCell.  It
> > >      may be most prudent to just not mention that as an example (a
> > "periodic
> > >      polling" procedure does not seem to have the same potential for
> > >      information skew)
> > >
> > >    > Thanks for the comment! I will just remove that sentence from this
> > >    paragraph.
> > >
> > >      Section 5.3
> > >
> > >         schedule is executed and the node sends frames to that parent.
> > When
> > >         NumTx reaches MAX_NUMTX, both NumTx and NumTxAck MUST be divided
> > by
> > >         2.  For example, when MAX_NUMTX is set to 256, from NumTx=255 and
> > >         NumTxAck=127, the counters become NumTx=128 and NumTxAck=64 if
> > one
> > >         frame is sent to the parent with an Acknowledgment received.
> > This
> > >         operation does not change the value of the PDR, but allows the
> > >         counters to keep incrementing.  The value of MAX_NUMTX is
> > >         implementation-specific.
> > >
> > >      Does MAX_NUMTX need to be a power of two (to avoid errors when the
> > >      division occurs)?
> > >
> > >    > Agree, it's better to be a power of two. Will state in the text.
> > >
> > >         4.  For any other cell, it compares its PDR against that of the
> > cell
> > >             with the highest PDR.  If the difference is larger than
> > >             RELOCATE_PDRTHRES, it triggers the relocation of that cell
> > using
> > >             a 6P RELOCATE command.
> > >
> > >      The recommended RELOCATE_PDRTHRES is given as "50 %".  Is this
> > >      "difference" performed as a subtraction (so that if the highest PDR
> > is
> > >      less than 50%, no cells can ever be relocated) or a ratio (a PDR
> > that's
> > >      half than the maximum PDR or smaller will trigger relocation)?
> > >
> > >    > This is "difference" performed as a subtraction.
> > >    > Yes it's sure if highest PDR is less than 50%, no cell can be
> > >    relocated.
> > >    > But it can't tell those cells are link quality bad or because of
> > >    collision.
> > >    > If all cell PDR is so low, highly chance the routing will be
> > affected
> > >    and switch to another neighbor.
> > >    > In experiments,  we never encounter highest PDR less 50% all time.
> >
> > I strongly suggest changing the wording to be clear that it is the
> > "subtraction" interpretation that's desired.  Perhaps "If the difference
> > (PDR_highest - PDR_thiscell) is larger than RELOCATE_PDRTHRES"?
> >
> 
> RESPONSE: it will be rephrased as following:
> 
> If *the subtraction difference between the PDR of the cell and the highest
> PDR* is larger than RELOCATE_PDRTHRES, it triggers the relocation of that
> cell using a 6P RELOCATE command.

I think the RFC Editor will try to make some more suggestions, and leave
the final wording to them.  This formulation has the meaning clear.

> >
> > >      Section 7
> > >
> > >      Maybe reference Section 17.1 where the allocation will occur?
> > >
> > >    > Will add this in next revision.
> > >
> > >      Section 8
> > >
> > >         *  The slotOffset of a cell in the CellList SHOULD be randomly
> > and
> > >            uniformly chosen among all the slotOffset values that satisfy
> > the
> > >            restrictions above.
> > >         *  The channelOffset of a cell in the CellList SHOULD be
> > randomly and
> > >            uniformly chosen in [0..numFrequencies], where numFrequencies
> > >            represents the number of frequencies a node can communicate
> > on.
> > >
> > >      Do these random selections need to be independent from each other?
> > (I
> > >      note that the selection for the autonomous cells are not.)
> > >
> > >    > For channelOffset, they are independently random selected.
> > >    > For slotOffset, since once a slotOffset is picked, the next time to
> > >    select slotOffset, that one can't be selected.
> > >    > This is indicated in the text already as "chosen among all the
> > >    slotOffset values that satisfy the
> > >          restrictions above"
> >
> > I was trying to get at a different point, I think: is there expected to be
> > correlation between the actual slotOffset and channelOffset values for a
> > given cell, as opposed to having them be completely independent selections?
> > In the case of the autonomous cells, since we use the same hash function
> > and input to the hash function for selecting both values, there is a
> > correlation between the two values.  Such a correlation might in theory
> > result in occasional problematic scenarios that are very problematic,
> > whereas if the channel and slot offsets are chosen independently, such
> > "very problematic" scenarios are expected to be much less common (based on
> > the obvious/naive mathematical model).
> >
> 
> *RESPONSE:* I agree there is potential correlation between
> the slotoffset and channel offset as using the same hashing function.
> However, the (slotoffset,  channeloffset) is general considered as one
> item, called a cell.
> As long as there is no correlation between cells, there should be no
> problems.

Okay, thanks for clarifying.

> >
> > >
> > >      Section 9
> > >
> > >      Is there a reference for these three parameters (MAXBE, MAXRETRIES,
> > >      SLOTFRAME_LENGTH)?  SLOTFRAME_LENGTH seems new in this document and
> > is
> > >      listed in the table in Section 14, but the other two are not listed
> > >      there.
> > >
> > >    > The MAXBE, MAXRETRIES are defined in IEEE802.15.4 standard.
> > >    > Their values various on different network systems, according to the
> > size
> > >    and density.
> > >    > Hence we didn't give a recommended value in this draft.
> >
> > Ah, I see now.  It might be helpful to note somewhere that MAXBE and
> > MAXRETRIES are defined by 802.15.4, though I expect most readers of this
> > document to already be at least somewhat familiar with 802.15.4.
> >
> 
> *RESPONSE:  *will be mentioned in the draft.
> 
> >
> > >      Section 14
> > >
> > >      Why is MAX_NUMTX not listed in the table?
> >
> > Should MAX_NUMTX be listed in the table?
> >
> 
> *RESPONSE: *it's listed in version 14.
> 
> >
> > >      Can we really give a recommended NUM_CH_OFFSET value, since this is
> > in
> > >      effect dependent on the number of channels available?
> > >
> > >    > We give a recommended value as this is a parameter used in the SAX
> > >    hashing algorithm.
> > >    >  This doesn't provide implementer to use other values.
> > >
> > >      KA_PERIOD is defined but not used elsewhere in the document.
> > >
> > >    > This is a legacy of MSF draft, which we forgot to remove. Will
> > update in
> > >    next revision
> > >
> > >      What are the considerations in using a power of 10 vs. a power of 2
> > as
> > >      MAX_NUM_CELLS?
> > >
> > >    > We pick power of 10 simply because it's easy for reader to
> > understand.
> > >    Nothing specific.
> > >    > There is no restriction to use power of 2, such as 128.
> > >
> > >      Section 16
> > >
> > >         MSF defines a series of "rules" for the node to follow.  It
> > triggers
> > >         several actions, that are carried out by the protocols defined
> > in the
> > >         following specifications: the Minimal IPv6 over the TSCH Mode of
> > IEEE
> > >         802.15.4e (6TiSCH) Configuration [RFC8180], the 6TiSCH Operation
> > >
> > >      I'd suggest a brief note that the security considerations of those
> > >      protocols continue to apply (even though it ought to be obvious);
> > >      reading them could help a reader understand the behavior of this
> > >      document as well.
> > >
> > >         Sublayer Protocol (6P) [RFC8480], and the Minimal Security
> > Framework
> > >         for 6TiSCH [I-D.ietf-6tisch-minimal-security].  In particular,
> > MSF
> > >
> > >      [CoJP again]
> > >
> > >         prevent it from receiving the join response.  This situation
> > should
> > >         be detected through the absence of a particular node from the
> > network
> > >         and handled by the network administrator through out-of-band
> > means,
> > >         e.g. by moving the node outside the radio range of the attacker.
> > >
> > >      "the radio range of the attacker" is not exactly a fixed constant
> > ...
> > >      attackers are not in general bound by legal limits and can increase
> > Tx
> > >      power subject only to their equipment and budget.
> > >
> > >    > Yes, I agree. For action, I will simply remove the example.
> > >
> > >         MSF adapts to traffics containing packets from IP layer.  It is
> > >         possible that the IP packet has a non-zero DSCP (Diffserv Code
> > Point
> > >         [RFC2597]) value in its IPv6 header.  The decision whether to
> > hand
> > >
> > >      RFC 2597 is talking more about specifically assured forwarding PHB
> > >      groups
> > >      than "DSCP codepoint"s per se.
> > >
> > >    > Yes, RFC2472 is the one defined the DSCP codepoint. Will update the
> > >    reference.
> >
> > This text was also changed to fix a pluralization nit, but over-corrected.
> > Please s/containing packet/containing packets/.
> >
> 
> *RESPONSE*: Will be updated in next revision.
> 
> >
> > >      Section 18.1
> > >
> > >      RFC 6206 seems to only be used as an example (Trickle), and could
> > >      probably be informative.
> > >
> > >      RFC 8505 might also not need to be normative.
> > >
> > >    > They will be moved to informative reference section
> > >
> > >      Appendix B
> > >
> > >         In MSF, the T is replaced by the length slotframe 1.  String s is
> > >
> > >      nit: "length of"
> > >
> > >         2.  sum the value of L_shift(h,l_bit), R_shift(h,r_bit) and ci
> > >
> > >      Is this addition performed in "infinite precision" integer
> > arithmetic or
> > >      limited to the output width of h, e.g., by modular division?  (It's
> > not
> > >      clear to me whether this is the role T plays or not.)
> > >
> > >    > What I know here the sum is used by most of the classic string
> > hashing
> > >    functions.
> > >    > The deep reason why using sum here is more mathematics question,
> > which I
> > >    am not an expertise on it:-(
> > >    > The T here used for modular is to make sure the result fall into the
> > >    range of slotframe ( to pick slotOffset), or available frequencies (
> > to
> > >    pick channelOffset).
> >
> > It sounds like this sum is performed modulo T as well?  (I am genuinely not
> > sure.)  I'm also not sure whether it's worth mentioning that fact; perhaps
> > just leaving the text as-is is best.
> 
> 
> > >         8.  assign the result of Step 5 to h
> > >
> > >      The value from step 5 *is* h, so taken literally this says "assign
> > h to
> > >      h" and is not needed.
> > >
> > >    >  Yes, this step is removed in next revision.
> > >    Thanks so much for your comments. Will prepare revision 13 to resolve
> > >    them!
> >
> > Thank you for the updates!
> >
> > I will await further clarification about whether changes to
> > draft-ietf-6tisch-minimal-security are required in order for this document
> > to realistically be able to meet the requirements from that document.
> >
> > -Ben
> >
> 
> *RESPONSE*: thanks again for your feedback on the draft!

And thanks for the updates!

-Ben