[6tisch] draft-ietf-6tisch-6top-protocol-09 comments

Lotte Steenbrink <lotte.steenbrink@fu-berlin.de> Wed, 28 February 2018 12:53 UTC

From: Lotte Steenbrink <lotte.steenbrink@fu-berlin.de>
Content-Type: multipart/alternative; boundary="Apple-Mail=_ABDEB1B9-93FD-434E-93D1-859D650E75A1"
Mime-Version: 1.0 (Mac OS X Mail 11.2 $3445.5.20$)
Message-Id: <CA7DEE60-BFF5-4B30-8937-6967B978B8D9@fu-berlin.de>
Date: Wed, 28 Feb 2018 13:56:42 +0100
Cc: wangqin@ies.ustb.edu.cn, xvilajosana@uoc.edu, thomas.watteyne@analog.com
To: 6tisch@ietf.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/6tisch/qs4mJI5I0hGcPLTPkVOj60mSjdc>
Subject: [6tisch] draft-ietf-6tisch-6top-protocol-09 comments
Precedence: list

Dear 6TiSCH Working Group,

somehow I missed the WGLC announcement for the 6top protocol draft. I'm not quite sure if I'm too late with my review/questions now, but in case I'm not, I'd like to share what I've got so far.

As for the context of my E-Mail: I'm currently implementing the 6top Protocol as part of my master's thesis. It's not a full implementation, just the parts that I currently need: 3-Step transactions are missing and DELETE Requests are still Work In Progress, for example. I'm new-ish to the ideas of 6TiSCH and TSCH in general, so my comments come from an outsiders' point of view.

While implementing 6P, I've stumbled across some parts of the document where I'm not quite sure if there's an inconsistency or if I'm just missing something. In any case, I think it might be helpful to clear these parts up (in the draft or on the WG Mailing List) before publishing 6P as Proposed Standard. (Any feedback to my questions would be greatly appreciated, and all statements proposing a change come with an implicit "I'd be happy to write/suggest text for that", of course.)

Overall, I've found the draft to be nicely written and easy to read, but lacking clear instructions in places. The idea of how 6P works is quick to grasp, also thanks to the illustrations in Fig. 4 and 5. However, when it comes to implementing the protocol, I've found myself skipping all over the document to gather information on what's what. Especially the message format and handling feels incomplete; not all message types are illustrated or documented in full and one often has to infer what to check and send when.
In the following, I will detail what exactly was unclear to me.

6P ADD Response where NumCells == 0
---------------------------------------------------------
Section 3.3.1. says:

"[...] The returned list can contain NumCells elements (succeeded) or between 0 and NumCells elements (partially succeeded). In the case that none of the cells could be allocated node B MUST send a 6P Response with return code set to NOALLOC, indicating that cells could not be allocated in the schedule, for example because they are already used or reserved. The returned list in this case MUST contain 0 elements."

If the returned list contains 0 elements, it satisfies both the requirements to send a SUCCESS as well as a NOALLOC response. Should I send
a) both a SUCCESS as well as a NOALLOC response
b) a NOALLOC response
c) a SUCCESS response
in this case?

I'd assume the answer is b), but since the wording around partially succeeded allocations is explicitly mentioning 0 cells as an options, that assumption may very well be wrong. Depending on what the correct answer is, I'd propose to state it more explicitly in the draft.

Also, NOALLOC doesn't appear in Figure 36: 6P Return Codes, is this on purpose?

Response Format Specification and Illustrations
---------------------------------------------------------------------
I would've found it very helpful to have Figures & subsections describing the format of SUCCESS, RESET and NOALLOC responses (and how to handle them) just like the requests are illustrated in fig. 9, 11, 13 (and 14).

As an example, I would've assumed that the NOALLOC response looks like this:

0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| T | R | Code | SFID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

(since the cellList is always empty anyways, there's no need to explicitly transmit it), but the text says "The returned list in this case MUST contain 0 elements", and while the text says that a message with code RC_RESET marks a "critical error, reset"– what exactly should be reset? The transaction state? How is it any different to a NOALLOC message then? Or are NOALLOC and RESET messages the same thing by a different name and NOALLOC is just a leftover from previous renamings?

On a side note, fig. 14 is missing the descriptions for the T and Code fields (i.e. to which values they should be set) that the other figures have, and while it's possible to infer these values from the draft, I personally think it'd be handy to have them near the figure as well. :)

Erroneous CellOptions in ADD request
-------------------------------------------------------
What happens if a node receives an ADD request where more than one link type (RX, TX or SHARED) is set in the cellOptions field? I'd assume it would send an error response, but of which type? RC_CELLLIST?

Handling CandidateCellList.size() < numCells in Relocate responses
-------------------------------------------------------------------------------------------
Section 3.3.3 states that in a 6P RELOCATE request, " NumCandidate MUST be larger or equal to NumCells", but doesn't specify how Node B should react if it receives a Relocate request that violates this requirement. Does it respond with an RC_CELLLIST error message? If so, I'd propose to change the first sentences of the paragraph starting with "Upon receiving the request" to something along the lines of:

"Upon receiving the request, Node B checks if the length of candidateCellList is be larger or equal to NumCells. Node B's SF verifies that all the cells in the Relocation CellList are indeed scheduled with node A and are of the link type specified in the CellOptions field. If any of these checks fail, node B MUST send a 6P Response to node A with return code RC_CELLLIST. [...]"

6P CLEAR Response format
----------------------------------------
1. If the value of SeqNum doesn't matter for CLEAR messages, why do they have a SeqNum field nonetheless?
2. Section 3.2.2 says "The Code field contains a 6P Return Code when the 6P message is of Type RESPONSE or CONFIRMATION." However, the 6P Return Codes listed in Section 6.2.4. don't list an RC_CLEAR code. Should an RC_CLEAR code be added to section 6.2.4. or should CLEAR response messages in fact have CMD_CLEAR set in their code field? (For the record, the former seems more intuitive to me personally)
Or am I misunderstanding something entirely?

Handling RC_CELLLIST Response messages
----------------------------------------------------------------
The Draft states that a faulty RELOCATE message should be responded to with a 6P response with RC_CELLLIST set. However, the draft does not specify how to handle a RC_CELLLIST response. I'm assuming that it means the same as RC_RESET or RC_NOALLOC: abort transaction, don't add/delete anything. Is this correct? In any case, I think this should be made clear in the draft.

Handling RC_RESET Response messages
------------------------------------------------------------
Section 3.4.3 says:
"If a node receives a 6P Request from a given neighbor before having sent the 6P Response to the previous 6P Request from that neighbor, it MUST send back a 6P Response with a return code of RC_RESET (as per Figure 36). A node receiving RC_RESET code MUST abort the transaction and consider it never happened."

I'm assuming that the node sending the RC_RESET response discards all data on this transaction after it has sent the Response, is this correct? If so, I'd propose to state this explicitly.

Timeout management
-------------------------------
Section 3.4.4. explains how a timeout works and when it occurs, but not what is supposed to happen when it occurs. From section 3.1.1 <https://tools.ietf.org/html/draft-ietf-6tisch-6top-protocol-09#section-3.1.1>.
I gathered that open transactions are aborted on timeout. It might seem trivial, but I think it'd be handy to explicitly mention this in section 3.4.4.

SeqNum maintenance
--------------------------------
Am I correct in assuming that the SeqNum is maintained as a shared SeqNum on a per-link basis (as opposed to A and B maintaining an internal SeqNum each, and keeping track of the others' seqNum as well)?
i.e. if Node A and Node B share a link which was created by an ADD request from A to B, they commit to maintaining (i.e. increasing with every transaction) the sequence number that A included in its initial ADD request. This sequence number is valid only for the link between Nodes A and B. If Node A also has a link to a Node C, their (shared) Sequence Number might be completely different.

Handling a Request with SeqNum == 0
----------------------------------------------------------------------------
I couldn't find any language explicitly specifying what to do when a SeqNum of 0 is received (except for Fig. 30).
I'm assuming it is the following:

- assume the node has reset: cancel any half-open transactions, let SF decide how to handle the situation
- if 3-step transactions are used and there's an active transaction: send response with 6p, send response with return code RC_SEQNUM and SeqNum = 0
- if there's no active transaction: we might just be hearing from this new node for the first time because it was just freshly booted & added to the network (and thus its SeqNum is 0)

is this correct? If so, would it make sense to state something like this in section 3.4.6?

SeqNum == 0 (again)
------------------------------
What happens when
1. Node A sends a Request to Node B
2. Node A reboots (i.e. the sequence number is reset to 0)
3. Node A sends another Request to B

Does Node B recognize that A has reset and cancel the ongoing transaction (as well as the whole "stale" link)? Does it trigger the inconsistency handling of the SF? Could this never occur because A should wait with sending any Request for $timeout time so that all half-open transactions can expire?

Cell Relocation
----------------------
Section 3.3.3. says that NumCells MUST be >= 1. What happens if (for example) the Relocation CellList contains 5 cells, but NumCells == 1? Shouldn't it be that NumCells MUST be == length(Relocation CellList)?

Regarding the Bitbucket links in Appendix C
----------------------------------------------------------------
They don't seem to work for me, is this on purpose?

With best regards,
Lotte Steenbrink

[6tisch] draft-ietf-6tisch-6top-protocol-09 comme… Lotte Steenbrink
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Xavi Vilajosana Guillen
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Lotte Steenbrink
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Thomas Watteyne
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Xavi Vilajosana Guillen
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Thomas Watteyne
Re: [6tisch] draft-ietf-6tisch-6top-protocol-09 c… Xavi Vilajosana Guillen