Re: [homenet] Alia Atlas' Discuss on draft-ietf-homenet-dncp-09: (with DISCUSS and COMMENT)

Steven Barth <cyrus@openwrt.org> Wed, 14 October 2015 09:32 UTC

From: Steven Barth <cyrus@openwrt.org>
To: Alia Atlas <akatlas@gmail.com>
References: <20150916221825.25125.13814.idtracker@ietfa.amsl.com> <55FAD7F9.9000100@openwrt.org> <56160B05.9050208@midlink.org> <CAG4d1reye3-VfkHbSfbEprxEZeX1PMY-KvbD4Y+OB88MY12G8Q@mail.gmail.com>
Message-ID: <561E2117.20309@openwrt.org>
Date: Wed, 14 Oct 2015 11:32:07 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:40.0) Gecko/20100101 Icedove/40.0
MIME-Version: 1.0
In-Reply-To: <CAG4d1reye3-VfkHbSfbEprxEZeX1PMY-KvbD4Y+OB88MY12G8Q@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/homenet/ShoAuOYP3IKLLj4o5ZgQqj4qD5I>
Cc: homenet-chairs@ietf.org, Mark Townsley <mark@townsley.net>, draft-ietf-homenet-dncp.shepherd@ietf.org, The IESG <iesg@ietf.org>, draft-ietf-homenet-dncp@ietf.org, HOMENET <homenet@ietf.org>
Subject: Re: [homenet] Alia Atlas' Discuss on draft-ietf-homenet-dncp-09: (with DISCUSS and COMMENT)
Precedence: list

Hello Alia,


thanks for your detailed replies and suggestions.
We have just released a new version which should improve upon the issues you have raised.


Please find some more detailed comments inline.


Cheers,

Steven



> I understand that but there were still a number of gaps.  For instance, I see nothing
> that describes how to compute the network state hash.  I would expect a section like:

I refactored the hash-tree section and created
two subsections "Calculating network state and node data hashes" and  "Updating network
state and node data hashes" to make it more obvious. I also added a reference to the
profile section wrt. hash-function and truncation as you suggested.


> The draft has "Topology graph the undirected graph of DNCP nodes produced by 
> retaining only bidirectional peer relationships between nodes."
> 
> In Section 4.6, I think that you are describing how to create the topology graph,
> rather than "traversing it".   

Updated this section as well, changed "traversing" to "updating". Added some more
clarifications and an xref to the hash tree section.



> 
> "Sec 4.5.1  Initiating Neighbor Relationships
> 
> A DNCP profile MAY specify a multicast address on which each DNCP node MUST
> listen to learn about new neighbors.  If multicast discovery is specified in the DNCP
> profile, then when a node starts, it SHOULD send a Node Endpoint TLV to the multicast
> address using ??UDP??.
> 
> In addition to or instead of multicast discovery, a node MAY be configured with a 
> set of DNCP neighbors and the associated interfaces to use.  When a node needs to initiate
> a neighbor relationship, that node SHOULD send a Node Endpoint TLV to the unicast address
> of the desired neighbor.  Note that security considerations may influence whether the relationship
> is established."
> 
> "Sec 4.5.2 Destroying a Neighbor Relationship
> 
> A node can destroy an established neighbor relationship, regardless of whether that node initiated
> or responded to create the neighbor relationship.  A node MUST send a Node State TLV that does
> not include the Node Endpoint TLV with the neighbor's Node Identifier and the associated link's Endpoint
> Identifiers.  A node MAY <somehow remove/expire its Node State TLV information> as part of
> terminating DNCP.

I updated the section on peers and renamed it to "Discovering, Adding and Removing Peers".
It should now clarify the points that I think were not obvious based on your suggestions.



> 
>  
> 
>     >> c) It looks like the TLVs are sent one after another in a datagram or
>     >> stream across a socket.  The closest I see to any detail
>     >> is "TLVs are sent across the transport as is".
>     >
>     > Section “4.2 Data Transport” says
>     >    TLVs are sent across the transport as is, and they SHOULD be sent
>     >    together where, e.g., MTU considerations do not recommend sending
>     >    them in multiple batches.  TLVs in general are handled individually
>     >    and statelessly, with one exception …
>     >
>     > Could you please be more specific on what is unclear?
>     >
>     > The section states that TLV handling is stateless except for the one exception
>     > which is explained, so otherwise  it does not really matter in what order
>     > they are sent or how you split them up onto datagrams (if that is your
>     > transport).
> 
> 
> At a minimum, you should specify "in network order" for how the TLVs are sent.

Do you mean that numbers are sent in network byte order? That is already defined
in the section on TLVs. I created a cross-reference there.


> It would be good to specify whether TLVs need to be sent in any particular order.
That was one of the intentions of the stateless parsing sentence, however I added
an explanation inside brackets mentioning ordering.

> It'd be nice for the receiver to know whether there's a way of seeing that the last
> TLV in the message has been received so it's easier to avoid doing work multiple
> times.

There are no special resending requirements. Reliability is either provided by
using a reliable transport or based on Trickle. Clarified that.


> When does a node decide to use unicast transport?  When does the node decide
> to use multicast transport?  There are some indications about what is sent, but
> not in a very normative way.

It should be quite normative already. The only TLV that is regularly sent over multicast
or that is actively sent (not as a reaction to another packet) is the trickle-driven
Network State TLV (accompanied by an Endpoint TLV). The section on "Trickle-Driven Status
Updates" has MUST-statements on which transport (mc or uc) to use.

All other TLVs are sent reactively over unicast. I added another clarification.



> 
> When and how does a node take into consideration MTU?  I know this is tied to 
> the profile, but it needs to be discussed here.  Can a TLV be broken across
> multiple packets?

Added that fragmentation and reassembly is handled by the transport.


> 
> "The Trickle algorithm is used to provide reliability across unreliable unicast
> or multicast transports.   When the DNCP profile is using reliable unicast,
> then the Trickle algorithm is not needed.


This is more or less stated in the previous paragraph on data transport, however
I added an introductory sentence to the trickle paragraph with an xref.


> 
> As mentioned in the terminology, a Trickle Instance is associated with a
> particular endpoint for multicast transport or (peer, endpoint) for unreliable
> unicast transport.  A Trickle Instance will indicate when an update should be
> sent.  To do this, the Trickle Instance must be aware of when the Local Network
> State Hash changes.  When the Local Network State Hash changes, the
> Trickle state for all Trickle instances is considered inconsistent or reset.
> 
> ....
> 
> When a reliable unicast transport is used, the local node MUST send a status
> update consisting of ... whenever the Local Network State Hash changes.

Clarified.

> 
> "
> 
> What's missing is the context of what the Trickle Instance is for, when it's
> actually used, and so on.

Added some more explanations.


> Yes, you're right.  Please have the draft explained why that magic value was picked
> and what concern doing the time check is resolving  I assume that it's that value 
> because of concerns about the time value wrapping so any origination time older
> is considered aged out?

Hopefully clarified.


> The draft says "Trickle-driven status updates (Section 4.3) provide a mechanism for
>   handling of new peer detection on an endpoint, as well as state
>    change notifications.  Another mechanism may be needed to get rid of
>    old, no longer valid peers if the transport or lower layers do not
>    provide one."
> 
> What I think you may mean is:
> 
> "When status updates with the Network Status TLV are sent via a multicast
> transport, this can provide a mechanism for new peers on that particular link to
> discover the sending node.  The status updates, regardless of the transport
> method used, provide state change notifications. 
> 
> The mechanism to remove old, no longer valid peers, is described in Section ....
> In some DNCP profiles, the loss of the transport layer connectivity is also an
> indication that the associated peer should be removed.
> "

Clarified a bit differently.

> 
> What I'm looking for is how the timestamp is used.  Please add a forward 
> reference to 6.1.5 or reorganize. 

Done and clarified.


> All of the network state hashes are locally calculated!  The local node is the one running this stuff.
> Perhaps some of them are based off of local data (i.e., the
> local node's view of the network) and some of them are based off of received data (i.e. a received Network
> State TLV).   Or maybe you never compute the Received Network State Hash because you always use the value found in the Network State TLV???
I guess the new explicit subsection on calculating the network state hash should clear this up.



>     >> 13) In Sec 6.2: "normally form peer relationships with all peers."  Where
>     >> is forming a peer
>     >> relationship defined?  Is this tied solely to Trickle instances?  What
>     >> about with reliable unicast
>     >> where presumably Trickle isn't used between peers as the Overview states
>     >> "If used in pure unicast mode with unreliable transport, Trickle is also
>     >> used between peers"?
>     >
>     > Please see replies to c) and 11).
> 
> 
> Neither of those actually better describe what a peer relationship entails.

Added xref to the peer section and introductory paragraph about peer relations there.


>  
> 
>     >> 4) There are multiple references to different unintroduced TLVs.  There
>     >> is also the idea that
>     >> each DNCP protocol can have its own TLVs.  It'd be very useful in the
>     >> Introduction to simply state
>     >> what TLVs are required by DNCP and conceptually what they are for.
>     >
>     > Section 7 is entirely devoted to that purpose. I do not think that it should
>     > be repeated in the general introduction.
>     >
>     > If there are particular forward references that are missing, please let us know.
> 
> 
> Section 7 is defining them and the rules around them.  I'm looking for something like
> 
> "As defined in Section 7, DNCP uses request TLVs and data TLVs.  The request TLVs
> are Request Network State TLV and Request Node State TLV.  These are used for 
> synchronizing the database between two neighbors.  The Data TLVs are : Node Endpoint,
> Network State, Node State with optional sub-TLVs, Peer, and Keep-Alive.  DNCP profiles
> may add additional TLVs or add new sub-TLVs to the Node State."
> 
> You don't want to lose or confuse your readers.

I added a paragraph to the overview section.

>  
> 
>     >> 5) In Sec 4.3, it says "The Trickle state for all Trickle instances is
>     >> considered
>     >>    inconsistent and reset if and only if the locally calculated network
>     >>    state hash changes."  but I have no idea yet what a Trickle instance
>     >> is, when
>     >>    a Trickle instance is created or associated with a node?  an
>     >> interface? or something else?
>     >
>     > Please see Section “5. Data Model”
>     >
>     >    “For each remote (peer, endpoint) pair detected on a local endpoint, a
>     >    DNCP node has:”
>     >    [...]
>     >    * Trickle instance: the particular peer's Trickle instance with
>     >       parameters I, T, and c (only on an endpoint in Unicast mode, when
>     >       using an unreliable unicast transport) .
>     >
>     > we will add an xref there and add “trickle instance” to the terminology.
> 
> 
> The definition does help.  It's still not clear from the text what things are handled
> because it's Trickle and what are handled b/c a status update is sent or so on.

Trickle instance is now explained at the begging of the Trickle paragraph.

> 
>     >> 6) I think it would be more useful to describe generally at least what is
>     >> in a DNCP profile earlier in the
>     >> document.  This may be bringing Section 9 forward or it may be listing it
>     >> earlier.  I'm seeing numerous
>     >> forward references.
>     >
>     > There would be numerous forward references in that case too, as it refers
>     > to the protocol behavior and individual TLVs. (There are currently 2 forward
>     > references in the terminology and 2 in the text; we probably cannot move it
>     > before the terminology, and right after the terminology it would introduce
>     > more forward references than it would eliminate.)
> 
> 
> Try adding a pointer to the appendix C to show there's an example. 

There are multiple pointers now to the appendix (in Applicability, Terminology) etc.

> 
>      >> 7) Start of Sec 4.6 talks about the topology graph - but there's been
> 
>     >> absolutely no introduction of
>     >> what the topology graph is or how it was created, updated, etc.
There is now an introductory paragraph.


>     >
>     > See reply to a).
>     >
>     >
>     >> 10) In Sec 5:  This is a helpful section.  It tells me that a Trickle
>     >> instance is per interface.  I don't see anything
>     >> like a topology graph in it.  It clarifies origination time slightly.  It
>     >> talks about "For each remote (peer, endpoint)
>     >> pair detected on a local endpoint" but I still have no ideas how that
>     >> detection is done.  It implies some policy
>     >> restrictions "Set of addresses: the DNCP nodes from which connections are
>     >> accepted." but has no details on whether
>     >> that is created via multicast messaging or via local configuration or
>     >> something else.
>     >
>     > See previous answers referring to peer detection.


There is a paragraph on discovery now.

[homenet] Alia Atlas' Discuss on draft-ietf-homen… Alia Atlas
Re: [homenet] Alia Atlas' Discuss on draft-ietf-h… Steven Barth
Re: [homenet] Alia Atlas' Discuss on draft-ietf-h… Alia Atlas
Re: [homenet] Alia Atlas' Discuss on draft-ietf-h… Steven Barth