Re: [manet] AD Review of draft-ietf-manet-olsrv2-multipath-11

Jiazi Yi <jiazi@jiaziyi.com> Wed, 19 April 2017 15:43 UTC

From: Jiazi Yi <jiazi@jiaziyi.com>
Message-Id: <6BE8ABAF-6672-4994-AFD3-1FA0BA9B865E@jiaziyi.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_4AE8DE52-C264-4374-9AB1-9774CD195AE2"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Date: Wed, 19 Apr 2017 17:42:55 +0200
In-Reply-To: <1F0A54FE-B275-4203-AC7B-347101217BEF@cisco.com>
Cc: "draft-ietf-manet-olsrv2-multipath@ietf.org" <draft-ietf-manet-olsrv2-multipath@ietf.org>, "manet-chairs@ietf.org" <manet-chairs@ietf.org>, Stan Ratliff <sratliff@idirect.net>, "manet@ietf.org" <manet@ietf.org>
To: "Alvaro Retana (aretana)" <aretana@cisco.com>
References: <1F0A54FE-B275-4203-AC7B-347101217BEF@cisco.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/manet/JaYRYekIVyuhxKG0-FpUOnM4l5k>
Subject: Re: [manet] AD Review of draft-ietf-manet-olsrv2-multipath-11
Precedence: list

Dear Alvaro, 

Firstly, we appreciate your detailed and valuable review, which certainly helps us improving the document a lot — thanks a ton!

Based on your comments, we updated the draft and submitted a new revision: 
            https://tools.ietf.org/html/draft-ietf-manet-olsrv2-multipath-12
            https://www.ietf.org/rfcdiff?url2=draft-ietf-manet-olsrv2-multipath-12 <https://www.ietf.org/rfcdiff?url2=draft-ietf-manet-olsrv2-multipath-12>

For your convenience to trace the modifications, please find also our reply to your comments below. 
We are looking forward to your further comments and instructions. 

> On 3 Apr 2017, at 23:22, Alvaro Retana (aretana) <aretana@cisco.com <mailto:aretana@cisco.com>> wrote:
> 
> Dear authors:
>  
> I just finished reading this document.   In general, I think it is relatively straight forward, but it needs several clarifications (see below).  I will start the IETF Last Call when my Major comments have been addressed in a new version.
>  
> As other people (specially the INT Directorate) review this document, more specific language related to the use of RFC6554 may be needed.  I’m inclined to go forward with the document as is (pending the comments below) because it is Experimental.  We can make further updates if needed.
>  
> Thanks!
>  
> Alvaro.
>  
>  
> Major:
>  
> M1. Section 3. (Applicability Statement): “It is interoperable with OLSRv2 implementations that do not have this extension.”  I don’t think that statement is true.  For example, how does a non-MP-OLSRv2 router handle a packet with the RFC6554 header in it?  IOW, strict source routing requires all the routers in the path to support the extension.  I think more clarity and specificity is needed.

This is considered in the draft: the source-routing supported routers can identify themselves by carrying a SOURCE_ROUTE TLV in their routing messages. When the paths are calculated, as stated in the draft:

   For IPv6 networks, as strict source routing is used, only the routers
   that exist in SR-OLSRv2 Router Set are considered in the path
   calculation, i.e., only the source-routing supported routers can
   exist in the path.

so the non-MP-OLSRv2 routers won’t be considered in the first place. We further clarified in the applicability statement with the following text: 

It is interoperable with OLSRv2 implementations that do not have this extension: as the MP-OLSRv2 uses source routing, in IPv4 networks the interoperability is achieved by using loose source routing header; in IPv6 networks, it is achieved by eliminating routers that do not support IPv6 strict source routing. 

>   
> M2. MAX_SRC_HOPS .  Both Section 5.1. (Router Parameters) and Section 9. (Configuration Parameters) say that “For IPv6 networks, it MUST be set to 0, i.e., no constraint on maximum number of hops.”  What about potential MTU/fragmentation issues?  RFC6554 (in Section 4.1. (Generating Source Routing Headers)) does talk about a maximum path length.

We added some text considering the MTU issue:

It is RECOMMENDED to use MTU sizes considering the source routing header to avoid fragmentation. Depending on the size of the routing domain, the MTU should be at least 1280 + 40 (for outer IP header) + 16 * diameter of the network in number of hops (for source routing header). If the links in the network have different MTU sizes, by using technologies like Path MTU Discovery, the routers are able to be aware of the MTU along the path. The size of the datagram plus the size of IP headers (including the source routing header) SHOULD NOT exceed the minimum MTU along the path.

>  
> M3. Section 6.1.1. (SOURCE_ROUTE TLV) says that each HELLO/TC message “MUST have exactly one SOURCE_ROUTE TLV”.  I read this as meaning two things: (1) every HELLO/TC must have a SOURCE_ROUTE TLV, and (2) there must be at most one SOURCE_ROUTE TLV.  What happens if either of these conditions are not met?   RFC7181 describes conditions under which a message should be considered invalid for processing (16.3, for example) – given the language used in this section, I would expect something similar in this document (to enhance what is already in RFC7181).

If the HELLO/TC doesn’t have a SOURCE_ROUTE TLV, it means that it’s the same as normal OLSRv2 generated messages. Then the originator won’t be treated as source-route supported routers, which means it has less possibility being chosen as MPRs, and it won’t be able to be used in IPv6 strict source routing. 

We added the invalide term as:

		In addition to the reasons specified in <xref target="RFC7181"/> for discarding a HELLO message or a TC message on reception, a HELLO or TC message received MUST be discarded if it has more than one Message TLV with Type = SOURCE_ROUTE. 

>  
> M3.1. [minor] Section 6.1.1. (SOURCE_ROUTE TLV). “MP-OLSRv2 Routing Process, or an OLSRv2 Routing Process that supports source-route forwarding.”  Specially for IPv6, can you identify a sub-set of this specification that would be implemented by “an OLSRv2 Routing Process that supports source-route forwarding”?  I’m wondering whether that distinction even needs to be made or if you can just assume that a manet router that supports RFC6554 also supports this document (it does, if it originates the new TLV).

I remember we had such discussion in the manet mailing list (with Justin?). The purpose is to have better flexibility and deployability of the protocol: I think it’s easier to convince an operator to support a standard track source routing document, than deploying an experimental function. Furthermore, the MP-OLSR has a reactive part in path calculation, which might be a bit tricky in some platforms, while the RFC6554 is quite standard. 

>  
> M3.2. This section also says that “an OLSRv2 Routing Process MAY have one SOURCE_ROUTE TLV, if the OLSRv2 Routing Process supports source-route forwarding, and is willing to join the source route generated by other MP-OLSRv2 Routing Processes.”  If the 2 conditions are met, doesn’t the text become “MUST have exactly one SOURCE_ROUTE TLV”?   Just pointing out that the differentiation may not be needed…and that the language is not consistent (see above).

Agree. Changed to "MUST have exactly one SOURCE_ROUTE TLV”. 

>  
> M3.3. “The existence of SOURCE_ROUTE TLV MUST be consistent for a specific OLSRv2 Routing Process”  Simplify by adding discard conditions. (See M3 above)

Resolved — see reply to M3 above. 

> M4. Section 6.2.2. (Source Routing Header in IPv6) says that the routing header from RFC6554 is used with IPv6 Routing Type 254 (experimental).  Why?   I’m asking because RFC4727 says that the experimental values “MUST NOT be shipped as defaults in implementations” – not only do implementations exist, but the purpose of this Experimental document is to gather experience form real deployments.  Short of asking for a new Routing Type, why isn’t the one already assigned to RFC6554 used?

Ah, I didn’t know that text in RFC4727 before. I asked such question to one of the 6man co-chairs, Ole, and he suggested using such experimental code point (https://mailarchive.ietf.org/arch/msg/manet/YcpyzC4T3VUVEsEN_YgyizN4pBo <https://mailarchive.ietf.org/arch/msg/manet/YcpyzC4T3VUVEsEN_YgyizN4pBo>) 
Now we are using the routing type used by RFC6554. 

>  
>  
> M5. Section 8.1. (HELLO and TC Message Generation).  “The TC message generation based on SR_TC_INTERVAL does not replace the ordinary TC message generation specified in [RFC7181]…The TC generation based on SR_TC_INTERVAL serves for those routers to advertise SOURCE_ROUTE TLV so that the other routers can be aware of the source-route enabled routers so as to be used as destinations of multipath routing.  The SR_TC_INTERVAL is set to a longer value than TC_INTERVAL.”  This text says that there are two intervals at which an MP-OLSRv2 router sends TCs, but that only at SR_TC_INTERVAL will the SOURCE_ROUTE TLV be included, is that correct?  If so, then how does a receiver reconcile that with the requirement in 6.1.1 (and in this section) for “Every HELLO or TC message generated by a MP-OLSRv2 Routing Process MUST have exactly one SOURCE_ROUTE TLV.”??  IOW, it looks like that requirement can’t be met unless all the TCs (including the ones generated every TC_INTERVAL) include the SOURCE_ROUTING TLV. 

Yes, all the TCs must include the SOURCE_ROUTING TLV. 
We made the text more explicit. 

>  
>  
> M6. Section 8.3. (MPR Selection): “MP-OLSR routers SHOULD be preferred as routing MPRs.”  When would it be ok for MP-OLSR routers not to be preferred?  IOW, why is “MUST” not used?

Already discussed in https://www.ietf.org/mail-archive/web/manet/current/msg19414.html <https://www.ietf.org/mail-archive/web/manet/current/msg19414.html>

>  
>  
> M7. If the Multi-path Routing Set is maintained reactively, what should be done with the datagram that triggered the calculation (while it is being done)?  I’m assuming that it should be buffered somewhere and then forwarded, right? 

Yes

>  I’m also assuming that any other datagrams to the same destination that are received while the calculation is happening should also be buffered, true? 
>  Is there a maximum time that this buffering should be maintained?  Is there a maximum expected size? 

In our current implementations for field tests and simulations, no buffer is used because the time spent to calculate the paths is trivial (the paths are either returned almost instantly, or no path at all). In more complex scenarios with high data rate and large number of nodes, a buffer would be needed. 

>  Maybe these are questions that you may want to answer through the experiment.  Also, buffering some datagrams may open an attack vector – worth mentioning in the Security Considerations section. 

Yes, very good points. We added some text in the “security consideration” section: 

If the multiple paths are calculated reactively, the datagrams SHOULD be buffered while the paths are being calculated. Because the path calculation is local and no control message is exchanged, the buffering time should be trivial. However, depending on the CPU power and memory of the router, a maximum buffer size SHOULD be set to avoid occupying too much memory of the router. When the buffer is full, the ancient datagrams are dropped. A possible attack that a malicious application could launch is that, it initiates large amount of datagrams to all the other routers in the network, thus triggering path calculation to all the other routers and during which, the datagrams are buffered. This might flush other legitimate datagrams. But the impact of the attack is transient: once the path calculation is finished, the datagrams are forwarded and the buffer goes back to empty.

>  
> M7.1. [minor] Section 4. (Protocol Overview and Functioning): “The reactive operation is local in the router and no message transmission delay is introduced.”  I think you may be referring to control plane messages, but it sounds as if there was no delay in the propagation of the data plane messages – which by definition need to somehow be held while the reactive calculation is completed.  Please clarify.

Yes. We now put "no additional routing control messages exchange is required”. 

>  
>  
> M8. Section 8.5.2. (Multi-path Dijkstra Algorithm) says that “Using different multi-path algorithms will not impact the interoperability.”  Using different algorithms in a network may not impact interoperability (as the packets on the wire will be understood by all), but it can lead to routing loops.  Please clarify.

Because the path is decided only by the originator, using different multi-path algorithms in different routers won’t be a problem. 

>  
> M8.1. Section 8.5.2. (Multi-path Dijkstra Algorithm) describes the MP algorithm as two incremental functions at “step i” of the original Dijkstra algorithm.  While the algorithm may be well known, please add a reference so that there is no confusion as to what “step i” is. 

Done. 

>  
>  
> M9. The Security Considerations of RFC6554 say that the extension is to be used only inside an RPL domain and that “RPL routers MUST drop datagrams entering or exiting a RPL routing domain that contain an SRH in the IPv6 Extension headers.”  This document does mention that “the source routing header is used only in the current routing domain”, but it doesn’t explicitly talk about what should be done with datagrams that contain the source routing headers as they enter/leave the domain.  Please include some text similar to the one in RFC6554.  Pointing to the Security Considerations of RFC6554 would also be a good idea (to capture ICMP attacks, and whatever else is mentioned there).

Done. 

>  
>  
> M10. IANA Considerations.  Sections 12.1. (Expert Review: Evaluation Guidelines) and 12.3. (Routing Type) are not needed as this document is not specifying any new space that requires DE review (the assignment should be done under DE review, but that is specified elsewhere), and the Routing Type is already assigned by IANA.

Removed. 

>  
>  
>  
> Minor:
>  
> P1. Please include a reference for “erasure coding”.

Done.

>  
> P2. Loose vs Strict Source Routing.  It caught my attention that the IPv4/IPv6 source routing modes are not the same – why is that?  I note that RFC791 does support strict source routing.  RFC6554 doesn’t support loose source routing, but other extensions do (draft-ietf-6man-segment-routing-header, for example, can support both loose and strict modes).  Given that this document is Experimental, I don’t think it is a show stopper, but consistency would be nice.  There’s also the opportunity to experiment further.

In IPv4, the loose source routing is used because there is a limitation on the length of the header. 
It is very good point that it would be interesting to experiment with the loose mode in IPv6, as you indicated. We added some text for experiments based on draft-ietf-6man-segment-routing-header: 

	• Use of IPv6 loose source routing. In the current specification, only strict source routing is used for IPv6 based on [RFC6554]. In [I‑D.ietf‑6man‑segment‑routing‑header], the use of loose source routing is also proposed in IPv6. In scenarios where the length of the source routing header is critical, the loose source routing can be considered.

>  
> P3. You mention round-robin and weighted scheduling for path selection (in 1.1).  What about flow-based (keep all the packets in a flow on a single path)?  I’m wondering about this because disjoint paths may result in significant performance characteristic differences, and path stability (as much as possible) may be important in some cases.

In fact, the flow-based is a different approach compared to the one that we proposed. Based on our tests and simulations, using disjoint paths simultaneously could improve the overall performance. Of course, we also noticed the issues that you mentioned, and put the following text as part of the experiments: 

	• The impacts of the delay variation due to multi-path routing. [RFC2991] brings out some concerns of multi-path routing, especially variable latencies. Although current experiment results show that multi-path routing can reduce the jitter in dynamic scenarios, some transport protocols or applications may be sensitive to the datagram re-ordering.

>  
> P4. Section 1.1 mentions the potential use of “Different algorithms to obtain multiple paths”.  The rtgwg WG has done a lot of work on finding alternate paths (from LFA all the way to MRT) in the context of fast reroute – these alternate paths can obviously be used for traffic forwarding under normal conditions (not just for repair).  I’m wondering why those other mechanisms were not used here?  Where they even considered?  [IOW, it’s always nice to use other IETF-developed technology.]

As mentioned in the previous reply, the purpose is not discovering alternate path for fast reroute, but using parallel paths to improve the reliability and take benefit of multiple description coding. 

>  
> P5. Please define SR-OLSRv2 in the Terminology section.

Done. 

> P6. The document doesn’t explain how SR_HOLD_TIME_MULTIPLIER should be used.  Later I found (in 8.2) that “source route hold time multiplier” is defined as “the value of a Message TLV with Type = SOURCE_ROUTE“, which is exactly what SR_HOLD_TIME_MULTIPLIER is.  Suggestion: to avoid confusion and be consistent, eliminate the definition of "source route hold time multiplier" and simply use SR_HOLD_TIME_MULTIPLIER.

Yes, good catch. Fixed. 

>  
> P7. Section 8.2. (HELLO and TC Message Processing) defines "validity time".  I don’t think this is necessary as it has the same meaning as in RFC7181, right?  If so, then a reference to that (in the SR_time formula) should be enough.  Otherwise, you at least need a reference to RFC5497.  I’m making this comment because it is easier/cleaner to use definitions by reference, than risk not using the exact same text and have the specifications potentially diverge over time.

Agree. We put a reference to RFC7181. 

>  
> P8. The definition of the Multi-path Routing Tuple in 8.5.1. (Requirements of Multi-path Calculation) is not (exactly) the same as in 7.2. (Multi-path Routing Set).  Suggestion: put a reference to 7.1 in 8.5.1.

Done. 

>  
> P9. References:
> - RFC6982 has been obsoleted by RFC7942.
> - s/RFC2460/draft-ietf-6man-rfc2460bis

Fixed. 

>  
>  
>  
> Nits:
>  
> N1. s/original OLSRv2/base OLSRv2
>  
> N2. In Section 5.1, please include a forward reference to Section 9…and in Section 6.1.1…and anywhere else the parameters are defined.
>  
> N3. s/existed/existing
>  
> N4. s/MP-OLSR/MP-OLSRv2
>  
> N5. In 8.4 s/they are/there are
>  
> N6. s/the the/the

all fixed. 

Again, thank very much for the review. Any further comments on our reply or the draft is appreciated. 

regards

Jiazi

Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Jiazi Yi
Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Alvaro Retana (aretana)
[manet] AD Review of draft-ietf-manet-olsrv2-mult… Alvaro Retana (aretana)
Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Jiazi Yi
Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Dearlove, Christopher (UK)
Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Alvaro Retana (aretana)
Re: [manet] AD Review of draft-ietf-manet-olsrv2-… Jiazi Yi