RE: Composite Link Requirements as WG document

Dimitri,

Both of your questions are related to the motivation of the draft.  From
carrier/operator's point of view, we have thought long and hard about
this, and we have detailed our motivations in the previous drafts,
although not in this one because we thought they were well understood
since we have presented it numerous times in several WGs and meetings.
I have copied and pasted what we had in the last draft below.  I think
it addressed your questions in detail. 

Appendix A.  Problem Statements

   Two applications are described here that encounter problems when
   multiple parallel links are deployed between two routers in today's
   IP/MPLS networks.

A.1.  Incomplete/Inefficient Utilization

   An MPLS-TE network is deployed to carry traffic on RSVP-TE LSPs, i.e.
   traffic engineered flows.  When traffic volume exceeds the capacity
   of a single physical link, multiple physical links are deployed
   between two routers as a single backbone trunk.  How to assign LSP
   traffic over multiple links and maintain this backbone trunk as a
   higher capacity and higher availability trunk than a single physical
   link becomes an extremely difficult task for carriers today.  Three
   methods that are available today are described here.

   1.  A hashing method is a common practice for traffic distribution
       over multiple paths.  Equal Cost Multi-Path (ECMP) for IP
       services and IEEE-defined Link Aggregation Group (LAG) for
       Ethernet traffic are two of the widely deployed hashing based
       technologies.  However, two common occurrences in carrier
       networks often prevent hashing being used efficiently.  First,
       for MPLS networks carrying mostly Virtual Private Network (VPN)
       traffic, the incoming traffic are usually highly encrypted, so
       that hashing depth is severely limited.  Second, the traffic in
       an MPLS-TE network typically contain a certain number of traffic
       flows that have vast differences in the bandwidth requirements.
       Furthermore, the links may be of different speeds.  In those
       cases hashing can cause some links to be congested while others
       are partially filled because hashing can only distinguish the
       flows but not the flow rates.  A TE based solution better applies
       for these cases.  IETF has always had two technology tracks for
       traffic distribution: TE-based and non-TE based.  A TE based
       solution provides a natural compliment to non-TE based hashing
       methods.

   2.  Assigning individual LSPs to each link through constrained
       routing.  A planning tool can track the utilization of each link
       and assignment of LSPs to the links.  The lag time between
       measurement and action may be large relative to the fluctuations
       in LSP utilization, and the availability and performance of the
       planning tool are also important operational factors.
       Furthermore, the flexibility in response to failure scenarios is
       limited as summarized in the following.  To gain high
       availability, FRR [RFC4090] is used to create a bypass tunnel on
       a link to protect traffic on another link or to create a detour
       LSP to protect another LSP.  If BW is reserved for the bypass
       tunnels or the detour LSPs, the network will need to reserve a
       large amount of capacity for failure recovery, which reduces the
       capacity to carry other traffic.  If BW is not reserved for the
       bypass tunnels and the detour LSPs, the planning tool can not
       assign LSPs properly to avoid the congestion during link failure
       when there are more than two parallel links.  This is because
       during the link failure, the impacted traffic is simply put on a
       bypass tunnel or detour LSPs which does not have enough reserved
       bandwidth to carry the extra traffic during the failure recovery
       phase.  An alternative is to use prioritized queuing so that
       premium traffic sees good performance during a congestion
       interval.  The LSP sources will receive an RSVP-TE path error and
       reoptimize the new LSP .  After a configurable period of time the
       LSPs will reoptimize and all traffic will experience good
       performance.  However, it will not work well when support is
       needed for both RSVP-TE signaled and LDP signaled traffic.

   3.  Facility protection, also called 1:1 protection.  Dedicate one
       link to protect another link.  Only assign traffic to one link in
       the normal condition.  When the working link fails, switch
       traffic over the protection link.  This requires 50% capacity for
       failure recovery.  This works when there are only two links.
       Under the multiple parallel link condition, this causes
       inefficient use of network capacity because there is no
       protection capacity sharing.  In addition, due to traffic
       burstiness, having one link fully loaded and another link idle
       increases transport latency and packet loss, which lowers the
       link performance quality for transport.

   None of these methods satisfies carrier requirement either because of
   poor link utilization or poor performance.  This forces carriers to
   go with the solution of deploying single higher capacity link.
   However, a higher capacity link can be expensive as compared with
   parallel low capacity links of equivalent aggregate capacity; a high
   capacity link can not be deployed in some circumstances due to
   physical impairments; or the highest capacity link may not large
   enough for some carriers.

   An LDP network can encounter the same issue as an MPLS-TE enabled
   network when multiple parallel links are deployed as a backbone
   trunk.  An LDP network can have large variance in flow rates where,
   for example, the small flows may be carrying stock tickers at a few
   kbps per flow while the large flows can be near 10 Gbps per flow
   carrying machine to machine and server to server traffic from
   individual customers.  Those large traffic flows often cannot be
   broken into micro flows.  Therefore, hashing would not work well for
   the networks carrying such flows.  Without per-flow TE information,
   this type of network has even more difficulty to use multiple
   parallel links and keep high link utilization.

A.2.  Inefficiency/Inflexibility of Logical Interface Bandwidth
      Allocation

   Using logically-separate routing instances in some implementations
   further complicates the situation.  Dedicating separate physical
   backbone links, or, in the case of sharing of a single common link,
   dedicating a portion of the link, to each routing instance is not
   efficient.  Assume that each routing instance must have at least the
   capacity of a single link, and also must have equal access to unused
   capacity.  Then, for example, if there are 2 routing instances and 3
   parallel links and half of each link bandwidth is assigned to a
   routing instance, neither routing instance can support an LSP with
   bandwidth greater than half the link bandwidth.  The same problem is
   also present in the case of the sharing of a single common link using
   the dedicated logical interface and link bandwidth method.  An
   alternative in dealing with multiple parallel links is to assign a
   logical interface and bandwidth on each of the parallel physical
   links to each routing instance, which improves efficiency as compared
   to dedicating physical links to each routing instance.

   Note that the traffic flows and LSPs from these different routing
   instances effectively operate in a Ships-in-the-Night mode, where
   they are unaware of each other.  Inflexibility results if there are
   multiple sets of LSPs (e.g., from different routing instances)
   sharing one link or a set of parallel links, and at least one set of
   LSPs can preempt others, in this case more efficient sharing of the
   link set between the routing instances is highly desirable.

A.3.  Additional Functions Required Beyond Link Bundling

   A link bundle [RFC4201] is a collection of TE links.  It is a logical
   construct that represents a way to group/map the information about
   certain physical resources that interconnect routers.  The purpose of
   link bundle is to improve routing scalability by reducing the amount
   of information that has to be handled by OSPF/IS-IS.  Each physical
   link in the link bundle is an IGP link in OSPF/IS-IS if numbered link
   is used.  A link bundle only has the significance at the router
   control plane.  The mapping of LSP to component link in a bundle is
   determined at LSP setup time and this mapping does not change due to
   newly configured LSP/LDP traffic on the same component link.  A link
   bundle only applies to RSVP-TE signaled traffic, which implies no mix
   of RSVP-TE and LDP traffic on a same interface.  CTG can handle
   RSVP-TE and LDP signaled traffic.

   Link bundling does not support groups of links with different
   characteristics (e.g., bandwidth, latency).  Furthermore, link
   bundling only supports RSVP-TE signaled LSPs and not LDP signaled
   LSPs.

Ning So
Lead Engineer
Enterprise Data Network and Traffic Planning
972-729-7905

-----Original Message-----
From: rtgwg-bounces@ietf.org [mailto:rtgwg-bounces@ietf.org] On Behalf
Of PAPADIMITRIOU Dimitri
Sent: Sunday, December 06, 2009 4:11 AM
To: John G. Scudder; rtgwg@ietf.org
Cc: alia.atlas@bt.com; ZININ Alex
Subject: RE: Composite Link Requirements as WG document

Hi,

Two comments on this document:

"Unlike a link bundle [RFC4201], the component links in a 
composite link can have different properties such as cost 
or capacity."

Component link "capacity" heterogeneity is "allowed" in 4201 I can
understand the potential limitation of identical TE metrics (for each
component) unfortunately limited explanation are given to sustain why it
should be addressed.

"This document describes a framework for managing aggregated 
traffic over a composite link."
[...]
"To achieve the better component link utilization and avoid 
component link congestion, the document describes some new
aspects on the traffic flow assignment to component links."

The document is specifically dealing to TE control but does not explain
why "measuring" component link capacity is assumed impossible (cf. 4201
Section 4) ?

Bottom-line: not clear if this document is intended to improve
applicability of bundling TE control in IP/MPLS or if the current
bundling approach is not fulfilling expected functionality (and this
document would outline the why/what). The issue is that the document
describes "modeling" but does not provide an answer to this question. 

Thanks,
-dimitri. 

> -----Original Message-----
> From: rtgwg-bounces@ietf.org [mailto:rtgwg-bounces@ietf.org] 
> On Behalf Of John G. Scudder
> Sent: Tuesday, November 10, 2009 10:08 AM
> To: rtgwg@ietf.org
> Cc: alia.atlas@bt.com; ZININ Alex
> Subject: Composite Link Requirements as WG document
> 
> Folks,
> 
> At today's meeting we received a request to adopt draft-so-yong-mpls- 
> ctg-requirement-00 as a working group document.  There was 
> reasonably  
> strong support in the room for doing so.  Please respond to the  
> mailing list with your discussion, support or opposition (please do  
> this even if you did so in person).  The deadline for comments is  
> November 30.
> 
> Note that accepting the document simply means that the working group  
> would begin working on requirements.  It does not imply blanket  
> acceptance of the document as it now stands.
> 
> Thanks,
> 
> --John
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org
> https://www.ietf.org/mailman/listinfo/rtgwg
> 
_______________________________________________
rtgwg mailing list
rtgwg@ietf.org
https://www.ietf.org/mailman/listinfo/rtgwg