suggestion additions to cl-framework

Curtis Villamizar <curtis@occnc.com> Tue, 12 July 2011 18:12 UTC

Return-Path: <curtis@occnc.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 607C821F8E47 for <rtgwg@ietfa.amsl.com>; Tue, 12 Jul 2011 11:12:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.523
X-Spam-Level:
X-Spam-Status: No, score=-1.523 tagged_above=-999 required=5 tests=[AWL=-0.123, BAYES_00=-2.599, J_CHICKENPOX_13=0.6, J_CHICKENPOX_51=0.6]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4y-OJQl2lYh0 for <rtgwg@ietfa.amsl.com>; Tue, 12 Jul 2011 11:12:50 -0700 (PDT)
Received: from harbor.orleans.occnc.com (harbor.orleans.occnc.com [173.9.106.135]) by ietfa.amsl.com (Postfix) with ESMTP id 41A3821F8D6B for <rtgwg@ietf.org>; Tue, 12 Jul 2011 11:12:49 -0700 (PDT)
Received: from harbor.orleans.occnc.com (harbor.orleans.occnc.com [173.9.106.135]) by harbor.orleans.occnc.com (8.13.6/8.13.6) with ESMTP id p6CICft7063180; Tue, 12 Jul 2011 14:12:41 -0400 (EDT) (envelope-from curtis@harbor.orleans.occnc.com)
Message-Id: <201107121812.p6CICft7063180@harbor.orleans.occnc.com>
To: rtgwg@ietf.org
Subject: suggestion additions to cl-framework
From: Curtis Villamizar <curtis@occnc.com>
Date: Tue, 12 Jul 2011 14:12:41 -0400
Sender: curtis@occnc.com
X-Mailman-Approved-At: Tue, 12 Jul 2011 11:50:47 -0700
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: curtis@occnc.com
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtgwg>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jul 2011 18:12:55 -0000

The CL-framework document is draft-so-yong-rtgwg-cl-framework-04 .

The remainder of this email is suggested additions to the CL
framework.  It looks a lot like an internet-draft, but it isn't.  It
was easier to validate the xml2rfc by making this self contained.

There has been a discussion of this among the co-author of
draft-so-yong-rtgwg-cl-framework-04 with Ning and Lucy agreeing to
merge this with the existing work (three if you count me) and the rest
(three) not weighing in yet.  Lucy has provided general comments and
wants to read through the proposed additions again and provide further
more detailed comments.

The discussion at the upcoming IETF meeting should still focus on
draft-so-yong-rtgwg-cl-framework-04 first and foremost.  This email is
just to let the WG to get a general idea of what we are planning to
merge in.  I'm entirely OK if the WG or WG chairs decides that it is
premature to discuss these additions at all at the meeting.

Any merge would have to occur after the IETF meeting, though a merged
draft is likely to have been exchanged among the co-authors prior to
the meeting.

The screwup on this is entirely my fault for taking way too long to
get around to writing this.

Curtis





RTGWG                                                 C. Villamizar, Ed.
Internet-Draft                                      Infinera Corporation
Intended status: Informational                             July 10, 2011
Expires: January 11, 2012


                   Composite Link Framework Additions
               draft-villamizar-cl-framework-additions-XX

Abstract

   This document provides some suggested additions to the existing
   Composite Link Framework document.  This is not a real internet-draft
   in that it is not submitted as it missed the deadline for IETF-81
   (though it looks convincing enough).  It exists for discussion
   purposes only.  It is hoped that ideas herein will be incorporated
   into the Composite Link Framework internet-draft after IETF-81.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 11, 2012.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as



Villamizar              Expires January 11, 2012                [Page 1]

Internet-Draft           CL Framework Additions                July 2011


   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Architecture Summary . . . . . . . . . . . . . . . . . . .  3
     1.2.  Requirements Language  . . . . . . . . . . . . . . . . . .  3
     1.3.  Definitions  . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Architecture Tradeoffs . . . . . . . . . . . . . . . . . . . .  4
     2.1.  Scalability Motivations  . . . . . . . . . . . . . . . . .  4
     2.2.  Reducing Routing Information and Exchange  . . . . . . . .  4
     2.3.  Reducing Signaling Load  . . . . . . . . . . . . . . . . .  5
     2.4.  Reducing Forwarding State  . . . . . . . . . . . . . . . .  6
     2.5.  Avoiding Route Oscillation . . . . . . . . . . . . . . . .  6
   3.  New Challenges . . . . . . . . . . . . . . . . . . . . . . . .  7
     3.1.  Control Plane Challenges . . . . . . . . . . . . . . . . .  7
       3.1.1.  Delay and Jitter Sensitive Routing . . . . . . . . . .  7
       3.1.2.  Local Control of Traffic Distribution  . . . . . . . .  8
       3.1.3.  Path Symetry Requirements  . . . . . . . . . . . . . .  8
       3.1.4.  Requirements for Contained LSP . . . . . . . . . . . .  9
       3.1.5.  Retaining Backwards Compatibility  . . . . . . . . . .  9
     3.2.  Data Plane Challenges  . . . . . . . . . . . . . . . . . . 10
       3.2.1.  Very Large LSP . . . . . . . . . . . . . . . . . . . . 10
       3.2.2.  Very Large Microflows  . . . . . . . . . . . . . . . . 11
       3.2.3.  Traffic Ordering Constraints . . . . . . . . . . . . . 11
   4.  Existing Mechanisms  . . . . . . . . . . . . . . . . . . . . . 11
     4.1.  Link Bundling  . . . . . . . . . . . . . . . . . . . . . . 11
   5.  Mechanisms Proposed in Other Documents . . . . . . . . . . . . 12
     5.1.  Loss and Delay Measurement . . . . . . . . . . . . . . . . 13
     5.2.  Link Bundle Extensions . . . . . . . . . . . . . . . . . . 13
     5.3.  Fat PW and Entorpy Labels  . . . . . . . . . . . . . . . . 13
     5.4.  Multipath Extensions . . . . . . . . . . . . . . . . . . . 14
   6.  Required Protocol Extensions and Mechanisms  . . . . . . . . . 14
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 16
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 16
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 16
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18












Villamizar              Expires January 11, 2012                [Page 2]

Internet-Draft           CL Framework Additions                July 2011


1.  Introduction

   This document provides additional detail intended to be merged with
   the composite link framework document.  The focus is on existing
   protocol mechanisms and extensions that will be required to
   accommodate functionality that is called for in the composite links
   requirements document but unsupported or inadequately supported by
   existing protocols [I-D.ietf-rtgwg-cl-requirement].

1.1.  Architecture Summary

   Networks aggregate information, both in the control plane and in the
   data plane, as a means to acheive scalability.  A tradeoff exists
   between the needs of scalability and the needs to identify differing
   path and link characteristics and differeing requirements among flows
   contained within further aggregated traffic flows.  These tradeoffs
   are discussed in detail in Section 2.

   Some aspects of Composite Link requirements present challenges for
   which multiple solutions may exist.  In Section 3 various challenges
   and potential approaches are discussed.

   A subset of the functionality called for in
   [I-D.ietf-rtgwg-cl-requirement] is available through MPLS Link
   Bundling [RFC4201].  Link bundling and other existing standards
   applicable to Composite Link are covered in Section 4.

   The most straightforward means of supporting Composite Link
   requirements is to extend MPLS and in particular to extend link
   bundling.  Extensions which have already been proposed in other
   documents which are applicable to Composite Link are discussed in
   Section 5.

   Goals of most new protocol work within IETF is to reuse existing
   protocol encapsulations and mechanisms where they meet requirements
   and extend existing mechanisms such that additional complexity is
   minimized while meeting requirements and such that backwards
   compatibility is preserved to the extent it is practical to do so.
   These goals are considered in proposing a framework for further
   protocol extensions and mechanisms in Section 6.

1.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].





Villamizar              Expires January 11, 2012                [Page 3]

Internet-Draft           CL Framework Additions                July 2011


1.3.  Definitions

   [ ... to be completed ... ]


2.  Architecture Tradeoffs

   Scalability and stability are critical considerations in protocol
   design where protocols may be used in large network.  Composite Link
   is applicable to large networks, and therefore scalability must be a
   major consideration.  Some of the requirements of Composite Link
   require additional information to be carried in situations where
   component links differ in some significant way.

2.1.  Scalability Motivations

   In the interest of scalability information is aggregated in
   situations where information about a large amount of network capacity
   or a large amount of network demand provides is adequate to meet
   requirements.  Routing information is aggregated to reduce the amount
   of information exchange related to routing and to simplify route
   computation.  Reducing the amount of information allows the exchange
   of information during a large routing change to be accomplished more
   quickly, and simplifying route computation improves convergence time
   after very significant network faults which cannot be handled by
   preprovisioned or precomputed protection mechanisms.

   Neglecting scaling issues can result in performance issues, such as
   slow convergence.  Neglecting scaling in some cases can result in
   netowrks which perform so poorly as to become unstable.

2.2.  Reducing Routing Information and Exchange

   Link bundling at the very least provides a means of aggregating
   control plane information.  Even where the all-ones component link
   supported by link bundling is not used, the amount of control
   information is reduced by the average number of component links in a
   bundle.

   Fully deaggregating link bundle information would negate this
   benefit.  If there is a need to deagregate, such as to distinguish
   between groups of links within specified ranges of delay, then no
   more deaggregation than is necessary should be done.

   For example, in supporting the requirement for heterogeneious
   component links, it makes little sense to fully deagregate link
   bundles when adding support for groups of component links with common
   attributes within a link bundle can maintain most of the benefit of



Villamizar              Expires January 11, 2012                [Page 4]

Internet-Draft           CL Framework Additions                July 2011


   aggregation while adequately supporting the requirement to support
   heterogeneious component links.

   Routing information exchange is also reduced by making sensible
   choices regarding the amount of change to link parameters that
   require link readvertisement.  For example, if delay measurements
   include queuing delay, then a much more course granularity of delay
   measurement would be called for than if the delay does not include
   queuing and is dominated by geographic delay (speed of light delay).

2.3.  Reducing Signaling Load

   Aggregating traffic into very large hierarchical LSP in the core very
   substantially reduces the number of LSP that need to be signaled and
   the number of path computations any given LSR will be required to
   perform when a major network fault occurs.

   In the extreme, applying MPLS to a very large network without
   hierarchy could exceed the 20 bit label space.  For example, in a
   network with 4,000 nodes, with 2,000 on either side of a cutset,
   would have 4,000,000 LSP crossing the cutset.  Even in a degree four
   cutset, an uneven distribution of LSP across the cutset, or the loss
   of one link would result in a need to exceed the size of the label
   space.  Among provider networks, 4,000 access nodes is not at all
   large.

   In less extreme cases, having each node terminate hundreds of LSP to
   acheive a full mesh creates a very large computational load.  The
   time complexity of one CSPF computation is order(N log N), where L is
   proportional to N, and N and L are the number of nodes and number of
   links, respectively.  If each node must perform order(N) computations
   when a fault occurs, then the computational load increases as
   order(N^2 log N) as the number of nodes increases.  In practice at
   the time of writing, this imposes a limit of a few hundred nodes in a
   full mesh of MPLS LSP before the computational load is sufficient to
   result in unacceptable convergence times.

   When the number of nodes grows too large, the solution is to use the
   MPLS PSC hierarchy [RFC4206].  A core within the hierarchy can divide
   the topology into M regions of on average N/M nodes.  Within a region
   the computational load is reduced by more than M^2.  Within the core,
   the computational load generally becomes quite small since M is
   usually a fairly small number (a few tens of regions) and each region
   is generally attached to the core in typically only two or three
   places on average.

   Using hierarchy improves scaling but has two consequences.  First,
   hierarchy effectively forces the use of platform label space.  When a



Villamizar              Expires January 11, 2012                [Page 5]

Internet-Draft           CL Framework Additions                July 2011


   containing LSP is rerouted, the labels assigned to the contained LSP
   cannot be changed but may arrive on a different interface.  Second,
   hierarchy results in much larger LSP.  These LSP today are larger
   than any single component link and therefore force the use of the
   all-ones component in link bundles.

2.4.  Reducing Forwarding State

   MPLS hierarchy has the benefit of reducing the amount of forwarding
   state.  Using the example from the previous section, the worst case
   generally occurs at borders with the core.

   For example, consider a network with approximately 1,000 nodes
   divided into 10 regions.  At the edges, each node requires 1,000 LSP
   to other edge nodes.  The edge nodes also require 100 intra-region
   LSP.  Within the core, if the core has only 3 attachments to each
   region the core LSR have less than 100 intra-core LSP.  At the border
   cutset between the core and a given region, in this example there are
   100 edge nodes with inter-region LSP crossing that cutset, destined
   to 900 other edge nodes.  That yields forwarding state for on the
   order of 90,000 LSP at the border cutset.  These same routers need
   only reroute well under 200 LSP when a multiple fault occurs, as long
   as only links are affected and a border LSR does not go down.

   In the core, the forwarding state is greatly reduced.  If inter-
   region LSP have different characteristics, it makes sense to make use
   of aggregates with different characteristics.  Rather than exchange
   information about every inter-region LSP within the intra-core LSP it
   makes more sense to use multiple intra-core LSP between pairs of core
   nodes, each aggregating sets of inter-region LSP with common
   characteristics or common requirements.

2.5.  Avoiding Route Oscillation

   Networks can become unstable when a feedback loop exists such that
   moving traffic to a link causes a metric such as delay to increase,
   which then causes traffic to move elsewhere.  For example, the
   original ARPAnet routing used a delay based cost metric and proved
   prone to route oscillations [DBP].

   Delay may be used as a constraint in routing for high priority
   traffic, where the movement of traffic cannot impact the delay.  The
   safest way to measure delay is to make measurements based on traffic
   which is prioritized such that it is queued ahead of the traffic
   which will be affected.  This is a reasonable measure of delay for
   high priority traffic for which constraints have been set which allow
   this type of traffic to consume only a fraction of link capacities
   with the remaining capacity available to lower priority traffic.



Villamizar              Expires January 11, 2012                [Page 6]

Internet-Draft           CL Framework Additions                July 2011


   Any measurement of jitter (delay variation) that is used in route
   decision is likely to cause oscillation.  Jitter that is caused by
   queuing effects and cannot be measured using a very high priority
   measurement traffic flow.

   It may be possible to find links with contrained queuing delay or
   jitter using a theoretical maximum or a probability based bound on
   queuing delay or jitter at a given priority based on the types and
   amounts of traffic accepted and combining that theoretical limit with
   a measured delay at very high priority.


3.  New Challenges

   New technical challenges are posed by [I-D.ietf-rtgwg-cl-requirement]
   in both the control plane and data plane.

   Among the more difficult challenges are maintaining new requirements
   such as the requirements related delay or jitter for contained LSP.

   Backwards compatibility poses numerous challenges.  Advertising
   groups of component links with similar characteristics is in itself
   not difficult, but doing so in a highly backwards compatible manner
   poses problems.

   The combination of ingress control over LSP placement and retaining
   an ability to move traffic as demands dictate can pose challenges and
   such requirements can even be conflicting.

3.1.  Control Plane Challenges

   Some of the control plane requirements are particularly challenging
   when considering handling flows with aggregated flows and the
   requirements to minimize impact on scalability.  Potentially
   conflicting are requirements for jitter and requirements for
   stability.  Potentially conflicting are the requirements for ingress
   control of a large number of parameters, and the requirements for
   local control needed to achieve traffic balance across a composite
   link.  These challenges and potential solutions are discussed in the
   following sections.

3.1.1.  Delay and Jitter Sensitive Routing

   Delay and jitter sensitive routing are called for in
   [I-D.ietf-rtgwg-cl-requirement] in requirements FR#2, FR#7, FR#8,
   FR#9, FR#15, FR#16, FR#17, FR#18.  Requirement FR#17 is particularly
   probelmatic, calling for constraints on jitter.




Villamizar              Expires January 11, 2012                [Page 7]

Internet-Draft           CL Framework Additions                July 2011


   A tradeoff exists between scaling benefits of aggergating
   information, and potential benefits of using a finer granularity in
   delay reporting.  To maintain the scaling benefit, measured link
   delay for any given composite link SHOULD be aggregated into a small
   number of delay ranges.  IGP-TE extensions MUST be provided which
   advertise the available capacities for each of the selected ranges.

   For path selection of delay sensitive LSP, the ingress SHOULD bias
   link metrics based on available capacity and select a low cost path
   which meets LSP total path delay criteria.  To communicate the
   requirements of an LSP, the ERO MUST be extended to indicate the per
   link constraints.  To communicate the type of resource used, the RRO
   SHOULD be extended to carry an identification of the group that is
   used to carry the LSP at each link bundle hop.

3.1.2.  Local Control of Traffic Distribution

   Many requirements in [I-D.ietf-rtgwg-cl-requirement] suggest that a
   node immediately adjacent to a component link should have a high
   degree of control over how traffic is distributed, as long as network
   performance objectives are met.  Particularly relevant are FR#18 and
   FR#19.

   The requirements to allow local control are potentially in conflict
   with requirement FR#21 which gives full control of component link
   select to the LSP ingress.  While supporting this capability is
   manditory, use of this feature is optional per LSP.

3.1.3.  Path Symetry Requirements

   Requirement FR#21 in [I-D.ietf-rtgwg-cl-requirement] includes a
   provision to bind both directions of a bidirectional LSP to the same
   component.  This is easily achieved if the LSP is directly signaled
   across a composite link.  This is not as easily achieved if a set of
   LSP with this requirement are signaled over a large hierarchical LSP
   which is in turn carried over a composite link.  The basis for load
   distribution in such as case is the label stack.  The labels in
   either direction are completely independent.

   This could be accomodated if the ingress, egress, and all midpoints
   of the hierarchical LSP make use of an entropy label in the
   distribution, and use only that entropy label.  A solution for this
   problem may add complexity with very little benefit.  There is little
   or no true benefit of using symetrical paths rather than component
   links of identical characteristics.






Villamizar              Expires January 11, 2012                [Page 8]

Internet-Draft           CL Framework Additions                July 2011


3.1.4.  Requirements for Contained LSP

   [I-D.ietf-rtgwg-cl-requirement] calls for new LSP contraints.  These
   constraints include frequency of load balancing rearrangement, delay
   and jitter, packet ordering contraints, and path symetry.

   When LSP are contained within hierarchical LSP, there is no signaling
   available at midpoint LSR which identifies the contained LSP let
   alone providing the set of requirements uniqe to each contained LSP.
   Defining extensions to provide this information would severely
   implact scalability and defeat the purpose of aggregating control
   information and forwarding information into hierarchical LSP.  For
   the same scalability reasons, not aggregating at all is not a vialble
   option.

   As pointed out in Section 3.1.3, the benefits of supporting symetric
   paths among LSP contained within hierarchical LSP may not be
   sufficient to justify the complexity of supporting this capability.

   For other LSP requirements, the most scalable solution is to provide
   multiple hierarchical LSP, each aggregating LSP with common
   requirements, and stating those same requirements for the
   hierarchical LSP.  This is a network design technique rather than a
   protocol extension.  This technique can accommodate delay and jitter
   requirements, frequency of load balancing rearrangement, packet
   ordering constraints.  Section 5.4 provides additional mechanisms for
   addressing packet ordering constraints.

3.1.5.  Retaining Backwards Compatibility

   Backwards compatibility and support for incremental deployment
   requries considering the impact of legacy LSR in the role of LSP
   ingress, and considering the impact of legacy LSR advertising
   ordinary links, Ethernet LAG as ordinary links, and link bundles.

   Legacy LSR in the role of LSP ingress cannot signal requirements
   which are not supported by their control plane software.  The
   addition of additional capabilities has not impact on these LSR.
   These LSR however, being unaware of extensions, may try to make use
   of scarse resources which support specific requirements such as low
   delay.  To a limited extent it may be possible to avoid this issue
   using existing mechanisms such as link administrative attributes and
   attribute affinities [RFC3209].

   Legacy LSR advertsing ordinary links will not advertise attributes
   needed by some LSP.  For example, there is no way to determine the
   delay or jitter characteristics of such a link.  Legacy LSR
   advertsing Ethernet LAG pose additional problems.  There is no way to



Villamizar              Expires January 11, 2012                [Page 9]

Internet-Draft           CL Framework Additions                July 2011


   determine that packet ordering constraints would be violated for LSP
   with strict packet ordering constraints, or that frequency of load
   balancing rearrangement constraints might be violated.

   Legacy LSR advertsing link bundles have no way to advertise the
   configured default behavior of the link bundle.  Some link bundles
   may be configured to place each LSP on a single component link and
   therefore may not be able to accommodate an LSP which requires
   bandwidth in excess of the size of a component link.  Some link
   bundles may be configured to spread all LSP over the all-ones
   component.  For LSR using the all-ones component link, there is no
   documented procedure for correctly setting the "Maximum LSP
   Bandwidth".  There is currently no way to indicate the largest
   microflow that could be supported by a link bundle using the all-ones
   component link.

   Having received the RRO, it is possible for an ingress to look for
   the all-ones component to identify such link bundles after having
   signaled at least one LSP.  Whether any LSR collects this information
   on legacy LSR and makes use of it to set defaults, is an
   implementation choice.

3.2.  Data Plane Challenges

   In order to maintain scalability, data plane forwarding retains state
   associated with the top label only.  Data plane forwarding makes use
   of the top label to select a composite link, or a group of components
   within a composite link or for an LSP associated with a specific
   component selects a specific component link.  For those LSP for which
   the LSP selects only the composite link or a group of a group of
   components within a composite link, the load balancing may make use
   of the entire label stack and in some cases may make use of
   information in the payload, though no state on specific contained LSP
   is retained.

   Load balancing makes use of techniques which allow large sets of
   flows to be moved to rearrange traffic.  These large sets of flows
   may be at a finer granularity than contained LSP.  Requirements to
   limit frequency of load balancing rearrangement can be adhered to by
   constraining the frequency at which these large sets of flows are
   moved.

3.2.1.  Very Large LSP

   Very large LSP may exceed the capacity of any single component of a
   composite link.  In some cases contained LSP may exceed the capacity
   of any single component.  These LSP require the use of the equivalent
   of the all-ones component of a link bundle.



Villamizar              Expires January 11, 2012               [Page 10]

Internet-Draft           CL Framework Additions                July 2011


3.2.2.  Very Large Microflows

   Within a very large LSP there may be very large microflows, or very
   large flows which cannot be further subdivided for other reasons.
   Flows which cannot be subdivided must be no larger that the capacity
   of any single component.

   Current signaling provides no way to specify the largest microflow
   that a can be supported on a given link bundle in routing
   advertisements.  Extensions which address this are discussed in
   Section 5.4.  Absent extensions of this type, traffic containing
   microflows that are too large for a given composite link may be
   present.  There is no data plane solution for this problem that would
   not require reordering traffic at the composite link egress.

   Some techniques are suseptible to statistical collisions where an
   algorithm to distribute traffic is unable to disambiguate traffic
   among two or more very large microflow where their sum is in excess
   of the capacity of any single component.  Hash based algorithms which
   use too small a hash space are particularly suseptible and require a
   change in hash seed in the event that this were to occur.  A change
   in hash seed is highly disruptive, causing traffic reordering among
   all traffic flows over which the hash function is applied.

3.2.3.  Traffic Ordering Constraints

   Some LSP have strict traffic ordering constraints.  Most notable
   among these are MPLS-TP LSP.  In the absense of aggregation into
   hierarchical LSP, those LSP with strict traffic ordering constraints
   can be placed on individual component links if there is a means of
   identifying which LSP have such a constraint.  If LSP with strict
   traffic ordering constraints are aggregated in hierarchical LSP, the
   hierarchical LSP capacity may exceed the capacity of any single
   component link.  In such a case the load balancing for the containing
   may be constrained to look only at the top label and the first
   contained label.  This and related issues are discussed further in
   Section 5.4.


4.  Existing Mechanisms

   In MPLS the one mechanisms which support explicit signaling of
   multiple parallel links is Link Bundling [RFC4201].

4.1.  Link Bundling

   Link bundling supports advertisement of a set of homogenous links as
   a single route advertisement.  Link bundling supports placement of an



Villamizar              Expires January 11, 2012               [Page 11]

Internet-Draft           CL Framework Additions                July 2011


   LSP on any single component link, or supports placement of an LSP on
   the all-ones component link.  Not all link bundling implementations
   support the all-ones component link and there is no way to tell which
   support this feature and which do not.  Based on [RFC4201] it is
   unclear how to advertise a link bundle for which the all-ones
   component link is available and used by default.  Common practice is
   to violate the specification and set the Maximum LSP Bandwidth to the
   Available Bandwidth.

   [RFC6107] extends the procedures for hierarchical LSP but also
   extends link bundles.  An LSP can be explicitly signaled to indicate
   that it is an LSP to be used as a component of a link bundle.

   While link bundling can be the basis for composite links, a
   significant number of small extension need to be added.

   1.  To support link bundles of heterogeneous links, a means of
       advertising the capacity available within a group of homogeneous
       needs to be provided.

   2.  Attributes need to be defined to support the following parameters
       for the link bundle or for a group of homogeneous links.

       A.  delay range

       B.  jitter (delay variation) range

       C.  group metric

       D.  all-ones component capable

       E.  capable of dynamically balancing load

       F.  largest supportable microflow

       G.  abilities to support strict packet ordering requirements
           within contained LSP


5.  Mechanisms Proposed in Other Documents

   A number of documents which at the time of writing are works in
   progress address parts of the requirements of Composite Link, or
   assist in making some of the goals acheivable.







Villamizar              Expires January 11, 2012               [Page 12]

Internet-Draft           CL Framework Additions                July 2011


5.1.  Loss and Delay Measurement

   Procedures for measuring loss and delay are provided in
   [I-D.ietf-mpls-loss-delay].  These are OAM based measurements.  This
   work could be the basis of delay measurements and delay variation
   measurement used for metrics called for in
   [I-D.ietf-rtgwg-cl-requirement].

5.2.  Link Bundle Extensions

   A set of link bundling extensions are defined in
   [I-D.ietf-mpls-explicit-resource-control-bundle].  This document
   provides extensions to the ERO and RRO to explicitly control the
   labels and resources within a bundle used by an LSP.

   The extensions in this document could be further extended to support
   indicating a group of component links in the ERO or RRO, where the
   group is given an interface identification like the bundle itself.
   The extensions could also be further extended to support
   specification of the all-ones component link in the ERO or RRO.

   This document does not provide a means to advertise the link bundle
   components.

5.3.  Fat PW and Entorpy Labels

   Two documents provide a means to add entropy for the purpose of
   improving load balance.  MPLS encapsulation can bury information that
   is needed to identify microflows.  These two documents allow a
   pseudowire ingress and LSP ingress respectively to add a label solely
   for the purpose of providing a finer granularity of microflow groups.

   [I-D.ietf-pwe3-fat-pw] allows pseudowires which carry a large volume
   of traffic, where microflows can be identified to be load balanced
   across multiple members of an Ethernet LAG or an MPLS link bundle.
   This is accomplished by adding a flow label below the pseudowire
   label in the MPLS label stack.  For this to be effective the link
   bundle load balance must make use of the label stack up to and
   including this flow label.

   [I-D.kompella-mpls-entropy-label] provides a means for a LER to put
   an additional label known as an entropy label on the MPLS label
   stack.  As defined, only the LER can add the entropy label and this
   label must be at the bottom of stack.

   If this restriction on entropy labels were to be relaxed, then core
   LSR could add entropy labels based on deep packet inspection and
   place the entropy label just below the label being acted on.  This



Villamizar              Expires January 11, 2012               [Page 13]

Internet-Draft           CL Framework Additions                July 2011


   would be helpful in situations where the label stack depth to which
   load distribution can operate is limited by implementation or is
   limited for other reasons such as carrying both MPLS-TP and MPLS with
   entropy labels within the same hierarchical LSP.

5.4.  Multipath Extensions

   The multipath extensions drafts address one aspect of Composite Link.
   These drafts deal with the issue of accommodating LSP which have
   strict packet ordering constraints in a network containing multipath.
   MPLS-TP has become the one important instance of LSP with strict
   packet ordering constraints nad has driven this work.

   [I-D.villamizar-mpls-tp-multipath] outlines requirements and gives a
   number of options for dealing with the apparent incompatibility of
   MPLS-TP and multipath.  A preferred option is described.

   [I-D.villamizar-mpls-tp-multipath-te-extn] provides protocol
   extensions needed to implement the preferred option described in
   [I-D.villamizar-mpls-tp-multipath].

   Other issues pertaining to multipath are also addressed.  Means to
   advertise the largest microflow supportable are defined.  Means to
   indicate the larges expected microflow within an LSP are defined.
   Issues related to hierarchy are addressed.


6.  Required Protocol Extensions and Mechanisms

   The primary areas where additional protocol mechanisms are required
   include the following.

   1.  An extension to link bundling is needed to specify a group of
       components with common attributes.  This can be a TLV defined
       within the link bundle that carries the same encapsulations as
       the link bundle.  Two interface indices would be needed for each
       group.

       A.  An index is needed that if included in an ERO would indicate
           the need to place the LSP on any one component within the
           group.

       B.  A second index is needed that if included in an ERO would
           indicate the need to balance flows within the LSP across all
           components of the group.  This is equivalent to the "all-
           ones" component for the entire bundle.





Villamizar              Expires January 11, 2012               [Page 14]

Internet-Draft           CL Framework Additions                July 2011


   2.  A parameter is needed in the IGP-TE advertisement of delay and
       delay variation for links, link bundles, and forwarding
       adjacencies.  Whatever mechanism is described must take
       precautions that insure that route oscillations cannot occur.

   3.  If a group is allowed to support all of the parameters of a link
       bundle, then a group TE metric would be accommodated.

   [ ... to be completed ... ]

   Note to co-authors: The following topics in the requirements document
   are not addressed.  Since they are explicitly mentioned in the
   requirements document some mention of how they are supported is
   needed, even if to say nother needed to be done.  If we conclude any
   particular topic is irrelevant, maybe the topic should be removed
   from the requirement document.  At that point we could add the
   management requirements that have come up and were missed.

   1.   L3VPN RFC 4364, RFC 4797,L2VPN RFC 4664, VPWS, VPLS RFC 4761,
        RFC 4762 and VPMS VPMS Framework
        (draft-ietf-l2vpn-vpms-frmwk-requirements).

   2.   IP and LDP.  This may be a matter of measuring, filtering the
        measurement, and deducting from the available bandwidth.

   3.   Migration may not be adequately covered in the backwards
        compatibility section.  Comments on this?

   4.   Do we need more on load sharing oscillation?

   5.   Lower layer to upper layer communication (FR#7, FR#20) is
        addressed in mpls-tp-multipath where layers are MPLS, but not
        elsewhere.

   6.   IGP-TE extensions are not defined for delay and jitter, and
        frequency of load balancing rearrangement (FR#13, FR#15-FR#17).
        Constraints are not defined in RSVP-TE, but could be modeled
        after adminstrative attribute affinities in RFC3209 and
        elsewhere.

   7.   RSVP-TE preemption and soft-preemption need to be called out as
        solutions for FR#10.

   8.   FR#11 explicitly calls for adaptive multipath.  This is assumed
        in the text so far but should be explicitly stated.  More on
        hash methods might be needed pointing out that adaptive
        multipath as described in cl-requirements appendix does the job.




Villamizar              Expires January 11, 2012               [Page 15]

Internet-Draft           CL Framework Additions                July 2011


   9.   The behavior of hash methods needs to be described in terms of
        FR#12 (minimally disruptive).  Reseeding the hash violates
        FR#12.  Using modulo operations if a link comes or goes violates
        FR#12 (as pointed out in RFC2991 and RFC2992).

   10.  Extending LDP is called for in DR#2.

   11.  DR#5 is not addressed (composite link spans multiple network
        topologies).  May need to discuss this.

   12.  We may need a performance section to address #DR6, #DR7, though
        we do already have scalability discussion.  The performance
        section would have to say "no worse than before, except were
        there was no alternative to make it very slightly worse" (in a
        bit more detail than that).


7.  Security Considerations

   [ ... to be completed ... ]

   The security section provides job security for the Security Area
   Directors.

   The security considerations for MPLS/GMPLS and for MPLS-TP are
   documented in [RFC5920] and [I-D.ietf-mpls-tp-security-framework].


8.  References

8.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

8.2.  Informative References

   [DBP]      Bertsekas, D., "Dynamic Behavior of Shortest Path Routing
              Algorithms for Communication Networks", IEEE Trans. Auto.
              Control 1982.

   [I-D.ietf-mpls-explicit-resource-control-bundle]
              Zamfir, A., Ali, Z., and P. Dimitri, "Component Link
              Recording and Resource Control for TE Links",
              draft-ietf-mpls-explicit-resource-control-bundle-10 (work
              in progress), April 2011.

   [I-D.ietf-mpls-loss-delay]



Villamizar              Expires January 11, 2012               [Page 16]

Internet-Draft           CL Framework Additions                July 2011


              Frost, D. and S. Bryant, "Packet Loss and Delay
              Measurement for MPLS Networks",
              draft-ietf-mpls-loss-delay-03 (work in progress),
              June 2011.

   [I-D.ietf-mpls-tp-security-framework]
              Fang, L., Niven-Jenkins, B., and S. Mansfield, "MPLS-TP
              Security Framework",
              draft-ietf-mpls-tp-security-framework-01 (work in
              progress), May 2011.

   [I-D.ietf-pwe3-fat-pw]
              Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan,
              J., and S. Amante, "Flow Aware Transport of Pseudowires
              over an MPLS Packet Switched Network",
              draft-ietf-pwe3-fat-pw-07 (work in progress), July 2011.

   [I-D.ietf-rtgwg-cl-requirement]
              Villamizar, C., McDysan, D., Ning, S., Malis, A., and L.
              Yong, "Requirements for MPLS Over a Composite Link",
              draft-ietf-rtgwg-cl-requirement-04 (work in progress),
              March 2011.

   [I-D.kompella-mpls-entropy-label]
              Kompella, K., Drake, J., Amante, S., Henderickx, W., and
              L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
              draft-kompella-mpls-entropy-label-02 (work in progress),
              March 2011.

   [I-D.villamizar-mpls-tp-multipath]
              Villamizar, C., "Use of Multipath with MPLS-TP and MPLS",
              draft-villamizar-mpls-tp-multipath-01 (work in progress),
              March 2011.

   [I-D.villamizar-mpls-tp-multipath-te-extn]
              Villamizar, C., "Multipath Extensions for MPLS Traffic
              Engineering",
              draft-villamizar-mpls-tp-multipath-te-extn-00 (work in
              progress), July 2011.

   [RFC3209]  Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
              and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
              Tunnels", RFC 3209, December 2001.

   [RFC4201]  Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
              in MPLS Traffic Engineering (TE)", RFC 4201, October 2005.

   [RFC5920]  Fang, L., "Security Framework for MPLS and GMPLS



Villamizar              Expires January 11, 2012               [Page 17]

Internet-Draft           CL Framework Additions                July 2011


              Networks", RFC 5920, July 2010.

   [RFC6107]  Shiomoto, K. and A. Farrel, "Procedures for Dynamically
              Signaled Hierarchical Label Switched Paths", RFC 6107,
              February 2011.


Author's Address

   Curtis Villamizar (editor)
   Infinera Corporation
   169 W. Java Drive
   Sunnyvale, CA  94089

   Email: curtis@occnc.com




































Villamizar              Expires January 11, 2012               [Page 18]