Re: Proposed Resolution of Comments (Was: Composite Link Requirements)

Curtis Villamizar <curtis@occnc.com> Wed, 03 March 2010 08:08 UTC

Return-Path: <curtis@occnc.com>
X-Original-To: rtgwg@core3.amsl.com
Delivered-To: rtgwg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8ADB828C21C for <rtgwg@core3.amsl.com>; Wed, 3 Mar 2010 00:08:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.339
X-Spam-Level:
X-Spam-Status: No, score=-2.339 tagged_above=-999 required=5 tests=[AWL=0.260, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wr1CUOF8OllJ for <rtgwg@core3.amsl.com>; Wed, 3 Mar 2010 00:08:48 -0800 (PST)
Received: from harbor.orleans.occnc.com (harbor.orleans.occnc.com [173.9.106.135]) by core3.amsl.com (Postfix) with ESMTP id 277C328C254 for <rtgwg@ietf.org>; Wed, 3 Mar 2010 00:08:47 -0800 (PST)
Received: from harbor.orleans.occnc.com (harbor.orleans.occnc.com [173.9.106.135]) by harbor.orleans.occnc.com (8.13.6/8.13.6) with ESMTP id o2388lXd097705; Wed, 3 Mar 2010 03:08:48 -0500 (EST) (envelope-from curtis@harbor.orleans.occnc.com)
Message-Id: <201003030808.o2388lXd097705@harbor.orleans.occnc.com>
To: "Mcdysan, David E" <dave.mcdysan@verizon.com>
From: Curtis Villamizar <curtis@occnc.com>
Subject: Re: Proposed Resolution of Comments (Was: Composite Link Requirements)
In-reply-to: Your message of "Tue, 02 Mar 2010 09:25:16 EST." <793F49BA1FC821409F99F10862A0E4DB05F96E94@FHDP1LUMXCV14.us.one.verizon.com>
Date: Wed, 03 Mar 2010 03:08:47 -0500
Sender: curtis@occnc.com
Cc: rtgwg@ietf.org
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: curtis@occnc.com
List-Id: <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtgwg>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Mar 2010 08:08:50 -0000

In message <793F49BA1FC821409F99F10862A0E4DB05F96E94@FHDP1LUMXCV14.us.one.verizon.com>
"Mcdysan, David E" writes:
>  
> Hi Curtis,
>  
> The co-authors of this draft reviewed your comments and decided to
> respond with three separate messages, to separate the threads as follows
> so that all of the issues you raise can be resolved efficiently.
>  
> 	#1 Composite Link Trademark Issue (Was: Composite Link
> Requirements)
> 	#2. Acknowledgement of Prior Work (Was: Composite Link
> Requirements)
> 	#3. Proposed Resolution of Comments (Was: Composite Link
> Requirements)
>  
> This is thread #3.
>  
> Dave   

Hi Dave

Replies are inline.

> > -----Original Message-----
> > From: rtgwg-bounces@ietf.org [mailto:rtgwg-bounces@ietf.org] 
> > On Behalf Of Curtis Villamizar
> > Sent: Saturday, February 27, 2010 4:00 AM
> > To: rtgwg@ietf.org
> > Subject: Composite Link Requirements
> > 
> > 
> > Hi there good people of RTGWG,
> > 
> > This is in regards to the goals that are embodied in the 
> > RTGWG acceptance of a draft to deal with requirements for 
> > composite link, currently named draft-ietf-rtgwg-cl-requirement-00.txt
> > 
> > 
> > I'm bringing up two issues in this email.  One is prior 
> > composite link work and the other is prior methods of 
> > handling composite link, which should be acknowledged.  
>  
> This is the basis for thread#3.
>  
> > After 
> > that I just have some comments and questions on the draft.
>  
> Text Snipped
>  
> > 
> > Comments on the draft:
> > 
> > The following statements may be inaccurate:
> > 
> >    The Link Bundle concept is somewhat limited because of the
> >    requirement that all component links must have identical
> >    capabilities, and because it applies only to TE links.
> > 
> >      This may be inaccurate.  I don't think there is a requirement
> >      that a link bundle use identical links.
> > 
> >      In any case, both Avici composite links and many LAG
> >      implementations allow a mix of member speeds and neither was
> >      applicable to TE links only.
>  
> Not clear what specific text change is being proposed. Does the
> following address your comment?
>  
> EXISTING 3.1 TEXT 
>    o  Advertisement of each component link into the IGP. Although this
>       would address the problem, it has a scaling impact on IGP routing,
>       and was an important motivation for the specification of link
>       bundling [RFC4201]. However, there are two gaps in link bundling:
>  
>          1.  It only supports RSVP-TE, not LDP.
>  
>    	   2.  It does not support a set of component links with
> different
>       characteristics (e.g., different bandwidth and/or latency).
>  
>       For example, in practice carriers commonly use link bandwidth and
>       link latency to set link TE metrics for RSVP-TE.  For RSVP-TE,
>       limiting the component links to same TE metric has the practical
>       effect of dis-allowing component links with different link
>       bandwidth and latencies.
>  
> PROPOSED TEXT
>  
>    o  Advertisement of each component link into the IGP. Although this
>       would address the problem, it has a scaling impact on IGP routing,
>       and was an important motivation for the specification of link
>       bundling [RFC4201]. However, there are two gaps in link bundling:
>  
>          1.  It only supports RSVP-TE, not LDP.
>  
>    	   2.  It only supports advertisement of a single TE metric for 
>              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
> 	a set of component links, 
>  
> 	However, the component links in this set may have different
>       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
> 	characteristics (e.g., different bandwidth and/or latency).
>       For example, in practice carriers commonly use link bandwidth and
>       link latency to set link TE metrics for RSVP-TE.  For RSVP-TE,
>       limiting the component links to same TE metric has the practical
>       effect of dis-allowing component links with different link
>       bandwidth and latencies.


That is not the text.  The exact text is:

   The Link Bundle concept is somewhat limited because of the
   requirement that all component links must have identical
   capabilities, and because it applies only to TE links.  This document
   sets out a more generic set of requirements for grouping together a
   set of parallel data links that may have different characteristics,
   and for advertising and operating them as a single TE or non-TE link
   called a Composite Link.

There is no requirement "that all component links must have identical
capabilities" imposed by link bundling.

The introduction of new TLVs to carry metrics that allow multiple
combinations of delay and other parameters mixed in with admin color,
priority, etc is completely orthogonal to the use of link bundling.

What I'm asking is that you drop the link bundle bashing here (not
that I am at all a fan of link bundling) and just stick with the
requirement.

    This document proposes that a link metrics allow for grouping
    together a set of parallel data links that may have different
    characteristics, and for advertising and operating them as a
    single link.

Not that if link bundling wanted to advertise more than one delay the
definition of should allow it to make use of such a metric.

> > The following should be replaced:
> > 
> >    Traffic Flow: A set of packets that with common identifier
> >    characteristics that the composite link is able to use to aggregate
> >    traffic into Connections.  Identifiers can be an MPLS label stack
> >    or any combination of IP addresses and protocol types for routing,
> >    signaling and management packets.
> > 
> > Diffserv already defined a microflow to be the same thing.  
> > We should not invent new terms to mean the same thing as 
> > existing terms.  We can just point out that labels can also 
> > be used to identify a microflow.
>  
> A clear definition is certainly the goal. Can you provide a specific
> Diffserv reference for review by the Working Group. 

Its a pain to find some of the relevant references but nothing grep
can't handle.

  rfc2474.txt:

   Microflow: a single instance of an application-to-application flow of
   packets which is identified by source address, destination address,
   protocol id, and source port, destination port (where applicable).

  rfc2475.txt:

   Microflow                 a single instance of an application-to-
                             application flow of packets which is
                             identified by source address, source port,
                             destination address, destination port and
                             protocol id.

   Traffic stream            an administratively significant set of one
                             or more microflows which traverse a path
                             segment.  A traffic stream may consist of
                             the set of active microflows which are
                             selected by a particular classifier.

An MPLS microflow is properly identified by its entire lable stack and
if the payload is IP, by the IP source and destination address.  If
not IP as indicated by a PW CW, then only the label stack is used.

You can also reference the "Avoiding ECMP" section in rfc4385.txt
which explains the reason to avoid certain values in the first nibble
of payload and rfc4928 which provides the following terms

   IP ECMP     A forwarding behavior in which the selection of the
               next-hop between equal cost routes is based on the
               header(s) of an IP packet

   Label ECMP  A forwarding behavior in which the selection of the
               next-hop between equal cost routes is based on the label
               stack of an MPLS packet

Note that with label ECMP the top label may be the same but diversity
elsewhere in the label stack allows microflows (or set of microflows)
to be identified.

> > This statement is definitely inaccurate for a number of reasons:
> > 
> >    o  ECMP/Hashing/LAG: IP traffic composed of a large number of flows
> >       with bandwidth that is small with respect to the individual link
> >       capacity can be handled relatively well using ECMP/LAG
> >       approaches.  However, ?many/some of? these approaches do not
> make use of MPLS
>                               ^^^^^^^^^^^^^^
> >       control plane information nor traffic volume
> >       information. Distribution techniques applied only within the
> >       data plane can result in less than ideal load balancing across
> >       component links of a composite link.
> > 
> >   Avici used feedback from the egress port to the ingress port on
> >   traffic volume and queue occupancy to influence the distribution of
> >   the hash.  There is nothing in the definition of ECMP to prohibit
> >   this and the OMP technique explicitly called for doing so and
> >   proposed protocol extension to be able to go beyond just a decision
> >   within a single NE as Avici did.
>  
> Does above proposed change address (part of your comment)? If so, should
> this state many or some? Other parts of your comment seem to be solution
> oriented and should certainly be considered to meet issues raised in the
> requirements or included in the proposed framework draft.

How abot this change:

   o  ECMP/Hashing/LAG: IP traffic composed of a large number of flows
      with bandwidth that is small with respect to the individual link
      capacity can be handled relatively well using ECMP/LAG approaches.
-     However, these approaches do not make use of MPLS control plane
-     information nor traffic volume information.
+     While nothing precludes using traffic volume information, and
+     some implementations have done so, in practice few if any
+     implementations today make use of MPLS control plane information
+     or traffic volume information.  Implementations commonly use the
+     entire MPLS label stack for non-IP MPLS traffic.
      Distribution
      techniques applied only within the data plane can result in less
      than ideal load balancing across component links of a composite
      link.

It makes it more clear that no ECMP/Hashing/LAG can and has at least
in the past (and possibly will in the future) use traffic volume
information and also commonly use the entire IP stack.  There is
precedence for use of the whole IP stack and traffic volume
information (at least Avici).

> > This is inaccurate:
> > 
> >   o 2.  It does not support a set of component links with different
> >         characteristics (e.g., different bandwidth and/or latency).
> > 
> >       For example, in practice carriers commonly use link bandwidth
> >       and link latency to set link TE metrics for RSVP-TE.  For
> >       RSVP-TE, limiting the component links to same TE metric has the
> >       practical effect of dis-allowing component links with different
> >       link bandwidth and latencies.
> > 
> > There is no formal meaning to the link metric in ISIS or OSPF.
>  
> See proposed rewording above.

See my comment above.  Introduction of a new metric should be
orthogonal to CL and applicable to link bundling if done right.

> > Under inverse-mux: the real problem with inverse-mux is the 
> > amount of bandwidth that needs to be multiplexed greatly 
> > exceeds the fastest single packet processing element and 
> > therefore doesn't work.  
>  
> This is true (we may want to state this as the bandwidth of any
> "connection" as defined in the draft instead of "packet processing
> element"). We recommend adding this to the Inverse Multiplexing bullet
> in the motivation section.

No they are different.  In principle a monster IPSEC GW or other
appliance could stuff a huge amount of traffic into the network which
appears as a single microflow.

The limitation is that at any hop in the network, there must be a
packet processing element big enough to swallow the whole flow and
spit it out striped over more than one component.

In practical terms, today there are 100Gb/s packet processor
elements.  There might be 200G or even 500G PP chips in the future,
but by that time there will already be a lot of Tb/s pipes and maybe
even Tb/s LSPs.

Network growth is outpacing Moore's Law.

> > The latency argument is not really valid.
>  
> Please elaborate.

Ths latency as a metric is a rough convention and the thinking usually
runs along the lines that the longer the fiber run the more dollars it
will cost so use delay rather than hop count for metric.  In practice
metric settings are used as a form of TE rather than an attempt to get
a minimum delay on traffic.

If a metric is invented that specifically means delay, it can be
invented with the semantics needed to handle link bundle, LAG, or CL.

> > I think that the ability of an LSR to measure latency on and 
> > LSP and report a latency figure or route based on lowest 
> > latency is almost orthogonal to the problem of composite 
> > link.  If latency and bandwidth at each holding priority is 
> > advertised, then we have a cross product of advertisements.  
> > For example, you can have 1 Gb/s at 10msec at pri#1, but if 
> > you can live with 12msec you can have 2Gb/s, or at 14 msec 
> > 3Gb/s, but at pri#2 you only get ... and so on for 8 priorities.
> > Is this what we're aiming for?
>  
> Advertising each link would solve the problem (see second bullet in
> motivation section), but creates more advertisements in the IGP and may
> not work well with LDP.

So what exactly is the requirement, stated only as a requirement void
of any additional bashing about what there is today?

> > The table at the beginning of seciton 4 is meaningless.
>  
> Do you believe that the text above the table is sufficient to define
> these terms, which are used in the outline? Amongst the co-authors
> agreeing on the meaning of these terms used to structure the outline
> based upon prior comments was viewed as useful. Do other members of the
> rtgwg want to delete this table, refine the definitions, or propose
> specific changes to the requirements outline? 

I don't find it at all what the table was supposed to represent.  What
dow a Y mean at the intersection of "With TE Info" and "IGP"?

There is no apparent (to me at least) meaning to the "Y" or "N"
assigned at the intersection of row and colum heading.

> > In this section:
> > 
> >   4.1.1.1. Traffic Flow and Connection Mapping
> > 
> >    The solution SHALL support operator assignment of traffic flows to
> >    specific connections.
> > 
> >    The solution SHALL support operator assignment of connections to
> >    specific component links.
> > 
> > How is this supposed to work for signaled LSP where the 
> > component links are not idendified in control signaling?  Is 
> > this scalable from a configuration standpoint or only 
> > applicable to staticly configured MPLS cross connect?
>  
> A future solution could identify component links. The wg direction for
> this draft was to focus on requirements. These comments may be
> applicable to the framework draft or a specific proposed solution.

Isn't that the "unbundled" version of link bundling, or put another
way, the common practice on a set of parallel links where LAG is not
used and before link bunding existed?

If so, there is nothing new about that "future solution".

As to the existing requirement it appears then that this requirement
implies that there cannot be a control plane.  Is that the case?

> >    In order to prevent packet loss, the solution must employ make-
> >    before-break when a change in the mapping of a connection to a
> >    component link mapping change has to occur.
> > 
> > Only the ingress of an LSP can initiate make-before-break and 
> > the ingress doesn't know about the component links.  In 
> > RFC3209, make-before-break involves a new LSP using the same 
> > tunnel-id.
> > Are you using a different meaning for make-before-break?
> > 
> Good point. This is more like MPLS FRR at an intermediate point where
> the bypass tunnel is first established before switching. Will clarify
> this point and provide appropriate reference(s).

FRR is only temporary and requires a resignal from the ingress to make
it permanent.  The "hint" to the ingress is that protection-inuse is
set in the RESV (and the IGP says the link is down).  There is nothing
to prevent that behaviour with FRR using another member except that
the set of SRLG might notice if it was another wave on the same fiber.

> > Regarding this statelent:
> > 
> >    The solution SHALL support management plane controlled parameters
> >    that define at least a minimum bandwidth, maximum bandwidth,
> >    preemption priority, and holding priority for each connection
> >    without TE information (i.e., LDP signaled LSP that does not
> >    contain the same information as an RSVP-TE signaled LSP).
> > 
> > Could you explain how preemption would work for LDP?  Do you 
> > plane to withdraw the FEC?  If so, for how long?  Forever?  
> > If not forever would the traffic periodically come back, get 
> > remeasured and withdrawn again?
>  
> Good questions, but they seem solution oriented. The wg direction for
> this draft was to focus on requirements. These comments may be
> applicable to the framework draft or a specific proposed solution. Would
> you propose a rewording, clarification of the requirement, and/or
> additional requirements? 

They are feasibility oriented questions.  If there is no feasible
solution, there shouldn't be a requirement.  I propose that the
requirement be removed until there is evidence that a feasible
solution exists.

> > In 4.2.2.1 what does this mean?
> > 
> >    o  Bandwidth of the highest and lowest speed
>  
> We discussed this and propose the following replacement:
>  
> o Maximum and minimum acceptable bandwidth of the LSP

This is sneaking in a requirement with no semantics.  How does the
"Maximum and minimum acceptable bandwidth of the LSP" relate to the
reserved BW at a given priority level and what is it used for?

A requirement document should document what is to be accomplished.  By
including what parameters are to be added you are clearly in the realm
of how, and you haven't fully defined what the purpose of the
parameters are.

I think all of section 4.2 does not belong in a requirement document,
at least in its current form.

> > Overall I find many of the stated requirements to be unclear. 
> >  Perhaps some discussion and improvements to the wording will 
> > bring clarity.
>  
> Achieving clarity is certainly our shared wg goal. Specific questions
> and comments like those Curtis has provided would be most appreciated by
> the co-authors.

Its getting late so I'll start another thread.

> > Or maybe I'm just dense.

You didn't respond to that.  :-)

> > Curtis
> > _______________________________________________
> > rtgwg mailing list
> > rtgwg@ietf.org
> > https://www.ietf.org/mailman/listinfo/rtgwg