Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08

"Joel M. Halpern" <> Tue, 13 March 2012 02:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7117D21F8874 for <>; Mon, 12 Mar 2012 19:50:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.1
X-Spam-Status: No, score=-102.1 tagged_above=-999 required=5 tests=[AWL=0.165, BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id dC32JRSkHHVD for <>; Mon, 12 Mar 2012 19:50:17 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 32B8621F886B for <>; Mon, 12 Mar 2012 19:50:17 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 23D8CA628D for <>; Mon, 12 Mar 2012 19:50:17 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 685A51C03434; Mon, 12 Mar 2012 19:50:16 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at
Received: from [] ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id EF9FA1C03431; Mon, 12 Mar 2012 19:50:13 -0700 (PDT)
Message-ID: <>
Date: Mon, 12 Mar 2012 22:50:12 -0400
From: "Joel M. Halpern" <>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Russ Housley <>
References: <> <> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc:, Steven Blake <>, Michael Menth <>,, Bob Briscoe <>, Tom Taylor <>, David Harrington <>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 13 Mar 2012 02:50:18 -0000

I may have missed something in my final review, but I thought this was 
properly resolved in later email exchanges, and the text in teh document 
cleaned up to be very clear and specific about what was needed (you need 
tunnels to be able to do the classification, and it says so.)


On 3/12/2012 2:28 PM, Russ Housley wrote:
> I am very confused about the state of this.  My skimming of the thread seems to indicate at least one unresolved issue.
> Russ
> On Jan 2, 2012, at 1:04 PM, Joel M. Halpern wrote:
>> The clarification on U is very helpful.  I look forward to comments from others on the routing based behavior / ECMP text removal / replacement question.
>> On 1/2/2012 12:58 PM, Michael Menth wrote:
>>> Hi Joel, hi Tom,
>>> Am 02.01.2012 18:18, schrieb Joel M. Halpern:
>>>> Michael, I am not sure what to make of your recommended text abut ECMP.
>>>> ECMP is used by almost all operators. It is generally considered a
>>>> necessary tool in the tool-kit.
>>>> More significantly, at least for the egress understanding of the
>>>> ingress, it is not even the single operator's ECMP, but other
>>>> operators selections of paths that produce the issue. So even in the
>>>> unlikely event that this operator does not use ECMP, it still is not
>>>> sufficient.
>>> Then I better leave the ECMP issue for others to answer.
>>> The definition of U can be better corrected as follows (improved
>>> rewording of my previous email):
>>> U represents the average ratio of PCN-supportable-rate to
>>> PCN-admissible-rate over all the links of the PCN-domain.
>>> ->
>>> U is a domain-wide constant which implicitly defines the
>>> PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN
>>> domain.
>>> Best wishes,
>>> Michael
>>>> Yours,
>>>> Joel
>>>> On 1/2/2012 11:54 AM, Michael Menth wrote:
>>>>> Hi Tom, hi Joel,
>>>>> I wish you a happy new year!
>>>>> Here are my comments to address Joel's concerns:
>>>>> ====================================================================
>>>>> The issue with ECMP: I'd add a comment that CL and SM should not be in
>>>>> the presence of ECMP if routing information is used to determine
>>>>> ingress-egress-aggregates since this seems to be messy and error-prone.
>>>>> ====================================================================
>>>>> The following text may clarify at the beginning of Section 3.3.2 the
>>>>> relation
>>>>> between admission control and flow termination to address one of Joel's
>>>>> comments (for both SM and CL):
>>>>> In the presence of light pre-congestion, i.e., in the presence of a
>>>>> small,
>>>>> positive ETM-rate (relative to the overall PCN traffic rate), new
>>>>> flows may
>>>>> already be blocked. However, in the presence of heavy pre-congestion,
>>>>> i.e.,
>>>>> in the presence of a relatively large ETM-rate, termination of some
>>>>> admitted
>>>>> flows is required. Thus, flow blocking is logical prerequisite for flow
>>>>> termination.
>>>>> ====================================================================
>>>>> The following sentence in 3.3.2 should be corrected (only SM-specific):
>>>>> U represents the average ratio of PCN-supportable-rate to
>>>>> PCN-admissible-rate
>>>>> over all the links of the PCN-domain.
>>>>> ->
>>>>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate
>>>>> for all
>>>>> the links of the PCN-domain.
>>>>> ====================================================================
>>>>> I also recommend to change the following text as I think it may cause
>>>>> misinterpretations (applies both to SM and CL):
>>>>> If the difference calculated in the second step is positive, the
>>>>> Decision
>>>>> Point SHOULD select PCN-flows to terminate, until it determines that the
>>>>> PCN-traffic admission rate will no longer be greater than the estimated
>>>>> sustainable aggregate rate. If the Decision Point knows the bandwidth
>>>>> required by individual PCN-flows (e.g., from resource signalling used to
>>>>> establish the flows), it MAY choose to complete its selection of
>>>>> PCN-flows to
>>>>> terminate in a single round of decisions.
>>>>> Alternatively, the Decision Point MAY spread flow termination over
>>>>> multiple
>>>>> rounds to avoid over-termination. If this is done, it is RECOMMENDED
>>>>> that
>>>>> enough time elapse between successive rounds of termination to allow the
>>>>> effects of previous rounds to be reflected in the measurements upon
>>>>> which the
>>>>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2
>>>>> and 4.3
>>>>> of [MeLe10].)
>>>>> ->
>>>>> If the difference calculated in the second step is positive (traffic
>>>>> rate to
>>>>> be terminated), the Decision Point SHOULD select PCN-flows to
>>>>> terminate. To
>>>>> that end, the Decision Point MAY use upper rate limits for individual
>>>>> PCN-flows (e.g., from resource signalling used to establish the
>>>>> flows) and
>>>>> select a set of flows whose sum of upper rate limits is up to the
>>>>> traffic
>>>>> rate to be terminated. Then, these flows are terminated. The use of
>>>>> upper
>>>>> limits on flow rates avoids over-termination.
>>>>> Termination may be continuously needed after consecutive measurement
>>>>> intervals for various
>>>>> reasons, e.g., if the used upper rate limits overestimate the actual
>>>>> flow rates.
>>>>> For such cases it is RECOMMENDED that enough time elapses between
>>>>> successive
>>>>> termination events to allow the effects of previous termination events
>>>>> to be
>>>>> reflected in the measurements upon which the termination decisions are
>>>>> based;
>>>>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2
>>>>> and
>>>>> 4.3 of [MeLe10].
>>>>> ====================================================================
>>>>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE"
>>>>> makes it look like a reference to a standards document.
>>>>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM.
>>>>> Best wishes,
>>>>> Michael
>>>>> Am 02.01.2012 15:21, schrieb Tom Taylor:
>>>>>> It shall be as you say, subject to comment from my co-authors when
>>>>>> they get back from holiday.
>>>>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote:
>>>>>>> In-line...
>>>>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote:
>>>>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote:
>>>>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on
>>>>>>>>> the
>>>>>>>>> two issues where I was unclear.
>>>>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I
>>>>>>>>> have is
>>>>>>>>> relative to the third operational alternative where routing is
>>>>>>>>> used to
>>>>>>>>> determine where the ingress and egress of a flow is. To be blunt,
>>>>>>>>> as far
>>>>>>>>> as I can tell this does not work.
>>>>>>>>> 1) It does not work on the ingress side because traffic from a given
>>>>>>>>> source prefix can come in at multiple places. Some of these
>>>>>>>>> places may
>>>>>>>>> claim reachability to the source prefix. Some may not. While a given
>>>>>>>>> flow will use only one of these paths, there is no way to determine
>>>>>>>>> from
>>>>>>>>> routing information, at the egress, which ingress that flow used.
>>>>>>>>> 2) A site may use multiple exits for a given destination prefix.
>>>>>>>>> Again,
>>>>>>>>> while the site will only use one of these egresses for a given flow,
>>>>>>>>> there is no way for the ingress to know which egress it will be
>>>>>>>>> on the
>>>>>>>>> basis of routing information.
>>>>>>>>> Thus, the text seems to allow for a behavior that simply does not
>>>>>>>>> work.
>>>>>>>> [PTT] I think the disconnect here is that you read the text to say
>>>>>>>> that
>>>>>>>> an individual node uses routing information to determine the IEA.
>>>>>>>> That
>>>>>>>> wasn't the intention. Instead, administrators use routing
>>>>>>>> information to
>>>>>>>> derive filters that are installed at the ingress and egress nodes.
>>>>>>> As far as I can tell, your response describes a situation even less
>>>>>>> effective than what I assumed.
>>>>>>> Firstly, it does not matter whether it is the edge node, the decision
>>>>>>> node, or the human administrator. Routing information is not enough to
>>>>>>> determine what the ingress-egress pairing is. The problems I describe
>>>>>>> above apply no matter who is making the decision.
>>>>>>> Secondly, having a human make the decision means that as soon as
>>>>>>> routing
>>>>>>> changes, the configured filters are wrong.
>>>>>>> I would suggest that the text in question be removed, and replaced
>>>>>>> with
>>>>>>> a warning against attempting what is currently described.
>>>>> My view is also that CL ans SM do not work in the presence of ECMP. This
>>>>> should be indicated as a warning.
>>>>>>>>> I am still confused about the relationship of section 3.3.2 to the
>>>>>>>>> behavior you describe. 3.3.2 says that as long as any excess
>>>>>>>>> traffic is
>>>>>>>>> being reported, teh decision point shall direct the blocking of
>>>>>>>>> additional flows. That does not match 3.3.1, and does not match your
>>>>>>>>> description.
>>>>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to
>>>>>>>> block as long as any excess traffic is being reported. What I
>>>>>>>> think it
>>>>>>>> says is that as long as excess traffic is reported, the decision
>>>>>>>> point
>>>>>>>> checks to see whether the traffic being admitted to the aggregate
>>>>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no
>>>>>>>> termination may be required (i.e., traffic is below the second
>>>>>>>> threshold).
>>>>>>> I think I see what you are saying. If I am reading this correctly, the
>>>>>>> decision process must re-calculate to determine if there is
>>>>>>> termination
>>>>>>> every time it receives a report with non-zero excess and the port is
>>>>>>> already blocked. But it does not have to actually block anything.
>>>>>>> This however seems to depend upon the correct relative
>>>>>>> configuration of
>>>>>>> the limit that flips it into blocked state, the value of U, and maybe
>>>>>>> some other values.
>>>>>>> Put differently, I understand that the two are not contradictory.
>>>>>>> However, since the two things use different calculations, it is not at
>>>>>>> all clear that they are consistent. This may well be acceptable.
>>>>>>> But the
>>>>>>> difference in methods is likely to lead to confusion. So, as a minor
>>>>>>> (rather than major) comment, I would suggest that you provide
>>>>>>> clarifying
>>>>>>> text explaining why it is okay to use one condition to decide if there
>>>>>>> is blocking, but a different condition (which could produce a lower
>>>>>>> threshold) to decide how much to get rid of.
>>>>>>> Yours,
>>>>>>> Joel
>>>>>>>>> Yours,
>>>>>>>>> Joel
>>>>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote:
>>>>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT].
>>>>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote:
>>>>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For
>>>>>>>>>>> background on
>>>>>>>>>>> Gen-ART, please see the FAQ at
>>>>>>>>>>> <>.
>>>>>>>>>>> Please resolve these comments along with any other Last Call
>>>>>>>>>>> comments
>>>>>>>>>>> you may receive.
>>>>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08
>>>>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of
>>>>>>>>>>> Operation
>>>>>>>>>>> Reviewer: Joel M. Halpern
>>>>>>>>>>> Review Date: 31-Dec-2011
>>>>>>>>>>> IETF LC End Date: 13-Jan-2012
>>>>>>>>>>> IESG Telechat date: N/A
>>>>>>>>>>> Summary: This documents is almost ready for publication as an
>>>>>>>>>>> Informational RFC.
>>>>>>>>>>> Question: Given that the document defines a complex set of
>>>>>>>>>>> behaviors,
>>>>>>>>>>> which are mandatory for compliant systems, it seems that this
>>>>>>>>>>> ought to
>>>>>>>>>>> be Experimental rather than Informational. It describes something
>>>>>>>>>>> that
>>>>>>>>>>> could, in theory, later become standards track.
>>>>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your
>>>>>>>>>> suggestion.
>>>>>>>>>>> Major issues:
>>>>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third
>>>>>>>>>>> bullet,
>>>>>>>>>>> states that the PCN-domain satisfies the conditions specified
>>>>>>>>>>> in RFC
>>>>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what
>>>>>>>>>>> conditions
>>>>>>>>>>> these are. Is this supposed to be a reference to RFC 5559
>>>>>>>>>>> instead? No
>>>>>>>>>>> matter which document it is referencing, please be more specific
>>>>>>>>>>> about
>>>>>>>>>>> which section / conditions are meant.
>>>>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long
>>>>>>>>>> time
>>>>>>>>>> since that text was written that I can't recall what the intention
>>>>>>>>>> was.
>>>>>>>>>> My inclination at the moment is simply to delete the bullet.
>>>>>>>>>>> It would have been helpful if the early part of the document
>>>>>>>>>>> indicated
>>>>>>>>>>> that the edge node information about how to determine
>>>>>>>>>>> ingress-egress-aggregates was described in section 5.
>>>>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to
>>>>>>>>>>> describe an option which does not seem to quite work. After
>>>>>>>>>>> describing
>>>>>>>>>>> how to use tunneling, and how to work with signaling, the text
>>>>>>>>>>> refers to
>>>>>>>>>>> inferring the ingress-egress-aggregate from the routing
>>>>>>>>>>> information. In
>>>>>>>>>>> the presence of multiple equal-cost domain exits (which does
>>>>>>>>>>> occur in
>>>>>>>>>>> reality), the routing table is not sufficient information to make
>>>>>>>>>>> this
>>>>>>>>>>> determination. Unless I am very confused (which does happen) this
>>>>>>>>>>> seems
>>>>>>>>>>> to be a serious hole in the specification.
>>>>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it,
>>>>>>>>>> operators
>>>>>>>>>> don't assign packets randomly to a given path in the presence of
>>>>>>>>>> alternatives -- they choose one based on values in the packet
>>>>>>>>>> header.
>>>>>>>>>> The basic intent is that packets of a given microflow all follow
>>>>>>>>>> the
>>>>>>>>>> same path, to prevent unnecessary reordering and minimize
>>>>>>>>>> jitter. The
>>>>>>>>>> implication is that filters can be defined at the ingress nodes to
>>>>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e.
>>>>>>>>>> flowing
>>>>>>>>>> from a specific ingress node to a specific egress node) based on
>>>>>>>>>> their
>>>>>>>>>> header contents. The filters to do the same job at egress nodes
>>>>>>>>>> are a
>>>>>>>>>> different problem, but they are not affected by ECMP.
>>>>>>>>>>> Minor issues:
>>>>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE
>>>>>>>>>>> (excess over total) rate exceeds the configured limit. However,
>>>>>>>>>>> section
>>>>>>>>>>> 3.3.2 states that the decision node must take further stapes if
>>>>>>>>>>> the
>>>>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency
>>>>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is
>>>>>>>>>>> important to drive the excess rate to 0, then why is action only
>>>>>>>>>>> initiated when the ratio is above a configured value, rather than
>>>>>>>>>>> any
>>>>>>>>>>> non-zero value? I can conceive of various reasons. But none are
>>>>>>>>>>> stated.)
>>>>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value
>>>>>>>>>> equal to
>>>>>>>>>> something less than (U - 1)/U. (The "something less" is because of
>>>>>>>>>> packet dropping at interior nodes.) The assumption is that (U -
>>>>>>>>>> 1)/U is
>>>>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds.
>>>>>>>>>> When the
>>>>>>>>>> CLE is below the first threshold, new flows are admitted. Above
>>>>>>>>>> that
>>>>>>>>>> threshold, they are blocked. When the CLE is above the second
>>>>>>>>>> threshold,
>>>>>>>>>> flows are terminated to bring them down to that threshold. In
>>>>>>>>>> the SM
>>>>>>>>>> mode of operation, the first threshold is specified directly on a
>>>>>>>>>> per-link basis by the value CLE-limit. The second threshold is
>>>>>>>>>> specified
>>>>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of
>>>>>>>>>> operation
>>>>>>>>>> the second threshold is also specified directly for each link.
>>>>>>>>>>> Nits/editorial comments:
>> _______________________________________________
>> Gen-art mailing list