Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08

Tom Taylor <> Wed, 21 March 2012 18:00 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9478221F854C for <>; Wed, 21 Mar 2012 11:00:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -3.515
X-Spam-Status: No, score=-3.515 tagged_above=-999 required=5 tests=[AWL=0.084, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id t9gQhByWDOZR for <>; Wed, 21 Mar 2012 11:00:15 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 2554F21F8548 for <>; Wed, 21 Mar 2012 11:00:12 -0700 (PDT)
Received: by ggmi1 with SMTP id i1so1329998ggm.31 for <>; Wed, 21 Mar 2012 11:00:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding :x-antivirus:x-antivirus-status; bh=TufVYQZz5DcITLNoKCvKWxelPlsFqWoVhMd9ercXHu0=; b=bKV6wvR6R74HfbLKnpM7SGT8/ExJ3uGzxD+p961b9NdO0ADNaRZANzj7kTxbhqx8tJ a5//gjGyAKGYv/TsjNVXEmNb1kOXcVxOsV8DPyFxREcY/CLTgiuIjcCUUoVOvGSytPzM YlYC2VRf+CZGb4DeKgsN2lbVq1x2dfEsCuETqGB/X1dUJIGxi1RN7GAIfs3CvKGUSZ/y 1wx89J0v8uNbtqIOYs9/ZH61xE0RJtwWrevi6c/wKw+duZdxsNQpHlw66vPzzCp8kPvP 39y5ti2uE538IVf1pA7Yba84N3lHTWchUBfIG7tIk4bqsVxD1lP5Uea64d/nW0fX112T iv+w==
Received: by with SMTP id t5mr4831418yhd.94.1332352811671; Wed, 21 Mar 2012 11:00:11 -0700 (PDT)
Received: from [] ( []) by with ESMTPS id 32sm3007697anu.14.2012. (version=TLSv1/SSLv3 cipher=OTHER); Wed, 21 Mar 2012 11:00:09 -0700 (PDT)
Message-ID: <>
Date: Wed, 21 Mar 2012 14:00:09 -0400
From: Tom Taylor <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2
MIME-Version: 1.0
To: Russ Housley <>
References: <> <> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Antivirus: avast! (VPS 120321-0, 21/03/2012), Outbound message
X-Antivirus-Status: Clean
Cc:, Steven Blake <>, Michael Menth <>,, Bob Briscoe <>, David Harrington <>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Mar 2012 18:00:17 -0000

The description of the factor U has been updated. The new text reads as 

    1.  [SM-specific] The sustainable aggregate rate (SAR) for the given
        ingress-egress-aggregate is estimated using the formula:

           SAR = U * NM-Rate

        for the latest reported interval, where U is a configurable
        factor greater than one which is the same for all ingress-egress-
        aggregates.  In effect, the value of the PCN-supportable-rate for
        each link is approximated by the expression


        rather than being calculated explicitly.

Tom Taylor

On 12/03/2012 2:28 PM, Russ Housley wrote:
> I am very confused about the state of this.  My skimming of the thread seems to indicate at least one unresolved issue.
> Russ
> On Jan 2, 2012, at 1:04 PM, Joel M. Halpern wrote:
>> The clarification on U is very helpful.  I look forward to comments from others on the routing based behavior / ECMP text removal / replacement question.
>> On 1/2/2012 12:58 PM, Michael Menth wrote:
>>> Hi Joel, hi Tom,
>>> Am 02.01.2012 18:18, schrieb Joel M. Halpern:
>>>> Michael, I am not sure what to make of your recommended text abut ECMP.
>>>> ECMP is used by almost all operators. It is generally considered a
>>>> necessary tool in the tool-kit.
>>>> More significantly, at least for the egress understanding of the
>>>> ingress, it is not even the single operator's ECMP, but other
>>>> operators selections of paths that produce the issue. So even in the
>>>> unlikely event that this operator does not use ECMP, it still is not
>>>> sufficient.
>>> Then I better leave the ECMP issue for others to answer.
>>> The definition of U can be better corrected as follows (improved
>>> rewording of my previous email):
>>> U represents the average ratio of PCN-supportable-rate to
>>> PCN-admissible-rate over all the links of the PCN-domain.
>>> ->
>>> U is a domain-wide constant which implicitly defines the
>>> PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN
>>> domain.
>>> Best wishes,
>>> Michael
>>>> Yours,
>>>> Joel
>>>> On 1/2/2012 11:54 AM, Michael Menth wrote:
>>>>> Hi Tom, hi Joel,
>>>>> I wish you a happy new year!
>>>>> Here are my comments to address Joel's concerns:
>>>>> ====================================================================
>>>>> The issue with ECMP: I'd add a comment that CL and SM should not be in
>>>>> the presence of ECMP if routing information is used to determine
>>>>> ingress-egress-aggregates since this seems to be messy and error-prone.
>>>>> ====================================================================
>>>>> The following text may clarify at the beginning of Section 3.3.2 the
>>>>> relation
>>>>> between admission control and flow termination to address one of Joel's
>>>>> comments (for both SM and CL):
>>>>> In the presence of light pre-congestion, i.e., in the presence of a
>>>>> small,
>>>>> positive ETM-rate (relative to the overall PCN traffic rate), new
>>>>> flows may
>>>>> already be blocked. However, in the presence of heavy pre-congestion,
>>>>> i.e.,
>>>>> in the presence of a relatively large ETM-rate, termination of some
>>>>> admitted
>>>>> flows is required. Thus, flow blocking is logical prerequisite for flow
>>>>> termination.
>>>>> ====================================================================
>>>>> The following sentence in 3.3.2 should be corrected (only SM-specific):
>>>>> U represents the average ratio of PCN-supportable-rate to
>>>>> PCN-admissible-rate
>>>>> over all the links of the PCN-domain.
>>>>> ->
>>>>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate
>>>>> for all
>>>>> the links of the PCN-domain.
>>>>> ====================================================================
>>>>> I also recommend to change the following text as I think it may cause
>>>>> misinterpretations (applies both to SM and CL):
>>>>> If the difference calculated in the second step is positive, the
>>>>> Decision
>>>>> Point SHOULD select PCN-flows to terminate, until it determines that the
>>>>> PCN-traffic admission rate will no longer be greater than the estimated
>>>>> sustainable aggregate rate. If the Decision Point knows the bandwidth
>>>>> required by individual PCN-flows (e.g., from resource signalling used to
>>>>> establish the flows), it MAY choose to complete its selection of
>>>>> PCN-flows to
>>>>> terminate in a single round of decisions.
>>>>> Alternatively, the Decision Point MAY spread flow termination over
>>>>> multiple
>>>>> rounds to avoid over-termination. If this is done, it is RECOMMENDED
>>>>> that
>>>>> enough time elapse between successive rounds of termination to allow the
>>>>> effects of previous rounds to be reflected in the measurements upon
>>>>> which the
>>>>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2
>>>>> and 4.3
>>>>> of [MeLe10].)
>>>>> ->
>>>>> If the difference calculated in the second step is positive (traffic
>>>>> rate to
>>>>> be terminated), the Decision Point SHOULD select PCN-flows to
>>>>> terminate. To
>>>>> that end, the Decision Point MAY use upper rate limits for individual
>>>>> PCN-flows (e.g., from resource signalling used to establish the
>>>>> flows) and
>>>>> select a set of flows whose sum of upper rate limits is up to the
>>>>> traffic
>>>>> rate to be terminated. Then, these flows are terminated. The use of
>>>>> upper
>>>>> limits on flow rates avoids over-termination.
>>>>> Termination may be continuously needed after consecutive measurement
>>>>> intervals for various
>>>>> reasons, e.g., if the used upper rate limits overestimate the actual
>>>>> flow rates.
>>>>> For such cases it is RECOMMENDED that enough time elapses between
>>>>> successive
>>>>> termination events to allow the effects of previous termination events
>>>>> to be
>>>>> reflected in the measurements upon which the termination decisions are
>>>>> based;
>>>>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2
>>>>> and
>>>>> 4.3 of [MeLe10].
>>>>> ====================================================================
>>>>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE"
>>>>> makes it look like a reference to a standards document.
>>>>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM.
>>>>> Best wishes,
>>>>> Michael
>>>>> Am 02.01.2012 15:21, schrieb Tom Taylor:
>>>>>> It shall be as you say, subject to comment from my co-authors when
>>>>>> they get back from holiday.
>>>>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote:
>>>>>>> In-line...
>>>>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote:
>>>>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote:
>>>>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on
>>>>>>>>> the
>>>>>>>>> two issues where I was unclear.
>>>>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I
>>>>>>>>> have is
>>>>>>>>> relative to the third operational alternative where routing is
>>>>>>>>> used to
>>>>>>>>> determine where the ingress and egress of a flow is. To be blunt,
>>>>>>>>> as far
>>>>>>>>> as I can tell this does not work.
>>>>>>>>> 1) It does not work on the ingress side because traffic from a given
>>>>>>>>> source prefix can come in at multiple places. Some of these
>>>>>>>>> places may
>>>>>>>>> claim reachability to the source prefix. Some may not. While a given
>>>>>>>>> flow will use only one of these paths, there is no way to determine
>>>>>>>>> from
>>>>>>>>> routing information, at the egress, which ingress that flow used.
>>>>>>>>> 2) A site may use multiple exits for a given destination prefix.
>>>>>>>>> Again,
>>>>>>>>> while the site will only use one of these egresses for a given flow,
>>>>>>>>> there is no way for the ingress to know which egress it will be
>>>>>>>>> on the
>>>>>>>>> basis of routing information.
>>>>>>>>> Thus, the text seems to allow for a behavior that simply does not
>>>>>>>>> work.
>>>>>>>> [PTT] I think the disconnect here is that you read the text to say
>>>>>>>> that
>>>>>>>> an individual node uses routing information to determine the IEA.
>>>>>>>> That
>>>>>>>> wasn't the intention. Instead, administrators use routing
>>>>>>>> information to
>>>>>>>> derive filters that are installed at the ingress and egress nodes.
>>>>>>> As far as I can tell, your response describes a situation even less
>>>>>>> effective than what I assumed.
>>>>>>> Firstly, it does not matter whether it is the edge node, the decision
>>>>>>> node, or the human administrator. Routing information is not enough to
>>>>>>> determine what the ingress-egress pairing is. The problems I describe
>>>>>>> above apply no matter who is making the decision.
>>>>>>> Secondly, having a human make the decision means that as soon as
>>>>>>> routing
>>>>>>> changes, the configured filters are wrong.
>>>>>>> I would suggest that the text in question be removed, and replaced
>>>>>>> with
>>>>>>> a warning against attempting what is currently described.
>>>>> My view is also that CL ans SM do not work in the presence of ECMP. This
>>>>> should be indicated as a warning.
>>>>>>>>> I am still confused about the relationship of section 3.3.2 to the
>>>>>>>>> behavior you describe. 3.3.2 says that as long as any excess
>>>>>>>>> traffic is
>>>>>>>>> being reported, teh decision point shall direct the blocking of
>>>>>>>>> additional flows. That does not match 3.3.1, and does not match your
>>>>>>>>> description.
>>>>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to
>>>>>>>> block as long as any excess traffic is being reported. What I
>>>>>>>> think it
>>>>>>>> says is that as long as excess traffic is reported, the decision
>>>>>>>> point
>>>>>>>> checks to see whether the traffic being admitted to the aggregate
>>>>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no
>>>>>>>> termination may be required (i.e., traffic is below the second
>>>>>>>> threshold).
>>>>>>> I think I see what you are saying. If I am reading this correctly, the
>>>>>>> decision process must re-calculate to determine if there is
>>>>>>> termination
>>>>>>> every time it receives a report with non-zero excess and the port is
>>>>>>> already blocked. But it does not have to actually block anything.
>>>>>>> This however seems to depend upon the correct relative
>>>>>>> configuration of
>>>>>>> the limit that flips it into blocked state, the value of U, and maybe
>>>>>>> some other values.
>>>>>>> Put differently, I understand that the two are not contradictory.
>>>>>>> However, since the two things use different calculations, it is not at
>>>>>>> all clear that they are consistent. This may well be acceptable.
>>>>>>> But the
>>>>>>> difference in methods is likely to lead to confusion. So, as a minor
>>>>>>> (rather than major) comment, I would suggest that you provide
>>>>>>> clarifying
>>>>>>> text explaining why it is okay to use one condition to decide if there
>>>>>>> is blocking, but a different condition (which could produce a lower
>>>>>>> threshold) to decide how much to get rid of.
>>>>>>> Yours,
>>>>>>> Joel
>>>>>>>>> Yours,
>>>>>>>>> Joel
>>>>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote:
>>>>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT].
>>>>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote:
>>>>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For
>>>>>>>>>>> background on
>>>>>>>>>>> Gen-ART, please see the FAQ at
>>>>>>>>>>> <>.
>>>>>>>>>>> Please resolve these comments along with any other Last Call
>>>>>>>>>>> comments
>>>>>>>>>>> you may receive.
>>>>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08
>>>>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of
>>>>>>>>>>> Operation
>>>>>>>>>>> Reviewer: Joel M. Halpern
>>>>>>>>>>> Review Date: 31-Dec-2011
>>>>>>>>>>> IETF LC End Date: 13-Jan-2012
>>>>>>>>>>> IESG Telechat date: N/A
>>>>>>>>>>> Summary: This documents is almost ready for publication as an
>>>>>>>>>>> Informational RFC.
>>>>>>>>>>> Question: Given that the document defines a complex set of
>>>>>>>>>>> behaviors,
>>>>>>>>>>> which are mandatory for compliant systems, it seems that this
>>>>>>>>>>> ought to
>>>>>>>>>>> be Experimental rather than Informational. It describes something
>>>>>>>>>>> that
>>>>>>>>>>> could, in theory, later become standards track.
>>>>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your
>>>>>>>>>> suggestion.
>>>>>>>>>>> Major issues:
>>>>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third
>>>>>>>>>>> bullet,
>>>>>>>>>>> states that the PCN-domain satisfies the conditions specified
>>>>>>>>>>> in RFC
>>>>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what
>>>>>>>>>>> conditions
>>>>>>>>>>> these are. Is this supposed to be a reference to RFC 5559
>>>>>>>>>>> instead? No
>>>>>>>>>>> matter which document it is referencing, please be more specific
>>>>>>>>>>> about
>>>>>>>>>>> which section / conditions are meant.
>>>>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long
>>>>>>>>>> time
>>>>>>>>>> since that text was written that I can't recall what the intention
>>>>>>>>>> was.
>>>>>>>>>> My inclination at the moment is simply to delete the bullet.
>>>>>>>>>>> It would have been helpful if the early part of the document
>>>>>>>>>>> indicated
>>>>>>>>>>> that the edge node information about how to determine
>>>>>>>>>>> ingress-egress-aggregates was described in section 5.
>>>>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to
>>>>>>>>>>> describe an option which does not seem to quite work. After
>>>>>>>>>>> describing
>>>>>>>>>>> how to use tunneling, and how to work with signaling, the text
>>>>>>>>>>> refers to
>>>>>>>>>>> inferring the ingress-egress-aggregate from the routing
>>>>>>>>>>> information. In
>>>>>>>>>>> the presence of multiple equal-cost domain exits (which does
>>>>>>>>>>> occur in
>>>>>>>>>>> reality), the routing table is not sufficient information to make
>>>>>>>>>>> this
>>>>>>>>>>> determination. Unless I am very confused (which does happen) this
>>>>>>>>>>> seems
>>>>>>>>>>> to be a serious hole in the specification.
>>>>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it,
>>>>>>>>>> operators
>>>>>>>>>> don't assign packets randomly to a given path in the presence of
>>>>>>>>>> alternatives -- they choose one based on values in the packet
>>>>>>>>>> header.
>>>>>>>>>> The basic intent is that packets of a given microflow all follow
>>>>>>>>>> the
>>>>>>>>>> same path, to prevent unnecessary reordering and minimize
>>>>>>>>>> jitter. The
>>>>>>>>>> implication is that filters can be defined at the ingress nodes to
>>>>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e.
>>>>>>>>>> flowing
>>>>>>>>>> from a specific ingress node to a specific egress node) based on
>>>>>>>>>> their
>>>>>>>>>> header contents. The filters to do the same job at egress nodes
>>>>>>>>>> are a
>>>>>>>>>> different problem, but they are not affected by ECMP.
>>>>>>>>>>> Minor issues:
>>>>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE
>>>>>>>>>>> (excess over total) rate exceeds the configured limit. However,
>>>>>>>>>>> section
>>>>>>>>>>> 3.3.2 states that the decision node must take further stapes if
>>>>>>>>>>> the
>>>>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency
>>>>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is
>>>>>>>>>>> important to drive the excess rate to 0, then why is action only
>>>>>>>>>>> initiated when the ratio is above a configured value, rather than
>>>>>>>>>>> any
>>>>>>>>>>> non-zero value? I can conceive of various reasons. But none are
>>>>>>>>>>> stated.)
>>>>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value
>>>>>>>>>> equal to
>>>>>>>>>> something less than (U - 1)/U. (The "something less" is because of
>>>>>>>>>> packet dropping at interior nodes.) The assumption is that (U -
>>>>>>>>>> 1)/U is
>>>>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds.
>>>>>>>>>> When the
>>>>>>>>>> CLE is below the first threshold, new flows are admitted. Above
>>>>>>>>>> that
>>>>>>>>>> threshold, they are blocked. When the CLE is above the second
>>>>>>>>>> threshold,
>>>>>>>>>> flows are terminated to bring them down to that threshold. In
>>>>>>>>>> the SM
>>>>>>>>>> mode of operation, the first threshold is specified directly on a
>>>>>>>>>> per-link basis by the value CLE-limit. The second threshold is
>>>>>>>>>> specified
>>>>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of
>>>>>>>>>> operation
>>>>>>>>>> the second threshold is also specified directly for each link.
>>>>>>>>>>> Nits/editorial comments:
>> _______________________________________________
>> Gen-art mailing list