Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08

Michael Menth <> Mon, 02 January 2012 17:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 082E311E80B1 for <>; Mon, 2 Jan 2012 09:59:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.801
X-Spam-Status: No, score=-0.801 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_DE=0.35, HELO_MISMATCH_DE=1.448]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id HKg9a2332bDh for <>; Mon, 2 Jan 2012 09:59:06 -0800 (PST)
Received: from (mx5.Informatik.Uni-Tuebingen.De []) by (Postfix) with SMTP id 00A6211E80B2 for <>; Mon, 2 Jan 2012 09:59:05 -0800 (PST)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9AFB45327; Mon, 2 Jan 2012 18:58:55 +0100 (MET)
X-Virus-Scanned: amavisd-new at
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id fqvEYicMucOC; Mon, 2 Jan 2012 18:58:47 +0100 (MET)
Received: from (zcs-bs.Informatik.Uni-Tuebingen.De []) by (Postfix) with ESMTP id 1CF6052A2; Mon, 2 Jan 2012 18:58:46 +0100 (MET)
Received: from [] ( []) by (Postfix) with ESMTP id CCEFD3457F84; Mon, 2 Jan 2012 18:58:45 +0100 (CET)
Message-ID: <>
Date: Mon, 02 Jan 2012 18:58:44 +0100
From: Michael Menth <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: "Joel M. Halpern" <>
References: <> <> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc:, Steven Blake <>,, Bob Briscoe <>, Tom Taylor <>, David Harrington <>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 02 Jan 2012 17:59:08 -0000

Hi Joel, hi Tom,

Am 02.01.2012 18:18, schrieb Joel M. Halpern:
> Michael, I am not sure what to make of your recommended text abut ECMP.
> ECMP is used by almost all operators.  It is generally considered a 
> necessary tool in the tool-kit.
> More significantly, at least for the egress understanding of the 
> ingress, it is not even the single operator's ECMP, but other 
> operators selections of paths that produce the issue.  So even in the 
> unlikely event that this operator does not use ECMP, it still is not 
> sufficient.

Then I better leave the ECMP issue for others to answer.

The definition of U can be better corrected as follows (improved 
rewording of my previous email):

U represents the average ratio of PCN-supportable-rate to 
PCN-admissible-rate over all the links of the PCN-domain.
U is a domain-wide constant which implicitly defines the 
PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN 

Best wishes,


> Yours,
> Joel
> On 1/2/2012 11:54 AM, Michael Menth wrote:
>> Hi Tom, hi Joel,
>> I wish you a happy new year!
>> Here are my comments to address Joel's concerns:
>> ====================================================================
>> The issue with ECMP: I'd add a comment that CL and SM should not be in
>> the presence of ECMP if routing information is used to determine
>> ingress-egress-aggregates since this seems to be messy and error-prone.
>> ====================================================================
>> The following text may clarify at the beginning of Section 3.3.2 the
>> relation
>> between admission control and flow termination to address one of Joel's
>> comments (for both SM and CL):
>> In the presence of light pre-congestion, i.e., in the presence of a 
>> small,
>> positive ETM-rate (relative to the overall PCN traffic rate), new 
>> flows may
>> already be blocked. However, in the presence of heavy pre-congestion, 
>> i.e.,
>> in the presence of a relatively large ETM-rate, termination of some
>> admitted
>> flows is required. Thus, flow blocking is logical prerequisite for flow
>> termination.
>> ====================================================================
>> The following sentence in 3.3.2 should be corrected (only SM-specific):
>> U represents the average ratio of PCN-supportable-rate to
>> PCN-admissible-rate
>> over all the links of the PCN-domain.
>> ->
>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate
>> for all
>> the links of the PCN-domain.
>> ====================================================================
>> I also recommend to change the following text as I think it may cause
>> misinterpretations (applies both to SM and CL):
>> If the difference calculated in the second step is positive, the 
>> Decision
>> Point SHOULD select PCN-flows to terminate, until it determines that the
>> PCN-traffic admission rate will no longer be greater than the estimated
>> sustainable aggregate rate. If the Decision Point knows the bandwidth
>> required by individual PCN-flows (e.g., from resource signalling used to
>> establish the flows), it MAY choose to complete its selection of
>> PCN-flows to
>> terminate in a single round of decisions.
>> Alternatively, the Decision Point MAY spread flow termination over 
>> multiple
>> rounds to avoid over-termination. If this is done, it is RECOMMENDED 
>> that
>> enough time elapse between successive rounds of termination to allow the
>> effects of previous rounds to be reflected in the measurements upon
>> which the
>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2 
>> and 4.3
>> of [MeLe10].)
>> ->
>> If the difference calculated in the second step is positive (traffic
>> rate to
>> be terminated), the Decision Point SHOULD select PCN-flows to 
>> terminate. To
>> that end, the Decision Point MAY use upper rate limits for individual
>> PCN-flows (e.g., from resource signalling used to establish the 
>> flows) and
>> select a set of flows whose sum of upper rate limits is up to the 
>> traffic
>> rate to be terminated. Then, these flows are terminated. The use of 
>> upper
>> limits on flow rates avoids over-termination.
>> Termination may be continuously needed after consecutive measurement
>> intervals for various
>> reasons, e.g., if the used upper rate limits overestimate the actual
>> flow rates.
>> For such cases it is RECOMMENDED that enough time elapses between
>> successive
>> termination events to allow the effects of previous termination events
>> to be
>> reflected in the measurements upon which the termination decisions are
>> based;
>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2
>> and
>> 4.3 of [MeLe10].
>> ====================================================================
>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE"
>> makes it look like a reference to a standards document.
>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM.
>> Best wishes,
>> Michael
>> Am 02.01.2012 15:21, schrieb Tom Taylor:
>>> It shall be as you say, subject to comment from my co-authors when
>>> they get back from holiday.
>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote:
>>>> In-line...
>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote:
>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote:
>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on 
>>>>>> the
>>>>>> two issues where I was unclear.
>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I 
>>>>>> have is
>>>>>> relative to the third operational alternative where routing is 
>>>>>> used to
>>>>>> determine where the ingress and egress of a flow is. To be blunt,
>>>>>> as far
>>>>>> as I can tell this does not work.
>>>>>> 1) It does not work on the ingress side because traffic from a given
>>>>>> source prefix can come in at multiple places. Some of these 
>>>>>> places may
>>>>>> claim reachability to the source prefix. Some may not. While a given
>>>>>> flow will use only one of these paths, there is no way to determine
>>>>>> from
>>>>>> routing information, at the egress, which ingress that flow used.
>>>>>> 2) A site may use multiple exits for a given destination prefix.
>>>>>> Again,
>>>>>> while the site will only use one of these egresses for a given flow,
>>>>>> there is no way for the ingress to know which egress it will be 
>>>>>> on the
>>>>>> basis of routing information.
>>>>>> Thus, the text seems to allow for a behavior that simply does not
>>>>>> work.
>>>>> [PTT] I think the disconnect here is that you read the text to say 
>>>>> that
>>>>> an individual node uses routing information to determine the IEA. 
>>>>> That
>>>>> wasn't the intention. Instead, administrators use routing
>>>>> information to
>>>>> derive filters that are installed at the ingress and egress nodes.
>>>> As far as I can tell, your response describes a situation even less
>>>> effective than what I assumed.
>>>> Firstly, it does not matter whether it is the edge node, the decision
>>>> node, or the human administrator. Routing information is not enough to
>>>> determine what the ingress-egress pairing is. The problems I describe
>>>> above apply no matter who is making the decision.
>>>> Secondly, having a human make the decision means that as soon as 
>>>> routing
>>>> changes, the configured filters are wrong.
>>>> I would suggest that the text in question be removed, and replaced 
>>>> with
>>>> a warning against attempting what is currently described.
>> My view is also that CL ans SM do not work in the presence of ECMP. This
>> should be indicated as a warning.
>>>>>> I am still confused about the relationship of section 3.3.2 to the
>>>>>> behavior you describe. 3.3.2 says that as long as any excess
>>>>>> traffic is
>>>>>> being reported, teh decision point shall direct the blocking of
>>>>>> additional flows. That does not match 3.3.1, and does not match your
>>>>>> description.
>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to
>>>>> block as long as any excess traffic is being reported. What I 
>>>>> think it
>>>>> says is that as long as excess traffic is reported, the decision 
>>>>> point
>>>>> checks to see whether the traffic being admitted to the aggregate
>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no
>>>>> termination may be required (i.e., traffic is below the second
>>>>> threshold).
>>>> I think I see what you are saying. If I am reading this correctly, the
>>>> decision process must re-calculate to determine if there is 
>>>> termination
>>>> every time it receives a report with non-zero excess and the port is
>>>> already blocked. But it does not have to actually block anything.
>>>> This however seems to depend upon the correct relative 
>>>> configuration of
>>>> the limit that flips it into blocked state, the value of U, and maybe
>>>> some other values.
>>>> Put differently, I understand that the two are not contradictory.
>>>> However, since the two things use different calculations, it is not at
>>>> all clear that they are consistent. This may well be acceptable. 
>>>> But the
>>>> difference in methods is likely to lead to confusion. So, as a minor
>>>> (rather than major) comment, I would suggest that you provide 
>>>> clarifying
>>>> text explaining why it is okay to use one condition to decide if there
>>>> is blocking, but a different condition (which could produce a lower
>>>> threshold) to decide how much to get rid of.
>>>> Yours,
>>>> Joel
>>>>>> Yours,
>>>>>> Joel
>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote:
>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT].
>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote:
>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For 
>>>>>>>> background on
>>>>>>>> Gen-ART, please see the FAQ at
>>>>>>>> <>.
>>>>>>>> Please resolve these comments along with any other Last Call
>>>>>>>> comments
>>>>>>>> you may receive.
>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08
>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of
>>>>>>>> Operation
>>>>>>>> Reviewer: Joel M. Halpern
>>>>>>>> Review Date: 31-Dec-2011
>>>>>>>> IETF LC End Date: 13-Jan-2012
>>>>>>>> IESG Telechat date: N/A
>>>>>>>> Summary: This documents is almost ready for publication as an
>>>>>>>> Informational RFC.
>>>>>>>> Question: Given that the document defines a complex set of
>>>>>>>> behaviors,
>>>>>>>> which are mandatory for compliant systems, it seems that this
>>>>>>>> ought to
>>>>>>>> be Experimental rather than Informational. It describes something
>>>>>>>> that
>>>>>>>> could, in theory, later become standards track.
>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your
>>>>>>> suggestion.
>>>>>>>> Major issues:
>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third
>>>>>>>> bullet,
>>>>>>>> states that the PCN-domain satisfies the conditions specified 
>>>>>>>> in RFC
>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what 
>>>>>>>> conditions
>>>>>>>> these are. Is this supposed to be a reference to RFC 5559
>>>>>>>> instead? No
>>>>>>>> matter which document it is referencing, please be more specific
>>>>>>>> about
>>>>>>>> which section / conditions are meant.
>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long
>>>>>>> time
>>>>>>> since that text was written that I can't recall what the intention
>>>>>>> was.
>>>>>>> My inclination at the moment is simply to delete the bullet.
>>>>>>>> It would have been helpful if the early part of the document
>>>>>>>> indicated
>>>>>>>> that the edge node information about how to determine
>>>>>>>> ingress-egress-aggregates was described in section 5.
>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to
>>>>>>>> describe an option which does not seem to quite work. After
>>>>>>>> describing
>>>>>>>> how to use tunneling, and how to work with signaling, the text
>>>>>>>> refers to
>>>>>>>> inferring the ingress-egress-aggregate from the routing
>>>>>>>> information. In
>>>>>>>> the presence of multiple equal-cost domain exits (which does
>>>>>>>> occur in
>>>>>>>> reality), the routing table is not sufficient information to make
>>>>>>>> this
>>>>>>>> determination. Unless I am very confused (which does happen) this
>>>>>>>> seems
>>>>>>>> to be a serious hole in the specification.
>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it,
>>>>>>> operators
>>>>>>> don't assign packets randomly to a given path in the presence of
>>>>>>> alternatives -- they choose one based on values in the packet 
>>>>>>> header.
>>>>>>> The basic intent is that packets of a given microflow all follow 
>>>>>>> the
>>>>>>> same path, to prevent unnecessary reordering and minimize 
>>>>>>> jitter. The
>>>>>>> implication is that filters can be defined at the ingress nodes to
>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e.
>>>>>>> flowing
>>>>>>> from a specific ingress node to a specific egress node) based on
>>>>>>> their
>>>>>>> header contents. The filters to do the same job at egress nodes 
>>>>>>> are a
>>>>>>> different problem, but they are not affected by ECMP.
>>>>>>>> Minor issues:
>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE
>>>>>>>> (excess over total) rate exceeds the configured limit. However,
>>>>>>>> section
>>>>>>>> 3.3.2 states that the decision node must take further stapes if 
>>>>>>>> the
>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency
>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is
>>>>>>>> important to drive the excess rate to 0, then why is action only
>>>>>>>> initiated when the ratio is above a configured value, rather than
>>>>>>>> any
>>>>>>>> non-zero value? I can conceive of various reasons. But none are
>>>>>>>> stated.)
>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value
>>>>>>> equal to
>>>>>>> something less than (U - 1)/U. (The "something less" is because of
>>>>>>> packet dropping at interior nodes.) The assumption is that (U -
>>>>>>> 1)/U is
>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds.
>>>>>>> When the
>>>>>>> CLE is below the first threshold, new flows are admitted. Above 
>>>>>>> that
>>>>>>> threshold, they are blocked. When the CLE is above the second
>>>>>>> threshold,
>>>>>>> flows are terminated to bring them down to that threshold. In 
>>>>>>> the SM
>>>>>>> mode of operation, the first threshold is specified directly on a
>>>>>>> per-link basis by the value CLE-limit. The second threshold is
>>>>>>> specified
>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of
>>>>>>> operation
>>>>>>> the second threshold is also specified directly for each link.
>>>>>>>> Nits/editorial comments:

Prof. Dr. habil. Michael Menth
University of Tuebingen
Faculty of Science
Department of Computer Science
Chair of Communication Networks
Sand 13, 72076 Tuebingen, Germany
phone: (+49)-7071/29-70505
fax: (+49)-7071/29-5220