Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08

"Joel M. Halpern" <jmh@joelhalpern.com> Mon, 02 January 2012 18:04 UTC

Return-Path: <jmh@joelhalpern.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CD63311E80B3 for <gen-art@ietfa.amsl.com>; Mon, 2 Jan 2012 10:04:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.265
X-Spam-Level:
X-Spam-Status: No, score=-102.265 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Nv9OSPEuCh9J for <gen-art@ietfa.amsl.com>; Mon, 2 Jan 2012 10:04:32 -0800 (PST)
Received: from morbo.mail.tigertech.net (morbo.mail.tigertech.net [67.131.251.54]) by ietfa.amsl.com (Postfix) with ESMTP id 9C22911E80A5 for <gen-art@ietf.org>; Mon, 2 Jan 2012 10:04:32 -0800 (PST)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) by morbo.tigertech.net (Postfix) with ESMTP id 8B50ECD154 for <gen-art@ietf.org>; Mon, 2 Jan 2012 10:04:32 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id 3094D1C0068; Mon, 2 Jan 2012 10:04:30 -0800 (PST)
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from [10.10.10.101] (pool-71-161-50-89.clppva.btas.verizon.net [71.161.50.89]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 755F61C08B9; Mon, 2 Jan 2012 10:04:27 -0800 (PST)
Message-ID: <4F01F1AD.8000806@joelhalpern.com>
Date: Mon, 02 Jan 2012 13:04:29 -0500
From: "Joel M. Halpern" <jmh@joelhalpern.com>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: Michael Menth <menth@informatik.uni-tuebingen.de>
References: <CAHBDyN6PN-vp9wXo6fF8G4VfODXjkfbWBaJN8EPopeWfOg9PmQ@mail.gmail.com> <4EFF838D.5020704@joelhalpern.com> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <4F00BAFD.2070201@joelhalpern.com> <4F00CAE1.60103@gmail.com> <4F00E181.7020605@joelhalpern.com> <4F01BD58.1080303@gmail.com> <4F01E15D.6080601@informatik.uni-tuebingen.de> <4F01E6F5.5080701@joelhalpern.com> <4F01F054.2050301@informatik.uni-tuebingen.de>
In-Reply-To: <4F01F054.2050301@informatik.uni-tuebingen.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: draft-ietf-pcn-sm-edge-behaviour@tools.ietf.org, Steven Blake <slblake@petri-meat.com>, gen-art@ietf.org, Bob Briscoe <bob.briscoe@bt.com>, Tom Taylor <tom.taylor.stds@gmail.com>, David Harrington <ietfdbh@comcast.net>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jan 2012 18:04:33 -0000

The clarification on U is very helpful.  I look forward to comments from 
others on the routing based behavior / ECMP text removal / replacement 
question.

On 1/2/2012 12:58 PM, Michael Menth wrote:
> Hi Joel, hi Tom,
>
> Am 02.01.2012 18:18, schrieb Joel M. Halpern:
>> Michael, I am not sure what to make of your recommended text abut ECMP.
>> ECMP is used by almost all operators. It is generally considered a
>> necessary tool in the tool-kit.
>> More significantly, at least for the egress understanding of the
>> ingress, it is not even the single operator's ECMP, but other
>> operators selections of paths that produce the issue. So even in the
>> unlikely event that this operator does not use ECMP, it still is not
>> sufficient.
>
> Then I better leave the ECMP issue for others to answer.
>
> The definition of U can be better corrected as follows (improved
> rewording of my previous email):
>
> U represents the average ratio of PCN-supportable-rate to
> PCN-admissible-rate over all the links of the PCN-domain.
> ->
> U is a domain-wide constant which implicitly defines the
> PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN
> domain.
>
> Best wishes,
>
> Michael
>
>
>>
>> Yours,
>> Joel
>>
>> On 1/2/2012 11:54 AM, Michael Menth wrote:
>>> Hi Tom, hi Joel,
>>>
>>> I wish you a happy new year!
>>>
>>> Here are my comments to address Joel's concerns:
>>>
>>> ====================================================================
>>>
>>> The issue with ECMP: I'd add a comment that CL and SM should not be in
>>> the presence of ECMP if routing information is used to determine
>>> ingress-egress-aggregates since this seems to be messy and error-prone.
>>>
>>> ====================================================================
>>>
>>> The following text may clarify at the beginning of Section 3.3.2 the
>>> relation
>>> between admission control and flow termination to address one of Joel's
>>> comments (for both SM and CL):
>>>
>>> In the presence of light pre-congestion, i.e., in the presence of a
>>> small,
>>> positive ETM-rate (relative to the overall PCN traffic rate), new
>>> flows may
>>> already be blocked. However, in the presence of heavy pre-congestion,
>>> i.e.,
>>> in the presence of a relatively large ETM-rate, termination of some
>>> admitted
>>> flows is required. Thus, flow blocking is logical prerequisite for flow
>>> termination.
>>>
>>> ====================================================================
>>>
>>> The following sentence in 3.3.2 should be corrected (only SM-specific):
>>>
>>> U represents the average ratio of PCN-supportable-rate to
>>> PCN-admissible-rate
>>> over all the links of the PCN-domain.
>>>
>>> ->
>>>
>>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate
>>> for all
>>> the links of the PCN-domain.
>>>
>>> ====================================================================
>>>
>>> I also recommend to change the following text as I think it may cause
>>> misinterpretations (applies both to SM and CL):
>>>
>>> If the difference calculated in the second step is positive, the
>>> Decision
>>> Point SHOULD select PCN-flows to terminate, until it determines that the
>>> PCN-traffic admission rate will no longer be greater than the estimated
>>> sustainable aggregate rate. If the Decision Point knows the bandwidth
>>> required by individual PCN-flows (e.g., from resource signalling used to
>>> establish the flows), it MAY choose to complete its selection of
>>> PCN-flows to
>>> terminate in a single round of decisions.
>>>
>>> Alternatively, the Decision Point MAY spread flow termination over
>>> multiple
>>> rounds to avoid over-termination. If this is done, it is RECOMMENDED
>>> that
>>> enough time elapse between successive rounds of termination to allow the
>>> effects of previous rounds to be reflected in the measurements upon
>>> which the
>>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2
>>> and 4.3
>>> of [MeLe10].)
>>>
>>> ->
>>>
>>> If the difference calculated in the second step is positive (traffic
>>> rate to
>>> be terminated), the Decision Point SHOULD select PCN-flows to
>>> terminate. To
>>> that end, the Decision Point MAY use upper rate limits for individual
>>> PCN-flows (e.g., from resource signalling used to establish the
>>> flows) and
>>> select a set of flows whose sum of upper rate limits is up to the
>>> traffic
>>> rate to be terminated. Then, these flows are terminated. The use of
>>> upper
>>> limits on flow rates avoids over-termination.
>>>
>>> Termination may be continuously needed after consecutive measurement
>>> intervals for various
>>> reasons, e.g., if the used upper rate limits overestimate the actual
>>> flow rates.
>>> For such cases it is RECOMMENDED that enough time elapses between
>>> successive
>>> termination events to allow the effects of previous termination events
>>> to be
>>> reflected in the measurements upon which the termination decisions are
>>> based;
>>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2
>>> and
>>> 4.3 of [MeLe10].
>>>
>>> ====================================================================
>>>
>>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE"
>>> makes it look like a reference to a standards document.
>>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM.
>>>
>>>
>>>
>>> Best wishes,
>>>
>>> Michael
>>>
>>>
>>> Am 02.01.2012 15:21, schrieb Tom Taylor:
>>>> It shall be as you say, subject to comment from my co-authors when
>>>> they get back from holiday.
>>>>
>>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote:
>>>>> In-line...
>>>>>
>>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote:
>>>>>>
>>>>>>
>>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote:
>>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on
>>>>>>> the
>>>>>>> two issues where I was unclear.
>>>>>>>
>>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I
>>>>>>> have is
>>>>>>> relative to the third operational alternative where routing is
>>>>>>> used to
>>>>>>> determine where the ingress and egress of a flow is. To be blunt,
>>>>>>> as far
>>>>>>> as I can tell this does not work.
>>>>>>> 1) It does not work on the ingress side because traffic from a given
>>>>>>> source prefix can come in at multiple places. Some of these
>>>>>>> places may
>>>>>>> claim reachability to the source prefix. Some may not. While a given
>>>>>>> flow will use only one of these paths, there is no way to determine
>>>>>>> from
>>>>>>> routing information, at the egress, which ingress that flow used.
>>>>>>> 2) A site may use multiple exits for a given destination prefix.
>>>>>>> Again,
>>>>>>> while the site will only use one of these egresses for a given flow,
>>>>>>> there is no way for the ingress to know which egress it will be
>>>>>>> on the
>>>>>>> basis of routing information.
>>>>>>> Thus, the text seems to allow for a behavior that simply does not
>>>>>>> work.
>>>>>>
>>>>>> [PTT] I think the disconnect here is that you read the text to say
>>>>>> that
>>>>>> an individual node uses routing information to determine the IEA.
>>>>>> That
>>>>>> wasn't the intention. Instead, administrators use routing
>>>>>> information to
>>>>>> derive filters that are installed at the ingress and egress nodes.
>>>>>
>>>>> As far as I can tell, your response describes a situation even less
>>>>> effective than what I assumed.
>>>>> Firstly, it does not matter whether it is the edge node, the decision
>>>>> node, or the human administrator. Routing information is not enough to
>>>>> determine what the ingress-egress pairing is. The problems I describe
>>>>> above apply no matter who is making the decision.
>>>>> Secondly, having a human make the decision means that as soon as
>>>>> routing
>>>>> changes, the configured filters are wrong.
>>>>>
>>>>> I would suggest that the text in question be removed, and replaced
>>>>> with
>>>>> a warning against attempting what is currently described.
>>>>>
>>> My view is also that CL ans SM do not work in the presence of ECMP. This
>>> should be indicated as a warning.
>>>
>>>>>>>
>>>>>>> I am still confused about the relationship of section 3.3.2 to the
>>>>>>> behavior you describe. 3.3.2 says that as long as any excess
>>>>>>> traffic is
>>>>>>> being reported, teh decision point shall direct the blocking of
>>>>>>> additional flows. That does not match 3.3.1, and does not match your
>>>>>>> description.
>>>>>>
>>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to
>>>>>> block as long as any excess traffic is being reported. What I
>>>>>> think it
>>>>>> says is that as long as excess traffic is reported, the decision
>>>>>> point
>>>>>> checks to see whether the traffic being admitted to the aggregate
>>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no
>>>>>> termination may be required (i.e., traffic is below the second
>>>>>> threshold).
>>>>>
>>>>> I think I see what you are saying. If I am reading this correctly, the
>>>>> decision process must re-calculate to determine if there is
>>>>> termination
>>>>> every time it receives a report with non-zero excess and the port is
>>>>> already blocked. But it does not have to actually block anything.
>>>>> This however seems to depend upon the correct relative
>>>>> configuration of
>>>>> the limit that flips it into blocked state, the value of U, and maybe
>>>>> some other values.
>>>>> Put differently, I understand that the two are not contradictory.
>>>>> However, since the two things use different calculations, it is not at
>>>>> all clear that they are consistent. This may well be acceptable.
>>>>> But the
>>>>> difference in methods is likely to lead to confusion. So, as a minor
>>>>> (rather than major) comment, I would suggest that you provide
>>>>> clarifying
>>>>> text explaining why it is okay to use one condition to decide if there
>>>>> is blocking, but a different condition (which could produce a lower
>>>>> threshold) to decide how much to get rid of.
>>>>>
>>>>> Yours,
>>>>> Joel
>>>>>
>>>>>>>
>>>>>>> Yours,
>>>>>>> Joel
>>>>>>>
>>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote:
>>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT].
>>>>>>>>
>>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote:
>>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For
>>>>>>>>> background on
>>>>>>>>> Gen-ART, please see the FAQ at
>>>>>>>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>>>>>>>
>>>>>>>>> Please resolve these comments along with any other Last Call
>>>>>>>>> comments
>>>>>>>>> you may receive.
>>>>>>>>>
>>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08
>>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of
>>>>>>>>> Operation
>>>>>>>>> Reviewer: Joel M. Halpern
>>>>>>>>> Review Date: 31-Dec-2011
>>>>>>>>> IETF LC End Date: 13-Jan-2012
>>>>>>>>> IESG Telechat date: N/A
>>>>>>>>>
>>>>>>>>> Summary: This documents is almost ready for publication as an
>>>>>>>>> Informational RFC.
>>>>>>>>>
>>>>>>>>> Question: Given that the document defines a complex set of
>>>>>>>>> behaviors,
>>>>>>>>> which are mandatory for compliant systems, it seems that this
>>>>>>>>> ought to
>>>>>>>>> be Experimental rather than Informational. It describes something
>>>>>>>>> that
>>>>>>>>> could, in theory, later become standards track.
>>>>>>>>
>>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your
>>>>>>>> suggestion.
>>>>>>>>>
>>>>>>>>> Major issues:
>>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third
>>>>>>>>> bullet,
>>>>>>>>> states that the PCN-domain satisfies the conditions specified
>>>>>>>>> in RFC
>>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what
>>>>>>>>> conditions
>>>>>>>>> these are. Is this supposed to be a reference to RFC 5559
>>>>>>>>> instead? No
>>>>>>>>> matter which document it is referencing, please be more specific
>>>>>>>>> about
>>>>>>>>> which section / conditions are meant.
>>>>>>>>
>>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long
>>>>>>>> time
>>>>>>>> since that text was written that I can't recall what the intention
>>>>>>>> was.
>>>>>>>> My inclination at the moment is simply to delete the bullet.
>>>>>>>>>
>>>>>>>>> It would have been helpful if the early part of the document
>>>>>>>>> indicated
>>>>>>>>> that the edge node information about how to determine
>>>>>>>>> ingress-egress-aggregates was described in section 5.
>>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to
>>>>>>>>> describe an option which does not seem to quite work. After
>>>>>>>>> describing
>>>>>>>>> how to use tunneling, and how to work with signaling, the text
>>>>>>>>> refers to
>>>>>>>>> inferring the ingress-egress-aggregate from the routing
>>>>>>>>> information. In
>>>>>>>>> the presence of multiple equal-cost domain exits (which does
>>>>>>>>> occur in
>>>>>>>>> reality), the routing table is not sufficient information to make
>>>>>>>>> this
>>>>>>>>> determination. Unless I am very confused (which does happen) this
>>>>>>>>> seems
>>>>>>>>> to be a serious hole in the specification.
>>>>>>>>
>>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it,
>>>>>>>> operators
>>>>>>>> don't assign packets randomly to a given path in the presence of
>>>>>>>> alternatives -- they choose one based on values in the packet
>>>>>>>> header.
>>>>>>>> The basic intent is that packets of a given microflow all follow
>>>>>>>> the
>>>>>>>> same path, to prevent unnecessary reordering and minimize
>>>>>>>> jitter. The
>>>>>>>> implication is that filters can be defined at the ingress nodes to
>>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e.
>>>>>>>> flowing
>>>>>>>> from a specific ingress node to a specific egress node) based on
>>>>>>>> their
>>>>>>>> header contents. The filters to do the same job at egress nodes
>>>>>>>> are a
>>>>>>>> different problem, but they are not affected by ECMP.
>>>>>>>>>
>>>>>>>>> Minor issues:
>>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE
>>>>>>>>> (excess over total) rate exceeds the configured limit. However,
>>>>>>>>> section
>>>>>>>>> 3.3.2 states that the decision node must take further stapes if
>>>>>>>>> the
>>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency
>>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is
>>>>>>>>> important to drive the excess rate to 0, then why is action only
>>>>>>>>> initiated when the ratio is above a configured value, rather than
>>>>>>>>> any
>>>>>>>>> non-zero value? I can conceive of various reasons. But none are
>>>>>>>>> stated.)
>>>>>>>>
>>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value
>>>>>>>> equal to
>>>>>>>> something less than (U - 1)/U. (The "something less" is because of
>>>>>>>> packet dropping at interior nodes.) The assumption is that (U -
>>>>>>>> 1)/U is
>>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds.
>>>>>>>> When the
>>>>>>>> CLE is below the first threshold, new flows are admitted. Above
>>>>>>>> that
>>>>>>>> threshold, they are blocked. When the CLE is above the second
>>>>>>>> threshold,
>>>>>>>> flows are terminated to bring them down to that threshold. In
>>>>>>>> the SM
>>>>>>>> mode of operation, the first threshold is specified directly on a
>>>>>>>> per-link basis by the value CLE-limit. The second threshold is
>>>>>>>> specified
>>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of
>>>>>>>> operation
>>>>>>>> the second threshold is also specified directly for each link.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Nits/editorial comments:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>