Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08

Russ Housley <housley@vigilsec.com> Mon, 12 March 2012 22:35 UTC

Return-Path: <housley@vigilsec.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADA8E21E8170 for <gen-art@ietfa.amsl.com>; Mon, 12 Mar 2012 15:35:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.577
X-Spam-Level:
X-Spam-Status: No, score=-102.577 tagged_above=-999 required=5 tests=[AWL=-0.022, BAYES_00=-2.599, DATE_IN_PAST_03_06=0.044, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QZ9dJ+3fA82z for <gen-art@ietfa.amsl.com>; Mon, 12 Mar 2012 15:35:01 -0700 (PDT)
Received: from odin.smetech.net (mail.smetech.net [208.254.26.82]) by ietfa.amsl.com (Postfix) with ESMTP id 25BFE21E815D for <gen-art@ietf.org>; Mon, 12 Mar 2012 15:35:01 -0700 (PDT)
Received: from localhost (unknown [208.254.26.81]) by odin.smetech.net (Postfix) with ESMTP id 49C7C9A472C; Mon, 12 Mar 2012 18:35:08 -0400 (EDT)
X-Virus-Scanned: amavisd-new at smetech.net
Received: from odin.smetech.net ([208.254.26.82]) by localhost (ronin.smetech.net [208.254.26.81]) (amavisd-new, port 10024) with ESMTP id U59IFuQUcqjQ; Mon, 12 Mar 2012 18:34:57 -0400 (EDT)
Received: from [10.242.59.150] (maf0f36d0.tmodns.net [208.54.15.175]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by odin.smetech.net (Postfix) with ESMTP id 5EDF69A470F; Mon, 12 Mar 2012 18:35:05 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Russ Housley <housley@vigilsec.com>
In-Reply-To: <4F01F1AD.8000806@joelhalpern.com>
Date: Mon, 12 Mar 2012 14:28:42 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <841FA158-3616-4C9A-ACF4-D8C7CC2A47D3@vigilsec.com>
References: <CAHBDyN6PN-vp9wXo6fF8G4VfODXjkfbWBaJN8EPopeWfOg9PmQ@mail.gmail.com> <4EFF838D.5020704@joelhalpern.com> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <4F00BAFD.2070201@joelhalpern.com> <4F00CAE1.60103@gmail.com> <4F00E181.7020605@joelhalpern.com> <4F01BD58.1080303@gmail.com> <4F01E15D.6080601@informatik.uni-tuebingen.de> <4F01E6F5.5080701@joelhalpern.com> <4F01F054.2050301@informatik.uni-tuebingen.de> <4F01F1AD.8000806@joelhalpern.com>
To: Joel M. Halpern <jmh@joelhalpern.com>
X-Mailer: Apple Mail (2.1084)
Cc: draft-ietf-pcn-sm-edge-behaviour@tools.ietf.org, Steven Blake <slblake@petri-meat.com>, Michael Menth <menth@informatik.uni-tuebingen.de>, gen-art@ietf.org, Bob Briscoe <bob.briscoe@bt.com>, Tom Taylor <tom.taylor.stds@gmail.com>, David Harrington <ietfdbh@comcast.net>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Mar 2012 22:35:24 -0000

I am very confused about the state of this.  My skimming of the thread seems to indicate at least one unresolved issue.

Russ

 
On Jan 2, 2012, at 1:04 PM, Joel M. Halpern wrote:

> The clarification on U is very helpful.  I look forward to comments from others on the routing based behavior / ECMP text removal / replacement question.
> 
> On 1/2/2012 12:58 PM, Michael Menth wrote:
>> Hi Joel, hi Tom,
>> 
>> Am 02.01.2012 18:18, schrieb Joel M. Halpern:
>>> Michael, I am not sure what to make of your recommended text abut ECMP.
>>> ECMP is used by almost all operators. It is generally considered a
>>> necessary tool in the tool-kit.
>>> More significantly, at least for the egress understanding of the
>>> ingress, it is not even the single operator's ECMP, but other
>>> operators selections of paths that produce the issue. So even in the
>>> unlikely event that this operator does not use ECMP, it still is not
>>> sufficient.
>> 
>> Then I better leave the ECMP issue for others to answer.
>> 
>> The definition of U can be better corrected as follows (improved
>> rewording of my previous email):
>> 
>> U represents the average ratio of PCN-supportable-rate to
>> PCN-admissible-rate over all the links of the PCN-domain.
>> ->
>> U is a domain-wide constant which implicitly defines the
>> PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN
>> domain.
>> 
>> Best wishes,
>> 
>> Michael
>> 
>> 
>>> 
>>> Yours,
>>> Joel
>>> 
>>> On 1/2/2012 11:54 AM, Michael Menth wrote:
>>>> Hi Tom, hi Joel,
>>>> 
>>>> I wish you a happy new year!
>>>> 
>>>> Here are my comments to address Joel's concerns:
>>>> 
>>>> ====================================================================
>>>> 
>>>> The issue with ECMP: I'd add a comment that CL and SM should not be in
>>>> the presence of ECMP if routing information is used to determine
>>>> ingress-egress-aggregates since this seems to be messy and error-prone.
>>>> 
>>>> ====================================================================
>>>> 
>>>> The following text may clarify at the beginning of Section 3.3.2 the
>>>> relation
>>>> between admission control and flow termination to address one of Joel's
>>>> comments (for both SM and CL):
>>>> 
>>>> In the presence of light pre-congestion, i.e., in the presence of a
>>>> small,
>>>> positive ETM-rate (relative to the overall PCN traffic rate), new
>>>> flows may
>>>> already be blocked. However, in the presence of heavy pre-congestion,
>>>> i.e.,
>>>> in the presence of a relatively large ETM-rate, termination of some
>>>> admitted
>>>> flows is required. Thus, flow blocking is logical prerequisite for flow
>>>> termination.
>>>> 
>>>> ====================================================================
>>>> 
>>>> The following sentence in 3.3.2 should be corrected (only SM-specific):
>>>> 
>>>> U represents the average ratio of PCN-supportable-rate to
>>>> PCN-admissible-rate
>>>> over all the links of the PCN-domain.
>>>> 
>>>> ->
>>>> 
>>>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate
>>>> for all
>>>> the links of the PCN-domain.
>>>> 
>>>> ====================================================================
>>>> 
>>>> I also recommend to change the following text as I think it may cause
>>>> misinterpretations (applies both to SM and CL):
>>>> 
>>>> If the difference calculated in the second step is positive, the
>>>> Decision
>>>> Point SHOULD select PCN-flows to terminate, until it determines that the
>>>> PCN-traffic admission rate will no longer be greater than the estimated
>>>> sustainable aggregate rate. If the Decision Point knows the bandwidth
>>>> required by individual PCN-flows (e.g., from resource signalling used to
>>>> establish the flows), it MAY choose to complete its selection of
>>>> PCN-flows to
>>>> terminate in a single round of decisions.
>>>> 
>>>> Alternatively, the Decision Point MAY spread flow termination over
>>>> multiple
>>>> rounds to avoid over-termination. If this is done, it is RECOMMENDED
>>>> that
>>>> enough time elapse between successive rounds of termination to allow the
>>>> effects of previous rounds to be reflected in the measurements upon
>>>> which the
>>>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2
>>>> and 4.3
>>>> of [MeLe10].)
>>>> 
>>>> ->
>>>> 
>>>> If the difference calculated in the second step is positive (traffic
>>>> rate to
>>>> be terminated), the Decision Point SHOULD select PCN-flows to
>>>> terminate. To
>>>> that end, the Decision Point MAY use upper rate limits for individual
>>>> PCN-flows (e.g., from resource signalling used to establish the
>>>> flows) and
>>>> select a set of flows whose sum of upper rate limits is up to the
>>>> traffic
>>>> rate to be terminated. Then, these flows are terminated. The use of
>>>> upper
>>>> limits on flow rates avoids over-termination.
>>>> 
>>>> Termination may be continuously needed after consecutive measurement
>>>> intervals for various
>>>> reasons, e.g., if the used upper rate limits overestimate the actual
>>>> flow rates.
>>>> For such cases it is RECOMMENDED that enough time elapses between
>>>> successive
>>>> termination events to allow the effects of previous termination events
>>>> to be
>>>> reflected in the measurements upon which the termination decisions are
>>>> based;
>>>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2
>>>> and
>>>> 4.3 of [MeLe10].
>>>> 
>>>> ====================================================================
>>>> 
>>>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE"
>>>> makes it look like a reference to a standards document.
>>>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM.
>>>> 
>>>> 
>>>> 
>>>> Best wishes,
>>>> 
>>>> Michael
>>>> 
>>>> 
>>>> Am 02.01.2012 15:21, schrieb Tom Taylor:
>>>>> It shall be as you say, subject to comment from my co-authors when
>>>>> they get back from holiday.
>>>>> 
>>>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote:
>>>>>> In-line...
>>>>>> 
>>>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote:
>>>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on
>>>>>>>> the
>>>>>>>> two issues where I was unclear.
>>>>>>>> 
>>>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I
>>>>>>>> have is
>>>>>>>> relative to the third operational alternative where routing is
>>>>>>>> used to
>>>>>>>> determine where the ingress and egress of a flow is. To be blunt,
>>>>>>>> as far
>>>>>>>> as I can tell this does not work.
>>>>>>>> 1) It does not work on the ingress side because traffic from a given
>>>>>>>> source prefix can come in at multiple places. Some of these
>>>>>>>> places may
>>>>>>>> claim reachability to the source prefix. Some may not. While a given
>>>>>>>> flow will use only one of these paths, there is no way to determine
>>>>>>>> from
>>>>>>>> routing information, at the egress, which ingress that flow used.
>>>>>>>> 2) A site may use multiple exits for a given destination prefix.
>>>>>>>> Again,
>>>>>>>> while the site will only use one of these egresses for a given flow,
>>>>>>>> there is no way for the ingress to know which egress it will be
>>>>>>>> on the
>>>>>>>> basis of routing information.
>>>>>>>> Thus, the text seems to allow for a behavior that simply does not
>>>>>>>> work.
>>>>>>> 
>>>>>>> [PTT] I think the disconnect here is that you read the text to say
>>>>>>> that
>>>>>>> an individual node uses routing information to determine the IEA.
>>>>>>> That
>>>>>>> wasn't the intention. Instead, administrators use routing
>>>>>>> information to
>>>>>>> derive filters that are installed at the ingress and egress nodes.
>>>>>> 
>>>>>> As far as I can tell, your response describes a situation even less
>>>>>> effective than what I assumed.
>>>>>> Firstly, it does not matter whether it is the edge node, the decision
>>>>>> node, or the human administrator. Routing information is not enough to
>>>>>> determine what the ingress-egress pairing is. The problems I describe
>>>>>> above apply no matter who is making the decision.
>>>>>> Secondly, having a human make the decision means that as soon as
>>>>>> routing
>>>>>> changes, the configured filters are wrong.
>>>>>> 
>>>>>> I would suggest that the text in question be removed, and replaced
>>>>>> with
>>>>>> a warning against attempting what is currently described.
>>>>>> 
>>>> My view is also that CL ans SM do not work in the presence of ECMP. This
>>>> should be indicated as a warning.
>>>> 
>>>>>>>> 
>>>>>>>> I am still confused about the relationship of section 3.3.2 to the
>>>>>>>> behavior you describe. 3.3.2 says that as long as any excess
>>>>>>>> traffic is
>>>>>>>> being reported, teh decision point shall direct the blocking of
>>>>>>>> additional flows. That does not match 3.3.1, and does not match your
>>>>>>>> description.
>>>>>>> 
>>>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to
>>>>>>> block as long as any excess traffic is being reported. What I
>>>>>>> think it
>>>>>>> says is that as long as excess traffic is reported, the decision
>>>>>>> point
>>>>>>> checks to see whether the traffic being admitted to the aggregate
>>>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no
>>>>>>> termination may be required (i.e., traffic is below the second
>>>>>>> threshold).
>>>>>> 
>>>>>> I think I see what you are saying. If I am reading this correctly, the
>>>>>> decision process must re-calculate to determine if there is
>>>>>> termination
>>>>>> every time it receives a report with non-zero excess and the port is
>>>>>> already blocked. But it does not have to actually block anything.
>>>>>> This however seems to depend upon the correct relative
>>>>>> configuration of
>>>>>> the limit that flips it into blocked state, the value of U, and maybe
>>>>>> some other values.
>>>>>> Put differently, I understand that the two are not contradictory.
>>>>>> However, since the two things use different calculations, it is not at
>>>>>> all clear that they are consistent. This may well be acceptable.
>>>>>> But the
>>>>>> difference in methods is likely to lead to confusion. So, as a minor
>>>>>> (rather than major) comment, I would suggest that you provide
>>>>>> clarifying
>>>>>> text explaining why it is okay to use one condition to decide if there
>>>>>> is blocking, but a different condition (which could produce a lower
>>>>>> threshold) to decide how much to get rid of.
>>>>>> 
>>>>>> Yours,
>>>>>> Joel
>>>>>> 
>>>>>>>> 
>>>>>>>> Yours,
>>>>>>>> Joel
>>>>>>>> 
>>>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote:
>>>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT].
>>>>>>>>> 
>>>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote:
>>>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For
>>>>>>>>>> background on
>>>>>>>>>> Gen-ART, please see the FAQ at
>>>>>>>>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
>>>>>>>>>> 
>>>>>>>>>> Please resolve these comments along with any other Last Call
>>>>>>>>>> comments
>>>>>>>>>> you may receive.
>>>>>>>>>> 
>>>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08
>>>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of
>>>>>>>>>> Operation
>>>>>>>>>> Reviewer: Joel M. Halpern
>>>>>>>>>> Review Date: 31-Dec-2011
>>>>>>>>>> IETF LC End Date: 13-Jan-2012
>>>>>>>>>> IESG Telechat date: N/A
>>>>>>>>>> 
>>>>>>>>>> Summary: This documents is almost ready for publication as an
>>>>>>>>>> Informational RFC.
>>>>>>>>>> 
>>>>>>>>>> Question: Given that the document defines a complex set of
>>>>>>>>>> behaviors,
>>>>>>>>>> which are mandatory for compliant systems, it seems that this
>>>>>>>>>> ought to
>>>>>>>>>> be Experimental rather than Informational. It describes something
>>>>>>>>>> that
>>>>>>>>>> could, in theory, later become standards track.
>>>>>>>>> 
>>>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your
>>>>>>>>> suggestion.
>>>>>>>>>> 
>>>>>>>>>> Major issues:
>>>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third
>>>>>>>>>> bullet,
>>>>>>>>>> states that the PCN-domain satisfies the conditions specified
>>>>>>>>>> in RFC
>>>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what
>>>>>>>>>> conditions
>>>>>>>>>> these are. Is this supposed to be a reference to RFC 5559
>>>>>>>>>> instead? No
>>>>>>>>>> matter which document it is referencing, please be more specific
>>>>>>>>>> about
>>>>>>>>>> which section / conditions are meant.
>>>>>>>>> 
>>>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long
>>>>>>>>> time
>>>>>>>>> since that text was written that I can't recall what the intention
>>>>>>>>> was.
>>>>>>>>> My inclination at the moment is simply to delete the bullet.
>>>>>>>>>> 
>>>>>>>>>> It would have been helpful if the early part of the document
>>>>>>>>>> indicated
>>>>>>>>>> that the edge node information about how to determine
>>>>>>>>>> ingress-egress-aggregates was described in section 5.
>>>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to
>>>>>>>>>> describe an option which does not seem to quite work. After
>>>>>>>>>> describing
>>>>>>>>>> how to use tunneling, and how to work with signaling, the text
>>>>>>>>>> refers to
>>>>>>>>>> inferring the ingress-egress-aggregate from the routing
>>>>>>>>>> information. In
>>>>>>>>>> the presence of multiple equal-cost domain exits (which does
>>>>>>>>>> occur in
>>>>>>>>>> reality), the routing table is not sufficient information to make
>>>>>>>>>> this
>>>>>>>>>> determination. Unless I am very confused (which does happen) this
>>>>>>>>>> seems
>>>>>>>>>> to be a serious hole in the specification.
>>>>>>>>> 
>>>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it,
>>>>>>>>> operators
>>>>>>>>> don't assign packets randomly to a given path in the presence of
>>>>>>>>> alternatives -- they choose one based on values in the packet
>>>>>>>>> header.
>>>>>>>>> The basic intent is that packets of a given microflow all follow
>>>>>>>>> the
>>>>>>>>> same path, to prevent unnecessary reordering and minimize
>>>>>>>>> jitter. The
>>>>>>>>> implication is that filters can be defined at the ingress nodes to
>>>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e.
>>>>>>>>> flowing
>>>>>>>>> from a specific ingress node to a specific egress node) based on
>>>>>>>>> their
>>>>>>>>> header contents. The filters to do the same job at egress nodes
>>>>>>>>> are a
>>>>>>>>> different problem, but they are not affected by ECMP.
>>>>>>>>>> 
>>>>>>>>>> Minor issues:
>>>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE
>>>>>>>>>> (excess over total) rate exceeds the configured limit. However,
>>>>>>>>>> section
>>>>>>>>>> 3.3.2 states that the decision node must take further stapes if
>>>>>>>>>> the
>>>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency
>>>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is
>>>>>>>>>> important to drive the excess rate to 0, then why is action only
>>>>>>>>>> initiated when the ratio is above a configured value, rather than
>>>>>>>>>> any
>>>>>>>>>> non-zero value? I can conceive of various reasons. But none are
>>>>>>>>>> stated.)
>>>>>>>>> 
>>>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value
>>>>>>>>> equal to
>>>>>>>>> something less than (U - 1)/U. (The "something less" is because of
>>>>>>>>> packet dropping at interior nodes.) The assumption is that (U -
>>>>>>>>> 1)/U is
>>>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds.
>>>>>>>>> When the
>>>>>>>>> CLE is below the first threshold, new flows are admitted. Above
>>>>>>>>> that
>>>>>>>>> threshold, they are blocked. When the CLE is above the second
>>>>>>>>> threshold,
>>>>>>>>> flows are terminated to bring them down to that threshold. In
>>>>>>>>> the SM
>>>>>>>>> mode of operation, the first threshold is specified directly on a
>>>>>>>>> per-link basis by the value CLE-limit. The second threshold is
>>>>>>>>> specified
>>>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of
>>>>>>>>> operation
>>>>>>>>> the second threshold is also specified directly for each link.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Nits/editorial comments:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> 
> _______________________________________________
> Gen-art mailing list
> Gen-art@ietf.org
> https://www.ietf.org/mailman/listinfo/gen-art