Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
Russ Housley <housley@vigilsec.com> Mon, 12 March 2012 22:35 UTC
Return-Path: <housley@vigilsec.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADA8E21E8170 for <gen-art@ietfa.amsl.com>; Mon, 12 Mar 2012 15:35:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.577
X-Spam-Level:
X-Spam-Status: No, score=-102.577 tagged_above=-999 required=5 tests=[AWL=-0.022, BAYES_00=-2.599, DATE_IN_PAST_03_06=0.044, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QZ9dJ+3fA82z for <gen-art@ietfa.amsl.com>; Mon, 12 Mar 2012 15:35:01 -0700 (PDT)
Received: from odin.smetech.net (mail.smetech.net [208.254.26.82]) by ietfa.amsl.com (Postfix) with ESMTP id 25BFE21E815D for <gen-art@ietf.org>; Mon, 12 Mar 2012 15:35:01 -0700 (PDT)
Received: from localhost (unknown [208.254.26.81]) by odin.smetech.net (Postfix) with ESMTP id 49C7C9A472C; Mon, 12 Mar 2012 18:35:08 -0400 (EDT)
X-Virus-Scanned: amavisd-new at smetech.net
Received: from odin.smetech.net ([208.254.26.82]) by localhost (ronin.smetech.net [208.254.26.81]) (amavisd-new, port 10024) with ESMTP id U59IFuQUcqjQ; Mon, 12 Mar 2012 18:34:57 -0400 (EDT)
Received: from [10.242.59.150] (maf0f36d0.tmodns.net [208.54.15.175]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by odin.smetech.net (Postfix) with ESMTP id 5EDF69A470F; Mon, 12 Mar 2012 18:35:05 -0400 (EDT)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="us-ascii"
From: Russ Housley <housley@vigilsec.com>
In-Reply-To: <4F01F1AD.8000806@joelhalpern.com>
Date: Mon, 12 Mar 2012 14:28:42 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <841FA158-3616-4C9A-ACF4-D8C7CC2A47D3@vigilsec.com>
References: <CAHBDyN6PN-vp9wXo6fF8G4VfODXjkfbWBaJN8EPopeWfOg9PmQ@mail.gmail.com> <4EFF838D.5020704@joelhalpern.com> <BLU0-SMTP18EE1E01EAA97CC44A44FFD8900@phx.gbl> <4F00BAFD.2070201@joelhalpern.com> <4F00CAE1.60103@gmail.com> <4F00E181.7020605@joelhalpern.com> <4F01BD58.1080303@gmail.com> <4F01E15D.6080601@informatik.uni-tuebingen.de> <4F01E6F5.5080701@joelhalpern.com> <4F01F054.2050301@informatik.uni-tuebingen.de> <4F01F1AD.8000806@joelhalpern.com>
To: "Joel M. Halpern" <jmh@joelhalpern.com>
X-Mailer: Apple Mail (2.1084)
Cc: draft-ietf-pcn-sm-edge-behaviour@tools.ietf.org, Steven Blake <slblake@petri-meat.com>, Michael Menth <menth@informatik.uni-tuebingen.de>, gen-art@ietf.org, Bob Briscoe <bob.briscoe@bt.com>, Tom Taylor <tom.taylor.stds@gmail.com>, David Harrington <ietfdbh@comcast.net>
Subject: Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviour-08
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Mar 2012 22:35:24 -0000
I am very confused about the state of this. My skimming of the thread seems to indicate at least one unresolved issue. Russ On Jan 2, 2012, at 1:04 PM, Joel M. Halpern wrote: > The clarification on U is very helpful. I look forward to comments from others on the routing based behavior / ECMP text removal / replacement question. > > On 1/2/2012 12:58 PM, Michael Menth wrote: >> Hi Joel, hi Tom, >> >> Am 02.01.2012 18:18, schrieb Joel M. Halpern: >>> Michael, I am not sure what to make of your recommended text abut ECMP. >>> ECMP is used by almost all operators. It is generally considered a >>> necessary tool in the tool-kit. >>> More significantly, at least for the egress understanding of the >>> ingress, it is not even the single operator's ECMP, but other >>> operators selections of paths that produce the issue. So even in the >>> unlikely event that this operator does not use ECMP, it still is not >>> sufficient. >> >> Then I better leave the ECMP issue for others to answer. >> >> The definition of U can be better corrected as follows (improved >> rewording of my previous email): >> >> U represents the average ratio of PCN-supportable-rate to >> PCN-admissible-rate over all the links of the PCN-domain. >> -> >> U is a domain-wide constant which implicitly defines the >> PCN-supportable-rate by U*PCN-admissible-rate on all links of the PCN >> domain. >> >> Best wishes, >> >> Michael >> >> >>> >>> Yours, >>> Joel >>> >>> On 1/2/2012 11:54 AM, Michael Menth wrote: >>>> Hi Tom, hi Joel, >>>> >>>> I wish you a happy new year! >>>> >>>> Here are my comments to address Joel's concerns: >>>> >>>> ==================================================================== >>>> >>>> The issue with ECMP: I'd add a comment that CL and SM should not be in >>>> the presence of ECMP if routing information is used to determine >>>> ingress-egress-aggregates since this seems to be messy and error-prone. >>>> >>>> ==================================================================== >>>> >>>> The following text may clarify at the beginning of Section 3.3.2 the >>>> relation >>>> between admission control and flow termination to address one of Joel's >>>> comments (for both SM and CL): >>>> >>>> In the presence of light pre-congestion, i.e., in the presence of a >>>> small, >>>> positive ETM-rate (relative to the overall PCN traffic rate), new >>>> flows may >>>> already be blocked. However, in the presence of heavy pre-congestion, >>>> i.e., >>>> in the presence of a relatively large ETM-rate, termination of some >>>> admitted >>>> flows is required. Thus, flow blocking is logical prerequisite for flow >>>> termination. >>>> >>>> ==================================================================== >>>> >>>> The following sentence in 3.3.2 should be corrected (only SM-specific): >>>> >>>> U represents the average ratio of PCN-supportable-rate to >>>> PCN-admissible-rate >>>> over all the links of the PCN-domain. >>>> >>>> -> >>>> >>>> U represents the ratio of PCN-supportable-rate to PCN-admissible-rate >>>> for all >>>> the links of the PCN-domain. >>>> >>>> ==================================================================== >>>> >>>> I also recommend to change the following text as I think it may cause >>>> misinterpretations (applies both to SM and CL): >>>> >>>> If the difference calculated in the second step is positive, the >>>> Decision >>>> Point SHOULD select PCN-flows to terminate, until it determines that the >>>> PCN-traffic admission rate will no longer be greater than the estimated >>>> sustainable aggregate rate. If the Decision Point knows the bandwidth >>>> required by individual PCN-flows (e.g., from resource signalling used to >>>> establish the flows), it MAY choose to complete its selection of >>>> PCN-flows to >>>> terminate in a single round of decisions. >>>> >>>> Alternatively, the Decision Point MAY spread flow termination over >>>> multiple >>>> rounds to avoid over-termination. If this is done, it is RECOMMENDED >>>> that >>>> enough time elapse between successive rounds of termination to allow the >>>> effects of previous rounds to be reflected in the measurements upon >>>> which the >>>> termination decisions are based. (See [IEEE-Satoh] and sections 4.2 >>>> and 4.3 >>>> of [MeLe10].) >>>> >>>> -> >>>> >>>> If the difference calculated in the second step is positive (traffic >>>> rate to >>>> be terminated), the Decision Point SHOULD select PCN-flows to >>>> terminate. To >>>> that end, the Decision Point MAY use upper rate limits for individual >>>> PCN-flows (e.g., from resource signalling used to establish the >>>> flows) and >>>> select a set of flows whose sum of upper rate limits is up to the >>>> traffic >>>> rate to be terminated. Then, these flows are terminated. The use of >>>> upper >>>> limits on flow rates avoids over-termination. >>>> >>>> Termination may be continuously needed after consecutive measurement >>>> intervals for various >>>> reasons, e.g., if the used upper rate limits overestimate the actual >>>> flow rates. >>>> For such cases it is RECOMMENDED that enough time elapses between >>>> successive >>>> termination events to allow the effects of previous termination events >>>> to be >>>> reflected in the measurements upon which the termination decisions are >>>> based; >>>> otherwise, over-termination may occur. See [IEEE-Satoh] and Sections 4.2 >>>> and >>>> 4.3 of [MeLe10]. >>>> >>>> ==================================================================== >>>> >>>> [IEEE-Satoh] is not a good key for Daisuke's work as the prefix "IEEE" >>>> makes it look like a reference to a standards document. >>>> You better use [SaUe10] or [Satoh10]. Applies both to CL and SM. >>>> >>>> >>>> >>>> Best wishes, >>>> >>>> Michael >>>> >>>> >>>> Am 02.01.2012 15:21, schrieb Tom Taylor: >>>>> It shall be as you say, subject to comment from my co-authors when >>>>> they get back from holiday. >>>>> >>>>> On 01/01/2012 5:43 PM, Joel M. Halpern wrote: >>>>>> In-line... >>>>>> >>>>>> On 1/1/2012 4:06 PM, Tom Taylor wrote: >>>>>>> >>>>>>> >>>>>>> On 01/01/2012 2:58 PM, Joel M. Halpern wrote: >>>>>>>> Thank you for responding promptly Tom. Let me try to elaborate on >>>>>>>> the >>>>>>>> two issues where I was unclear. >>>>>>>> >>>>>>>> On the ingress-egress-aggregate issue and ECMP, the concern I >>>>>>>> have is >>>>>>>> relative to the third operational alternative where routing is >>>>>>>> used to >>>>>>>> determine where the ingress and egress of a flow is. To be blunt, >>>>>>>> as far >>>>>>>> as I can tell this does not work. >>>>>>>> 1) It does not work on the ingress side because traffic from a given >>>>>>>> source prefix can come in at multiple places. Some of these >>>>>>>> places may >>>>>>>> claim reachability to the source prefix. Some may not. While a given >>>>>>>> flow will use only one of these paths, there is no way to determine >>>>>>>> from >>>>>>>> routing information, at the egress, which ingress that flow used. >>>>>>>> 2) A site may use multiple exits for a given destination prefix. >>>>>>>> Again, >>>>>>>> while the site will only use one of these egresses for a given flow, >>>>>>>> there is no way for the ingress to know which egress it will be >>>>>>>> on the >>>>>>>> basis of routing information. >>>>>>>> Thus, the text seems to allow for a behavior that simply does not >>>>>>>> work. >>>>>>> >>>>>>> [PTT] I think the disconnect here is that you read the text to say >>>>>>> that >>>>>>> an individual node uses routing information to determine the IEA. >>>>>>> That >>>>>>> wasn't the intention. Instead, administrators use routing >>>>>>> information to >>>>>>> derive filters that are installed at the ingress and egress nodes. >>>>>> >>>>>> As far as I can tell, your response describes a situation even less >>>>>> effective than what I assumed. >>>>>> Firstly, it does not matter whether it is the edge node, the decision >>>>>> node, or the human administrator. Routing information is not enough to >>>>>> determine what the ingress-egress pairing is. The problems I describe >>>>>> above apply no matter who is making the decision. >>>>>> Secondly, having a human make the decision means that as soon as >>>>>> routing >>>>>> changes, the configured filters are wrong. >>>>>> >>>>>> I would suggest that the text in question be removed, and replaced >>>>>> with >>>>>> a warning against attempting what is currently described. >>>>>> >>>> My view is also that CL ans SM do not work in the presence of ECMP. This >>>> should be indicated as a warning. >>>> >>>>>>>> >>>>>>>> I am still confused about the relationship of section 3.3.2 to the >>>>>>>> behavior you describe. 3.3.2 says that as long as any excess >>>>>>>> traffic is >>>>>>>> being reported, teh decision point shall direct the blocking of >>>>>>>> additional flows. That does not match 3.3.1, and does not match your >>>>>>>> description. >>>>>>> >>>>>>> [PTT] I can't see the text in section 3.3.2 that says you continue to >>>>>>> block as long as any excess traffic is being reported. What I >>>>>>> think it >>>>>>> says is that as long as excess traffic is reported, the decision >>>>>>> point >>>>>>> checks to see whether the traffic being admitted to the aggregate >>>>>>> exceeds the supportable level. Excess traffic may be non-zero, yet no >>>>>>> termination may be required (i.e., traffic is below the second >>>>>>> threshold). >>>>>> >>>>>> I think I see what you are saying. If I am reading this correctly, the >>>>>> decision process must re-calculate to determine if there is >>>>>> termination >>>>>> every time it receives a report with non-zero excess and the port is >>>>>> already blocked. But it does not have to actually block anything. >>>>>> This however seems to depend upon the correct relative >>>>>> configuration of >>>>>> the limit that flips it into blocked state, the value of U, and maybe >>>>>> some other values. >>>>>> Put differently, I understand that the two are not contradictory. >>>>>> However, since the two things use different calculations, it is not at >>>>>> all clear that they are consistent. This may well be acceptable. >>>>>> But the >>>>>> difference in methods is likely to lead to confusion. So, as a minor >>>>>> (rather than major) comment, I would suggest that you provide >>>>>> clarifying >>>>>> text explaining why it is okay to use one condition to decide if there >>>>>> is blocking, but a different condition (which could produce a lower >>>>>> threshold) to decide how much to get rid of. >>>>>> >>>>>> Yours, >>>>>> Joel >>>>>> >>>>>>>> >>>>>>>> Yours, >>>>>>>> Joel >>>>>>>> >>>>>>>> On 1/1/2012 2:48 PM, Tom Taylor wrote: >>>>>>>>> Thanks for the review, Joel. Comments below, marked with [PTT]. >>>>>>>>> >>>>>>>>> On 31/12/2011 4:50 PM, Joel M. Halpern wrote: >>>>>>>>>> I am the assigned Gen-ART reviewer for this draft. For >>>>>>>>>> background on >>>>>>>>>> Gen-ART, please see the FAQ at >>>>>>>>>> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. >>>>>>>>>> >>>>>>>>>> Please resolve these comments along with any other Last Call >>>>>>>>>> comments >>>>>>>>>> you may receive. >>>>>>>>>> >>>>>>>>>> Document: draft-ietf-pcn-sm-edge-behaviour-08 >>>>>>>>>> PCN Boundary Node Behaviour for the Single Marking (SM) Mode of >>>>>>>>>> Operation >>>>>>>>>> Reviewer: Joel M. Halpern >>>>>>>>>> Review Date: 31-Dec-2011 >>>>>>>>>> IETF LC End Date: 13-Jan-2012 >>>>>>>>>> IESG Telechat date: N/A >>>>>>>>>> >>>>>>>>>> Summary: This documents is almost ready for publication as an >>>>>>>>>> Informational RFC. >>>>>>>>>> >>>>>>>>>> Question: Given that the document defines a complex set of >>>>>>>>>> behaviors, >>>>>>>>>> which are mandatory for compliant systems, it seems that this >>>>>>>>>> ought to >>>>>>>>>> be Experimental rather than Informational. It describes something >>>>>>>>>> that >>>>>>>>>> could, in theory, later become standards track. >>>>>>>>> >>>>>>>>> [PTT] OK, we've wobbled on this one, but we can follow your >>>>>>>>> suggestion. >>>>>>>>>> >>>>>>>>>> Major issues: >>>>>>>>>> Section 2 on Assumed Core Network Behavior for SM, in the third >>>>>>>>>> bullet, >>>>>>>>>> states that the PCN-domain satisfies the conditions specified >>>>>>>>>> in RFC >>>>>>>>>> 5696. Unfortunately, look at RFC 5696 I can not tell what >>>>>>>>>> conditions >>>>>>>>>> these are. Is this supposed to be a reference to RFC 5559 >>>>>>>>>> instead? No >>>>>>>>>> matter which document it is referencing, please be more specific >>>>>>>>>> about >>>>>>>>>> which section / conditions are meant. >>>>>>>>> >>>>>>>>> [PTT] You are right that RFC 5696 isn't relevant. It's such a long >>>>>>>>> time >>>>>>>>> since that text was written that I can't recall what the intention >>>>>>>>> was. >>>>>>>>> My inclination at the moment is simply to delete the bullet. >>>>>>>>>> >>>>>>>>>> It would have been helpful if the early part of the document >>>>>>>>>> indicated >>>>>>>>>> that the edge node information about how to determine >>>>>>>>>> ingress-egress-aggregates was described in section 5. >>>>>>>>>> In conjunction with that, section 5.1.2, third paragraph, seems to >>>>>>>>>> describe an option which does not seem to quite work. After >>>>>>>>>> describing >>>>>>>>>> how to use tunneling, and how to work with signaling, the text >>>>>>>>>> refers to >>>>>>>>>> inferring the ingress-egress-aggregate from the routing >>>>>>>>>> information. In >>>>>>>>>> the presence of multiple equal-cost domain exits (which does >>>>>>>>>> occur in >>>>>>>>>> reality), the routing table is not sufficient information to make >>>>>>>>>> this >>>>>>>>>> determination. Unless I am very confused (which does happen) this >>>>>>>>>> seems >>>>>>>>>> to be a serious hole in the specification. >>>>>>>>> >>>>>>>>> [PTT] I'm not sure what the issue is here. As I understand it, >>>>>>>>> operators >>>>>>>>> don't assign packets randomly to a given path in the presence of >>>>>>>>> alternatives -- they choose one based on values in the packet >>>>>>>>> header. >>>>>>>>> The basic intent is that packets of a given microflow all follow >>>>>>>>> the >>>>>>>>> same path, to prevent unnecessary reordering and minimize >>>>>>>>> jitter. The >>>>>>>>> implication is that filters can be defined at the ingress nodes to >>>>>>>>> identify the packets in a given ingress-egress-aggregate (i.e. >>>>>>>>> flowing >>>>>>>>> from a specific ingress node to a specific egress node) based on >>>>>>>>> their >>>>>>>>> header contents. The filters to do the same job at egress nodes >>>>>>>>> are a >>>>>>>>> different problem, but they are not affected by ECMP. >>>>>>>>>> >>>>>>>>>> Minor issues: >>>>>>>>>> Section 3.3.1 states that the "block" decision occurs when the CLE >>>>>>>>>> (excess over total) rate exceeds the configured limit. However, >>>>>>>>>> section >>>>>>>>>> 3.3.2 states that the decision node must take further stapes if >>>>>>>>>> the >>>>>>>>>> excess rate is non-zero in further reports. Is this inconsistency >>>>>>>>>> deliberate? If so, please explain. If not, please fix. (If it is >>>>>>>>>> important to drive the excess rate to 0, then why is action only >>>>>>>>>> initiated when the ratio is above a configured value, rather than >>>>>>>>>> any >>>>>>>>>> non-zero value? I can conceive of various reasons. But none are >>>>>>>>>> stated.) >>>>>>>>> >>>>>>>>> [PTT] We aren't driving the excess rate to zero, but to a value >>>>>>>>> equal to >>>>>>>>> something less than (U - 1)/U. (The "something less" is because of >>>>>>>>> packet dropping at interior nodes.) The assumption is that (U - >>>>>>>>> 1)/U is >>>>>>>>> greater than CLE-limit. Conceptually, PCN uses two thresholds. >>>>>>>>> When the >>>>>>>>> CLE is below the first threshold, new flows are admitted. Above >>>>>>>>> that >>>>>>>>> threshold, they are blocked. When the CLE is above the second >>>>>>>>> threshold, >>>>>>>>> flows are terminated to bring them down to that threshold. In >>>>>>>>> the SM >>>>>>>>> mode of operation, the first threshold is specified directly on a >>>>>>>>> per-link basis by the value CLE-limit. The second threshold is >>>>>>>>> specified >>>>>>>>> by the same value (U - 1)/U for all links. With the CL mode of >>>>>>>>> operation >>>>>>>>> the second threshold is also specified directly for each link. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Nits/editorial comments: >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> > _______________________________________________ > Gen-art mailing list > Gen-art@ietf.org > https://www.ietf.org/mailman/listinfo/gen-art
- [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviou… Joel M. Halpern
- [Gen-art] A *new* batch of IETF LC reviews - 2011… Mary Barnes
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Tom Taylor
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Tom Taylor
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Tom Taylor
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Michael Menth
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Michael Menth
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… David Harrington
- [Gen-art] Review: draft-ietf-pcn-sm-edge-behaviou… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Russ Housley
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Joel M. Halpern
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Tom Taylor
- Re: [Gen-art] Review: draft-ietf-pcn-sm-edge-beha… Tom Taylor
- [Gen-art] Gen-Art review of draft-ietf-sipcore-rf… Alexey Melnikov
- Re: [Gen-art] Gen-Art review of draft-ietf-sipcor… Adam Roach
- Re: [Gen-art] Gen-Art review of draft-ietf-sipcor… Alexey Melnikov