Re: [tsvwg] Response to the detailed review: draft-ietf-tsvwg-circuit-breaker-05
gorry@erg.abdn.ac.uk Mon, 02 November 2015 06:41 UTC
Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 799F41A007E for <tsvwg@ietfa.amsl.com>; Sun, 1 Nov 2015 22:41:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.611
X-Spam-Level:
X-Spam-Status: No, score=-2.611 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=1, J_CHICKENPOX_21=0.6, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qg33bRxhayWH for <tsvwg@ietfa.amsl.com>; Sun, 1 Nov 2015 22:41:20 -0800 (PST)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id 779C21A0024 for <tsvwg@ietf.org>; Sun, 1 Nov 2015 22:41:20 -0800 (PST)
Received: from erg.abdn.ac.uk (galactica.erg.abdn.ac.uk [139.133.210.32]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPA id 307571B001A0; Mon, 2 Nov 2015 06:48:32 +0000 (GMT)
Received: from 212.159.18.54 (SquirrelMail authenticated user gorry) by erg.abdn.ac.uk with HTTP; Mon, 2 Nov 2015 06:41:18 -0000
Message-ID: <860a29480ed2b0044c31bac57d4ae5aa.squirrel@erg.abdn.ac.uk>
In-Reply-To: <5636BBF2.5010100@bobbriscoe.net>
References: <55F055AD.3050809@tik.ee.ethz.ch> <55F05D54.5060708@tik.ee.ethz.ch> <55FF2910.7080908@bobbriscoe.net> <BN1PR03MB008FCB491B06E80B6A9A915B64F0@BN1PR03MB008.namprd03.prod.outlook.com> <561DA02D.8020001@bobbriscoe.net> <8f6bae69ac931eff10a0c0d2acfe5b21.squirrel@erg.abdn.ac.uk> <5636BBF2.5010100@bobbriscoe.net>
Date: Mon, 02 Nov 2015 06:41:18 -0000
From: gorry@erg.abdn.ac.uk
To: Bob Briscoe <ietf@bobbriscoe.net>
User-Agent: SquirrelMail/1.4.23 [SVN]
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/ltEIW240B-EArstPqLfTT1ijay4>
Cc: gorry@erg.abdn.ac.uk, tsvwg@ietf.org
Subject: Re: [tsvwg] Response to the detailed review: draft-ietf-tsvwg-circuit-breaker-05
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2015 06:41:32 -0000
Let's see what we may agree on: > Gorry, > > I've been taking a step back since I reviewed the detail of this draft. > I was carried along by the narrative of the draft, and didn't think > about the draft as a whole. > > I think there was also an ambiguity about the definition of 'flow' in > the draft, that I took one way (=microflow). Rereading, I think you > mostly meant the other way (=(aggregate OR microflow)). In the latter > case I would be very negative about the draft (whereas before I said I > was supportive). In the following, I use "flow-termination" with the > former meaning; I use "circuit-breaker" with the latter definition. > > *1. Circuit-breaking of real-time traffic without regard to microflows > considered (very) harmful** > * > 1a. The main problem: the transport area sends out totally the wrong > message if it offers a network circuit-breaker as its apparent > recommended remedy for persistent congestion. The IETF transport area > has developed a wide range of remedies to this problem over the years, > and one of them will nearly always be better than a circuit breaker, > where the phrase "nearly always" leaves an extremely small exception > space. > > Why is the first thing that tsvwg has chosen to expedite, and give the > status of BCP, solely about this tiny exception space? > David seemed to talk to this in the TSVWG meeting. I'll leave that to David to comment further. > 1b. Over the last fifteen years I have researched, designed, built, > evaluated and standardised numerous network-operator controlled > mechanisms for limiting congestion in networks. Below I outline a number > of ways of dealing with persistent congestion while maximizing the > satisfaction of everyone. These have been developed in the IETF over the > years. Then suddenly we get a draft saying, "it's OK to cut off the > whole tunnel if you have to, or at least a large fraction of it," > without saying that it will hardly ever be appropriate to take this > approach. > > I've realised that this is verging on vandalism: promotion of vandalism > of customer traffic, and vandalism of all the hard work of > dozens/hundreds of people over the years to address these problems in a > more considered way. > If the comment is to the statement to reduce to 10%, I'd be happy to add text to say "drop flows" not "packets" when you are able. > * That work started with pre-congestion notification (PCN) for > unresponsive flows, keeping per-flow operations to the edge of a > network. That included flow admission control and flow termination. > It was never the intention to replace the role of CC or admission control or diffuser, etc.This isn't the point. > * We then developed the congestion policer to work at an aggregate > level, dealing with a mix of unresponsive and responsive traffic in > whole networks, or in tunnels or at single bottlenecks. It was designed > to cause the least disruption to each of the flows, when it is > impractical to police each flow separately. A congestion policer deals > effectively with persistent congestion, whether caused by unresponsive > flows, or an excess of responsive flow arrivals, or both. As congestion > rises, i) initially it gives responsive flows more drop signals, while > only removing a small fraction of unresponsive traffic, so the latter > should still be able to continue; ii) if congestion persists, it removes > larger and larger fractions of the aggregate, irrespective of whether > flows are responsive or unresponsive. iii) It can be complemented by > well-known techniques from the 1990s to randomly identify heavy flows to > give a compromise between per-flow and aggregate processing. > And again, I have no problem with people using better methods when they can be encouraged to do so. > * The draft cites the work of Yaakov Stein, David Black and myself > <draft-ietf-pals-congcons>, which the authors and WG have finally > reached consensus on (after years). Your draft is correct that it proves > that typical pseudowires for TDM traffic will become useless in > themselves before they consume more capacity than responsive traffic. > And the draft refers to this circuit-breaker as a possible solution. But > that depends how you define circuit-breaker: > - If it indiscriminately removes packets from the pseudowire > aggregate, it is /not/ the best solution at all. > - The best solution is briefly described in the pals draft (briefly, > 'cos remedies were out of scope): removal of inactive voice channels, > and admission control of individual flows from the pseudowire. Failing > that, the PW should remove random individual flows until congestion is > reduced sufficiently. > - Indiscriminate removal of packets is only appropriate, if the > network cannot see the microflows (e.g. due to encryption at the > aggregate level). > > If circuit-breaking is defined as per-aggregate not per-microflow, then > this draft should not recommend circuit-breaking. It should recommend > flow admission control and flow termination, not circuit-breaking. > > Cutting off any indiscriminate proportion of the packets without regard > to microflows is equivalent to cutting off every microflow within the > pseudowire, because real-time flows typically cannot survive with loss > levels above a few %. Cutting off the whole pseudowire in this way is > vandalism, and only appropriate if the operator is a lazy > good-for-nothing outfit that doesn't deserve any customers. > Seems like also a statement to use a better way to discard traffic. That is fine with me. > 1c. One remedy is to start this draft by saying : > > ***The idea of cutting down unresponsive traffic from a tunnel without > regard to microflows will rarely if ever be appropriate.*** > Sorry, I'm not sure that is a sensible way to start this document. > If a circuit-breaker is defined to include removing traffic without > regard to microflows, then the scope of applicability of this > circuit-breaker draft is tiny, and it should be rewritten to reflect > that. Ie. the next sentence might as well say "Therefore, don't read > this draft. There will nearly always be less harmful ways of addressing > persistent tunnel congestion, many of which are already written up in > RFCs or drafts." > > *2. Scope: Measurement mechanism vs. response behaviour** > * > Secondly, there is a huge question-mark over the scope of the draft. > I do not think the scope of the draft was in doubt. The scope in my mind has remained the same throughout the process. > It ranges over the same territory as another draft already adopted as tsvwg > chartered work: > draft-ietf-tsvwg-tunnel-congestion-feedback > > The circuit-breaker draft does not even reference > tunnel-congestion-feedback. David can judge this, -00 was published as a WG item in September 25, 2014 draft-ietf-tsvwg-tunnel-congestion-feedback (-00 draft March 19, 2016) I don't see this draft as a BCP, I see this as a specification of a method. It was my understanding that this was a congestion manager - although I concede you could also use it as a method of last resort to terminate the tunnel (as in a Circuit Breaker), this seems like an unlikely motivation for doing this work. > However, the two drafts should be totally > complementary. I.e.: > a) tsvwg-tunnel-congestion-feedback specifies a generic way to get > feedback from the egress of a tunnel to a decision point (which can be > the tunnel ingress or centralised entity), and it provides mechanism for > resilience of the feedback messages, ability to control timeliness vs > feedback message overhead, etc. > Agree. > b) tsvwg-circuit-breaker should be limited in scope to solely > specifying one of the possible behaviours in response to congestion > measured over the tunnel: i.e.breaking the circuit > That's what was intended. > c) other drafts can specify other responses, e.g.: > * aggregate techniques: > - provisioning additional capacity > - re-routing part of the load (traffic engineering) > - rate policing > - congestion policing {Notes 1, 2, 3} > * per-flow techniques > - flow admission control > - flow rate policing > - flow congestion policing > - flow termination > > There is material in circuit-breaker about the measurement and feedback > mechanism that contradicts that in tunnel-congestion-feedback. Instead > of writing two contradictory standards, the WG should agree the division > of scope between the two drafts, and the superset of all the authors and > reviewers should work towards a single standard for each scope. > These don't seem like the functions that need to be done by a circuit breaker to me, so I am quite happy with these, but wouldn't wish to see these discussed in the circuit breaker draft. > The chairs of a WG are entitled to assign authors (including themselves) > to a chartered work item to ensure it is well-written, technically sound > and moves along in a timely manner. > > *Notes** > *** > {Note 1} when the conex WG completed, the responsible AD (Martin S) > suggested that drafts like draft-briscoe-conex-policing might be picked > up by tsvwg, which would be appropriate, because there is nothing > specific to conex in that draft - it assumes some mechanism is getting > information about congestion to the ingress of the network, which might > be conex or draft-ietf-tsvwg-tunnel-congestion-feedback. > OK. I personally would not call this a circuit breaker. > {Note 2} draft-briscoe-conex-data-centre specifies the same mechanism > as draft-ietf-tsvwg-tunnel-congestion-feedback. Again, even tho it has > conex in the filename, it was written to give a general tunnel > congestion feedback mechanism where hosts do not support conex (and to > interwork with hosts that do). It focuses on a data centre deployment > scenario, but there is nothing in it that prevents it being applicable > in general networks. > OK. I also would not call this a circuit breaker. > {Note 3} For those not familiar with a congestion policer, it is > essentially a token bucket, but rather than the tokens representing an > allowance to forward bytes, they represent an allowance to cause > congested bytes (ie loss or ECN marking). > For the case of congestion feedback across a tunnel, when feedback about > loss (or ECN) comes back across the tunnel, it drains that amount of > bytes from the bucket (where intermittent feedback messages are > envisaged, token draining has to be averaged over the next measurement > interval). The closer the bucket is to empty the more it blocks packets > entering the ingress. > This has the effect of thinning the ingress traffic so that the > congestion level is just below the defined threshold. > > > _*Further detailed review comments*__* > *_ > *Abstract* > > ... network tunnels, and other non-congestion controlled > applications > > I think you mean > non-congestion controlled network tunnels > > Reasoning: as it stands, this implies tunnels are always a type of > non-congestion controlled application, which is clearly not intended. > I can find better words. > *Introduction** > * > FIrst para: the term 'flow' needs to be disambiguated here, not just > later. Does it mean microflow, or aggregate? I support this draft if it > means the former. but not if the latter. > The scope, as stated by David in the TSVWG meeting today was both. > It was countered by the requirement to use congestion > control (CC) by the Transmission Control Protocol (TCP) [Jacobsen88 > <https://tools.ietf.org/html/draft-ietf-tsvwg-circuit-breaker-07#ref-Jacobsen88>] > [RFC1112 <https://tools.ietf.org/html/rfc1112>]. > > There was no requirement in RFC1112 to use TCP or congestion control. > I'll remove the RFC-REF and leave the paper citation. > People have been implementing what this > draft characterizes as circuit breakers on an ad hoc basis > > This needs references. > > ...either by disabling the flow or by significantly reducing > the level of traffic. > > Again, pls disambiguate "flow". I supported this drdaft, because I > thought it meant microflow. If it means aggregate, I don't support the > draft. > flow or flow aggregate being managed by the circuit breaker > reflects a fair use of the available capacity > This new text added in the latest draft you sent me is problematic. It > will need to say "according to the policy of the operator of the > circuit-breaker, not the IETF". > I will look to this again > Two more comments inline... > > On 17/10/15 12:32, gorry@erg.abdn.ac.uk wrote: >> Sorry I've taken so long to reply. >> >>> Review by Bob Briscoe: 8th Oct 2015 >>> Gorry, >>> >>> Despite being past the WG stage, here's my review anyway. Consider this >>> as early response to IETF last-call. >>> >>> In general I support the intent of this draft, but I am concerned at >>> the >>> severity of the problems I have found with it given it is meant to be >>> about to go to the IESG. I am particularly concerned that I have found >>> numerous significant problems with the normative requirements section. >>> >>> Have you had a substantial review from anyone before this? The level of >>> review comments on the tsvwg list seemed quite light - picking on >>> issues >>> of particular concern, but not seeming to review the draft as a whole. >>> >> I did see some reviews (and have had comments off-list - including from >> other WGs), >> but I also REALLY do appreciate this careful read of the whole. >> >> See my comments on comments below (a few notes from DB as document >> shepherd are also included). >> >> I'm guilty of reaching the deadline, and therefore have submitted these >> changes in a new draft (06), some of these points may benefit from >> further >> discussion if the new update did not address the issues. >> >> Gorry >> >> -------- >> >>> *1. Intro: ** >>> *Congestion Collapse is a very specific case - CB is much more general. >>> it is clear from the draft that a CB is intended to mitigate >>> circumstances wider than solely the extreme case of congestion >>> collapse. >>> For instance: a large unresponsive aggregate contributing to a high >>> level of congestion alongside congestion responsive traffic. This is >>> nowhere near congestion collapse, but it would be an applicable case >>> for >>> a circuit-breaker. Congestion collapse is a specific well-defined >>> process that involves a cascade of congestion as a sequence of queues >>> fill in turn moving in the upstream direction. It is due to continual >>> retries or additional load arriving faster than existing flows are >>> departing. {Note 1} >>> >> GF: I see and I can do this, and will rephrase in the next version, >> to avoid using this term. >> ---- >>> The introduction mentions that TCP-style cc is only an appropriate >>> remedy when long flows dominate. The implication that CB could be used >>> to deal with congestion induced by many short flows is a step too far, >>> IMO. This problem has not even been discussed in the IETF or IRTF to my >>> knowledge, let alone in the context of this draft. In 6.2 this draft >>> all-but says that a CB is a solution to this problem. I strongly object >>> to a BCP making that assertion. CB would be a very drastic and clumsy >>> solution to that problem.{Note 2} >>> >>> It says that the timescale at which a circuit-breaker operates must be >>> seconds or tens of seconds - much longer than the RTT timescale on >>> which >>> TCP, SCTP and DCCP react. This disregards an important type of >>> application response to congestion; it must say that the timescale also >>> has to be longer than the timescale on which certain real-time >>> applications operate their own circuit-breakers i.e. adapt down their >>> codec rates, and eventually close the connection as a form of >>> self-admission control. Applications operate per-flow circuit-breakers >>> typically over the order of seconds or tens of seconds, so network CBs >>> MUST take longer than that - I would say "no less than a minute". >>> >>> We MUST not discourage voluntary self-regulation by overriding it >>> (end-to-end principle). I pick up this point later (comments on section >>> 51.), arguing that the fast-trip CB for RTP should be considered as an >>> application CB, and a network CB should always take longer to trigger >>> than these app CBs. >>> >> GF: I think want to encourage applications to do this, starting by >> ensuring that they have the opportunity to do so, added some text on >> this. >> I revised the upper bound to clarify this, please see if this helps. > > Timescales - the multiple tens of seconds you've added is better. Here's > a typed record of what I said to you f2f the other day: > > A network circuit-breaker needs to allow time for all possible > self-regulation techniques (transport and app layer) to complete: > * TCP-like congestion control (RTT) > * Smoothed TCP-like cc (tens of RTT) > * Codec adaptation (tens of seconds) > * flow termination (app circuit breaker) (~1 minute) > * flow self-admission control (~minutes) > > These are /not/ the timescales written in IETF drafts like > rtp-circuit-breakers. These are the sort of timescales that commercial > real-time apps would be willing to use. There is a difference between > theory and practice. > > It would be very wrong for a circuit-breaker to fire before allowing all > these stages to take effect. > > The most appropriate thing the IETF could standardise (in a separate > draft) is: > 1) the max time apps should take for each of these actions {ToDo: check > the RTP circuit-breaker draft} > > 2) that any timers in any of these mechanisms MUST be randomised > Agree - do you have text to motivate this - it could be a useful contribution. > >> >> GF: I added some more text on the coexistence of fast and other circuit >> breakers, which may help (within the fast circuit breakers section) - >> but >> did not create a new section dedicated to this. If this is desirable we >> could create a section (electrical circuit breakers have different >> "curves" for trigger in a similar way to what is described ... but >> that's >> just a side observation). >> >> GF-XXX: If these comments still apply on the new text, it is worth >> discussing further. >> --- >>> *1.1 Types of CB** >>> ** >>> *I saw criticism on the list of the use of the term "protect" in this >>> section. Why hasn't it been changed? As the posting said, a CB does not >>> protect the aggregate that it monitors; rather it /regulates/ the >>> aggregate to protect the rest of the traffic that it is /not/ >>> monitoring. >>> >> GF: Totally agree, although guilty of not changing the wording as >> promised >> - (people have the same problem describing Electrical circuit >> breakers) >> and sorry thatI inadvertently repeated this many times. I have >> rephrased >> throughout. >> --- > **There are various forms of network transport circuit breaker. > > ... > Fast-Trip Circuit Breakers: > > Repeating what I said before, Fast-Trip is /not/ a type of /network/ > circuit breaker. This really must be fixed. > The RTP Circuit Breaker author thinks it is. > > > Bob >>> *3.1 Functional Components.** >>> * >>> There is no mention of the problem of synchronising the ingress and >>> egress measurements to allow for transit time. Given you are trying to >>> measure loss, which is a relatively small difference between the >>> traffic >>> entering and leaving, you can get very bad errors if you don't take >>> path >>> delay into account. draft-ietf-tsvwg-tunnel-congestion-feedback >>> describes a nice (and commonly used) stateless way of doing that, by >>> sending the ingress measurement in-band to the egress, which triggers >>> the egress measurement so they are synchronized; allowing for transit >>> time. Then the egress can send them both back to the ingress to be >>> compared and acted on. >>> >> GF:I'd hope that measurements over longer periods would not result in >> inaccuracies high enough to lead to trigger, but added a note to the >> text. >> GF: I'm not sure the tunnel method is a CB, it seems more like a >> congestion control method, which is perhaps why it that case the >> synchronisation is required. >> ---- >>> *4. Reqs** >>> * >>> >>> There MUST be a control path from the ingress meter and the >>> egress >>> meter to the point of measurement. The Circuit Breaker MUST >>> trigger if this control path fails. >>> >>> Either this is unclear terminology, or I strongly disagree. What do you >>> mean by a control path? We should only recommend that the CB triggers >>> due to lack of measurement signals if the measurement signals are >>> carried in-band with the data being monitored. That is only one way of >>> arranging the mechanism. The term control path, sounds like it is out >>> of >>> band. >> GF: I fell into the trap of a multiply-defined term. This needs >> resolved. >> I think we should say "communication path used for control messages", >> and >> edit accordingly to avoid this misunderstanding. >> >>> If the measurement signals are out of band, the CB MUST NOT >>> trigger due to lack of measurement signals. I would recommend the >>> in-band method, but there are plenty of network designers who will want >>> to do this in centralised out of band ways, so we have to cater for >>> that >>> way of thinking (even tho it's misguided). >>> >> DB: I think Bob's request for not triggering the CB if a separate >> control >> path fails is reasonable - that may need to be NOC/operator-mediated. >> GF-XXX: I'm not sure I agree with "MUST NOT", so this needs more thought >> - >> comments welcome on this point. >> ---- >>> The measurement period MUST be longer than the time that >>> current >>> Congestion Control algorithms need to reduce their rate >>> following >>> detection of congestion. >>> >>> This needs to be rewritten. Or just removed. It seems like ideas >>> changed >>> after it was written, and the end was changed but not the normative >>> statement at the beginning. IMO, the measurement period can be >>> arbitrarily short, as long as multiple measurements are combined before >>> triggering the CB. >>> It talks about unnecessarily penalizing long RTT >>> flows, but the measurement period is nothing to do with the period >>> before there is any penalization (defined later as the triggering >>> interval). There is no problem with short measurement periods as long >>> as >>> any high congestion measured in these periods is averaged over all the >>> measurement periods in the triggering interval. >>> >> DB: There seem to be two meanings of "measurement period" here. I've >> always viewed it as the period of time over which measurements are taken >> that result in triggering the CB, whereas Bob seems to view it as the >> period of time over which an individual measurement is taken. I have no >> problem with the line of Bob's text quoted above, and think some >> clarification to make it clear that the means of measurement is >> unspecified (e.g., taking short measurements and combining them is >> fine), >> but the period of time that applies to the metric that is used to >> trigger >> the CB has to be sufficiently long. >> >> GF: My understanding was the "measurement period" are the samples that >> feed the trigger, so if the ingress or egress meter samples more >> frequently, these would be combined. I added text to clarify this. >>> In fact, there should be many measurement intervals per trigger >>> interval, so that there are many opportunities for measurement messages >>> to get through. Otherwise if there are only one or two measurement >>> periods per trigger interval, the possibility of a false trigger due to >>> lost control signals becomes too great. >>> >> GF: This is the robustness issue. >>> o A Circuit Breaker is REQUIRED to define a threshold to >>> determine >>> whether the measured congestion is considered excessive. >>> >>> o A Circuit Breaker is REQUIRED to define the triggering >>> interval, >>> >>> A perfectly good CB could vary the trigger interval and threshold >>> depending on how rapidly congestion is rising, or how high its absolute >>> level is. Indeed one could say it is actually wrong to define a single >>> threshold or a single interval, so these normative statements are >>> overly >>> restrictive and preclude designs that are smarter than just simple >>> fixed >>> threshold. >>> >> GF: There are many ways to do this, and individual specs can indeed >> react >> using more sophisticated algorithms. At some point these will become >> more >> like CC than an envelop CB. >> DB: A metric and threshold for that metric are needed. A >> rate-of-increase >> metric or one based in part on that would be fine. >> ---- >>> Also, see comment above about allowing time for application CBs, and >>> suggesting one minute minumum. >>> >>> o A Circuit Breaker SHOULD be constructed so that it does not >>> trigger under light or intermittent congestion, with a default >>> response to a trigger that disables all traffic that >>> contributed >>> to congestion. >>> >>> The second half after the comma seems misplaced. If it does not >>> trigger, >>> why does the sentence go on to talk about disabling all traffic that >>> contributed to congestion (which is what an /enabled/ trigger would >>> do)? >>> >> GF: Split the second part as a separate clause. >>> A reaction that results in a reduction SHOULD result in >>> reducing the traffic by at least a factor of ten, >>> >>> What evidence have you got for this 10% number? It seems utterly >>> inappropriate to write a number here. The number depends on what >>> proportion of the traffic on the path between ingress and egress is >>> regulated by the CB. If the proportion is low, it needs to reduce by a >>> lot to make sufficient space for other traffic. If the proportion is >>> high relative to other traffic, it might be sufficient to reduce by 5% >>> to 95% of the previous load. If the tunnel traffic represented say 80% >>> of the load on the path, and it reduced by a factor of 10, that would >>> leave 92% of the path for other traffic, which might be unnecessarily >>> much greater than the normal proportion used by other traffic. >>> >> DB: Need to say something here - we can fall back to "order of >> magnitude" >> or something like that, but this'll need list discussion. >> GF: Text updated. This point was previously discussed, but I am happy to >> receive more feedback, rather than relying on what was said before. As I >> recall, many were happy with terminating flows once a CB triggered, but >> this was an attempt to be "kinder" and allow more flexibility - which I >> personally liked - but still the default reaction to successive triggers >> needs to be harsh. (It is a "SHOULD" though). >> ----- >>> Manual operator >>> intervention will usually be required to restore a flow. >>> >>> This sentence should be toned down to possibly, not usually. A human is >>> no more capable than a machine is of bringing together all the >>> necessary >>> measurements to decide what other courses of action might be possible, >>> and when to release the brakes. I suggest the last para of 5.3.1 >>> starting: >>> >>> "An operator-based response provides opportunity..." >>> >>> is more appropriate here, and doesn't really fit where it is. >>> >> DB: Operator intervention may be required to restore a flow. >> GF: Changed as above. >> ---- >>> Section 4.1 contains no requirements text, only examples. It ought to >>> be >>> moved from the normative requirements section to section 5 (Examples). >>> >> GF: Resolved as a section on topologies. >> ---- >>> *5. Examples:** >>> * >>> *5.1.1 Fast-Trip CB for RTP** >>> * >>> The draft needs to make the distinction between an application doing >>> its >>> own circuit breaking vs. functions on the path between the application >>> endpoints (even if in the hosts) doing CB. The extremely important >>> distinction is: >>> 1a) an app knows when congestion is too high for it to work properly >>> 1b) functions under the app can only infer congestion is possibly too >>> high for most apps to work properly >>> 2a) an app may be able to reduce the rate at which it sends data >>> 2b) a function under an app can only discard data, not remove it at >>> source. >>> >> GF: Although this is not the place to delve into details, I added a >> prefix: >> "Applications ought to use a full-featured transport (TCP, SCTP, DCCP), >> and if not, application (e.g. those using UDP and its UDP-Lite variant >> [RFC3828])they need to provide appropriate congestion avoidance. >> [RFC2309] >> discusses the dangers of congestion-unresponsive flows and states that >> "all UDP-based streaming applications should incorporate effective >> congestion avoidance mechanisms". Guidance for applications that do not >> use congestion-controlled transports is provided in [RFC5405.bis]. Such >> mechanisms can be designed to react on much shorter timescales than a >> circuit breaker, that only observes a traffic envelope. These methods >> can >> also interact with an application to more effectively control its >> sending >> rate." >> GF: Also updated some other parts of the section. >> ---- >>> I believe that the requirements in section 4 do not apply to >>> application-controlled circuit-breakers. So, I would not include the >>> "Fast-Trip CB for RTP" as an example of a /network/ transport CB. >>> >>> As the requirements say, a network CB should never fast trip. >> GF: But by design a RTP-CB should also not terminate flows. >> >>> By misclassifying RTP CBs as network CBs, you've allowed the timescale >>> for network CBs to trigger after tens of seconds. When a network CB >>> should allow app CBs this long to trigger themselves (as I said >>> earlier). >> GF: I'm not sure, since understanding the difference between the two is >> indeed important. Apps that "trigger themselves" describes what the >> fast-trip CB does (in the absence of CC). >> ----- >>> *Missing examples:** >>> * >>> * You might want to point to the flow termination function (as opposed >>> to admission control) in the PCN architecture [RFC5559], which is >>> precisely a network CB. It was precisely developed for cases where >>> failures caused traffic to reroute onto a previously well-provisioned >>> path (see 6.1). >> DB: I think that's a stretch. This is not among the types listed in >> 1.1. >> It might be mentioned as a related concept that is outside the scope of >> this draft. >> GF: How different to rsvp and other admission-controlled schemes? - I >> think we're drifting away here... I did change in the new rev. Unless >> there is strong support from the WG, I'd rather not add these examples. >> ------ >>> * Andrew McGregor gave the examples of Google's BwE (bandwidth >>> enforcer) >>> and B4, but you haven't referred to them. Given they are documented >>> existence proof of this beast, that seems remiss. >>> >> GF: I believe I discussed with Andrew and we didn't at the time have >> good >> text to add. This may well have changed, if there are good references >> please send them for consideration. >> ------ >>> *7. Security Consid's** >>> ** >>> * >>> >>> The circuit breaker MUST be designed to be robust to packet loss >>> that >>> can also be experienced during congestion/overload. >>> >>> This implies reliable transmission - i.e. retransmit for ever until >>> acknowledged. This is NOT a good idea. In >>> ietf-tsvwg-tunnel-congestion-feedback we propose using SCTP partially >>> reliable transport. Then if congestion causes messages to be lost, they >>> don't have to be retransmitted if there are insufficient resources >>> (thus >>> not risking contributing to congestion collapse - and here I use the >>> phrase correctly). Because they transmit counters, the missing counters >>> values do not matter. This is the tried-and-tested message delivery >>> approach used for IPFIX. The messages can still be given priority, but >>> should not be retransmitted. >>> >> GF: I disagree this is a requirement for reliably transmitting packets. >> I'd be happy to add text to explain robustness does not imply >> reliability >> and that this is likely to be an evil thing to do. >> GF: Added text on duplicating messages. >> ------- >>> Simple protection can be provided by using a >>> randomized source port, or equivalent field in the packet header >>> (such as the RTP SSRC value and the RTP sequence number) expected >>> not >>> to be known to an off-path attacker. >>> >>> I think the draft should recommend that for most scenarios, randomized >>> ports will be insufficient protection for CB control messages, which >>> should be properly crytographically authenticated. Otherwise, a >>> CB-controlled aggregate is too vulnerable to these off-path attacks. >>> >> GF-XXX: I don't know why you state that a random port (or protocol >> field) >> is insufficient protection from off-path attack, please explain. Are >> you >> saying a CB is more vulnerable to attack than other transport traffic. I >> don't see the problem yet, can you suggest what you would like to see >> added? >> -------- >>> *Gap #1:** >>> ***The draft seems to think it is so obvious what a CB should measure >>> that it only says it vaguely as "the level of congestion", and only >>> suggests the difference between ingress and egress counters as an >>> example. Some readers might well think like this: Does congestion level >>> mean the percentage extra bit-rate relative to the aggregate's expected >>> or maximum bit-rate? That might actually be a correct measure of >>> congestion in some scenarios, but... >>> >>> The draft does not say that the congestion level is defined as dropped >>> bytes divided by ingress bytes. The draft should spell out that a CB >>> should measure the volume of bytes dropped and the volume of >>> ECN-capable >>> bytes marked with CE, and express these as a fraction of resp. total >>> ingress non-ECT bytes and total ingress ECT bytes (assuming buffers >>> within the scope of the CB are ECN-enabled). Even this is problematic, >>> because the assumption in parentheses never holds, particularly during >>> excessive congestion. It could also discuss the relative merit of >>> measuring the percentage of packets dropped/marked instead of bytes. >>> >> GF: A detailed discussion of how to measure congestion is out of scope >> here, IMHO. >> ----- >>> Also it should mention that care should be taken over how to combine >>> the >>> measurements. For instance avoid the common mistake of averaging >>> fractions, because ave(c1/t1, c2/t2, c3/t3 ...) != (c1 + c2 + c3)/(t1 + >>> t2 + t3). >>> >> GF: Added some text - is this sufficient, given there is no intention to >> specify the algorithm here: >> "If necessary, MAY combine successive individual meter samples from the >> ingress and egresss to ensure observation of an average over a >> sufficiently long interval. (Note when meter samples need to be >> combined, >> the combination needs to reflect the sum of the individual sample >> counts >> divided by the total time/volume over which the samples were measured. >> Individual samples over different intervals can not be directly combined >> to generate an average value.)" >> -------- >>> *Gap #2:** >>> ***All the diags show multiple routers, but the text says congestion >>> can >>> be measured by comparing ingress and egress traffic. Nowhere does it >>> say >>> that only traffic with addressing that will have for-certain only >>> passed >>> through both ends should be measured. >>> >> GF: True, I think everyone who previously looked at this thought that >> section 3.1 described. I agree though that it needs to be stated, and >> will >> add text. >> -------- >>> {Note 1}: A few years ago I dug deep into the history surrounding the >>> early congestion collapses on the Internet and found that those >>> involved >>> were adamant that the term congestion collapse should not be waved >>> around for dramatic effect, because it has a very specific definition, >>> as paraphrased above. >>> >> GF: I cited an RFC on congestion collapse and did not use the words >> thereafter. >> -------- >>> {Note 2}: The credit feature of ConEx was intended to address >>> short-flow >>> overload if it becomes a problem. DOn't get me wrong; I'm not objecting >>> to the use of CBs for the short-flow problem because I want you to use >>> my solution. I'm just using this as an example of a fine-grained way to >>> solve the problem, rather than the sledge-hammer CB way. >>> Here's the intuition briefly: With ConEx, you have to attach >>> 'congestion >>> credit' to the first packets of a flow to cover the risk of congestion >>> before you have feedback (and if you don't and there is congestion, >>> your >>> packets are dropped by an audit function). Then congestion policers at >>> the network ingress can limit the amount of congestion credit consumed >>> without needing feedback, and thin out traffic if it consists of large >>> numbers of short flows. If short flows come to predominate, ConEx >>> credit >>> was also designed to incentivize a new form of proxy that could >>> regulate >>> short-flows with a push-back style of congestion control, without a >>> full >>> feedback loop. That would be far preferable to such a drastic measure >>> as >>> a circuit-breaker. This aspect of ConEx was not written into the IETF >>> docs, but it is mentioned in the re-ECN drafts that were the ancestors >>> of ConEx. >>> >> GF: I do agree ConEx can manage traffic. I'm confident that a >> ConEx-controlled flow would not need an additional circuit breaker >> mechanism - but I see this as a congestion control mechanism though - >> not >> a circuit-breaker. >> --- >>> *Nits** >>> * >>> 3. >>> s/last resort protection to the network paths that these are used./ >>> /last resort protection to the traffic sharing their network path./ >>> >> GF: Good you spotted this, changed to: " to provide last resort >> protection >> for traffic that shares the network path being used." >>> s/tunnels encapsulations/ >>> /tunnel encapsulations/ >> GF: Fixed in latest version >> >> --- >>> 3. What makes a good CB? >>> >>> Circuit Breakers are RECOMMENDED for IETF protocols and tunnels >>> that >>> carry non-congestion-controlled Internet flows and for traffic >>> aggregates, e.g., traffic sent using a network tunnel. >>> >>> Delete " >>> >>> e.g., traffic sent using a network tunnel >>> >>> " >>> Reason: this implies all network tunnels are problematic, whereas the >>> rest of the sentence adequately says that only tunnels carrying >>> non-congestion controlled flows are of concern. >>> >> GF: Understood, suggest remove from this sentence, and instead place in >> a >> separate sentence that follows this saying, >> "This includes traffic sent using a network tunnel." >> ----- >>> 4. >>> >>> s/monitor the level congestion/ >>> /monitor the level of congestion >> GF: Fixed in latest version >>> 4.1.1 >>> (e.g. to implement a Section 5.1) >>> ? >> GF: Fixed tag in latest version >>> 4.1.2 >>> s/pre-prosvisioned/ >>> /pre-provisioned/ >>> >> GF: Was fixed in -06 >>> 6.1 >>> >>> One common question is whether a Circuit Breaker is needed when a >>> tunnel is deployed in a private network with pre-provisioned >>> capacity? >>> Remove '?' from the end. >>> >> GF: Fixed in latest version >> >>> 6.2 >>> >>> s/in the event that persistent congestion occur./ >>> /in the event that persistent congestion occurs./ >>> >> GF: Fixed in latest version >>> Regards >>> >>> >>> >>> Bob >>> >> Gorry >> > > -- > ________________________________________________________________ > Bob Briscoehttp://bobbriscoe.net/ > >
- [tsvwg] Response to the detailed review: draft-ie… gorry
- Re: [tsvwg] Response to the detailed review: draf… Black, David
- Re: [tsvwg] Response to the detailed review: draf… Spencer Dawkins at IETF
- Re: [tsvwg] Response to the detailed review: draf… gorry
- Re: [tsvwg] Response to the detailed review: draf… Black, David
- Re: [tsvwg] Response to the detailed review: draf… gorry
- Re: [tsvwg] Response to the detailed review: draf… Spencer Dawkins at IETF
- Re: [tsvwg] Response to the detailed review: draf… Bob Briscoe
- Re: [tsvwg] Response to the detailed review: draf… gorry
- Re: [tsvwg] Response to the detailed review: draf… Martin Stiemerling
- Re: [tsvwg] Response to the detailed review: draf… Bob Briscoe