Re: [tsvwg] Extensive review of draft-ietf-tsvwg-circuit-breaker-05
gorry@erg.abdn.ac.uk Fri, 09 October 2015 02:56 UTC
Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C0351B2C32 for <tsvwg@ietfa.amsl.com>; Thu, 8 Oct 2015 19:56:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.211
X-Spam-Level:
X-Spam-Status: No, score=-4.211 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E_4ypW0okIuw for <tsvwg@ietfa.amsl.com>; Thu, 8 Oct 2015 19:56:08 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id 6B1D61B2C36 for <tsvwg@ietf.org>; Thu, 8 Oct 2015 19:55:36 -0700 (PDT)
Received: from erg.abdn.ac.uk (galactica.erg.abdn.ac.uk [139.133.210.32]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPA id 3D5F51B0030F; Fri, 9 Oct 2015 04:02:44 +0100 (BST)
Received: from 212.159.18.54 (SquirrelMail authenticated user gorry) by erg.abdn.ac.uk with HTTP; Fri, 9 Oct 2015 03:55:35 +0100
Message-ID: <e6b05e949788b5f9cf8cf00c81aff0c8.squirrel@erg.abdn.ac.uk>
In-Reply-To: <56172149.1050307@bobbriscoe.net>
References: <5616376D.4010505@bobbriscoe.net> <561657D9.5040908@erg.abdn.ac.uk> <56172149.1050307@bobbriscoe.net>
Date: Fri, 09 Oct 2015 03:55:35 +0100
From: gorry@erg.abdn.ac.uk
To: Bob Briscoe <ietf@bobbriscoe.net>, tsvwg@ietf.org
User-Agent: SquirrelMail/1.4.23 [SVN]
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/BVqyX_BLJXfth_ttCGGfZJsTDAU>
Cc: gorry@erg.abdn.ac.uk
Subject: Re: [tsvwg] Extensive review of draft-ietf-tsvwg-circuit-breaker-05
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Oct 2015 02:56:12 -0000
Thanks for the review. I can't now respond to this at this time, I wasn't expecting this review and will be away. Some parts I agree with, some parts I don't. I'll get to it in a couple of weeks, - I may not have answers to a few of the things we avoided saying. Gorry > Gorry, > > Despite being past the WG stage, here's my review anyway. Consider this > as early response to IETF last-call. > > In general I support the intent of this draft, but I am concerned at the > severity of the problems I have found with it given it is meant to be > about to go to the IESG. I am particularly concerned that I have found > numerous significant problems with the normative requirements section. > > Have you had a substantial review from anyone before this? The level of > review comments on the tsvwg list seemed quite light - picking on issues > of particular concern, but not seeming to review the draft as a whole. > > *1. Intro: ** > *Congestion Collapse is a very specific case - CB is much more general. > it is clear from the draft that a CB is intended to mitigate > circumstances wider than solely the extreme case of congestion collapse. > For instance: a large unresponsive aggregate contributing to a high > level of congestion alongside congestion responsive traffic. This is > nowhere near congestion collapse, but it would be an applicable case for > a circuit-breaker. Congestion collapse is a specific well-defined > process that involves a cascade of congestion as a sequence of queues > fill in turn moving in the upstream direction. It is due to continual > retries or additional load arriving faster than existing flows are > departing. {Note 1} > > The introduction mentions that TCP-style cc is only an appropriate > remedy when long flows dominate. The implication that CB could be used > to deal with congestion induced by many short flows is a step too far, > IMO. This problem has not even been discussed in the IETF or IRTF to my > knowledge, let alone in the context of this draft. In 6.2 this draft > all-but says that a CB is a solution to this problem. I strongly object > to a BCP making that assertion. CB would be a very drastic and clumsy > solution to that problem.{Note 2} > > It says that the timescale at which a circuit-breaker operates must be > seconds or tens of seconds - much longer than the RTT timescale on which > TCP, SCTP and DCCP react. This disregards an important type of > application response to congestion; it must say that the timescale also > has to be longer than the timescale on which certain real-time > applications operate their own circuit-breakers i.e. adapt down their > codec rates, and eventually close the connection as a form of > self-admission control. Applications operate per-flow circuit-breakers > typically over the order of seconds or tens of seconds, so network CBs > MUST take longer than that - I would say "no less than a minute". > > We MUST not discourage voluntary self-regulation by overriding it > (end-to-end principle). I pick up this point later (comments on section > 51.), arguing that the fast-trip CB for RTP should be considered as an > application CB, and a network CB should always take longer to trigger > than these app CBs. > > > *1.1 Types of CB** > ** > *I saw criticism on the list of the use of the term "protect" in this > section. Why hasn't it been changed? As the posting said, a CB does not > protect the aggregate that it monitors; rather it /regulates/ the > aggregate to protect the rest of the traffic that it is /not/ monitoring. > > *3.1 Functional Components.** > * > There is no mention of the problem of synchronising the ingress and > egress measurements to allow for transit time. Given you are trying to > measure loss, which is a relatively small difference between the traffic > entering and leaving, you can get very bad errors if you don't take path > delay into account. draft-ietf-tsvwg-tunnel-congestion-feedback > describes a nice (and commonly used) stateless way of doing that, by > sending the ingress measurement in-band to the egress, which triggers > the egress measurement so they are synchronized; allowing for transit > time. Then the egress can send them both back to the ingress to be > compared and acted on. > > *4. Reqs** > * > > There MUST be a control path from the ingress meter and the egress > meter to the point of measurement. The Circuit Breaker MUST > trigger if this control path fails. > > Either this is unclear terminology, or I strongly disagree. What do you > mean by a control path? We should only recommend that the CB triggers > due to lack of measurement signals if the measurement signals are > carried in-band with the data being monitored. That is only one way of > arranging the mechanism. The term control path, sounds like it is out of > band. If the measurement signals are out of band, the CB MUST NOT > trigger due to lack of measurement signals. I would recommend the > in-band method, but there are plenty of network designers who will want > to do this in centralised out of band ways, so we have to cater for that > way of thinking (even tho it's misguided). > > The measurement period MUST be longer than the time that current > Congestion Control algorithms need to reduce their rate following > detection of congestion. > > This needs to be rewritten. Or just removed. It seems like ideas changed > after it was written, and the end was changed but not the normative > statement at the beginning. IMO, the measurement period can be > arbitrarily short, as long as multiple measurements are combined before > triggering the CB. It talks about unnecessarily penalizing long RTT > flows, but the measurement period is nothing to do with the period > before there is any penalization (defined later as the triggering > interval). There is no problem with short measurement periods as long as > any high congestion measured in these periods is averaged over all the > measurement periods in the triggering interval. > > In fact, there should be many measurement intervals per trigger > interval, so that there are many opportunities for measurement messages > to get through. Otherwise if there are only one or two measurement > periods per trigger interval, the possibility of a false trigger due to > lost control signals becomes too great. > > o A Circuit Breaker is REQUIRED to define a threshold to determine > whether the measured congestion is considered excessive. > > o A Circuit Breaker is REQUIRED to define the triggering interval, > > A perfectly good CB could vary the trigger interval and threshold > depending on how rapidly congestion is rising, or how high its absolute > level is. Indeed one could say it is actually wrong to define a single > threshold or a single interval, so these normative statements are overly > restrictive and preclude designs that are smarter than just simple fixed > threshold. > > Also, see comment above about allowing time for application CBs, and > suggesting one minute minumum. > > o A Circuit Breaker SHOULD be constructed so that it does not > trigger under light or intermittent congestion, with a default > response to a trigger that disables all traffic that contributed > to congestion. > > The second half after the comma seems misplaced. If it does not trigger, > why does the sentence go on to talk about disabling all traffic that > contributed to congestion (which is what an /enabled/ trigger would do)? > > A reaction that results in a reduction SHOULD result in > reducing the traffic by at least a factor of ten, > > What evidence have you got for this 10% number? It seems utterly > inappropriate to write a number here. The number depends on what > proportion of the traffic on the path between ingress and egress is > regulated by the CB. If the proportion is low, it needs to reduce by a > lot to make sufficient space for other traffic. If the proportion is > high relative to other traffic, it might be sufficient to reduce by 5% > to 95% of the previous load. If the tunnel traffic represented say 80% > of the load on the path, and it reduced by a factor of 10, that would > leave 92% of the path for other traffic, which might be unnecessarily > much greater than the normal proportion used by other traffic. > > Manual operator > intervention will usually be required to restore a flow. > > This sentence should be toned down to possibly, not usually. A human is > no more capable than a machine is of bringing together all the necessary > measurements to decide what other courses of action might be possible, > and when to release the brakes. I suggest the last para of 5.3.1 starting: > > "An operator-based response provides opportunity..." > > is more appropriate here, and doesn't really fit where it is. > > Section 4.1 contains no requirements text, only examples. It ought to be > moved from the normative requirements section to section 5 (Examples). > > > *5. Examples:** > * > *5.1.1 Fast-Trip CB for RTP** > * > The draft needs to make the distinction between an application doing its > own circuit breaking vs. functions on the path between the application > endpoints (even if in the hosts) doing CB. The extremely important > distinction is: > 1a) an app knows when congestion is too high for it to work properly > 1b) functions under the app can only infer congestion is possibly too > high for most apps to work properly > 2a) an app may be able to reduce the rate at which it sends data > 2b) a function under an app can only discard data, not remove it at > source. > > I believe that the requirements in section 4 do not apply to > application-controlled circuit-breakers. So, I would not include the > "Fast-Trip CB for RTP" as an example of a /network/ transport CB. > > As the requirements say, a network CB should never fast trip. > By misclassifying RTP CBs as network CBs, you've allowed the timescale > for network CBs to trigger after tens of seconds. When a network CB > should allow app CBs this long to trigger themselves (as I said earlier). > > > *Missing examples:** > * > * You might want to point to the flow termination function (as opposed > to admission control) in the PCN architecture [RFC5559], which is > precisely a network CB. It was precisely developed for cases where > failures caused traffic to reroute onto a previously well-provisioned > path (see 6.1). > * Andrew McGregor gave the examples of Google's BwE (bandwidth enforcer) > and B4, but you haven't referred to them. Given they are documented > existence proof of this beast, that seems remiss. > > *7. Security Consid's** > ** > * > > The circuit breaker MUST be designed to be robust to packet loss that > can also be experienced during congestion/overload. > > > This implies reliable transmission - i.e. retransmit for ever until > acknowledged. This is NOT a good idea. In > ietf-tsvwg-tunnel-congestion-feedback we propose using SCTP partially > reliable transport. Then if congestion causes messages to be lost, they > don't have to be retransmitted if there are insufficient resources (thus > not risking contributing to congestion collapse - and here I use the > phrase correctly). Because they transmit counters, the missing counters > values do not matter. This is the tried-and-tested message delivery > approach used for IPFIX. The messages can still be given priority, but > should not be retransmitted. > > Simple protection can be provided by using a > randomized source port, or equivalent field in the packet header > (such as the RTP SSRC value and the RTP sequence number) expected not > to be known to an off-path attacker. > > I think the draft should recommend that for most scenarios, randomized > ports will be insufficient protection for CB control messages, which > should be properly crytographically authenticated. Otherwise, a > CB-controlled aggregate is too vulnerable to these off-path attacks. > > *Gap #1:** > ***The draft seems to think it is so obvious what a CB should measure > that it only says it vaguely as "the level of congestion", and only > suggests the difference between ingress and egress counters as an > example. Some readers might well think like this: Does congestion level > mean the percentage extra bit-rate relative to the aggregate's expected > or maximum bit-rate? That might actually be a correct measure of > congestion in some scenarios, but... > > The draft does not say that the congestion level is defined as dropped > bytes divided by ingress bytes. The draft should spell out that a CB > should measure the volume of bytes dropped and the volume of ECN-capable > bytes marked with CE, and express these as a fraction of resp. total > ingress non-ECT bytes and total ingress ECT bytes (assuming buffers > within the scope of the CB are ECN-enabled). Even this is problematic, > because the assumption in parentheses never holds, particularly during > excessive congestion. It could also discuss the relative merit of > measuring the percentage of packets dropped/marked instead of bytes. > > Also it should mention that care should be taken over how to combine the > measurements. For instance avoid the common mistake of averaging > fractions, because ave(c1/t1, c2/t2, c3/t3 ...) != (c1 + c2 + c3)/(t1 + > t2 + t3). > > *Gap #2:** > ***All the diags show multiple routers, but the text says congestion can > be measured by comparing ingress and egress traffic. Nowhere does it say > that only traffic with addressing that will have for-certain only passed > through both ends should be measured. > > > {Note 1}: A few years ago I dug deep into the history surrounding the > early congestion collapses on the Internet and found that those involved > were adamant that the term congestion collapse should not be waved > around for dramatic effect, because it has a very specific definition, > as paraphrased above. > > {Note 2}: The credit feature of ConEx was intended to address short-flow > overload if it becomes a problem. DOn't get me wrong; I'm not objecting > to the use of CBs for the short-flow problem because I want you to use > my solution. I'm just using this as an example of a fine-grained way to > solve the problem, rather than the sledge-hammer CB way. > Here's the intuition briefly: With ConEx, you have to attach 'congestion > credit' to the first packets of a flow to cover the risk of congestion > before you have feedback (and if you don't and there is congestion, your > packets are dropped by an audit function). Then congestion policers at > the network ingress can limit the amount of congestion credit consumed > without needing feedback, and thin out traffic if it consists of large > numbers of short flows. If short flows come to predominate, ConEx credit > was also designed to incentivize a new form of proxy that could regulate > short-flows with a push-back style of congestion control, without a full > feedback loop. That would be far preferable to such a drastic measure as > a circuit-breaker. This aspect of ConEx was not written into the IETF > docs, but it is mentioned in the re-ECN drafts that were the ancestors > of ConEx. > > > > *Nits** > * > 3. > s/last resort protection to the network paths that these are used./ > /last resort protection to the traffic sharing their network path./ > > s/tunnels encapsulations/ > /tunnel encapsulations/ > > 3. What makes a good CB? > > Circuit Breakers are RECOMMENDED for IETF protocols and tunnels that > carry non-congestion-controlled Internet flows and for traffic > aggregates, e.g., traffic sent using a network tunnel. > > Delete " > > e.g., traffic sent using a network tunnel > > " > Reason: this implies all network tunnels are problematic, whereas the > rest of the sentence adequately says that only tunnels carrying > non-congestion controlled flows are of concern. > > 4. > > s/monitor the level congestion/ > /monitor the level of congestion/ > > 4.1.1 > (e.g. to implement a Section 5.1) > ? > > 4.1.2 > s/pre-prosvisioned/ > /pre-provisioned/ > > 6.1 > > One common question is whether a Circuit Breaker is needed when a > tunnel is deployed in a private network with pre-provisioned > capacity? > > Remove '?' from the end. > > 6.2 > > s/in the event that persistent congestion occur./ > /in the event that persistent congestion occurs./ > > > > Regards > > > > Bob > > > On 08/10/15 12:47, Gorry Fairhurst wrote: >>> [Gorry, I also have to deliver on my promise on a paragraph for >>> circuit-breaker. Do you have a deadline for that?] >>> >> The circuit-breaker ID is pending start of IETF last call, the >> deadline for doing an author rev passed, sorry. >> >> -- >> ________________________________________________________________ >> Bob Briscoe http://bobbriscoe.net/ >