Re: [tsvwg] Extensive review of draft-ietf-tsvwg-circuit-breaker-05

gorry@erg.abdn.ac.uk Fri, 09 October 2015 02:56 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C0351B2C32 for <tsvwg@ietfa.amsl.com>; Thu, 8 Oct 2015 19:56:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.211
X-Spam-Level:
X-Spam-Status: No, score=-4.211 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E_4ypW0okIuw for <tsvwg@ietfa.amsl.com>; Thu, 8 Oct 2015 19:56:08 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id 6B1D61B2C36 for <tsvwg@ietf.org>; Thu, 8 Oct 2015 19:55:36 -0700 (PDT)
Received: from erg.abdn.ac.uk (galactica.erg.abdn.ac.uk [139.133.210.32]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPA id 3D5F51B0030F; Fri, 9 Oct 2015 04:02:44 +0100 (BST)
Received: from 212.159.18.54 (SquirrelMail authenticated user gorry) by erg.abdn.ac.uk with HTTP; Fri, 9 Oct 2015 03:55:35 +0100
Message-ID: <e6b05e949788b5f9cf8cf00c81aff0c8.squirrel@erg.abdn.ac.uk>
In-Reply-To: <56172149.1050307@bobbriscoe.net>
References: <5616376D.4010505@bobbriscoe.net> <561657D9.5040908@erg.abdn.ac.uk> <56172149.1050307@bobbriscoe.net>
Date: Fri, 09 Oct 2015 03:55:35 +0100
From: gorry@erg.abdn.ac.uk
To: Bob Briscoe <ietf@bobbriscoe.net>, tsvwg@ietf.org
User-Agent: SquirrelMail/1.4.23 [SVN]
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/BVqyX_BLJXfth_ttCGGfZJsTDAU>
Cc: gorry@erg.abdn.ac.uk
Subject: Re: [tsvwg] Extensive review of draft-ietf-tsvwg-circuit-breaker-05
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Oct 2015 02:56:12 -0000

Thanks for the review.

I can't now respond to this at this time, I wasn't expecting this review
and will be away. Some parts I agree with, some parts I don't. I'll get to
it in a couple of weeks, - I may not have answers to a few of the things
we avoided saying.

Gorry

> Gorry,
>
> Despite being past the WG stage, here's my review anyway. Consider this
> as early response to IETF last-call.
>
> In general I support the intent of this draft, but I am concerned at the
> severity of the problems I have found with it given it is meant to be
> about to go to the IESG. I am particularly concerned that I have found
> numerous significant problems with the normative requirements section.
>
> Have you had a substantial review from anyone before this? The level of
> review comments on the tsvwg list seemed quite light - picking on issues
> of particular concern, but not seeming to review the draft as a whole.
>
> *1. Intro: **
> *Congestion Collapse is a very specific case - CB is much more general.
> it is clear from the draft that a CB is intended to mitigate
> circumstances wider than solely the extreme case of congestion collapse.
> For instance: a large unresponsive aggregate contributing to a high
> level of congestion alongside congestion responsive traffic. This is
> nowhere near congestion collapse, but it would be an applicable case for
> a circuit-breaker. Congestion collapse is a specific well-defined
> process that involves a cascade of congestion as a sequence of queues
> fill in turn moving in the upstream direction. It is due to continual
> retries or additional load arriving faster than existing flows are
> departing. {Note 1}
>
> The introduction mentions that TCP-style cc is only an appropriate
> remedy when long flows dominate. The implication that CB could be used
> to deal with congestion induced by many short flows is a step too far,
> IMO. This problem has not even been discussed in the IETF or IRTF to my
> knowledge, let alone in the context of this draft. In 6.2 this draft
> all-but says that a CB is a solution to this problem. I strongly object
> to a BCP making that assertion. CB would be a very drastic and clumsy
> solution to that problem.{Note 2}
>
> It says that the timescale at which a circuit-breaker operates must be
> seconds or tens of seconds - much longer than the RTT timescale on which
> TCP, SCTP and DCCP react. This disregards an important type of
> application response to congestion; it must say that the timescale also
> has to be longer than the timescale on which certain real-time
> applications operate their own circuit-breakers i.e. adapt down their
> codec rates, and eventually close the connection as a form of
> self-admission control. Applications operate per-flow circuit-breakers
> typically over the order of seconds or tens of seconds, so network CBs
> MUST take longer than that - I would say "no less than a minute".
>
> We MUST not discourage voluntary self-regulation by overriding it
> (end-to-end principle). I pick up this point later (comments on section
> 51.), arguing that the fast-trip CB for RTP should be considered as an
> application CB, and a network CB should always take longer to trigger
> than these app CBs.
>
>
> *1.1 Types of CB**
> **
> *I saw criticism on the list of the use of the term "protect" in this
> section. Why hasn't it been changed? As the posting said, a CB does not
> protect the aggregate that it monitors; rather it /regulates/ the
> aggregate to protect the rest of the traffic that it is /not/ monitoring.
>
> *3.1 Functional Components.**
> *
> There is no mention of the problem of synchronising the ingress and
> egress measurements to allow for transit time. Given you are trying to
> measure loss, which is a relatively small difference between the traffic
> entering and leaving, you can get very bad errors if you don't take path
> delay into account. draft-ietf-tsvwg-tunnel-congestion-feedback
> describes a nice (and commonly used) stateless way of doing that, by
> sending the ingress measurement in-band to the egress, which triggers
> the egress measurement so they are synchronized; allowing for transit
> time. Then the egress can send them both back to the ingress to be
> compared and acted on.
>
> *4. Reqs**
> *
>
>        There MUST be a control path from the ingress meter and the egress
>        meter to the point of measurement.  The Circuit Breaker MUST
>        trigger if this control path fails.
>
> Either this is unclear terminology, or I strongly disagree. What do you
> mean by a control path? We should only recommend that the CB triggers
> due to lack of measurement signals if the measurement signals are
> carried in-band with the data being monitored. That is only one way of
> arranging the mechanism. The term control path, sounds like it is out of
> band. If the measurement signals are out of band, the CB MUST NOT
> trigger due to lack of measurement signals. I would recommend the
> in-band method, but there are plenty of network designers who will want
> to do this in centralised out of band ways, so we have to cater for that
> way of thinking (even tho it's misguided).
>
>        The measurement period MUST be longer than the time that current
>        Congestion Control algorithms need to reduce their rate following
>        detection of congestion.
>
> This needs to be rewritten. Or just removed. It seems like ideas changed
> after it was written, and the end was changed but not the normative
> statement at the beginning. IMO, the measurement period can be
> arbitrarily short, as long as multiple measurements are combined before
> triggering the CB. It talks about unnecessarily penalizing long RTT
> flows, but the measurement period is nothing to do with the period
> before there is any penalization (defined later as the triggering
> interval). There is no problem with short measurement periods as long as
> any high congestion measured in these periods is averaged over all the
> measurement periods in the triggering interval.
>
> In fact, there should be many measurement intervals per trigger
> interval, so that there are many opportunities for measurement messages
> to get through. Otherwise if there are only one or two measurement
> periods per trigger interval, the possibility of a false trigger due to
> lost control signals becomes too great.
>
>     o  A Circuit Breaker is REQUIRED to define a threshold to determine
>        whether the measured congestion is considered excessive.
>
>     o  A Circuit Breaker is REQUIRED to define the triggering interval,
>
> A perfectly good CB could vary the trigger interval and threshold
> depending on how rapidly congestion is rising, or how high its absolute
> level is. Indeed one could say it is actually wrong to define a single
> threshold or a single interval, so these normative statements are overly
> restrictive and preclude designs that are smarter than just simple fixed
> threshold.
>
> Also, see comment above about allowing time for application CBs, and
> suggesting one minute minumum.
>
> o  A Circuit Breaker SHOULD be constructed so that it does not
>        trigger under light or intermittent congestion, with a default
>        response to a trigger that disables all traffic that contributed
>        to congestion.
>
> The second half after the comma seems misplaced. If it does not trigger,
> why does the sentence go on to talk about disabling all traffic that
> contributed to congestion (which is what an /enabled/ trigger would do)?
>
> A reaction that results in a reduction SHOULD result in
>        reducing the traffic by at least a factor of ten,
>
> What evidence have you got for this 10% number? It seems utterly
> inappropriate to write a number here. The number depends on what
> proportion of the traffic on the path between ingress and egress is
> regulated by the CB. If the proportion is low, it needs to reduce by a
> lot to make sufficient space for other traffic. If the proportion is
> high relative to other traffic, it might be sufficient to reduce by 5%
> to 95% of the previous load. If the tunnel traffic represented say 80%
> of the load on the path, and it reduced by a factor of 10, that would
> leave 92% of the path for other traffic, which might be unnecessarily
> much greater than the normal proportion used by other traffic.
>
>        Manual operator
>        intervention will usually be required to restore a flow.
>
> This sentence should be toned down to possibly, not usually. A human is
> no more capable than a machine is of bringing together all the necessary
> measurements to decide what other courses of action might be possible,
> and when to release the brakes. I suggest the last para of 5.3.1 starting:
>
> "An operator-based response provides opportunity..."
>
> is more appropriate here, and doesn't really fit where it is.
>
> Section 4.1 contains no requirements text, only examples. It ought to be
> moved from the normative requirements section to section 5 (Examples).
>
>
> *5. Examples:**
> *
> *5.1.1 Fast-Trip CB for RTP**
> *
> The draft needs to make the distinction between an application doing its
> own circuit breaking vs. functions on the path between the application
> endpoints (even if in the hosts) doing CB. The extremely important
> distinction is:
> 1a) an app knows when congestion is too high for it to work properly
> 1b) functions under the app can only infer congestion is possibly too
> high for most apps to work properly
> 2a) an app may be able to reduce the rate at which it sends data
> 2b) a function under an app can only discard data, not remove it at
> source.
>
> I believe that the requirements in section 4 do not apply to
> application-controlled circuit-breakers. So, I would not include the
> "Fast-Trip CB for RTP" as an example of a /network/ transport CB.
>
> As the requirements say, a network CB should never fast trip.
> By misclassifying RTP CBs as network CBs, you've allowed the timescale
> for network CBs to trigger after tens of seconds. When a network CB
> should allow app CBs this long to trigger themselves (as I said earlier).
>
>
> *Missing examples:**
> *
> * You might want to point to the flow termination function (as opposed
> to admission control) in the PCN architecture [RFC5559], which is
> precisely a network CB. It was precisely developed for cases where
> failures caused traffic to reroute onto a previously well-provisioned
> path (see 6.1).
> * Andrew McGregor gave the examples of Google's BwE (bandwidth enforcer)
> and B4, but you haven't referred to them. Given they are documented
> existence proof of this beast, that seems remiss.
>
> *7. Security Consid's**
> **
> *
>
>     The circuit breaker MUST be designed to be robust to packet loss that
>     can also be experienced during congestion/overload.
>
>
> This implies reliable transmission - i.e. retransmit for ever until
> acknowledged. This is NOT a good idea. In
> ietf-tsvwg-tunnel-congestion-feedback we propose using SCTP partially
> reliable transport. Then if congestion causes messages to be lost, they
> don't have to be retransmitted if there are insufficient resources (thus
> not risking contributing to congestion collapse - and here I use the
> phrase correctly). Because they transmit counters, the missing counters
> values do not matter. This is the tried-and-tested message delivery
> approach used for IPFIX. The messages can still be given priority, but
> should not be retransmitted.
>
>     Simple protection can be provided by using a
>     randomized source port, or equivalent field in the packet header
>     (such as the RTP SSRC value and the RTP sequence number) expected not
>     to be known to an off-path attacker.
>
> I think the draft should recommend that for most scenarios, randomized
> ports will be insufficient protection for CB control messages, which
> should be properly crytographically authenticated. Otherwise, a
> CB-controlled aggregate is too vulnerable to these off-path attacks.
>
> *Gap #1:**
> ***The draft seems to think it is so obvious what a CB should measure
> that it only says it vaguely as "the level of congestion", and only
> suggests the difference between ingress and egress counters as an
> example. Some readers might well think like this: Does congestion level
> mean the percentage extra bit-rate relative to the aggregate's expected
> or maximum bit-rate? That might actually be a correct measure of
> congestion in some scenarios, but...
>
> The draft does not say that the congestion level is defined as dropped
> bytes divided by ingress bytes. The draft should spell out that a CB
> should measure the volume of bytes dropped and the volume of ECN-capable
> bytes marked with CE, and express these as a fraction of resp. total
> ingress non-ECT bytes and total ingress ECT bytes (assuming buffers
> within the scope of the CB are ECN-enabled). Even this is problematic,
> because the assumption in parentheses never holds, particularly during
> excessive congestion. It could also discuss the relative merit of
> measuring the percentage of packets dropped/marked instead of bytes.
>
> Also it should mention that care should be taken over how to combine the
> measurements. For instance avoid the common mistake of averaging
> fractions, because ave(c1/t1, c2/t2, c3/t3 ...) != (c1 + c2 + c3)/(t1 +
> t2 + t3).
>
> *Gap #2:**
> ***All the diags show multiple routers, but the text says congestion can
> be measured by comparing ingress and egress traffic. Nowhere does it say
> that only traffic with addressing that will have for-certain only passed
> through both ends should be measured.
>
>
> {Note 1}: A few years ago I dug deep into the history surrounding the
> early congestion collapses on the Internet and found that those involved
> were adamant that the term congestion collapse should not be waved
> around for dramatic effect, because it has a very specific definition,
> as paraphrased above.
>
> {Note 2}: The credit feature of ConEx was intended to address short-flow
> overload if it becomes a problem. DOn't get me wrong; I'm not objecting
> to the use of CBs for the short-flow problem because I want you to use
> my solution. I'm just using this as an example of a fine-grained way to
> solve the problem, rather than the sledge-hammer CB way.
> Here's the intuition briefly: With ConEx, you have to attach 'congestion
> credit' to the first packets of a flow to cover the risk of congestion
> before you have feedback (and if you don't and there is congestion, your
> packets are dropped by an audit function). Then congestion policers at
> the network ingress can limit the amount of congestion credit consumed
> without needing feedback, and thin out traffic if it consists of large
> numbers of short flows. If short flows come to predominate, ConEx credit
> was also designed to incentivize a new form of proxy that could regulate
> short-flows with a push-back style of congestion control, without a full
> feedback loop. That would be far preferable to such a drastic measure as
> a circuit-breaker. This aspect of ConEx was not written into the IETF
> docs, but it is mentioned in the re-ECN drafts that were the ancestors
> of ConEx.
>
>
>
> *Nits**
> *
> 3.
> s/last resort protection to the network paths that these are used./
>   /last resort protection to the traffic sharing their network path./
>
> s/tunnels encapsulations/
>   /tunnel encapsulations/
>
> 3. What makes a good CB?
>
>     Circuit Breakers are RECOMMENDED for IETF protocols and tunnels that
>     carry non-congestion-controlled Internet flows and for traffic
>     aggregates, e.g., traffic sent using a network tunnel.
>
> Delete "
>
> e.g., traffic sent using a network tunnel
>
> "
> Reason: this implies all network tunnels are problematic, whereas the
> rest of the sentence adequately says that only tunnels carrying
> non-congestion controlled flows are of concern.
>
> 4.
>
> s/monitor the level congestion/
>   /monitor the level of congestion/
>
> 4.1.1
> (e.g. to implement a Section 5.1)
> ?
>
> 4.1.2
> s/pre-prosvisioned/
>   /pre-provisioned/
>
> 6.1
>
>     One common question is whether a Circuit Breaker is needed when a
>     tunnel is deployed in a private network with pre-provisioned
>     capacity?
>
> Remove '?' from the end.
>
> 6.2
>
> s/in the event that persistent congestion occur./
>   /in the event that persistent congestion occurs./
>
>
>
> Regards
>
>
>
> Bob
>
>
> On 08/10/15 12:47, Gorry Fairhurst wrote:
>>> [Gorry, I also have to deliver on my promise on a paragraph for
>>> circuit-breaker. Do you have a deadline for that?]
>>>
>> The circuit-breaker ID is pending start of IETF last call, the
>> deadline for doing an author rev passed, sorry.
>>
>> --
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/
>