[tsvwg] draft-ietf-tsvwg-circuit-breaker-05 & draft-ietf-tsvwg-tunnel-congestion-feedback-00

"Black, David" <david.black@emc.com> Wed, 04 November 2015 22:50 UTC

Return-Path: <david.black@emc.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9BFD61B35A5 for <tsvwg@ietfa.amsl.com>; Wed, 4 Nov 2015 14:50:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.31
X-Spam-Level:
X-Spam-Status: No, score=-3.31 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_SUMOF=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DqUUvJGfQxmj for <tsvwg@ietfa.amsl.com>; Wed, 4 Nov 2015 14:50:09 -0800 (PST)
Received: from mailuogwdur.emc.com (mailuogwdur.emc.com [128.221.224.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 79B1E1B35D9 for <tsvwg@ietf.org>; Wed, 4 Nov 2015 14:50:08 -0800 (PST)
Received: from maildlpprd52.lss.emc.com (maildlpprd52.lss.emc.com [10.106.48.156]) by mailuogwprd51.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id tA4MnMp5007918 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 4 Nov 2015 17:49:23 -0500
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd51.lss.emc.com tA4MnMp5007918
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=emc.com; s=jan2013; t=1446677363; bh=2EzsDmJGyuHIPdIB1WMjgm4oQvE=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=d3dBBtIobVIR1hmLag6vvEcm2nDeMv/fdTb/qlGWllnGrOFACU0BHSbXGt3Jjgz6t tH1fLVfYoS3XYoZAAYLtDuysKPgpQ57TK/lRI9U+wpbncEOQdLjFE1a8ixSIJW/B3u qj4Ma1WCqeGWDMgJCBmPIdfjqYtS0uJy9tuCahAY=
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd51.lss.emc.com tA4MnMp5007918
Received: from mailusrhubprd51.lss.emc.com (mailusrhubprd51.lss.emc.com [10.106.48.24]) by maildlpprd52.lss.emc.com (RSA Interceptor); Wed, 4 Nov 2015 17:46:25 -0500
Received: from mxhub05.corp.emc.com (mxhub05.corp.emc.com [128.222.70.202]) by mailusrhubprd51.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id tA4Mn25n026410 (version=TLSv1 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 4 Nov 2015 17:49:02 -0500
Received: from MXHUB207.corp.emc.com (10.253.68.33) by mxhub05.corp.emc.com (128.222.70.202) with Microsoft SMTP Server (TLS) id 8.3.327.1; Wed, 4 Nov 2015 17:49:02 -0500
Received: from MX104CL02.corp.emc.com ([169.254.8.60]) by MXHUB207.corp.emc.com ([10.253.68.33]) with mapi id 14.03.0266.001; Wed, 4 Nov 2015 17:49:01 -0500
From: "Black, David" <david.black@emc.com>
To: Bob Briscoe <ietf@bobbriscoe.net>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Thread-Topic: draft-ietf-tsvwg-circuit-breaker-05 & draft-ietf-tsvwg-tunnel-congestion-feedback-00
Thread-Index: AdEXUvg9u9hQ+QyoRfKuC/8YehXLrw==
Date: Wed, 04 Nov 2015 22:48:58 +0000
Message-ID: <CE03DB3D7B45C245BCA0D243277949362137C12E@MX104CL02.corp.emc.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.13.35.86]
Content-Type: multipart/alternative; boundary="_000_CE03DB3D7B45C245BCA0D243277949362137C12EMX104CL02corpem_"
MIME-Version: 1.0
X-Sentrion-Hostname: mailusrhubprd51.lss.emc.com
X-RSA-Classifications: public
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/1no9LSFh-aGFUXe1C3BGPw6cArU>
Subject: [tsvwg] draft-ietf-tsvwg-circuit-breaker-05 & draft-ietf-tsvwg-tunnel-congestion-feedback-00
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Nov 2015 22:50:32 -0000

Bob,

As draft shepherd, let me start with the draft relationship concern:

> 2. Scope: Measurement mechanism vs. response behaviour

> Secondly, there is a huge question-mark over the scope of the draft. It ranges over the same territory as another draft already
> adopted as tsvwg chartered work:
>    draft-ietf-tsvwg-tunnel-congestion-feedback
> The circuit-breaker draft does not even reference tunnel-congestion-feedback. However, the two drafts should be totally complementary.

I believe the drafts are complementary, but I think the relationship
between the drafts is:

a) The tunnel congestion feedback draft is a measurement mechanism draft.
Those measurements can be used for multiple purposes.

b) One of those multiple purposes is to feed a circuit breaker of some
sort.  That is one of many possible uses of those measurements, and
definitely *not* the only use.  I believe that prior discussion of
the tunnel congestion feedback draft is consistent with this (i.e.,
it is not solely for circuit breakers).

c) The circuit breaker draft is about how to use measurements that may
or may not come from the tunnel congestion feedback draft (in particular
that's unlikely to be used for a Managed Circuit Breaker) for a specific
purpose.  All of the measurements involved can be used  for other purposes.

So, I think reducing the scope of the circuit breaker draft to exclude
all mention of measurements is in appropriate - as a consumer of measurements
(item c), it seems appropriate for the circuit breaker draft to discuss
the necessary and desirable characteristics of those measurements.

In my "copious spare time" this morning, I will put a/b/c above onto a
slide for discussion in the tsvwg meeting today.

> There is material in circuit-breaker about the measurement and feedback mechanism that contradicts that in tunnel-congestion-feedback

Please cite specific text that needs attention, preferably with respect
to the above a/b/c relationship.  There is more work needed on the
circuit breaker draft for other reasons, some of which were discussed
in the meeting Monday, so there is an opportunity to revise text.

Thanks,
--David

From: Bob Briscoe [mailto:ietf@bobbriscoe.net]
Sent: Sunday, November 01, 2015 8:27 PM
To: gorry@erg.abdn.ac.uk; tsvwg@ietf.org; Black, David
Subject: Re: Response to the detailed review: draft-ietf-tsvwg-circuit-breaker-05

Gorry,

I've been taking a step back since I reviewed the detail of this draft. I was carried along by the narrative of the draft, and didn't think about the draft as a whole.

I think there was also an ambiguity about the definition of 'flow' in the draft, that I took one way (=microflow). Rereading, I think you mostly meant the other way (=(aggregate OR microflow)). In the latter case I would be very negative about the draft (whereas before I said I was supportive). In the following, I use "flow-termination" with the former meaning; I use "circuit-breaker" with the latter definition.

1. Circuit-breaking of real-time traffic without regard to microflows considered (very) harmful

1a. The main problem: the transport area sends out totally the wrong message if it offers a network circuit-breaker as its apparent recommended remedy for persistent congestion. The IETF transport area has developed a wide range of remedies to this problem over the years, and one of them will nearly always be better than a circuit breaker, where the phrase "nearly always" leaves an extremely small exception space.

Why is the first thing that tsvwg has chosen to expedite, and give the status of BCP, solely about this tiny exception space?

1b. Over the last fifteen years I have researched, designed, built, evaluated and standardised numerous network-operator controlled mechanisms for limiting congestion in networks. Below I outline a number of ways of dealing with persistent congestion while maximizing the satisfaction of everyone. These have been developed in the IETF over the years. Then suddenly we get a draft saying, "it's OK to cut off the whole tunnel if you have to, or at least a large fraction of it," without saying that it will hardly ever be appropriate to take this approach.

I've realised that this is verging on vandalism: promotion of vandalism of customer traffic, and vandalism of all the hard work of dozens/hundreds of people over the years to address these problems in a more considered way.

  * That work started with pre-congestion notification (PCN) for unresponsive flows, keeping per-flow operations to the edge of a network. That included flow admission control and flow termination.

  * We then developed the congestion policer to work at an aggregate level, dealing with a mix of unresponsive and responsive traffic in whole networks, or in tunnels or at single bottlenecks. It was designed to cause the least disruption to each of the flows, when it is impractical to police each flow separately. A congestion policer deals effectively with persistent congestion, whether caused by unresponsive flows, or an excess of responsive flow arrivals, or both. As congestion rises, i) initially it gives responsive flows more drop signals, while only removing a small fraction of unresponsive traffic, so the latter should still be able to continue; ii) if congestion persists, it removes larger and larger fractions of the aggregate, irrespective of whether flows are responsive or unresponsive. iii) It can be complemented by well-known techniques from the 1990s to randomly identify heavy flows to give a compromise between per-flow and aggregate processing.

  * The draft cites the work of Yaakov Stein, David Black and myself <draft-ietf-pals-congcons>, which the authors and WG have finally reached consensus on (after years). Your draft is correct that it proves that typical pseudowires for TDM traffic will become useless in themselves before they consume more capacity than responsive traffic. And the draft refers to this circuit-breaker as a possible solution. But that depends how you define circuit-breaker:
   - If it indiscriminately removes packets from the pseudowire aggregate, it is /not/ the best solution at all.
   - The best solution is briefly described in the pals draft (briefly, 'cos remedies were out of scope): removal of inactive voice channels, and admission control of individual flows from the pseudowire. Failing that, the PW should remove random individual flows until congestion is reduced sufficiently.
   - Indiscriminate removal of packets is only appropriate, if the network cannot see the microflows (e.g. due to encryption at the aggregate level).

If circuit-breaking is defined as per-aggregate not per-microflow, then this draft should not recommend circuit-breaking. It should recommend flow admission control and flow termination, not circuit-breaking.

Cutting off any indiscriminate proportion of the packets without regard to microflows is equivalent to cutting off every microflow within the  pseudowire, because real-time flows typically cannot survive with loss levels above a few %. Cutting off the whole pseudowire in this way is vandalism, and only appropriate if the operator is a lazy good-for-nothing outfit that doesn't deserve any customers.

1c. One remedy is to start this draft by saying :

***The idea of cutting down unresponsive traffic from a tunnel without regard to microflows will rarely if ever be appropriate.***

If a circuit-breaker is defined to include removing traffic without regard to microflows, then the scope of applicability of this circuit-breaker draft is tiny, and it should be rewritten to reflect that. Ie. the next sentence might as well say "Therefore, don't read this draft. There will nearly always be less harmful ways of addressing persistent tunnel congestion, many of which are already written up in RFCs or drafts."

2. Scope: Measurement mechanism vs. response behaviour

Secondly, there is a huge question-mark over the scope of the draft. It ranges over the same territory as another draft already adopted as tsvwg chartered work:
    draft-ietf-tsvwg-tunnel-congestion-feedback
The circuit-breaker draft does not even reference tunnel-congestion-feedback. However, the two drafts should be totally complementary. I.e.:
   a) tsvwg-tunnel-congestion-feedback specifies a generic way to get feedback from the egress of a tunnel to a decision point (which can be the tunnel ingress or centralised entity), and it provides mechanism for resilience of the feedback messages, ability to control timeliness vs feedback message overhead, etc.
   b) tsvwg-circuit-breaker should be limited in scope to solely specifying one of the possible behaviours in response to congestion measured over the tunnel: i.e.breaking the circuit
   c) other drafts can specify other responses, e.g.:
      * aggregate techniques:
        -  provisioning additional capacity
        - re-routing part of the load (traffic engineering)
        - rate policing
        - congestion policing {Notes 1, 2, 3}
      * per-flow techniques
        - flow admission control
        - flow rate policing
        - flow congestion policing
        - flow termination

There is material in circuit-breaker about the measurement and feedback mechanism that contradicts that in tunnel-congestion-feedback. Instead of writing two contradictory standards, the WG should agree the division of scope between the two drafts, and the superset of all the authors and reviewers should work towards a single standard for each scope.

The chairs of a WG are entitled to assign authors (including themselves) to a chartered work item to ensure it is well-written, technically sound and moves along in a timely manner.

Notes

{Note 1} when the conex WG completed, the responsible AD (Martin S) suggested that drafts like draft-briscoe-conex-policing might be picked up by tsvwg, which would be appropriate, because there is nothing specific to conex in that draft - it assumes some mechanism is getting information about congestion to the ingress of the network, which might be conex or draft-ietf-tsvwg-tunnel-congestion-feedback.

{Note 2} draft-briscoe-conex-data-centre specifies  the same mechanism as draft-ietf-tsvwg-tunnel-congestion-feedback. Again, even tho it has conex in the filename, it was written to give a general tunnel congestion feedback mechanism where hosts do not support conex (and to interwork with hosts that do). It focuses on a data centre deployment scenario, but there is nothing in it that prevents it being applicable in general networks.

{Note 3} For those not familiar with a congestion policer, it is essentially a token bucket, but rather than the tokens representing an allowance to forward bytes, they represent an allowance to cause congested bytes (ie loss or ECN marking).
For the case of congestion feedback across a tunnel, when feedback about loss (or ECN) comes back across the tunnel, it drains that amount of bytes from the bucket (where intermittent feedback messages are envisaged, token draining has to be averaged over the next measurement interval). The closer the bucket is to empty the more it blocks packets entering the ingress.
This has the effect of thinning the ingress traffic so that the congestion level is just below the defined threshold.


Further detailed review comments

Abstract

     ... network tunnels, and other non-congestion controlled

   applications
I think you mean
    non-congestion controlled network tunnels

Reasoning: as it stands, this implies tunnels are always a type of non-congestion controlled application, which is clearly not intended.

Introduction

FIrst para: the term 'flow' needs to be disambiguated here, not just later. Does it mean microflow, or aggregate? I support this draft if it means the former. but not if the latter.

   It was countered by the requirement to use congestion

   control (CC) by the Transmission Control Protocol (TCP) [Jacobsen88<https://tools.ietf.org/html/draft-ietf-tsvwg-circuit-breaker-07#ref-Jacobsen88>]

   [RFC1112<https://tools.ietf.org/html/rfc1112>].
There was no requirement in RFC1112 to use TCP or congestion control.

   People have been implementing what this

   draft characterizes as circuit breakers on an ad hoc basis
This needs references.

    ...either by disabling the flow or by significantly reducing

   the level of traffic.
Again, pls disambiguate "flow". I supported this drdaft, because I thought it meant microflow. If it means aggregate, I don't support the draft.

   reflects a fair use of the available capacity
This new text added in the latest draft you sent me is problematic. It will need to say "according to the policy of the operator of the circuit-breaker, not the IETF".

Two more comments inline...

On 17/10/15 12:32, gorry@erg.abdn.ac.uk<mailto:gorry@erg.abdn.ac.uk> wrote:


Sorry I've taken so long to reply.



Review by Bob Briscoe: 8th Oct 2015

Gorry,



Despite being past the WG stage, here's my review anyway. Consider this

as early response to IETF last-call.



In general I support the intent of this draft, but I am concerned at the

severity of the problems I have found with it given it is meant to be

about to go to the IESG. I am particularly concerned that I have found

numerous significant problems with the normative requirements section.



Have you had a substantial review from anyone before this? The level of

review comments on the tsvwg list seemed quite light - picking on issues

of particular concern, but not seeming to review the draft as a whole.



I did see some reviews (and have had comments off-list - including from

other WGs),

but I also REALLY do appreciate this careful read of the whole.



See my comments on comments below (a few notes from DB as document

shepherd are also included).



I'm guilty of reaching the deadline, and therefore have submitted these

changes in a new draft (06), some of these points may benefit from further

discussion if the new update did not address the issues.



Gorry



--------



*1. Intro: **

*Congestion Collapse is a very specific case - CB is much more general.

it is clear from the draft that a CB is intended to mitigate

circumstances wider than solely the extreme case of congestion collapse.

For instance: a large unresponsive aggregate contributing to a high

level of congestion alongside congestion responsive traffic. This is

nowhere near congestion collapse, but it would be an applicable case for

a circuit-breaker. Congestion collapse is a specific well-defined

process that involves a cascade of congestion as a sequence of queues

fill in turn moving in the upstream direction. It is due to continual

retries or additional load arriving faster than existing flows are

departing. {Note 1}



GF: I see and I can do this, and will rephrase in the next version,

to avoid using this term.

----

The introduction mentions that TCP-style cc is only an appropriate

remedy when long flows dominate. The implication that CB could be used

to deal with congestion induced by many short flows is a step too far,

IMO. This problem has not even been discussed in the IETF or IRTF to my

knowledge, let alone in the context of this draft. In 6.2 this draft

all-but says that a CB is a solution to this problem. I strongly object

to a BCP making that assertion. CB would be a very drastic and clumsy

solution to that problem.{Note 2}



It says that the timescale at which a circuit-breaker operates must be

seconds or tens of seconds - much longer than the RTT timescale on which

TCP, SCTP and DCCP react. This disregards an important type of

application response to congestion; it must say that the timescale also

has to be longer than the timescale on which certain real-time

applications operate their own circuit-breakers i.e. adapt down their

codec rates, and eventually close the connection as a form of

self-admission control. Applications operate per-flow circuit-breakers

typically over the order of seconds or tens of seconds, so network CBs

MUST take longer than that - I would say "no less than a minute".



We MUST not discourage voluntary self-regulation by overriding it

(end-to-end principle). I pick up this point later (comments on section

51.), arguing that the fast-trip CB for RTP should be considered as an

application CB, and a network CB should always take longer to trigger

than these app CBs.



GF: I think want to encourage applications to do this, starting by

ensuring that they have the opportunity to do so, added some text on this.

I revised the upper bound to clarify this, please see if this helps.

Timescales - the multiple tens of seconds you've added is better. Here's a typed record of what I said to you f2f the other day:

A network circuit-breaker needs to allow time for all possible self-regulation techniques (transport and app layer) to complete:
* TCP-like congestion control (RTT)
* Smoothed TCP-like cc (tens of RTT)
* Codec adaptation (tens of seconds)
* flow termination (app circuit breaker) (~1 minute)
* flow self-admission control (~minutes)

These are /not/ the timescales written in IETF drafts like rtp-circuit-breakers. These are the sort of timescales that commercial real-time apps would be willing to use. There is a difference between theory and practice.

It would be very wrong for a circuit-breaker to fire before allowing all these stages to take effect.

The most appropriate thing the IETF could standardise (in a separate draft) is:
1) the max time apps should take for each of these actions {ToDo: check the RTP circuit-breaker draft}
2) that any timers in any of these mechanisms MUST be randomised








GF: I added some more text on the coexistence of fast and other circuit

breakers, which may help (within the fast circuit breakers section) - but

did not create a new section dedicated to this. If this is desirable we

could create a section (electrical circuit breakers have different

"curves" for trigger in a similar way to what is described ... but that's

just a side observation).



GF-XXX: If these comments still apply on the new text, it is worth

discussing further.

---

*1.1 Types of CB**

**

*I saw criticism on the list of the use of the term "protect" in this

section. Why hasn't it been changed? As the posting said, a CB does not

protect the aggregate that it monitors; rather it /regulates/ the

aggregate to protect the rest of the traffic that it is /not/ monitoring.



GF: Totally agree, although guilty of not changing the wording as promised

-   (people have the same problem describing Electrical circuit breakers)

and sorry thatI inadvertently repeated this many times.  I have rephrased

throughout.

---
There are various forms of network transport circuit breaker.

...

Fast-Trip Circuit Breakers:
Repeating what I said before, Fast-Trip is /not/ a type of /network/ circuit breaker. This really must be fixed.





Bob




*3.1 Functional Components.**

*

There is no mention of the problem of synchronising the ingress and

egress measurements to allow for transit time. Given you are trying to

measure loss, which is a relatively small difference between the traffic

entering and leaving, you can get very bad errors if you don't take path

delay into account. draft-ietf-tsvwg-tunnel-congestion-feedback

describes a nice (and commonly used) stateless way of doing that, by

sending the ingress measurement in-band to the egress, which triggers

the egress measurement so they are synchronized; allowing for transit

time. Then the egress can send them both back to the ingress to be

compared and acted on.



GF:I'd hope that measurements over longer periods would not result in

inaccuracies high enough to lead to trigger, but added a note to the text.

GF: I'm not sure the tunnel method is a CB, it seems more like a

congestion control method, which is perhaps why it that case the

synchronisation is required.

----

*4. Reqs**

*



       There MUST be a control path from the ingress meter and the egress

       meter to the point of measurement.  The Circuit Breaker MUST

       trigger if this control path fails.



Either this is unclear terminology, or I strongly disagree. What do you

mean by a control path? We should only recommend that the CB triggers

due to lack of measurement signals if the measurement signals are

carried in-band with the data being monitored. That is only one way of

arranging the mechanism. The term control path, sounds like it is out of

band.

GF: I fell into the trap of a multiply-defined term.  This needs resolved.

I think we should say "communication path used for control messages", and

edit accordingly to avoid this misunderstanding.



If the measurement signals are out of band, the CB MUST NOT

trigger due to lack of measurement signals. I would recommend the

in-band method, but there are plenty of network designers who will want

to do this in centralised out of band ways, so we have to cater for that

way of thinking (even tho it's misguided).



DB: I think Bob's request for not triggering the CB if a separate control

path fails is reasonable - that may need to be NOC/operator-mediated.

GF-XXX: I'm not sure I agree with "MUST NOT", so this needs more thought -

comments welcome on this point.

----

       The measurement period MUST be longer than the time that current

       Congestion Control algorithms need to reduce their rate following

       detection of congestion.



This needs to be rewritten. Or just removed. It seems like ideas changed

after it was written, and the end was changed but not the normative

statement at the beginning. IMO, the measurement period can be

arbitrarily short, as long as multiple measurements are combined before

triggering the CB.

It talks about unnecessarily penalizing long RTT

flows, but the measurement period is nothing to do with the period

before there is any penalization (defined later as the triggering

interval). There is no problem with short measurement periods as long as

any high congestion measured in these periods is averaged over all the

measurement periods in the triggering interval.



DB: There seem to be two meanings of "measurement period" here.  I've

always viewed it as the period of time over which measurements are taken

that result in triggering the CB, whereas Bob seems to view it as the

period of time over which an individual measurement is taken.  I have no

problem with the line of Bob's text quoted above, and think some

clarification to make it clear that the means of measurement is

unspecified (e.g., taking short measurements and combining them is fine),

but the period of time that applies to the metric that is used to trigger

the CB has to be sufficiently long.



GF: My understanding was the "measurement period" are the samples that

feed the trigger, so if the ingress or egress meter samples more

frequently, these would be combined. I added text to clarify this.

In fact, there should be many measurement intervals per trigger

interval, so that there are many opportunities for measurement messages

to get through. Otherwise if there are only one or two measurement

periods per trigger interval, the possibility of a false trigger due to

lost control signals becomes too great.



GF: This is the robustness issue.

    o  A Circuit Breaker is REQUIRED to define a threshold to determine

       whether the measured congestion is considered excessive.



    o  A Circuit Breaker is REQUIRED to define the triggering interval,



A perfectly good CB could vary the trigger interval and threshold

depending on how rapidly congestion is rising, or how high its absolute

level is. Indeed one could say it is actually wrong to define a single

threshold or a single interval, so these normative statements are overly

restrictive and preclude designs that are smarter than just simple fixed

threshold.



GF: There are many ways to do this, and individual specs can indeed react

using more sophisticated algorithms. At some point these will become more

like CC than an envelop CB.

DB: A metric and threshold for that metric are needed.  A rate-of-increase

metric or one based in part on that would be fine.

----

Also, see comment above about allowing time for application CBs, and

suggesting one minute minumum.



o  A Circuit Breaker SHOULD be constructed so that it does not

       trigger under light or intermittent congestion, with a default

       response to a trigger that disables all traffic that contributed

       to congestion.



The second half after the comma seems misplaced. If it does not trigger,

why does the sentence go on to talk about disabling all traffic that

contributed to congestion (which is what an /enabled/ trigger would do)?



GF: Split the second part as a separate clause.

A reaction that results in a reduction SHOULD result in

       reducing the traffic by at least a factor of ten,



What evidence have you got for this 10% number? It seems utterly

inappropriate to write a number here. The number depends on what

proportion of the traffic on the path between ingress and egress is

regulated by the CB. If the proportion is low, it needs to reduce by a

lot to make sufficient space for other traffic. If the proportion is

high relative to other traffic, it might be sufficient to reduce by 5%

to 95% of the previous load. If the tunnel traffic represented say 80%

of the load on the path, and it reduced by a factor of 10, that would

leave 92% of the path for other traffic, which might be unnecessarily

much greater than the normal proportion used by other traffic.



DB: Need to say something here - we can fall back to "order of magnitude"

or something like that, but this'll need list discussion.

GF: Text updated. This point was previously discussed, but I am happy to

receive more feedback, rather than relying on what was said before. As I

recall, many were happy with terminating flows once a CB triggered, but

this was an attempt to be "kinder" and allow more flexibility - which I

personally liked - but still the default reaction to successive triggers

needs to be harsh. (It is a "SHOULD" though).

-----

       Manual operator

       intervention will usually be required to restore a flow.



This sentence should be toned down to possibly, not usually. A human is

no more capable than a machine is of bringing together all the necessary

measurements to decide what other courses of action might be possible,

and when to release the brakes. I suggest the last para of 5.3.1 starting:



"An operator-based response provides opportunity..."



is more appropriate here, and doesn't really fit where it is.



DB:  Operator intervention may be required to restore a flow.

GF: Changed as above.

----

Section 4.1 contains no requirements text, only examples. It ought to be

moved from the normative requirements section to section 5 (Examples).



GF: Resolved as a section on topologies.

----

*5. Examples:**

*

*5.1.1 Fast-Trip CB for RTP**

*

The draft needs to make the distinction between an application doing its

own circuit breaking vs. functions on the path between the application

endpoints (even if in the hosts) doing CB. The extremely important

distinction is:

1a) an app knows when congestion is too high for it to work properly

1b) functions under the app can only infer congestion is possibly too

high for most apps to work properly

2a) an app may be able to reduce the rate at which it sends data

2b) a function under an app can only discard data, not remove it at

source.



GF: Although this is not the place to delve into details, I added a prefix:

"Applications ought to use a full-featured transport (TCP, SCTP, DCCP),

and if not, application (e.g. those using UDP and its UDP-Lite variant

[RFC3828])they need to provide appropriate congestion avoidance. [RFC2309]

discusses the dangers of congestion-unresponsive flows and states that

"all UDP-based streaming applications should incorporate effective

congestion avoidance mechanisms". Guidance for applications that do not

use congestion-controlled transports is provided in [RFC5405.bis]. Such

mechanisms can be designed to react on much shorter timescales than a

circuit breaker, that only observes a traffic envelope. These methods can

also interact with an application to more effectively control its sending

rate."

GF: Also updated some other parts of the section.

----

I believe that the requirements in section 4 do not apply to

application-controlled circuit-breakers. So, I would not include the

"Fast-Trip CB for RTP" as an example of a /network/ transport CB.



As the requirements say, a network CB should never fast trip.

GF: But by design a RTP-CB should also not terminate flows.



By misclassifying RTP CBs as network CBs, you've allowed the timescale

for network CBs to trigger after tens of seconds. When a network CB

should allow app CBs this long to trigger themselves (as I said earlier).

GF: I'm not sure, since understanding the difference between the two is

indeed important. Apps that "trigger themselves" describes what the

fast-trip CB does (in the absence of CC).

-----

*Missing examples:**

*

* You might want to point to the flow termination function (as opposed

to admission control) in the PCN architecture [RFC5559], which is

precisely a network CB. It was precisely developed for cases where

failures caused traffic to reroute onto a previously well-provisioned

path (see 6.1).

DB: I think that's a stretch.  This is not among the types listed in 1.1.

It might be mentioned as a related concept that is outside the scope of

this draft.

GF: How different to rsvp and other admission-controlled schemes? - I

think we're drifting away here... I did change in the new rev. Unless

there is strong support from the WG, I'd rather not add these examples.

------

* Andrew McGregor gave the examples of Google's BwE (bandwidth enforcer)

and B4, but you haven't referred to them. Given they are documented

existence proof of this beast, that seems remiss.



GF: I believe I discussed with Andrew and we didn't at the time have good

text to add. This may well have changed, if there are good references

please send them for consideration.

------

*7. Security Consid's**

**

*



    The circuit breaker MUST be designed to be robust to packet loss that

    can also be experienced during congestion/overload.



This implies reliable transmission - i.e. retransmit for ever until

acknowledged. This is NOT a good idea. In

ietf-tsvwg-tunnel-congestion-feedback we propose using SCTP partially

reliable transport. Then if congestion causes messages to be lost, they

don't have to be retransmitted if there are insufficient resources (thus

not risking contributing to congestion collapse - and here I use the

phrase correctly). Because they transmit counters, the missing counters

values do not matter. This is the tried-and-tested message delivery

approach used for IPFIX. The messages can still be given priority, but

should not be retransmitted.



GF: I disagree this is a requirement for reliably transmitting packets.

I'd be happy to add text to explain robustness does not imply reliability

and that this is likely to be an evil thing to do.

GF: Added text on duplicating messages.

-------

    Simple protection can be provided by using a

    randomized source port, or equivalent field in the packet header

    (such as the RTP SSRC value and the RTP sequence number) expected not

    to be known to an off-path attacker.



I think the draft should recommend that for most scenarios, randomized

ports will be insufficient protection for CB control messages, which

should be properly crytographically authenticated. Otherwise, a

CB-controlled aggregate is too vulnerable to these off-path attacks.



GF-XXX: I don't know why you state that a random port (or protocol field)

is insufficient protection from off-path attack, please explain.  Are you

saying a CB is more vulnerable to attack than other transport traffic. I

don't see the problem yet, can you suggest what you would like to see

added?

--------

*Gap #1:**

***The draft seems to think it is so obvious what a CB should measure

that it only says it vaguely as "the level of congestion", and only

suggests the difference between ingress and egress counters as an

example. Some readers might well think like this: Does congestion level

mean the percentage extra bit-rate relative to the aggregate's expected

or maximum bit-rate? That might actually be a correct measure of

congestion in some scenarios, but...



The draft does not say that the congestion level is defined as dropped

bytes divided by ingress bytes. The draft should spell out that a CB

should measure the volume of bytes dropped and the volume of ECN-capable

bytes marked with CE, and express these as a fraction of resp. total

ingress non-ECT bytes and total ingress ECT bytes (assuming buffers

within the scope of the CB are ECN-enabled). Even this is problematic,

because the assumption in parentheses never holds, particularly during

excessive congestion. It could also discuss the relative merit of

measuring the percentage of packets dropped/marked instead of bytes.



GF:  A detailed discussion of how to measure congestion is out of scope

here, IMHO.

-----

Also it should mention that care should be taken over how to combine the

measurements. For instance avoid the common mistake of averaging

fractions, because ave(c1/t1, c2/t2, c3/t3 ...) != (c1 + c2 + c3)/(t1 +

t2 + t3).



GF: Added some text - is this sufficient, given there is no intention to

specify the algorithm here:

"If necessary, MAY combine successive individual meter samples from the

ingress and egresss to ensure observation of an average over a

sufficiently long interval. (Note when meter samples need to be combined,

the combination needs to reflect the sum of the  individual sample counts

divided by the total time/volume over which the samples were measured.

Individual samples over different intervals can not be directly combined

to generate an average value.)"

--------

*Gap #2:**

***All the diags show multiple routers, but the text says congestion can

be measured by comparing ingress and egress traffic. Nowhere does it say

that only traffic with addressing that will have for-certain only passed

through both ends should be measured.



GF: True, I think everyone who previously looked at this thought that

section 3.1 described. I agree though that it needs to be stated, and will

add text.

--------

{Note 1}: A few years ago I dug deep into the history surrounding the

early congestion collapses on the Internet and found that those involved

were adamant that the term congestion collapse should not be waved

around for dramatic effect, because it has a very specific definition,

as paraphrased above.



GF: I cited an RFC on congestion collapse and did not use the words

thereafter.

--------

{Note 2}: The credit feature of ConEx was intended to address short-flow

overload if it becomes a problem. DOn't get me wrong; I'm not objecting

to the use of CBs for the short-flow problem because I want you to use

my solution. I'm just using this as an example of a fine-grained way to

solve the problem, rather than the sledge-hammer CB way.

Here's the intuition briefly: With ConEx, you have to attach 'congestion

credit' to the first packets of a flow to cover the risk of congestion

before you have feedback (and if you don't and there is congestion, your

packets are dropped by an audit function). Then congestion policers at

the network ingress can limit the amount of congestion credit consumed

without needing feedback, and thin out traffic if it consists of large

numbers of short flows. If short flows come to predominate, ConEx credit

was also designed to incentivize a new form of proxy that could regulate

short-flows with a push-back style of congestion control, without a full

feedback loop. That would be far preferable to such a drastic measure as

a circuit-breaker. This aspect of ConEx was not written into the IETF

docs, but it is mentioned in the re-ECN drafts that were the ancestors

of ConEx.



GF: I do agree ConEx can manage traffic. I'm confident that a

ConEx-controlled flow would not need an additional circuit breaker

mechanism - but I see this as a congestion control mechanism though - not

a circuit-breaker.

---

*Nits**

*

3.

s/last resort protection to the network paths that these are used./

  /last resort protection to the traffic sharing their network path./



GF: Good you spotted this, changed to: " to provide last resort protection

for traffic that shares the network path being used."

s/tunnels encapsulations/

  /tunnel encapsulations/

GF: Fixed in latest version



---

3. What makes a good CB?



    Circuit Breakers are RECOMMENDED for IETF protocols and tunnels that

    carry non-congestion-controlled Internet flows and for traffic

    aggregates, e.g., traffic sent using a network tunnel.



Delete "



e.g., traffic sent using a network tunnel



"

Reason: this implies all network tunnels are problematic, whereas the

rest of the sentence adequately says that only tunnels carrying

non-congestion controlled flows are of concern.



GF: Understood, suggest remove from this sentence, and instead place in a

separate sentence that follows this saying,

"This includes traffic sent using a network tunnel. "

-----

4.



s/monitor the level congestion/

  /monitor the level of congestion

GF: Fixed in latest version

4.1.1

(e.g. to implement a Section 5.1)

?

GF: Fixed tag in latest version

4.1.2

s/pre-prosvisioned/

  /pre-provisioned/



GF: Was fixed in -06

6.1



    One common question is whether a Circuit Breaker is needed when a

    tunnel is deployed in a private network with pre-provisioned

    capacity?

Remove '?' from the end.



GF: Fixed in latest version



6.2



s/in the event that persistent congestion occur./

  /in the event that persistent congestion occurs./



GF: Fixed in latest version

Regards







Bob



Gorry





--

________________________________________________________________

Bob Briscoe                               http://bobbriscoe.net/