Re: [tsvwg] Response to the detailed review: draft-ietf-tsvwg-circuit-breaker-05

Bob Briscoe <ietf@bobbriscoe.net> Mon, 02 November 2015 01:27 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 45C781B3FFD for <tsvwg@ietfa.amsl.com>; Sun, 1 Nov 2015 17:27:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1
X-Spam-Level:
X-Spam-Status: No, score=-1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=1, HTML_MESSAGE=0.001, J_CHICKENPOX_21=0.6, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id essvNvP5302X for <tsvwg@ietfa.amsl.com>; Sun, 1 Nov 2015 17:27:25 -0800 (PST)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DC5E51B3FF8 for <tsvwg@ietf.org>; Sun, 1 Nov 2015 17:27:22 -0800 (PST)
Received: from dhcp-25-140.meeting.ietf94.jp ([133.93.25.140]:45665) by server.dnsblock1.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.86) (envelope-from <ietf@bobbriscoe.net>) id 1Zt3u3-0002tz-98; Mon, 02 Nov 2015 01:27:20 +0000
From: Bob Briscoe <ietf@bobbriscoe.net>
To: gorry@erg.abdn.ac.uk, tsvwg@ietf.org, "Black, David" <david.black@emc.com>
References: <55F055AD.3050809@tik.ee.ethz.ch> <55F05D54.5060708@tik.ee.ethz.ch> <55FF2910.7080908@bobbriscoe.net> <BN1PR03MB008FCB491B06E80B6A9A915B64F0@BN1PR03MB008.namprd03.prod.outlook.com> <561DA02D.8020001@bobbriscoe.net> <8f6bae69ac931eff10a0c0d2acfe5b21.squirrel@erg.abdn.ac.uk>
Message-ID: <5636BBF2.5010100@bobbriscoe.net>
Date: Mon, 02 Nov 2015 01:27:14 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <8f6bae69ac931eff10a0c0d2acfe5b21.squirrel@erg.abdn.ac.uk>
Content-Type: multipart/alternative; boundary="------------040108070806060405040100"
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: server.dnsblock1.com: in@bobbriscoe.net
Archived-At: <http://mailarchive.ietf.org/arch/msg/tsvwg/-Kshjm_6zH2dOGfSDzhSFgh9hdo>
Subject: Re: [tsvwg] Response to the detailed review: draft-ietf-tsvwg-circuit-breaker-05
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Nov 2015 01:27:36 -0000

Gorry,

I've been taking a step back since I reviewed the detail of this draft. 
I was carried along by the narrative of the draft, and didn't think 
about the draft as a whole.

I think there was also an ambiguity about the definition of 'flow' in 
the draft, that I took one way (=microflow). Rereading, I think you 
mostly meant the other way (=(aggregate OR microflow)). In the latter 
case I would be very negative about the draft (whereas before I said I 
was supportive). In the following, I use "flow-termination" with the 
former meaning; I use "circuit-breaker" with the latter definition.

*1. Circuit-breaking of real-time traffic without regard to microflows 
considered (very) harmful**
*
1a. The main problem: the transport area sends out totally the wrong 
message if it offers a network circuit-breaker as its apparent 
recommended remedy for persistent congestion. The IETF transport area 
has developed a wide range of remedies to this problem over the years, 
and one of them will nearly always be better than a circuit breaker, 
where the phrase "nearly always" leaves an extremely small exception space.

Why is the first thing that tsvwg has chosen to expedite, and give the 
status of BCP, solely about this tiny exception space?

1b. Over the last fifteen years I have researched, designed, built, 
evaluated and standardised numerous network-operator controlled 
mechanisms for limiting congestion in networks. Below I outline a number 
of ways of dealing with persistent congestion while maximizing the 
satisfaction of everyone. These have been developed in the IETF over the 
years. Then suddenly we get a draft saying, "it's OK to cut off the 
whole tunnel if you have to, or at least a large fraction of it," 
without saying that it will hardly ever be appropriate to take this 
approach.

I've realised that this is verging on vandalism: promotion of vandalism 
of customer traffic, and vandalism of all the hard work of 
dozens/hundreds of people over the years to address these problems in a 
more considered way.

   * That work started with pre-congestion notification (PCN) for 
unresponsive flows, keeping per-flow operations to the edge of a 
network. That included flow admission control and flow termination.

   * We then developed the congestion policer to work at an aggregate 
level, dealing with a mix of unresponsive and responsive traffic in 
whole networks, or in tunnels or at single bottlenecks. It was designed 
to cause the least disruption to each of the flows, when it is 
impractical to police each flow separately. A congestion policer deals 
effectively with persistent congestion, whether caused by unresponsive 
flows, or an excess of responsive flow arrivals, or both. As congestion 
rises, i) initially it gives responsive flows more drop signals, while 
only removing a small fraction of unresponsive traffic, so the latter 
should still be able to continue; ii) if congestion persists, it removes 
larger and larger fractions of the aggregate, irrespective of whether 
flows are responsive or unresponsive. iii) It can be complemented by 
well-known techniques from the 1990s to randomly identify heavy flows to 
give a compromise between per-flow and aggregate processing.

   * The draft cites the work of Yaakov Stein, David Black and myself 
<draft-ietf-pals-congcons>, which the authors and WG have finally 
reached consensus on (after years). Your draft is correct that it proves 
that typical pseudowires for TDM traffic will become useless in 
themselves before they consume more capacity than responsive traffic. 
And the draft refers to this circuit-breaker as a possible solution. But 
that depends how you define circuit-breaker:
    - If it indiscriminately removes packets from the pseudowire 
aggregate, it is /not/ the best solution at all.
    - The best solution is briefly described in the pals draft (briefly, 
'cos remedies were out of scope): removal of inactive voice channels, 
and admission control of individual flows from the pseudowire. Failing 
that, the PW should remove random individual flows until congestion is 
reduced sufficiently.
    - Indiscriminate removal of packets is only appropriate, if the 
network cannot see the microflows (e.g. due to encryption at the 
aggregate level).

If circuit-breaking is defined as per-aggregate not per-microflow, then 
this draft should not recommend circuit-breaking. It should recommend 
flow admission control and flow termination, not circuit-breaking.

Cutting off any indiscriminate proportion of the packets without regard 
to microflows is equivalent to cutting off every microflow within the  
pseudowire, because real-time flows typically cannot survive with loss 
levels above a few %. Cutting off the whole pseudowire in this way is 
vandalism, and only appropriate if the operator is a lazy 
good-for-nothing outfit that doesn't deserve any customers.

1c. One remedy is to start this draft by saying :

***The idea of cutting down unresponsive traffic from a tunnel without 
regard to microflows will rarely if ever be appropriate.***

If a circuit-breaker is defined to include removing traffic without 
regard to microflows, then the scope of applicability of this 
circuit-breaker draft is tiny, and it should be rewritten to reflect 
that. Ie. the next sentence might as well say "Therefore, don't read 
this draft. There will nearly always be less harmful ways of addressing 
persistent tunnel congestion, many of which are already written up in 
RFCs or drafts."

*2. Scope: Measurement mechanism vs. response behaviour**
*
Secondly, there is a huge question-mark over the scope of the draft. It 
ranges over the same territory as another draft already adopted as tsvwg 
chartered work:
     draft-ietf-tsvwg-tunnel-congestion-feedback
The circuit-breaker draft does not even reference 
tunnel-congestion-feedback. However, the two drafts should be totally 
complementary. I.e.:
    a) tsvwg-tunnel-congestion-feedback specifies a generic way to get 
feedback from the egress of a tunnel to a decision point (which can be 
the tunnel ingress or centralised entity), and it provides mechanism for 
resilience of the feedback messages, ability to control timeliness vs 
feedback message overhead, etc.
    b) tsvwg-circuit-breaker should be limited in scope to solely 
specifying one of the possible behaviours in response to congestion 
measured over the tunnel: i.e.breaking the circuit
    c) other drafts can specify other responses, e.g.:
       * aggregate techniques:
         -  provisioning additional capacity
         - re-routing part of the load (traffic engineering)
         - rate policing
         - congestion policing {Notes 1, 2, 3}
       * per-flow techniques
         - flow admission control
         - flow rate policing
         - flow congestion policing
         - flow termination

There is material in circuit-breaker about the measurement and feedback 
mechanism that contradicts that in tunnel-congestion-feedback. Instead 
of writing two contradictory standards, the WG should agree the division 
of scope between the two drafts, and the superset of all the authors and 
reviewers should work towards a single standard for each scope.

The chairs of a WG are entitled to assign authors (including themselves) 
to a chartered work item to ensure it is well-written, technically sound 
and moves along in a timely manner.

*Notes**
***
{Note 1} when the conex WG completed, the responsible AD (Martin S) 
suggested that drafts like draft-briscoe-conex-policing might be picked 
up by tsvwg, which would be appropriate, because there is nothing 
specific to conex in that draft - it assumes some mechanism is getting 
information about congestion to the ingress of the network, which might 
be conex or draft-ietf-tsvwg-tunnel-congestion-feedback.

{Note 2} draft-briscoe-conex-data-centre specifies  the same mechanism 
as draft-ietf-tsvwg-tunnel-congestion-feedback. Again, even tho it has 
conex in the filename, it was written to give a general tunnel 
congestion feedback mechanism where hosts do not support conex (and to 
interwork with hosts that do). It focuses on a data centre deployment 
scenario, but there is nothing in it that prevents it being applicable 
in general networks.

{Note 3} For those not familiar with a congestion policer, it is 
essentially a token bucket, but rather than the tokens representing an 
allowance to forward bytes, they represent an allowance to cause 
congested bytes (ie loss or ECN marking).
For the case of congestion feedback across a tunnel, when feedback about 
loss (or ECN) comes back across the tunnel, it drains that amount of 
bytes from the bucket (where intermittent feedback messages are 
envisaged, token draining has to be averaged over the next measurement 
interval). The closer the bucket is to empty the more it blocks packets 
entering the ingress.
This has the effect of thinning the ingress traffic so that the 
congestion level is just below the defined threshold.


_*Further detailed review comments*__*
*_
*Abstract*

      ... network tunnels, and other non-congestion controlled
    applications

I think you mean
     non-congestion controlled network tunnels

Reasoning: as it stands, this implies tunnels are always a type of 
non-congestion controlled application, which is clearly not intended.

*Introduction**
*
FIrst para: the term 'flow' needs to be disambiguated here, not just 
later. Does it mean microflow, or aggregate? I support this draft if it 
means the former. but not if the latter.

    It was countered by the requirement to use congestion
    control (CC) by the Transmission Control Protocol (TCP) [Jacobsen88 
<https://tools.ietf.org/html/draft-ietf-tsvwg-circuit-breaker-07#ref-Jacobsen88>]
    [RFC1112 <https://tools.ietf.org/html/rfc1112>].

There was no requirement in RFC1112 to use TCP or congestion control.

    People have been implementing what this
    draft characterizes as circuit breakers on an ad hoc basis

This needs references.

     ...either by disabling the flow or by significantly reducing
    the level of traffic.

Again, pls disambiguate "flow". I supported this drdaft, because I 
thought it meant microflow. If it means aggregate, I don't support the 
draft.

    reflects a fair use of the available capacity
This new text added in the latest draft you sent me is problematic. It 
will need to say "according to the policy of the operator of the 
circuit-breaker, not the IETF".

Two more comments inline...

On 17/10/15 12:32, gorry@erg.abdn.ac.uk wrote:
> Sorry I've taken so long to reply.
>
>> Review by Bob Briscoe: 8th Oct 2015
>> Gorry,
>>
>> Despite being past the WG stage, here's my review anyway. Consider this
>> as early response to IETF last-call.
>>
>> In general I support the intent of this draft, but I am concerned at the
>> severity of the problems I have found with it given it is meant to be
>> about to go to the IESG. I am particularly concerned that I have found
>> numerous significant problems with the normative requirements section.
>>
>> Have you had a substantial review from anyone before this? The level of
>> review comments on the tsvwg list seemed quite light - picking on issues
>> of particular concern, but not seeming to review the draft as a whole.
>>
> I did see some reviews (and have had comments off-list - including from
> other WGs),
> but I also REALLY do appreciate this careful read of the whole.
>
> See my comments on comments below (a few notes from DB as document
> shepherd are also included).
>
> I'm guilty of reaching the deadline, and therefore have submitted these
> changes in a new draft (06), some of these points may benefit from further
> discussion if the new update did not address the issues.
>
> Gorry
>
> --------
>
>> *1. Intro: **
>> *Congestion Collapse is a very specific case - CB is much more general.
>> it is clear from the draft that a CB is intended to mitigate
>> circumstances wider than solely the extreme case of congestion collapse.
>> For instance: a large unresponsive aggregate contributing to a high
>> level of congestion alongside congestion responsive traffic. This is
>> nowhere near congestion collapse, but it would be an applicable case for
>> a circuit-breaker. Congestion collapse is a specific well-defined
>> process that involves a cascade of congestion as a sequence of queues
>> fill in turn moving in the upstream direction. It is due to continual
>> retries or additional load arriving faster than existing flows are
>> departing. {Note 1}
>>
> GF: I see and I can do this, and will rephrase in the next version,
> to avoid using this term.
> ----
>> The introduction mentions that TCP-style cc is only an appropriate
>> remedy when long flows dominate. The implication that CB could be used
>> to deal with congestion induced by many short flows is a step too far,
>> IMO. This problem has not even been discussed in the IETF or IRTF to my
>> knowledge, let alone in the context of this draft. In 6.2 this draft
>> all-but says that a CB is a solution to this problem. I strongly object
>> to a BCP making that assertion. CB would be a very drastic and clumsy
>> solution to that problem.{Note 2}
>>
>> It says that the timescale at which a circuit-breaker operates must be
>> seconds or tens of seconds - much longer than the RTT timescale on which
>> TCP, SCTP and DCCP react. This disregards an important type of
>> application response to congestion; it must say that the timescale also
>> has to be longer than the timescale on which certain real-time
>> applications operate their own circuit-breakers i.e. adapt down their
>> codec rates, and eventually close the connection as a form of
>> self-admission control. Applications operate per-flow circuit-breakers
>> typically over the order of seconds or tens of seconds, so network CBs
>> MUST take longer than that - I would say "no less than a minute".
>>
>> We MUST not discourage voluntary self-regulation by overriding it
>> (end-to-end principle). I pick up this point later (comments on section
>> 51.), arguing that the fast-trip CB for RTP should be considered as an
>> application CB, and a network CB should always take longer to trigger
>> than these app CBs.
>>
> GF: I think want to encourage applications to do this, starting by
> ensuring that they have the opportunity to do so, added some text on this.
> I revised the upper bound to clarify this, please see if this helps.

Timescales - the multiple tens of seconds you've added is better. Here's 
a typed record of what I said to you f2f the other day:

A network circuit-breaker needs to allow time for all possible 
self-regulation techniques (transport and app layer) to complete:
* TCP-like congestion control (RTT)
* Smoothed TCP-like cc (tens of RTT)
* Codec adaptation (tens of seconds)
* flow termination (app circuit breaker) (~1 minute)
* flow self-admission control (~minutes)

These are /not/ the timescales written in IETF drafts like 
rtp-circuit-breakers. These are the sort of timescales that commercial 
real-time apps would be willing to use. There is a difference between 
theory and practice.

It would be very wrong for a circuit-breaker to fire before allowing all 
these stages to take effect.

The most appropriate thing the IETF could standardise (in a separate 
draft) is:
1) the max time apps should take for each of these actions {ToDo: check 
the RTP circuit-breaker draft}
2) that any timers in any of these mechanisms MUST be randomised


>
> GF: I added some more text on the coexistence of fast and other circuit
> breakers, which may help (within the fast circuit breakers section) - but
> did not create a new section dedicated to this. If this is desirable we
> could create a section (electrical circuit breakers have different
> "curves" for trigger in a similar way to what is described ... but that's
> just a side observation).
>
> GF-XXX: If these comments still apply on the new text, it is worth
> discussing further.
> ---
>> *1.1 Types of CB**
>> **
>> *I saw criticism on the list of the use of the term "protect" in this
>> section. Why hasn't it been changed? As the posting said, a CB does not
>> protect the aggregate that it monitors; rather it /regulates/ the
>> aggregate to protect the rest of the traffic that it is /not/ monitoring.
>>
> GF: Totally agree, although guilty of not changing the wording as promised
> -   (people have the same problem describing Electrical circuit breakers)
> and sorry thatI inadvertently repeated this many times.  I have rephrased
> throughout.
> ---
**There are various forms of network transport circuit breaker.

...
Fast-Trip Circuit Breakers:

Repeating what I said before, Fast-Trip is /not/ a type of /network/ 
circuit breaker. This really must be fixed.





Bob
>> *3.1 Functional Components.**
>> *
>> There is no mention of the problem of synchronising the ingress and
>> egress measurements to allow for transit time. Given you are trying to
>> measure loss, which is a relatively small difference between the traffic
>> entering and leaving, you can get very bad errors if you don't take path
>> delay into account. draft-ietf-tsvwg-tunnel-congestion-feedback
>> describes a nice (and commonly used) stateless way of doing that, by
>> sending the ingress measurement in-band to the egress, which triggers
>> the egress measurement so they are synchronized; allowing for transit
>> time. Then the egress can send them both back to the ingress to be
>> compared and acted on.
>>
> GF:I'd hope that measurements over longer periods would not result in
> inaccuracies high enough to lead to trigger, but added a note to the text.
> GF: I'm not sure the tunnel method is a CB, it seems more like a
> congestion control method, which is perhaps why it that case the
> synchronisation is required.
> ----
>> *4. Reqs**
>> *
>>
>>         There MUST be a control path from the ingress meter and the egress
>>         meter to the point of measurement.  The Circuit Breaker MUST
>>         trigger if this control path fails.
>>
>> Either this is unclear terminology, or I strongly disagree. What do you
>> mean by a control path? We should only recommend that the CB triggers
>> due to lack of measurement signals if the measurement signals are
>> carried in-band with the data being monitored. That is only one way of
>> arranging the mechanism. The term control path, sounds like it is out of
>> band.
> GF: I fell into the trap of a multiply-defined term.  This needs resolved.
> I think we should say "communication path used for control messages", and
> edit accordingly to avoid this misunderstanding.
>
>> If the measurement signals are out of band, the CB MUST NOT
>> trigger due to lack of measurement signals. I would recommend the
>> in-band method, but there are plenty of network designers who will want
>> to do this in centralised out of band ways, so we have to cater for that
>> way of thinking (even tho it's misguided).
>>
> DB: I think Bob's request for not triggering the CB if a separate control
> path fails is reasonable - that may need to be NOC/operator-mediated.
> GF-XXX: I'm not sure I agree with "MUST NOT", so this needs more thought -
> comments welcome on this point.
> ----
>>         The measurement period MUST be longer than the time that current
>>         Congestion Control algorithms need to reduce their rate following
>>         detection of congestion.
>>
>> This needs to be rewritten. Or just removed. It seems like ideas changed
>> after it was written, and the end was changed but not the normative
>> statement at the beginning. IMO, the measurement period can be
>> arbitrarily short, as long as multiple measurements are combined before
>> triggering the CB.
>> It talks about unnecessarily penalizing long RTT
>> flows, but the measurement period is nothing to do with the period
>> before there is any penalization (defined later as the triggering
>> interval). There is no problem with short measurement periods as long as
>> any high congestion measured in these periods is averaged over all the
>> measurement periods in the triggering interval.
>>
> DB: There seem to be two meanings of "measurement period" here.  I've
> always viewed it as the period of time over which measurements are taken
> that result in triggering the CB, whereas Bob seems to view it as the
> period of time over which an individual measurement is taken.  I have no
> problem with the line of Bob's text quoted above, and think some
> clarification to make it clear that the means of measurement is
> unspecified (e.g., taking short measurements and combining them is fine),
> but the period of time that applies to the metric that is used to trigger
> the CB has to be sufficiently long.
>
> GF: My understanding was the "measurement period" are the samples that
> feed the trigger, so if the ingress or egress meter samples more
> frequently, these would be combined. I added text to clarify this.
>> In fact, there should be many measurement intervals per trigger
>> interval, so that there are many opportunities for measurement messages
>> to get through. Otherwise if there are only one or two measurement
>> periods per trigger interval, the possibility of a false trigger due to
>> lost control signals becomes too great.
>>
> GF: This is the robustness issue.
>>      o  A Circuit Breaker is REQUIRED to define a threshold to determine
>>         whether the measured congestion is considered excessive.
>>
>>      o  A Circuit Breaker is REQUIRED to define the triggering interval,
>>
>> A perfectly good CB could vary the trigger interval and threshold
>> depending on how rapidly congestion is rising, or how high its absolute
>> level is. Indeed one could say it is actually wrong to define a single
>> threshold or a single interval, so these normative statements are overly
>> restrictive and preclude designs that are smarter than just simple fixed
>> threshold.
>>
> GF: There are many ways to do this, and individual specs can indeed react
> using more sophisticated algorithms. At some point these will become more
> like CC than an envelop CB.
> DB: A metric and threshold for that metric are needed.  A rate-of-increase
> metric or one based in part on that would be fine.
> ----
>> Also, see comment above about allowing time for application CBs, and
>> suggesting one minute minumum.
>>
>> o  A Circuit Breaker SHOULD be constructed so that it does not
>>         trigger under light or intermittent congestion, with a default
>>         response to a trigger that disables all traffic that contributed
>>         to congestion.
>>
>> The second half after the comma seems misplaced. If it does not trigger,
>> why does the sentence go on to talk about disabling all traffic that
>> contributed to congestion (which is what an /enabled/ trigger would do)?
>>
> GF: Split the second part as a separate clause.
>> A reaction that results in a reduction SHOULD result in
>>         reducing the traffic by at least a factor of ten,
>>
>> What evidence have you got for this 10% number? It seems utterly
>> inappropriate to write a number here. The number depends on what
>> proportion of the traffic on the path between ingress and egress is
>> regulated by the CB. If the proportion is low, it needs to reduce by a
>> lot to make sufficient space for other traffic. If the proportion is
>> high relative to other traffic, it might be sufficient to reduce by 5%
>> to 95% of the previous load. If the tunnel traffic represented say 80%
>> of the load on the path, and it reduced by a factor of 10, that would
>> leave 92% of the path for other traffic, which might be unnecessarily
>> much greater than the normal proportion used by other traffic.
>>
> DB: Need to say something here - we can fall back to "order of magnitude"
> or something like that, but this'll need list discussion.
> GF: Text updated. This point was previously discussed, but I am happy to
> receive more feedback, rather than relying on what was said before. As I
> recall, many were happy with terminating flows once a CB triggered, but
> this was an attempt to be "kinder" and allow more flexibility - which I
> personally liked - but still the default reaction to successive triggers
> needs to be harsh. (It is a "SHOULD" though).
> -----
>>         Manual operator
>>         intervention will usually be required to restore a flow.
>>
>> This sentence should be toned down to possibly, not usually. A human is
>> no more capable than a machine is of bringing together all the necessary
>> measurements to decide what other courses of action might be possible,
>> and when to release the brakes. I suggest the last para of 5.3.1 starting:
>>
>> "An operator-based response provides opportunity..."
>>
>> is more appropriate here, and doesn't really fit where it is.
>>
> DB:  Operator intervention may be required to restore a flow.
> GF: Changed as above.
> ----
>> Section 4.1 contains no requirements text, only examples. It ought to be
>> moved from the normative requirements section to section 5 (Examples).
>>
> GF: Resolved as a section on topologies.
> ----
>> *5. Examples:**
>> *
>> *5.1.1 Fast-Trip CB for RTP**
>> *
>> The draft needs to make the distinction between an application doing its
>> own circuit breaking vs. functions on the path between the application
>> endpoints (even if in the hosts) doing CB. The extremely important
>> distinction is:
>> 1a) an app knows when congestion is too high for it to work properly
>> 1b) functions under the app can only infer congestion is possibly too
>> high for most apps to work properly
>> 2a) an app may be able to reduce the rate at which it sends data
>> 2b) a function under an app can only discard data, not remove it at
>> source.
>>
> GF: Although this is not the place to delve into details, I added a prefix:
> "Applications ought to use a full-featured transport (TCP, SCTP, DCCP),
> and if not, application (e.g. those using UDP and its UDP-Lite variant
> [RFC3828])they need to provide appropriate congestion avoidance. [RFC2309]
> discusses the dangers of congestion-unresponsive flows and states that
> "all UDP-based streaming applications should incorporate effective
> congestion avoidance mechanisms". Guidance for applications that do not
> use congestion-controlled transports is provided in [RFC5405.bis]. Such
> mechanisms can be designed to react on much shorter timescales than a
> circuit breaker, that only observes a traffic envelope. These methods can
> also interact with an application to more effectively control its sending
> rate."
> GF: Also updated some other parts of the section.
> ----
>> I believe that the requirements in section 4 do not apply to
>> application-controlled circuit-breakers. So, I would not include the
>> "Fast-Trip CB for RTP" as an example of a /network/ transport CB.
>>
>> As the requirements say, a network CB should never fast trip.
> GF: But by design a RTP-CB should also not terminate flows.
>
>> By misclassifying RTP CBs as network CBs, you've allowed the timescale
>> for network CBs to trigger after tens of seconds. When a network CB
>> should allow app CBs this long to trigger themselves (as I said earlier).
> GF: I'm not sure, since understanding the difference between the two is
> indeed important. Apps that "trigger themselves" describes what the
> fast-trip CB does (in the absence of CC).
> -----
>> *Missing examples:**
>> *
>> * You might want to point to the flow termination function (as opposed
>> to admission control) in the PCN architecture [RFC5559], which is
>> precisely a network CB. It was precisely developed for cases where
>> failures caused traffic to reroute onto a previously well-provisioned
>> path (see 6.1).
> DB: I think that's a stretch.  This is not among the types listed in 1.1.
> It might be mentioned as a related concept that is outside the scope of
> this draft.
> GF: How different to rsvp and other admission-controlled schemes? - I
> think we're drifting away here... I did change in the new rev. Unless
> there is strong support from the WG, I'd rather not add these examples.
> ------
>> * Andrew McGregor gave the examples of Google's BwE (bandwidth enforcer)
>> and B4, but you haven't referred to them. Given they are documented
>> existence proof of this beast, that seems remiss.
>>
> GF: I believe I discussed with Andrew and we didn't at the time have good
> text to add. This may well have changed, if there are good references
> please send them for consideration.
> ------
>> *7. Security Consid's**
>> **
>> *
>>
>>      The circuit breaker MUST be designed to be robust to packet loss that
>>      can also be experienced during congestion/overload.
>>
>> This implies reliable transmission - i.e. retransmit for ever until
>> acknowledged. This is NOT a good idea. In
>> ietf-tsvwg-tunnel-congestion-feedback we propose using SCTP partially
>> reliable transport. Then if congestion causes messages to be lost, they
>> don't have to be retransmitted if there are insufficient resources (thus
>> not risking contributing to congestion collapse - and here I use the
>> phrase correctly). Because they transmit counters, the missing counters
>> values do not matter. This is the tried-and-tested message delivery
>> approach used for IPFIX. The messages can still be given priority, but
>> should not be retransmitted.
>>
> GF: I disagree this is a requirement for reliably transmitting packets.
> I'd be happy to add text to explain robustness does not imply reliability
> and that this is likely to be an evil thing to do.
> GF: Added text on duplicating messages.
> -------
>>      Simple protection can be provided by using a
>>      randomized source port, or equivalent field in the packet header
>>      (such as the RTP SSRC value and the RTP sequence number) expected not
>>      to be known to an off-path attacker.
>>
>> I think the draft should recommend that for most scenarios, randomized
>> ports will be insufficient protection for CB control messages, which
>> should be properly crytographically authenticated. Otherwise, a
>> CB-controlled aggregate is too vulnerable to these off-path attacks.
>>
> GF-XXX: I don't know why you state that a random port (or protocol field)
> is insufficient protection from off-path attack, please explain.  Are you
> saying a CB is more vulnerable to attack than other transport traffic. I
> don't see the problem yet, can you suggest what you would like to see
> added?
> --------
>> *Gap #1:**
>> ***The draft seems to think it is so obvious what a CB should measure
>> that it only says it vaguely as "the level of congestion", and only
>> suggests the difference between ingress and egress counters as an
>> example. Some readers might well think like this: Does congestion level
>> mean the percentage extra bit-rate relative to the aggregate's expected
>> or maximum bit-rate? That might actually be a correct measure of
>> congestion in some scenarios, but...
>>
>> The draft does not say that the congestion level is defined as dropped
>> bytes divided by ingress bytes. The draft should spell out that a CB
>> should measure the volume of bytes dropped and the volume of ECN-capable
>> bytes marked with CE, and express these as a fraction of resp. total
>> ingress non-ECT bytes and total ingress ECT bytes (assuming buffers
>> within the scope of the CB are ECN-enabled). Even this is problematic,
>> because the assumption in parentheses never holds, particularly during
>> excessive congestion. It could also discuss the relative merit of
>> measuring the percentage of packets dropped/marked instead of bytes.
>>
> GF:  A detailed discussion of how to measure congestion is out of scope
> here, IMHO.
> -----
>> Also it should mention that care should be taken over how to combine the
>> measurements. For instance avoid the common mistake of averaging
>> fractions, because ave(c1/t1, c2/t2, c3/t3 ...) != (c1 + c2 + c3)/(t1 +
>> t2 + t3).
>>
> GF: Added some text - is this sufficient, given there is no intention to
> specify the algorithm here:
> "If necessary, MAY combine successive individual meter samples from the
> ingress and egresss to ensure observation of an average over a
> sufficiently long interval. (Note when meter samples need to be combined,
> the combination needs to reflect the sum of the  individual sample counts
> divided by the total time/volume over which the samples were measured.
> Individual samples over different intervals can not be directly combined
> to generate an average value.)"
> --------
>> *Gap #2:**
>> ***All the diags show multiple routers, but the text says congestion can
>> be measured by comparing ingress and egress traffic. Nowhere does it say
>> that only traffic with addressing that will have for-certain only passed
>> through both ends should be measured.
>>
> GF: True, I think everyone who previously looked at this thought that
> section 3.1 described. I agree though that it needs to be stated, and will
> add text.
> --------
>> {Note 1}: A few years ago I dug deep into the history surrounding the
>> early congestion collapses on the Internet and found that those involved
>> were adamant that the term congestion collapse should not be waved
>> around for dramatic effect, because it has a very specific definition,
>> as paraphrased above.
>>
> GF: I cited an RFC on congestion collapse and did not use the words
> thereafter.
> --------
>> {Note 2}: The credit feature of ConEx was intended to address short-flow
>> overload if it becomes a problem. DOn't get me wrong; I'm not objecting
>> to the use of CBs for the short-flow problem because I want you to use
>> my solution. I'm just using this as an example of a fine-grained way to
>> solve the problem, rather than the sledge-hammer CB way.
>> Here's the intuition briefly: With ConEx, you have to attach 'congestion
>> credit' to the first packets of a flow to cover the risk of congestion
>> before you have feedback (and if you don't and there is congestion, your
>> packets are dropped by an audit function). Then congestion policers at
>> the network ingress can limit the amount of congestion credit consumed
>> without needing feedback, and thin out traffic if it consists of large
>> numbers of short flows. If short flows come to predominate, ConEx credit
>> was also designed to incentivize a new form of proxy that could regulate
>> short-flows with a push-back style of congestion control, without a full
>> feedback loop. That would be far preferable to such a drastic measure as
>> a circuit-breaker. This aspect of ConEx was not written into the IETF
>> docs, but it is mentioned in the re-ECN drafts that were the ancestors
>> of ConEx.
>>
> GF: I do agree ConEx can manage traffic. I'm confident that a
> ConEx-controlled flow would not need an additional circuit breaker
> mechanism - but I see this as a congestion control mechanism though - not
> a circuit-breaker.
> ---
>> *Nits**
>> *
>> 3.
>> s/last resort protection to the network paths that these are used./
>>    /last resort protection to the traffic sharing their network path./
>>
> GF: Good you spotted this, changed to: " to provide last resort protection
> for traffic that shares the network path being used."
>> s/tunnels encapsulations/
>>    /tunnel encapsulations/
> GF: Fixed in latest version
>
> ---
>> 3. What makes a good CB?
>>
>>      Circuit Breakers are RECOMMENDED for IETF protocols and tunnels that
>>      carry non-congestion-controlled Internet flows and for traffic
>>      aggregates, e.g., traffic sent using a network tunnel.
>>
>> Delete "
>>
>> e.g., traffic sent using a network tunnel
>>
>> "
>> Reason: this implies all network tunnels are problematic, whereas the
>> rest of the sentence adequately says that only tunnels carrying
>> non-congestion controlled flows are of concern.
>>
> GF: Understood, suggest remove from this sentence, and instead place in a
> separate sentence that follows this saying,
> "This includes traffic sent using a network tunnel."
> -----
>> 4.
>>
>> s/monitor the level congestion/
>>    /monitor the level of congestion
> GF: Fixed in latest version
>> 4.1.1
>> (e.g. to implement a Section 5.1)
>> ?
> GF: Fixed tag in latest version
>> 4.1.2
>> s/pre-prosvisioned/
>>    /pre-provisioned/
>>
> GF: Was fixed in -06
>> 6.1
>>
>>      One common question is whether a Circuit Breaker is needed when a
>>      tunnel is deployed in a private network with pre-provisioned
>>      capacity?
>> Remove '?' from the end.
>>
> GF: Fixed in latest version
>
>> 6.2
>>
>> s/in the event that persistent congestion occur./
>>    /in the event that persistent congestion occurs./
>>
> GF: Fixed in latest version
>> Regards
>>
>>
>>
>> Bob
>>
> Gorry
>

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/