Re: [AVTCORE] Alissa Cooper's Discuss on draft-ietf-avtcore-rtp-circuit-breakers-15: (with DISCUSS and COMMENT)

Colin Perkins <> Fri, 10 June 2016 16:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 713E912D864; Fri, 10 Jun 2016 09:42:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id wDqcViQjNSYk; Fri, 10 Jun 2016 09:42:10 -0700 (PDT)
Received: from ( [IPv6:2a00:1098:0:86:1000:0:2:1]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C787B12D127; Fri, 10 Jun 2016 09:42:04 -0700 (PDT)
Received: from [] (port=55905 by with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.80) (envelope-from <>) id 1bBOh2-00081g-V3; Fri, 10 Jun 2016 16:50:02 +0100
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Colin Perkins <>
In-Reply-To: <>
Date: Fri, 10 Jun 2016 16:49:44 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <>
To: Alissa Cooper <>
X-Mailer: Apple Mail (2.3124)
X-BlackCat-Spam-Score: -28
X-Mythic-Debug: Threshold = On =
Archived-At: <>
Cc:, Magnus Westerlund <>,, IESG <>,
Subject: Re: [AVTCORE] Alissa Cooper's Discuss on draft-ietf-avtcore-rtp-circuit-breakers-15: (with DISCUSS and COMMENT)
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 10 Jun 2016 16:42:12 -0000

Hi Alissa,

My proposal to address this DISCUSS is to change the first two paragraphs of Section 4.5 to be as follows:

          What it means to cease transmission depends on the application.
          The intention is that the application will stop sending RTP data
          packets on a particular 5-tuple (transport protocol, source and
          destination ports, source and destination IP addresses), until 
          whatever network problem that triggered the RTP circuit breaker 
          has dissipated.  This could mean stopping a single RTP flow, or
          it could mean that multiple bundled RTP flows are stopped. 
          RTP flows halted by the circuit breaker SHOULD NOT be restarted
          automatically unless the sender has received information that the
          congestion has dissipated, or can reasonably be expected to have
          dissipated.  What could trigger this expectation is necessarily
          application dependent, but could be, for example, an indication
          that a competing flow has finished and freed up some capacity, or
          for an application running on a mobile device, that the device
          moved to a new location so the flow would traverse a different path
          if it were restarted.  Ideally, a human user will be involved in
          the decision to try to restart the flow, since that user will
          eventually give up if the flows repeatedly trigger the circuit
          breaker. This will help avoid problems with automatic redial
          systems from congesting the network.

          It is recognised that the RTP implementation in some systems might
          not be able to determine if a flow set-up request was initiated by
          a human user, or automatically by some scripted higher-level
          component of the system. These implementations MUST rate limit
          attempts to restart a flow on the same 5-tuple as used
          by a flow that triggered the circuit breaker, so that the reaction 
          to a triggered circuit breaker lasts for at least the triggering 
          interval <xref target="I-D.ietf-tsvwg-circuit-breaker"/>.

Would that address your concerns?


> On 3 May 2016, at 00:05, Alissa Cooper <> wrote:
>> On May 2, 2016, at 3:28 PM, Colin Perkins <> wrote:
>> Hi,
>>> On 2 May 2016, at 22:49, Alissa Cooper <> wrote:
>>> Alissa Cooper has entered the following ballot position for
>>> draft-ietf-avtcore-rtp-circuit-breakers-15: Discuss
>>> When responding, please keep the subject line intact and reply to all
>>> email addresses included in the To and CC lines. (Feel free to cut this
>>> introductory paragraph, however.)
>>> Please refer to
>>> for more information about IESG DISCUSS and COMMENT positions.
>>> The document, along with other ballot positions, can be found here:
>>> ----------------------------------------------------------------------
>>> ----------------------------------------------------------------------
>>> Many thanks for this work. I expect to ballot YES once we discuss and
>>> resolve the issue below.
>>> In Section 4.5, I understand the need to base the re-start of the media
>>> flow on a human user intervention, but I find it puzzling that this is
>>> framed in terms of "restarting the call" rather than "restarting the
>>> flow." The recommendation in Section 8 is that senders MUST treat each
>>> session independently, but ending/restarting "the call" seems to assume
>>> that multiple flows will be treated together.
>>> One situation I'm thinking of is one where my audio and video traffic are
>>> in separate RTP flows and are routed along different paths for whatever
>>> reason. Some network problem is encountered in the video path, triggering
>>> a circuit breaker. The "call" doesn't necessarily need to be terminated
>>> and re-started, because my audio can continue just fine. This is another
>>> case where the application may not want to rely on a human user re-start
>>> (if you leave it up to me whether to re-start my video, I'll certainly
>>> try to re-start it right away).
>> It’s fine if the human user tries to restart the media straight-away: if it keeps failing, they’ll eventually give up. The goal is to avoid an automatic restart that never gives up if it keep failing.
> Fair enough, although in this particular case it doesn’t seem to make a lot of sense to leave it up to the user.
>>> I think the text in this section needs to
>>> be re-phrased to separate the case where a circuit breaker triggering on
>>> a single 3-tuple causes a whole call to end (either because the call
>>> consisted of a single flow or because all of the flows were encountering
>>> congestion and it takes just one circuit breaker to trigger the end of
>>> it) from cases where it causes only that flow to be suspended, and
>>> reference Section 8 to make it clear that the unit of operation for
>>> "ceasing" and "re-starting" is a single flow unless the sender chooses to
>>> group flows.
>> Right - if the flows are bundled together, then the circuit breaker applies to the entire bundle. If they’re sent on separate paths, then it applies to each flow individually. If that’s not clear, I agree that we should fix the text to make it so.
>>> Furthermore (and this is not a DISCUSS point but I leave it here since it
>>> follows from the points above), the normative recommendation in the first
>>> paragraph here doesn't really follow from the discussion of restarting
>>> the call. The recommendation is not to automatically re-start until
>>> indications are received that congestion has improved, which is different
>>> from waiting until a human user re-starts. I think this would be clearer
>>> if the normative recommendation came first and the human user case was
>>> discussed afterward.
>> This is in §4.5? I can rephrase, if it’s clearer.
> Yes, this is all in 4.5
>>> ----------------------------------------------------------------------
>>> ----------------------------------------------------------------------
>>> (1) Did the WG discuss BCP status for this rather than PS?
>> Not that I recall. Standards track seems more appropriate to me, but BCP would be fine also.
> I could see arguments either way. I asked since draft-ietf-tsvwg-circuit-breaker is approved for BCP.
>>> (2) Section 4.3:
>>> "If such a reduction in
>>> sending rate resolves the congestion problem, the sender MAY
>>> gradually increase the rate at which it sends data after a reasonable
>>> amount of time has passed, provided it takes care not to cause the
>>> problem to recur ("reasonable" is intentionally not defined here)."
>>> In later sections you explain that thresholds are not specified because
>>> they are application-dependent. I think that would be useful to note here
>>> too as the reason for not defining "reasonable," assuming that is the
>>> reason.
>> Sure.
>> -- 
>> Colin Perkins
> _______________________________________________
> Audio/Video Transport Core Maintenance

Colin Perkins