Re: [AVTCORE] Alissa Cooper's Discuss on draft-ietf-avtcore-rtp-circuit-breakers-15: (with DISCUSS and COMMENT)

Alissa Cooper <alissa@cooperw.in> Fri, 10 June 2016 18:55 UTC

Return-Path: <alissa@cooperw.in>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F2CEA12D1C0; Fri, 10 Jun 2016 11:55:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.721
X-Spam-Level:
X-Spam-Status: No, score=-2.721 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cooperw.in header.b=dnRMXmvh; dkim=pass (1024-bit key) header.d=messagingengine.com header.b=fFrmSW6x
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7gSo3bDKHiBh; Fri, 10 Jun 2016 11:55:53 -0700 (PDT)
Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D724212B047; Fri, 10 Jun 2016 11:55:52 -0700 (PDT)
Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 2E7C221EA8; Fri, 10 Jun 2016 14:55:52 -0400 (EDT)
Received: from frontend2 ([10.202.2.161]) by compute6.internal (MEProxy); Fri, 10 Jun 2016 14:55:52 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=cooperw.in; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=RDUk1sMgoedVA1VYtLecaKiMkKM=; b=dnRMXm vh5Kn04k7mw3Zbimz3J9OXyKu87ZxlPbOnsj60Styi2ybi7Sg5H0hHA1WOi0dVyz Hno1gnDGinD4tDOrCJjjOjBDaum3r/Odd5oXM8DNdjRHXiyELNOOa7UqDflmQSD2 P8WRIT48zJ4+8T6fz6eU2tptZL60C4y1QWgsw=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=RDUk1sMgoedVA1V YtLecaKiMkKM=; b=fFrmSW6xxJBvS47GhhAiDPJiaFIidkZFUk+19vBpGO3y1rp YMgpciv4DOhHQAAVjg7vrBCGTf7JICVAzudQfJcahhm9jmZ8P17okNdNPiieU8DI PUuFbLz/WAwLq2N/B6rtY70wG0QpaMEG1gFVh6CrDgqL9Ad8NHsWy+Z44GDc=
X-Sasl-enc: +JJ1uWADo6QSH/bKV3/UGnXFCw89FJJqrte26cRwKhUp 1465584951
Received: from dhcp-171-68-20-136.cisco.com (dhcp-171-68-20-136.cisco.com [171.68.20.136]) by mail.messagingengine.com (Postfix) with ESMTPA id 346F2CCDA6; Fri, 10 Jun 2016 14:55:51 -0400 (EDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Alissa Cooper <alissa@cooperw.in>
In-Reply-To: <0A848CFD-C9EF-4590-8E5A-873F9283AABC@csperkins.org>
Date: Fri, 10 Jun 2016 11:55:50 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <9FF5DF7A-4F2F-4F11-964C-E308ACB5A144@cooperw.in>
References: <20160502214947.15809.26879.idtracker@ietfa.amsl.com> <352578AF-85CD-44D0-9D39-A787767E225D@csperkins.org> <F9A52AFD-87B4-46DC-B7F5-8EA383DCA35E@cooperw.in> <0A848CFD-C9EF-4590-8E5A-873F9283AABC@csperkins.org>
To: Colin Perkins <csp@csperkins.org>
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/Xnm09qjhzJoAVKZjbUkfT7h55ME>
Cc: avtcore-chairs@ietf.org, Magnus Westerlund <magnus.westerlund@ericsson.com>, draft-ietf-avtcore-rtp-circuit-breakers@ietf.org, IESG <iesg@ietf.org>, avt@ietf.org
Subject: Re: [AVTCORE] Alissa Cooper's Discuss on draft-ietf-avtcore-rtp-circuit-breakers-15: (with DISCUSS and COMMENT)
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Jun 2016 18:55:55 -0000

> On Jun 10, 2016, at 8:49 AM, Colin Perkins <csp@csperkins.org> wrote:
> 
> Hi Alissa,
> 
> My proposal to address this DISCUSS is to change the first two paragraphs of Section 4.5 to be as follows:
> 
>        <t>
>          What it means to cease transmission depends on the application.
>          The intention is that the application will stop sending RTP data
>          packets on a particular 5-tuple (transport protocol, source and
>          destination ports, source and destination IP addresses), until 
>          whatever network problem that triggered the RTP circuit breaker 
>          has dissipated.  This could mean stopping a single RTP flow, or
>          it could mean that multiple bundled RTP flows are stopped. 
>          RTP flows halted by the circuit breaker SHOULD NOT be restarted
>          automatically unless the sender has received information that the
>          congestion has dissipated, or can reasonably be expected to have
>          dissipated.  What could trigger this expectation is necessarily
>          application dependent, but could be, for example, an indication
>          that a competing flow has finished and freed up some capacity, or
>          for an application running on a mobile device, that the device
>          moved to a new location so the flow would traverse a different path
>          if it were restarted.  Ideally, a human user will be involved in
>          the decision to try to restart the flow, since that user will
>          eventually give up if the flows repeatedly trigger the circuit
>          breaker. This will help avoid problems with automatic redial
>          systems from congesting the network.
>        </t>
> 
>        <t>
>          It is recognised that the RTP implementation in some systems might
>          not be able to determine if a flow set-up request was initiated by
>          a human user, or automatically by some scripted higher-level
>          component of the system. These implementations MUST rate limit
>          attempts to restart a flow on the same 5-tuple as used
>          by a flow that triggered the circuit breaker, so that the reaction 
>          to a triggered circuit breaker lasts for at least the triggering 
>          interval <xref target="I-D.ietf-tsvwg-circuit-breaker"/>.
>        </t>
> 
> Would that address your concerns?

Yes, thank you.
Alissa

> 
> Colin
> 
> 
> 
> 
>> On 3 May 2016, at 00:05, Alissa Cooper <alissa@cooperw.in> wrote:
>> 
>> 
>>> On May 2, 2016, at 3:28 PM, Colin Perkins <csp@csperkins.org> wrote:
>>> 
>>> Hi,
>>> 
>>>> On 2 May 2016, at 22:49, Alissa Cooper <alissa@cooperw.in> wrote:
>>>> 
>>>> Alissa Cooper has entered the following ballot position for
>>>> draft-ietf-avtcore-rtp-circuit-breakers-15: Discuss
>>>> 
>>>> When responding, please keep the subject line intact and reply to all
>>>> email addresses included in the To and CC lines. (Feel free to cut this
>>>> introductory paragraph, however.)
>>>> 
>>>> 
>>>> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
>>>> for more information about IESG DISCUSS and COMMENT positions.
>>>> 
>>>> 
>>>> The document, along with other ballot positions, can be found here:
>>>> https://datatracker.ietf.org/doc/draft-ietf-avtcore-rtp-circuit-breakers/
>>>> 
>>>> 
>>>> 
>>>> ----------------------------------------------------------------------
>>>> DISCUSS:
>>>> ----------------------------------------------------------------------
>>>> 
>>>> Many thanks for this work. I expect to ballot YES once we discuss and
>>>> resolve the issue below.
>>>> 
>>>> In Section 4.5, I understand the need to base the re-start of the media
>>>> flow on a human user intervention, but I find it puzzling that this is
>>>> framed in terms of "restarting the call" rather than "restarting the
>>>> flow." The recommendation in Section 8 is that senders MUST treat each
>>>> session independently, but ending/restarting "the call" seems to assume
>>>> that multiple flows will be treated together.
>>>> 
>>>> One situation I'm thinking of is one where my audio and video traffic are
>>>> in separate RTP flows and are routed along different paths for whatever
>>>> reason. Some network problem is encountered in the video path, triggering
>>>> a circuit breaker. The "call" doesn't necessarily need to be terminated
>>>> and re-started, because my audio can continue just fine. This is another
>>>> case where the application may not want to rely on a human user re-start
>>>> (if you leave it up to me whether to re-start my video, I'll certainly
>>>> try to re-start it right away).
>>> 
>>> It’s fine if the human user tries to restart the media straight-away: if it keeps failing, they’ll eventually give up. The goal is to avoid an automatic restart that never gives up if it keep failing.
>> 
>> Fair enough, although in this particular case it doesn’t seem to make a lot of sense to leave it up to the user.
>> 
>>> 
>>>> I think the text in this section needs to
>>>> be re-phrased to separate the case where a circuit breaker triggering on
>>>> a single 3-tuple causes a whole call to end (either because the call
>>>> consisted of a single flow or because all of the flows were encountering
>>>> congestion and it takes just one circuit breaker to trigger the end of
>>>> it) from cases where it causes only that flow to be suspended, and
>>>> reference Section 8 to make it clear that the unit of operation for
>>>> "ceasing" and "re-starting" is a single flow unless the sender chooses to
>>>> group flows.
>>> 
>>> Right - if the flows are bundled together, then the circuit breaker applies to the entire bundle. If they’re sent on separate paths, then it applies to each flow individually. If that’s not clear, I agree that we should fix the text to make it so.
>>> 
>>>> Furthermore (and this is not a DISCUSS point but I leave it here since it
>>>> follows from the points above), the normative recommendation in the first
>>>> paragraph here doesn't really follow from the discussion of restarting
>>>> the call. The recommendation is not to automatically re-start until
>>>> indications are received that congestion has improved, which is different
>>>> from waiting until a human user re-starts. I think this would be clearer
>>>> if the normative recommendation came first and the human user case was
>>>> discussed afterward.
>>> 
>>> This is in §4.5? I can rephrase, if it’s clearer.
>> 
>> Yes, this is all in 4.5
>> 
>>> 
>>>> ----------------------------------------------------------------------
>>>> COMMENT:
>>>> ----------------------------------------------------------------------
>>>> 
>>>> (1) Did the WG discuss BCP status for this rather than PS?
>>> 
>>> Not that I recall. Standards track seems more appropriate to me, but BCP would be fine also.
>> 
>> I could see arguments either way. I asked since draft-ietf-tsvwg-circuit-breaker is approved for BCP.
>> 
>>> 
>>>> (2) Section 4.3:
>>>> 
>>>> "If such a reduction in
>>>> sending rate resolves the congestion problem, the sender MAY
>>>> gradually increase the rate at which it sends data after a reasonable
>>>> amount of time has passed, provided it takes care not to cause the
>>>> problem to recur ("reasonable" is intentionally not defined here)."
>>>> 
>>>> In later sections you explain that thresholds are not specified because
>>>> they are application-dependent. I think that would be useful to note here
>>>> too as the reason for not defining "reasonable," assuming that is the
>>>> reason.
>>> 
>>> Sure.
>>> 
>>> -- 
>>> Colin Perkins
>>> https://csperkins.org/
>>> 
>>> 
>>> 
>>> 
>> 
>> _______________________________________________
>> Audio/Video Transport Core Maintenance
>> avt@ietf.org
>> https://www.ietf.org/mailman/listinfo/avt
> 
> 
> 
> -- 
> Colin Perkins
> https://csperkins.org/
> 
> 
> 
>