Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16

Michael Welzl <michawe@ifi.uio.no> Mon, 20 June 2016 16:35 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D116512D650; Mon, 20 Jun 2016 09:35:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.626
X-Spam-Level:
X-Spam-Status: No, score=-5.626 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qEuh64FlejmJ; Mon, 20 Jun 2016 09:35:46 -0700 (PDT)
Received: from mail-out4.uio.no (mail-out4.uio.no [IPv6:2001:700:100:10::15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 431A012D77E; Mon, 20 Jun 2016 09:35:46 -0700 (PDT)
Received: from mail-mx4.uio.no ([129.240.10.45]) by mail-out4.uio.no with esmtp (Exim 4.80.1) (envelope-from <michawe@ifi.uio.no>) id 1bF2An-0000AQ-N2; Mon, 20 Jun 2016 18:35:41 +0200
Received: from 3.134.189.109.customer.cdi.no ([109.189.134.3] helo=[192.168.0.104]) by mail-mx4.uio.no with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) user michawe (Exim 4.80) (envelope-from <michawe@ifi.uio.no>) id 1bF2Am-0002lH-MC; Mon, 20 Jun 2016 18:35:41 +0200
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Michael Welzl <michawe@ifi.uio.no>
In-Reply-To: <CE03DB3D7B45C245BCA0D243277949362F59C41D@MX307CL04.corp.emc.com>
Date: Mon, 20 Jun 2016 18:35:38 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <3E053A65-2698-4749-8E3D-E0451DF84011@ifi.uio.no>
References: <ccf9f2d7-2694-4336-0ec9-ccfebfeb0120@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F585D3E@MX307CL04.corp.emc.com> <d97e30a7-70f5-26d0-c3a4-0497c669f5f6@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F586054@MX307CL04.corp.emc.com> <D19E595F-7C66-4AE9-92B4-D550A93F634D@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F589335@MX307CL04.corp.emc.com> <20160616222548.GB77166@verdi> <0643E158-BF26-4692-8167-B7A959CB20CE@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F596DBC@MX307CL04.corp.emc.com> <E16BEA87-1D0F-48F1-A9AC-2729079D581D@tik.ee.ethz.ch> <8C16F1C6-B4A7-4BB4-B215-D7E7EAF308F8@erg.abdn.ac.uk> <CE03DB3D7B45C245BCA0D243277949362F59C41D@MX307CL04.corp.emc.com>
To: "Black, David" <david.black@emc.com>
X-Mailer: Apple Mail (2.3124)
X-UiO-SPF-Received:
X-UiO-Ratelimit-Test: rcpts/h 15 msgs/h 5 sum rcpts/h 15 sum msgs/h 5 total rcpts 43389 max rcpts/h 54 ratelimit 0
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, TVD_RCVD_IP=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: D44EF8E5A0B4DDC7002F5832F45A733540B00FCB
X-UiO-SPAM-Test: remote_host: 109.189.134.3 spam_score: -49 maxlevel 80 minaction 2 bait 0 mail/h: 5 total 1352 max/h 14 blacklist 0 greylist 0 ratelimit 0
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/Z-7ADNpMdbUHpMNxSc_MJQ7dVkI>
Cc: "<gorry@erg.abdn.ac.uk> Fairhurst" <gorry@erg.abdn.ac.uk>, Magnus Westerlund <magnus.westerlund@ericsson.com>, tsvwg <tsvwg@ietf.org>, IETF AVTCore WG <avt@ietf.org>, Mirja Kühlewind <mirja.kuehlewind@tik.ee.ethz.ch>, "rtcweb@ietf.org" <rtcweb@ietf.org>, Colin Perkins <csp@csperkins.org>
Subject: Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Jun 2016 16:35:50 -0000

> On 20. jun. 2016, at 15.16, Black, David <david.black@emc.com> wrote:
> 
>>> But I’m less concerned than David about eventually ignoring it for circuit
>> breaker.
>>> 
>> Agree. Loss is the measurement that a CB MUST respond to.
> 
> Mumble.   I would be ok with a clear discouragement for use of ECN-CE marks, accompanied by the sort of design rationale here, or even better, a clear statement that lost packets for the purpose of the RTP circuit breaker have to be actually lost without getting into whether or not ECN-CE marks are involved -i.e., the RTP circuit breaker is specified against actual drops as a network protection backstop.
> 
> A related concern is that ECN marks may overstate equivalent loss behavior - a simplistic queue management discipline that marks every packet when the queue is over a threshold (NB: this class of marking behavior is NOT RECOMMENDED - a real AQM SHOULD be used) could yield a run of ECN-CE marks that would not cause a corresponding with a run of packet drops.   This is among the reasons that TCP reacts to ECN-CE marks only once per RTT, and might be a reason to treat multiple ECN-CE marks in an RTT interval as not representing drops of all packets for the RTP circuit breaker's TCP-equivalent throughput calculation.

I’m not sure we need such complicated logic to find a case where ECN marks are different from packet drops:

Basically, they simply aren’t - even “real” AQMs marking isn’t exactly the same as a packet drop: the marks themselves inform you that an AQM did its job, and with modern AQMs like CoDel / PIE etc., you’re probably getting this from a shallow queue. Chances are that this is less than a BDP worth of queuing, which is our justification for recommending a different back-off behavior in draft-khademi-tsvwg-ecn-response-00 and draft-khademi-tcpm-alternativebackoff-ecn-00

So the point is not that AQMs would treat ECN marking and dropping differently - it’s that ECN indicates an AQM, and hence probably a shallow queue. With a drop, you just don’t know.

Back to the CB, I think an AQM marking at a shallow queue (like e.g. CoDel) is indeed quite different from a “broken connection”.

Cheers,
Michael


> 
> Thanks, --David
> 
>> -----Original Message-----
>> From: Gorry (erg) [mailto:gorry@erg.abdn.ac.uk]
>> Sent: Saturday, June 18, 2016 2:23 AM
>> To: Mirja Kühlewind
>> Cc: Black, David; Magnus Westerlund; Colin Perkins; rtcweb@ietf.org; IETF
>> AVTCore WG; tsvwg
>> Subject: Re: [tsvwg] [rtcweb] [AVTCORE] WG Last Call on changes: draft-ietf-
>> avtcore-rtp-circuit-breakers-16
>> 
>> I think we SHOULD NOT recommend to use ECN marks as inputs to a CB. See
>> below:
>> 
>>> On 17 Jun 2016, at 16:02, Mirja Kühlewind <mirja.kuehlewind@tik.ee.ethz.ch>
>> wrote:
>>> 
>>> +1 to not use normative language here.
>>> 
>>> However, please note that having a high level of ECN-CE marks (without any
>> losses) means that all packets were received correctly. This situation can even
>> occurs without high delays (depending on the AQM used), which would just
>> mean the services works perfectly. Therefore for me CE marks are a perfect input
>> signal for a congestion control loop (where the AQM tell the sender to take action
>> - whatever that means).
>> 
>> We may in future figure out ways to do this to detect significant failure using a
>> rate adaptive transport and ECN e.g.  Observing 100% CE marks or something, for
>> an RTP flow that is trying to send well below its peak rate decided by CC -- but I
>> think this is speculating at an algorithm and adding details here is not a good idea.
>> Especially as AQM continues to evolve.
>> 
>>> But I’m less concerned than David about eventually ignoring it for circuit
>> breaker.
>>> 
>> Agree. Loss is the measurement that a CB MUST respond to.
>> 
>>> In addition one point on something Magnus wrote earlier:
>>> "If the implementation only have circuit breaker, i.e. no full fledged congestion
>> controller and uses ECN, they can in worst case drive the buffer into the overload
>> regime where it starts dropping packets. „
>>> 
>>> I’m not sure about this case. ECN is an input signal for congestion control. If you
>> don’t use congestion control but only a circuit breaker, you should probably not
>> enable ECN. At least it not clear to me why you would enable it, and it's definitely
>> not conform to the ECN spec. Probably we should say something about this in the
>> draft...?
>>> 
>> Agree, enabling ECN without a responsive CC is going to lead to trouble.
>> 
>>> Mirja
>>> 
>> Gorry
>> 
>>>> Am 17.06.2016 um 16:03 schrieb Black, David <david.black@emc.com>:
>>>> 
>>>> Colin,
>>>> 
>>>>>>> ...  I view the current text as providing implementers with too much
>>>>>>> latitude to ignore ECN-CE marks (e.g., because an implementer doesn't
>>>>>>> want to think about this problem space in the first place).
>>>>> 
>>>>> I agree, but the argument is that doing so is less harmful than deploying a
>> circuit
>>>>> breaker that triggers too often when ECN is used.
>>>>> 
>>>>> I’m not sure I believe this argument, though, since it seems that any new
>> AQM
>>>>> that applies ECN marks much more often than at present will have to
>> consider
>>>>> backwards compatibility, to work with deployed TCP (e.g., draft-briscoe-
>> tsvwg-
>>>>> aqm-tcpm-rmcat-l4s-problem uses ECT(1) as a signal to use the new marking,
>>>>> while existing implementations set ECT(0)). These compatibility mechanisms
>>>>> would seem to prevent the issues with the circuit breaker too.
>>>> 
>>>> That roughly matches my line of thinking, and I'll observe that the original
>> DCTCP
>>>> protocol design that used more aggressive ECN-CE marking was only safe for
>>>> Controlled Environment deployments.   See the TSVWG rfc5405bis draft for
>> the
>>>> definition of Controlled Environment, and ignore the fact that the rfc5405bis
>>>> draft is a UDP draft - this definition is more broadly applicable.
>>>> 
>>>> Going back over Section 7 in this avtcore draft, my views are:
>>>> 
>>>> [A] None of these drafts justify a "MAY ignore" response to ECN-CE marks:
>>>>   - draft-khademi-tcpm-alternativebackoff-ecn
>>>>   - draft-ietf-rmcat-nada
>>>>   - draft-ietf-rmcat-scream-cc
>>>> 
>>>> [B] In line with Colin's comment on the L4S draft, I think it's incumbent on
>>>> the authors of draft-briscoe-aqm-dualq-coupled to figure out how that will
>>>> coexist (or avoid) deployed TCP, and this avtcore draft ought not to be
>>>> trying to prejudge what will be done there.
>>>> 
>>>> So, I don't think the current text in Section 7 has justified the unfettered
>>>> "implementations MAY ignore ECN-CE marks" text, as ignoring those marks
>>>> is not consistent with any of the four cited drafts.
>>>> 
>>>> In more detail, I think making changes to normative requirements here based
>>>> on [B] is premature, and I would hope that the rmcat WG could be
>> encouraged
>>>> to consider the RTP circuit breaker in its congestion control drafts, as those CC
>>>> mechanisms are related to the circuit breaker mechanism, hence likely
>>>> to be in related areas of an RTP implementation.
>>>> 
>>>> That leaves draft-khademi-tcpm-alternativebackoff-ecn, which TSVWG
>>>> will be looking at in Berlin.  If a normative statement about ECN-CE reaction
>>>> is going to rest on that draft, then the reference to that draft should be
>>>> normative.  Something about doing that strikes me as premature ...
>>>> 
>>>> I realize that we're trying to predict and accommodate the future, which
>>>> is an imprecise undertaking at best.   As an alternative to the current text,
>>>> would it be reasonable to say (without any RFC 2119 keywords) that the
>>>> best current guidance is still to treat ECN-CE marks as indicating drops,
>>>> with a warning that there is a good possibility of this changing in the
>>>> near future due to all of the work in progress cited in Section 7?
>>>> 
>>>> Thanks, --David
>>>> 
>>>>> -----Original Message-----
>>>>> From: Colin Perkins [mailto:csp@csperkins.org]
>>>>> Sent: Friday, June 17, 2016 6:14 AM
>>>>> To: John Leslie; Black, David
>>>>> Cc: rtcweb@ietf.org; IETF AVTCore WG; tsvwg
>>>>> Subject: Re: [rtcweb] [AVTCORE] [tsvwg] WG Last Call on changes: draft-ietf-
>>>>> avtcore-rtp-circuit-breakers-16
>>>>> 
>>>>> 
>>>>>> On 16 Jun 2016, at 23:25, John Leslie <john@jlc.net> wrote:
>>>>>> 
>>>>>> Black, David <david.black@emc.com> wrote:
>>>>>>> 
>>>>>>> ...  I view the current text as providing implementers with too much
>>>>>>> latitude to ignore ECN-CE marks (e.g., because an implementer doesn't
>>>>>>> want to think about this problem space in the first place).
>>>>> 
>>>>> I agree, but the argument is that doing so is less harmful than deploying a
>> circuit
>>>>> breaker that triggers too often when ECN is used.
>>>>> 
>>>>> I’m not sure I believe this argument, though, since it seems that any new
>> AQM
>>>>> that applies ECN marks much more often than at present will have to
>> consider
>>>>> backwards compatibility, to work with deployed TCP (e.g., draft-briscoe-
>> tsvwg-
>>>>> aqm-tcpm-rmcat-l4s-problem uses ECT(1) as a signal to use the new marking,
>>>>> while existing implementations set ECT(0)). These compatibility mechanisms
>>>>> would seem to prevent the issues with the circuit breaker too.
>>>>> 
>>>>>> Understand, we have at least two proposals to make ECN-CE more
>> frequent
>>>>>> than packet drop would be for non-ECN packets: possibly substantially
>>>>>> more frequent. Unless both are killed off, ECN-CE will show up frequently
>>>>>> enough that closing the flow on ECN-CE would kill too many connections.
>>>>>> 
>>>>>> If you want circuit-breaking on such connections, there are two ways:
>>>>>> 1. convince the forwarding nodes to drop packets if their queue exceeds
>>>>>> design capacity; or
>>>>>> 2. require the sender to send enough not-ECN-capable packets so that our
>>>>>> receiver will see enough packet-drops when a circuit-breaker should
>>>>>> activate.
>>>>>> 
>>>>>> (I prefer the first option; but I wouldn't object to the second.)
>>>>>> 
>>>>>> There really isn't any way for our circuit-breaker to know _how_much_
>>>>>> more frequent the ECN-CE marks may be. :^(
>>>>> 
>>>>> This is a problem, both for the circuit breaker, and for the algorithms being
>>>>> defined in RMCAT. We do need some understanding what the expected
>> marking
>>>>> rates are likely to be, so congestion control and circuit breakers can be
>> defined.
>>>>> 
>>>>>> We _will_ be sorry if we
>>>>>> allot the same frequency of CE packets as packet-drops to trigger the
>>>>>> circuit-breaker.
>>>>>> 
>>>>>>> Could someone propose initial text to qualifies the current "MAY ignore"
>>>>>>> statement?
>>>>>> 
>>>>>> Essentially, for the second option, you might propose text to the
>>>>>> effect of:
>>>>>> ]
>>>>>> ] If too many ECN-CE packets are received, the sender SHOULD send some
>>>>>> ] not-ECN-capable packets to determine whether enough packets along the
>>>>>> ] path are being dropped to justify activating our circuit-breaker.
>>>>>> 
>>>>>> I’m not enthusiastic about adding that; but it would resolve the issue.
>>>>> 
>>>>> I’m not convinced this would work. The circuit breaker is looking at long term
>>>>> trends, and in order to have enough not-ECT packets to determine if it
>> should
>>>>> trigger, you’d essentially have to run without ECN for some seconds.
>>>>> 
>>>>> --
>>>>> Colin Perkins
>>>>> https://csperkins.org/
>>>> 
>>>> _______________________________________________
>>>> rtcweb mailing list
>>>> rtcweb@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/rtcweb
>