Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16

Michael Welzl <michawe@ifi.uio.no> Mon, 27 June 2016 22:29 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Michael Welzl <michawe@ifi.uio.no>
In-Reply-To: <3FD27BBF-8E2D-4A42-86A0-C4C0692FF8C9@csperkins.org>
Date: Tue, 28 Jun 2016 00:29:26 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <A1874131-D163-4740-98B9-61F055230A04@ifi.uio.no>
References: <ccf9f2d7-2694-4336-0ec9-ccfebfeb0120@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F585D3E@MX307CL04.corp.emc.com> <d97e30a7-70f5-26d0-c3a4-0497c669f5f6@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F586054@MX307CL04.corp.emc.com> <D19E595F-7C66-4AE9-92B4-D550A93F634D@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F589335@MX307CL04.corp.emc.com> <20160616222548.GB77166@verdi> <0643E158-BF26-4692-8167-B7A959CB20CE@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F596DBC@MX307CL04.corp.emc.com> <E16BEA87-1D0F-48F1-A9AC-2729079D581D@tik.ee.ethz.ch> <8C16F1C6-B4A7-4BB4-B215-D7E7EAF308F8@erg.abdn.ac.uk> <CE03DB3D7B45C245BCA0D243277949362F59C41D@MX307CL04.corp.emc.com> <3E053A65-2698-4749-8E3D-E0451DF84011@ifi.uio.no> <BF6B00CC65FD2D45A326E74492B2C19FB76A6433@FR711WXCHMBA05.zeu.alcatel-lucent.com> <32a23d69d22062669f78df806a4eb6b8.squirrel@erg.abdn.ac.uk> <BF6B00CC65FD2D45A326E74492B2C19FB76A659B@FR711WXCHMBA05.zeu.alcatel-lucent.com> <CE03DB3D7B45C245BCA0D243277949362F5 AEE02@MX307CL04.corp.emc.com> <6E35FB6C-CA98-413C-B7AE-75402A968017@ifi.uio.no> <3FD27BBF-8E2D-4A42-86A0-C4C0692FF8C9@csperkins.org>
To: Colin Perkins <csp@csperkins.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/9DhIy5S5OC1m660Fmo8q-O9fW0g>
Cc: "Black, David" <david.black@emc.com>, "De Schepper, Koen (Nokia - BE)" <koen.de_schepper@nokia-bell-labs.com>, "rtcweb@ietf.org" <rtcweb@ietf.org>, tsvwg <tsvwg@ietf.org>, IETF AVTCore WG <avt@ietf.org>
Subject: Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16
Precedence: list

> On 28. jun. 2016, at 00.02, Colin Perkins <csp@csperkins.org> wrote:
> 
> 
>> On 27 Jun 2016, at 21:52, Michael Welzl <michawe@ifi.uio.no> wrote:
>> 
>> David,
>> 
>> 
>>> On 27. jun. 2016, at 22.09, Black, David <david.black@emc.com> wrote:
>>> 
>>>> As long as an AQM is marking at the same rate as dropping
>>> 
>>> That's an interesting assumption - it should be true for AQMs vetted
>>> here in the past, but there are easy ways for it not to hold (e.g., if dropping
>>> or marking is based on queue occupancy, it is possible that dropping
>>> reduces queue occupancy in a fashion that marking does not).
>>> 
>>> For ECN "classic" (i.e., see RFC 3168) where ECN-CE markings are treated
>>> as drop-equivalent, that is for congestion control purposes, which is similar
>>> to, (but not the same as) the throughput estimation usage for the RTP circuit
>>> breaker.    I'll note that ECN "classic" was designed congestion control
>>> algorithms for react to ECN-CE marks once per RTT, independent of how
>>> many ECN-CE marks are observed in an RTT.
>>> 
>>> Gorry wrote:
>>> 
>>>>> in this context we should use ECN to drive a CC algorithm and we should be
>>>>> cautious to avoid requiring its use within a Circuit Breaker - optional
>>>>> use, if you understand how to interpret a reaction to many CE-marks as
>>>>> excessive congestion, are permitted.
>>> 
>>> Something like that may be workable, starting with a clear distinction between
>>> the use of ECN by CC (routine, active at all times) and ECN by a circuit
>>> breaker (monitors for evidence that things have gotten bad, only activated
>>> when things get bad).   This would baseline the RTP circuit breaker on drops
>>> and allow use of ECN as additional evidence of problems, in contrast to
>>> congestion control where ECN-CE is effectively treated as drop-equivalent.
>>> 
>>> I'm not quite sure how to specify "use of ECN as additional evidence" of
>>> "excessive congestion" as drop-equivalence is about the best we have
>>> for current guidance.
>> 
>> I fail to parse that sentence, so maybe I’m getting you wrong, but anyway I wonder: what’s even the point of this?
>> Why even bother considering CE-marks as information for a circuit breaker?
> 
> Because the alternative is that we only break the circuit once the queue has been driven into overflow, and packets have been lost. We want to avoid that, since it causes latency, and too much latency is very bad for the user experience. 

Well - the better way out would be for the application to react. Maybe this is me misunderstanding the circuit breaker, but I did think it’s more like a last resort… you just don’t want to be trigger-happy with such a thing?

>> CE-marks may *not* indicate *excessive* congestion - and since you say “additional evidence”: I don’t think that a combination of loss and CE-marks makes this any better? CE-marks may be produced by a shallow queue, which can be rather “mild” congestion, at least in the light of what a circuit breaker should consider…
> 
> Surely this is just arguing for a different threshold for a circuit breaker triggered by ECN-CE marks (using a modern, small queue, AQM) than for one triggered by loss (or ECN marks considered equivalent to loss)? 

If you have room for yet another code point, for the circuit breaker only?  :)    Or maybe I just misunderstand you here?

> If I understand the L4S proposal correctly, that would be treat ECN-CE marks on ECT(0) marked flows as equivalent to loss, but treat ECN-CE marks on ECT(1) marked flows with a (much) higher threshold. 

L4S would not change anything about how ECT(0) marked flows are treated, and would CE-mark packets carrying ECT(1) with an instantaneous queue - i.e. a much *lower* threshold. But that’s not the issue - I agree there’s no problem with L4S.

The compatibility problem does exist with the ABE proposal, which works off ECT(0).

The ABE proposal exploits a very simple fact: that CE-marks are, by definition, *not* the same as loss (see David Black’s previous email where he says "if dropping or marking is based on queue occupancy, it is possible that dropping reduces queue occupancy in a fashion that marking does not”). Indeed, queue dynamics play out differently when packets are dropped or marked  ( see Section 7 with Figures 13/14 in https://www.duo.uio.no/bitstream/handle/10852/37381/khademi-AQM_Kids_TR434.pdf ) .

Losses may stem from a DropTail (FIFO) queue somewhere along the path - CE-marks are, however, very likely to only be caused by an AQM algorithm. TCP’s built-in reaction to loss yields full link utilization only when there’s at least a BDP worth of queuing. This is a lot of latency - when the queue is full this doubles the RTT. Modern AQM mechanisms strive to maintain a much smaller average queue size, and this is where they mark packets.

So: if we react to CE-marks the same way as to loss, CoDel and PIE let us underutilize the link.

Thus, it makes more sense to interpret the signal for what it is: an indication that there was congestion, but from a queue that might be much smaller than a BDP.

> Assuming, in all cases, that there’s a parallel congestion control algorithm running

If you assume that there’s a parallel congestion control algorithm running, I understand even less why you want to feed ECN CE-marks into the circuit breaker. The congestion control algorithm should already deal with them.

> (and RMCAT has figured out the right congestion response for that; the proposals now treat ECN-CE and loss very similarly).

I disagree that this is the “right” congestion response. It’s a workable one, sure. Nothing extremely terrible will happen if congestion controllers treat ECN-CE and loss similarly - it just yields unnecessarily poor utilization with ECN, with modern AQMs  (unless one backs off by less than TCP would in response to loss too, which is good if there’s an AQM in place but may be quite bad otherwise).

Bottom line: it really does mean something different, and it seems wrong to me to act as if that wasn’t the case - just because we’ve always done so.

Cheers,
Michael

Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… De Schepper, Koen (Nokia - BE)
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Colin Perkins
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Black, David
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Ruediger.Geib
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Colin Perkins
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Colin Perkins
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Black, David
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Fred Baker (fred)
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… John Leslie
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Fred Baker (fred)
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Colin Perkins
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Black, David
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… De Schepper, Koen (Nokia - BE)
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… gorry
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… De Schepper, Koen (Nokia - BE)
Re: [AVTCORE] [rtcweb] WG Last Call on changes: d… Ben Campbell
Re: [AVTCORE] [rtcweb] WG Last Call on changes: d… Magnus Westerlund
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Black, David
Re: [AVTCORE] [tsvwg] [rtcweb] WG Last Call on ch… Gorry (erg)
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Mirja Kühlewind
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Black, David
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Colin Perkins
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… John Leslie
Re: [AVTCORE] [tsvwg] WG Last Call on changes: dr… Black, David
Re: [AVTCORE] [tsvwg] WG Last Call on changes: dr… Colin Perkins
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… Michael Welzl
Re: [AVTCORE] [tsvwg] WG Last Call on changes: dr… Black, David
Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on ch… John Leslie
Re: [AVTCORE] [tsvwg] WG Last Call on changes: dr… Magnus Westerlund
[AVTCORE] WG Last Call on changes: draft-ietf-avt… Magnus Westerlund
Re: [AVTCORE] [tsvwg] WG Last Call on changes: dr… Black, David