Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16

"Black, David" <david.black@emc.com> Tue, 28 June 2016 01:04 UTC

Return-Path: <david.black@emc.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0464412D7D1; Mon, 27 Jun 2016 18:04:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.747
X-Spam-Level:
X-Spam-Status: No, score=-5.747 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=emc.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oRgaliwB9gGn; Mon, 27 Jun 2016 18:04:33 -0700 (PDT)
Received: from mailuogwhop.emc.com (mailuogwhop.emc.com [168.159.213.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 132B112D0AD; Mon, 27 Jun 2016 18:04:32 -0700 (PDT)
Received: from maildlpprd03.lss.emc.com (maildlpprd03.lss.emc.com [10.253.24.35]) by mailuogwprd02.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id u5S14Rtp022619 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 27 Jun 2016 21:04:27 -0400
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd02.lss.emc.com u5S14Rtp022619
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=emc.com; s=jan2013; t=1467075868; bh=GP3mzO+j5XxOL31gZdB8kCQpKfo=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:MIME-Version; b=EBNvq5m4lBjav/2/ABb1U1Mph3UR414wbwYGs9xoKYIdKz2N7hKMy5qX1t9B48tF/ XPMC8IyBaxDLYr8t3uO2nSoWtvkRO+pxxFdnliV9Gx25TBruawmEWsxncOHZp9/8EZ xAdHsnDr+zW0N+vv7PJNFFM2FamUjtaHf6sPqXsE=
X-DKIM: OpenDKIM Filter v2.4.3 mailuogwprd02.lss.emc.com u5S14Rtp022619
Received: from mailusrhubprd51.lss.emc.com (mailusrhubprd51.lss.emc.com [10.106.48.24]) by maildlpprd03.lss.emc.com (RSA Interceptor); Mon, 27 Jun 2016 21:03:52 -0400
Received: from MXHUB301.corp.emc.com (MXHUB301.corp.emc.com [10.146.3.27]) by mailusrhubprd51.lss.emc.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.0) with ESMTP id u5S14C9f018610 (version=TLSv1.2 cipher=AES128-SHA256 bits=128 verify=FAIL); Mon, 27 Jun 2016 21:04:13 -0400
Received: from MX307CL04.corp.emc.com ([fe80::849f:5da2:11b:4385]) by MXHUB301.corp.emc.com ([10.146.3.27]) with mapi id 14.03.0266.001; Mon, 27 Jun 2016 21:04:12 -0400
From: "Black, David" <david.black@emc.com>
To: Michael Welzl <michawe@ifi.uio.no>, Colin Perkins <csp@csperkins.org>
Thread-Topic: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16
Thread-Index: AQHR0L+ZFj+KetGouEG7mSRAG7kvVJ/+KEcA///WaKA=
Date: Tue, 28 Jun 2016 01:04:11 +0000
Message-ID: <CE03DB3D7B45C245BCA0D243277949362F5AFAE1@MX307CL04.corp.emc.com>
References: <ccf9f2d7-2694-4336-0ec9-ccfebfeb0120@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F585D3E@MX307CL04.corp.emc.com> <d97e30a7-70f5-26d0-c3a4-0497c669f5f6@ericsson.com> <CE03DB3D7B45C245BCA0D243277949362F586054@MX307CL04.corp.emc.com> <D19E595F-7C66-4AE9-92B4-D550A93F634D@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F589335@MX307CL04.corp.emc.com> <20160616222548.GB77166@verdi> <0643E158-BF26-4692-8167-B7A959CB20CE@csperkins.org> <CE03DB3D7B45C245BCA0D243277949362F596DBC@MX307CL04.corp.emc.com> <E16BEA87-1D0F-48F1-A9AC-2729079D581D@tik.ee.ethz.ch> <8C16F1C6-B4A7-4BB4-B215-D7E7EAF308F8@erg.abdn.ac.uk> <CE03DB3D7B45C245BCA0D243277949362F59C41D@MX307CL04.corp.emc.com> <3E053A65-2698-4749-8E3D-E0451DF84011@ifi.uio.no> <BF6B00CC65FD2D45A326E74492B2C19FB76A6433@FR711WXCHMBA05.zeu.alcatel-lucent.com> <32a23d69d22062669f78df806a4eb6b8.squirrel@erg.abdn.ac.uk> <BF6B00CC65FD2D45A326E74492B2C19FB76A659B@FR711WXCHMBA05.zeu.alcatel-lucent.com> <CE03DB3D7B45C245BCA0D24327! 7949362F5 AEE02@MX307CL04.corp.emc.com> <6E35FB6C-CA98-413C-B7AE-75402A968017@ifi.uio.no> <3FD27BBF-8E2D-4A42-86A0-C4C0692FF8C9@csperkins.org> <A1874131-D163-4740-98B9-61F055230A04@ifi.uio.no>
In-Reply-To: <A1874131-D163-4740-98B9-61F055230A04@ifi.uio.no>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.238.45.60]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Sentrion-Hostname: mailusrhubprd51.lss.emc.com
X-RSA-Classifications: public
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/TyjDciUXkbgLFIhVrH4eNv_utuI>
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>, tsvwg <tsvwg@ietf.org>, IETF AVTCore WG <avt@ietf.org>
Subject: Re: [AVTCORE] [rtcweb] [tsvwg] WG Last Call on changes: draft-ietf-avtcore-rtp-circuit-breakers-16
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jun 2016 01:04:35 -0000

Trying to shorten up this thread again ...

> >>> I'm not quite sure how to specify "use of ECN as additional evidence" of
> >>> "excessive congestion" as drop-equivalence is about the best we have
> >>> for current guidance.
> >>
> >> I fail to parse that sentence, so maybe I’m getting you wrong, but anyway I
> >> wonder: what’s even the point of this?
> >> Why even bother considering CE-marks as information for a circuit breaker?
> >
> > Because the alternative is that we only break the circuit once the queue has
> been driven into overflow, and packets have been lost. We want to avoid that,
> since it causes latency, and too much latency is very bad for the user experience.
> 
> Well - the better way out would be for the application to react. Maybe this is me
> misunderstanding the circuit breaker, but I did think it’s more like a last resort…
> you just don’t want to be trigger-happy with such a thing?

Well, the RTP circuit breaker draft is not trigger happy - for its congestion circuit
breaker to trip, RTP has to be sending at 10x the rate that TCP would send under
those conditions, based on the TCP throughput equation.  See:

https://tools.ietf.org/html/draft-ietf-avtcore-rtp-circuit-breakers-16#section-4.3

The issue here is - when calculating the comparable TCP throughput, how are ECN-CE
marks used to determine the loss rate input to the TCP throughput equation?  Do
ECN-CE marked packets count as having arrived or having been dropped?

When things are relatively stable and the ECN-CE marks are being used to nudge
the sender's rate based on what the network can absorb, whether ECN-CE marks
count as losses or not is probably immaterial - the 10x divergence from the TCP
throughput equation's rate is not going to arise, and the circuit breaker won't trip.
The circuit breaker is only supposed to trip when things are seriously wrong.

(1) If the RTP congestion circuit breaker trips based on ECN-CE marks alone,
something feels intuitively wrong - how'd we get to RTP running at 10x the
comparable TCP sending rate with no losses?  Perhaps the circuit breaker
shouldn't trip on ECN-CE marks alone?

(2) At the other extreme, the congestion circuit breaker clearly has to trip if RTP
gets to 10x the comparable TCP sending rate based on losses alone.  This is the
baseline for the circuit breaker to provide network protection as intended.

So, going back to Gorry's suggestion to use ECN-CE marks as "additional evidence,"
here's a straw proposal to shoot at ... factor in ECN-CE marks as additional losses
*only when* losses are already occurring.   

For example, we could specify that for the RTP congestion circuit breaker to trip, the
RTP sending rate has to be:
	- 10x the equivalent TCP sending rate based on counting ECN-CE marked
		packets as lost; AND
	- 3x the equivalent sending rate based on actual drops (i.e., counting
		ECN-CE marked packets as delivered).
The "3x" above is an off-the-top-of-my-head factor that attempts to roughly
equally weight the inputs (3 is close to the square root of 10) - pick a different
number if that weighting feels wrong.

This would force drops to occur and then consider ECN-CE marks as additional evidence
that something is wrong in the network.

Another possible rationale for this mixing is that if drops start occurring, then many of
the new and proposed uses of ECN that treat ECN-CE marks as less than loss-equivalent
are outside their intended operating envelopes/regions.

Thanks, --David