Re: [rmcat] Review of Congestion Control Requirements For RMCAT - draft-ietf-rmcat-cc-requirements-00

Randell Jesup <randell-ietf@jesup.org> Fri, 15 November 2013 21:17 UTC

Return-Path: <randell-ietf@jesup.org>
X-Original-To: rmcat@ietfa.amsl.com
Delivered-To: rmcat@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 82FAA11E8190 for <rmcat@ietfa.amsl.com>; Fri, 15 Nov 2013 13:17:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[BAYES_50=0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u3FjHjsHfC3k for <rmcat@ietfa.amsl.com>; Fri, 15 Nov 2013 13:17:25 -0800 (PST)
Received: from r2-chicago.webserversystems.com (r2-chicago.webserversystems.com [173.236.101.58]) by ietfa.amsl.com (Postfix) with ESMTP id 6ACC511E815C for <rmcat@ietf.org>; Fri, 15 Nov 2013 13:17:22 -0800 (PST)
Received: from pool-173-49-144-199.phlapa.fios.verizon.net ([173.49.144.199]:63104 helo=[192.168.1.11]) by r2-chicago.webserversystems.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.80) (envelope-from <randell-ietf@jesup.org>) id 1VhQlO-0005IY-0P for rmcat@ietf.org; Fri, 15 Nov 2013 15:17:14 -0600
Message-ID: <52868F5C.2030706@jesup.org>
Date: Fri, 15 Nov 2013 16:17:16 -0500
From: Randell Jesup <randell-ietf@jesup.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Thunderbird/25.0
MIME-Version: 1.0
To: rmcat@ietf.org
References: <AE7F97DB5FEE054088D82E836BD15BE92014261D@xmb-aln-x05.cisco.com>
In-Reply-To: <AE7F97DB5FEE054088D82E836BD15BE92014261D@xmb-aln-x05.cisco.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - r2-chicago.webserversystems.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jesup.org
X-Get-Message-Sender-Via: r2-chicago.webserversystems.com: authenticated_id: randell@jesup.org
Subject: Re: [rmcat] Review of Congestion Control Requirements For RMCAT - draft-ietf-rmcat-cc-requirements-00
X-BeenThere: rmcat@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "RTP Media Congestion Avoidance Techniques \(RMCAT\) Working Group discussion list." <rmcat.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rmcat>, <mailto:rmcat-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rmcat>
List-Post: <mailto:rmcat@ietf.org>
List-Help: <mailto:rmcat-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rmcat>, <mailto:rmcat-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Nov 2013 21:17:40 -0000

On 11/12/2013 11:27 AM, Bill Ver Steeg (versteb) wrote:
> Randell-
>
> As I promised in Vancouver, here are my note on draft-ietf-rmcat-cc-requirements-00
>
> In summary, this is quite a nice start. It is a difficult document to write, as it is intentionally quite broad. On the whole, I like it.

That's a great start for a review!  ;-)   Thanks!  Now on to the meat...

> The main concern is my item #1
>
> 1- In the abstract, do we want to mention that the RMCAT CC algorithm needs to be robust in the presence of a wide range of cross traffic? If not, we certainly need to state it strongly in the introduction and the requirements. Perhaps a paragraph similar to
> "While the requirements for RMCAT differ from the requirements for the other flow types, these other flow types will be present in the network. The RMCAT congestion control algorithm must work properly when these other flow types are present as cross traffic on the network." should be in either the abstract or in the introduction. I would also like to see a statement that the RMCAT CC algorithm should drive scavenger flows (like LEDBAT) to nearly 0, taking that BW for the RMCAT/TCP/"normal" flows. These points are referenced in item #11, but it is a rather thin section late in the document. IMHO, this should be one of the primary points of the requirements document.

Certainly upping the emphasis on robustness in the face of a variety of 
cross traffic is good - though it's not something that we can guarantee 
is achievable (just as we can't guarantee success in competing with 
greedy TCP flows).

As for driving LEDBAT to 0.... I'm afraid that might be desirable, but 
unachievable.  We may well find that LEDBAT will put a floor on how low 
we can drive delay, driven by the delay constant hard-coded into LEDBAT 
implementations (100ms(!) in the spec, 25ms in Apple's kernel 
implementation used for things like Apple's TimeMachine(?) auto-backup 
stuff.  I'm not sure what BitTorrent is using, but I suspect it's 100ms.


> There are some minor issues, which I discuss below.
>
> 2- Requirements section, bullet 1 - Do we need to elaborate the other cases in which the algorithm needs to adjust the BW? We state (in 1a) that topology changes need to be handled. Do we need to state <1b - Changes in cross traffic > and <1c - changes in offered load from the application sending data over RMCAT>

I'll check the text; I don't have it in front of me at the moment

> 3- Requirements section, bullet 2a - If we are enumerating types of traffic that we need to be concerned with, I would more concerned with MPEG DASH style Adaptive BitRate video than generic web browsing. Web browsing is bursty, but at least it is well-bounded in time. ABR flows are bursty, cyclical, and persistent. I would either remove the reference to web browsing or change it to include other bursty flow types (I note that OTT ABR traffic is now more than 50% of the peak load on many networks).

Good point, though MPEG DASH is not the dominant transfer protocol for 
such data (or not yet).  The major players are implementing DASH in JS 
(i.e. under provider/application control), so that adds another wrinkle 
in verifying this - but the idea that we have to coexist with such 
traffic is important, and HTTP streaming and DASH and proprietary 
protocols from Apple/MS/etc do pose real problems - HTTP streaming for 
example may look like a very hungry realtime protocol, or it may look 
like a periodic maximum-transfer TCP flow (Gettys showed a NetFlix BW vs 
Time graph that showed a ~10 second square-wave of bandwidth use, for 
example).

> 4- Requirements section 3 - do we need to mention that there is a temporal component to information sharing across streams. In other words, we may consider a previous 5-tuples experience as a baseline to seed a CC algorithm, particularly if it is the exact same source addr/dest addr/dest port/protocol? This is hinted in item #12 in the document, but the guidance is quite thin in that section. Temporal hinting is particularly valuable if the old session data was from 1 second ago. It is less true if it was 1 day ago, but may still have some value as an initial seed for the CC algorithm. 3b is also a bit awkward, as multiple  "flows" on a given 5-tuple introduces some difficult SSRC concepts that are currently being discussed in the AVT group. We either need to define "flow" or describe the concept in broader terms. I am reluctant to open that can of worms in this document, but if we are not clear it will be the source of endless debate/confusion. I hate glossaries in the front of RFCs, but we may have to do that.
>
> 5- In section 4, I do not see any details on how ECN, delay and loss indicate congestion. Even though we all understand that this is a complex relationship and will be difficult to characterize in a requirements document, I think that a few paragraphs of detail are in order. This level of detail would be important to a less experienced reader. This is my major concern with the draft, and I could write a paragraph or two to include this discussion (if there agreement that this should be discussed in this draft)

Let me take a try at it.  Understandability also can help avoid 
confusion among experts and implementations.

>   
>
> 6- Is 5a a requirement, or are we starting to discuss the solution space here? I suspect that we will end up mandating AVFP, but we may be getting ahead of ourselves here. Perhaps we should soften this to a suggestion to examine RFC4585 rather than a MUST use statement.

Hmmm.  Let me think.  I think we'd said that these algorithms *if* they 
use RTCP would need AVPF in order to respond within orders of magnitude 
of the RTT.  But there are other ways to provide the feedback needed 
than RTCP.  But that isn't to say that a new profile (AVPG ;-) ) 
couldn't also meet the needs here.

> 7- If we are elaborating AQM schemes, we should include PIE (I know, a shameless plug for the one I am working on - but I think it is a valid comment nonetheless). And to be rigorous, RED, CoDel and PIE are buffer management algorithms that operate on a given queue. One can also apply multiple queues (FQ being one variant of multiple queues) to the problem as well.  To be more clear, we should mention queuing algorithms like RED, CoDel and PIE and then mention that each of these algorithms can be optionally mapped to multiple queues.  If this is going to cause a debate of some sort, we could also just reference the broad class of AQMs and the broad class of queue allocation schemes without elaborating the specific algorithms.

The specific were more to indicate which ones we thought were most 
important to work/test with, so perhaps we should go to a more generic 
statement and move mentions of specifics to Varun's document.

> 8- I am reluctant to bring this into the conversation........ Once we mention multiple queues, we may want to mention that the multiple queues may represent different QOS buckets, and the activity in one queue may impact the drain rate of lower priority queues, whereas the activity of a lower priority queue will have a (very?) small impact on the drain rate of a higher queue. I am reluctant to start this train of thought in this document, so perhaps we do not mention multiple queues of multiple priorities. If we are to mention different priority flows, perhaps we use DSCP markings or VLAN tagging examples - but once again this quickly goes down a very deep rat hole. We may just want to state that there are often multiple priorities of flows, using the traditional SP voice, SP video, general purpose data, and scavenger class as examples......... I am torn on this one, and could be convinced to not include this in the document.

Fundamentally, we have 4 classes of packets we want to get across the 
network in webrtc (which also came up in the W3 TPAC WebRTC discussions 
this week with regard to DSCP markings/etc): Faster-than-audio 
(typically low bandwidth and/or intermittent), audio, video, and 
best-effort (maybe split up into interactive use, and bulk-transfer).  
Outside of webrtc we would also have scavenger (below immediate 
bulk-transfer).  Typical browsing traffic/etc would fall into the same 
general classes as best effort and bulk-transfer.  Note that "audio" and 
"video" don't mean they have to be exclusively packets of that type, 
just that they they would have network priorities of that type.

Interestingly (or problematically), splitting the congestion regimes by 
DSCP markings really needs to be done unless you know (or strongly 
believe) they're being ignored at the bottleneck (note: not necessarily 
ignored everywhere).  But splitting the congestion controllers has the 
side effect (if they are in one queue at the bottleneck) of reducing the 
feedback (per controller) and slowing the ability to notice changes in 
the bottleneck (and to respond).

> Bill VerStee

Thanks again for a thoughtful review!

-- 
Randell Jesup
randell-ietf@jesup.org