Re: [rmcat] Review of Congestion Control Requirements For RMCAT

Re: [rmcat] Review of Congestion Control Requirements For RMCAT - draft-ietf-rmcat-cc-requirements-00

Dave Taht <dave.taht@gmail.com> Thu, 21 November 2013 22:49 UTC

MIME-Version: 1.0
In-Reply-To: <52868F5C.2030706@jesup.org>
References: <AE7F97DB5FEE054088D82E836BD15BE92014261D@xmb-aln-x05.cisco.com> <52868F5C.2030706@jesup.org>
Date: Thu, 21 Nov 2013 14:49:22 -0800
Message-ID: <CAA93jw6cidDM=kMF91DEMO-6h=-f686kvZN=6mdJzGWqWtQbAA@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
To: Randell Jesup <randell-ietf@jesup.org>
Content-Type: multipart/alternative; boundary="f46d043890796d3d6004ebb7b51d"
Cc: "rmcat@ietf.org" <rmcat@ietf.org>
Subject: Re: [rmcat] Review of Congestion Control Requirements For RMCAT - draft-ietf-rmcat-cc-requirements-00
Precedence: list

I could have sworn I'd commented on this already, but don't see it
On Nov 15, 2013 1:17 PM, "Randell Jesup" <randell-ietf@jesup.org> wrote:

> On 11/12/2013 11:27 AM, Bill Ver Steeg (versteb) wrote:
>
>> Randell-
>>
>> As I promised in Vancouver, here are my note on draft-ietf-rmcat-cc-
>> requirements-00
>>
>> In summary, this is quite a nice start. It is a difficult document to
>> write, as it is intentionally quite broad. On the whole, I like it.
>>
>
> That's a great start for a review!  ;-)   Thanks!  Now on to the meat...
>
>  The main concern is my item #1
>>
>> 1- In the abstract, do we want to mention that the RMCAT CC algorithm
>> needs to be robust in the presence of a wide range of cross traffic? If
>> not, we certainly need to state it strongly in the introduction and the
>> requirements. Perhaps a paragraph similar to
>> "While the requirements for RMCAT differ from the requirements for the
>> other flow types, these other flow types will be present in the network.
>> The RMCAT congestion control algorithm must work properly when these other
>> flow types are present as cross traffic on the network." should be in
>> either the abstract or in the introduction. I would also like to see a
>> statement that the RMCAT CC algorithm should drive scavenger flows (like
>> LEDBAT) to nearly 0, taking that BW for the RMCAT/TCP/"normal" flows. These
>> points are referenced in item #11, but it is a rather thin section late in
>> the document. IMHO, this should be one of the primary points of the
>> requirements document.
>>
>
> Certainly upping the emphasis on robustness in the face of a variety of
> cross traffic is good - though it's not something that we can guarantee is
> achievable (just as we can't guarantee success in competing with greedy TCP
> flows).
>
> As for driving LEDBAT to 0.... I'm afraid that might be desirable, but
> unachievable.  We may well find that LEDBAT will put a floor on how low we
> can drive delay, driven by the delay constant hard-coded into LEDBAT
> implementations (100ms(!) in the spec, 25ms in Apple's kernel
> implementation used for things like Apple's TimeMachine(?) auto-backup
> stuff.  I'm not sure what BitTorrent is using, but I suspect it's 100ms.
>

It is impossible to drive ledbat to 0. 100ms is torrent, yes. Yet, It is
highly desirable to get queuing latency below 100ms, and 5ms is achievable
with modern AQM/packet scheduling technology, which turn all delay based
TCPs into loss based ones again. I can't remember if I already posted this
(forgive me if I have) but

http://perso.telecom-paristech.fr/~drossi/paper/rossi13tma-b.pdf

and the followup modelling it, which I have not thoroughly digested yet:

http://arxiv.org/pdf/1303.6817.pdf



>
>  There are some minor issues, which I discuss below.
>>
>> 2- Requirements section, bullet 1 - Do we need to elaborate the other
>> cases in which the algorithm needs to adjust the BW? We state (in 1a) that
>> topology changes need to be handled. Do we need to state <1b - Changes in
>> cross traffic > and <1c - changes in offered load from the application
>> sending data over RMCAT>
>>
>
> I'll check the text; I don't have it in front of me at the moment
>
>  3- Requirements section, bullet 2a - If we are enumerating types of
>> traffic that we need to be concerned with, I would more concerned with MPEG
>> DASH style Adaptive BitRate video than generic web browsing. Web browsing
>> is bursty, but at least it is well-bounded in time. ABR flows are bursty,
>> cyclical, and persistent. I would either remove the reference to web
>> browsing or change it to include other bursty flow types (I note that OTT
>> ABR traffic is now more than 50% of the peak load on many networks).
>>
>
> Good point, though MPEG DASH is not the dominant transfer protocol for
> such data (or not yet).  The major players are implementing DASH in JS
> (i.e. under provider/application control), so that adds another wrinkle in
> verifying this - but the idea that we have to coexist with such traffic is
> important, and HTTP streaming and DASH and proprietary protocols from
> Apple/MS/etc do pose real problems - HTTP streaming for example may look
> like a very hungry realtime protocol, or it may look like a periodic
> maximum-transfer TCP flow (Gettys showed a NetFlix BW vs Time graph that
> showed a ~10 second square-wave of bandwidth use, for example).
>

It is my hope netflix has changed their setup by now. There was some
awesome followup work after the "confused, timid and unstable" paper:

http://www.stanford.edu/~huangty/imc012-huang.pdf

here:

http://reproducingnetworkresearch.wordpress.com/2013/03/13/cs244-13-rising-from-the-depths-observing-and-implementing-improvements-in-online-video-bitrate-selection/

So I think that data on netflix's behavior needs to be revisited. (I have
retired that slide)

http://www.ietf.org/mail-archive/web/aqm/current/msg00195.html



>
>  4- Requirements section 3 - do we need to mention that there is a
>> temporal component to information sharing across streams. In other words,
>> we may consider a previous 5-tuples experience as a baseline to seed a CC
>> algorithm, particularly if it is the exact same source addr/dest addr/dest
>> port/protocol? This is hinted in item #12 in the document, but the guidance
>> is quite thin in that section. Temporal hinting is particularly valuable if
>> the old session data was from 1 second ago. It is less true if it was 1 day
>> ago, but may still have some value as an initial seed for the CC algorithm.
>> 3b is also a bit awkward, as multiple  "flows" on a given 5-tuple
>> introduces some difficult SSRC concepts that are currently being discussed
>> in the AVT group. We either need to define "flow" or describe the concept
>> in broader terms. I am reluctant to open that can of worms in this
>> document, but if we are not clear it will be the source of endless
>> debate/confusion. I hate glossaries in the front of RFCs, but we may have
>> to do that.
>>
>> 5- In section 4, I do not see any details on how ECN, delay and loss
>> indicate congestion. Even though we all understand that this is a complex
>> relationship and will be difficult to characterize in a requirements
>> document, I think that a few paragraphs of detail are in order. This level
>> of detail would be important to a less experienced reader. This is my major
>> concern with the draft, and I could write a paragraph or two to include
>> this discussion (if there agreement that this should be discussed in this
>> draft)
>>
>
> Let me take a try at it.  Understandability also can help avoid confusion
> among experts and implementations.
>

I'm all ears. We stumbled on defining this too, in the aqm wg.


>
>
>> 6- Is 5a a requirement, or are we starting to discuss the solution space
>> here? I suspect that we will end up mandating AVFP, but we may be getting
>> ahead of ourselves here. Perhaps we should soften this to a suggestion to
>> examine RFC4585 rather than a MUST use statement.
>>
>
> Hmmm.  Let me think.  I think we'd said that these algorithms *if* they
> use RTCP would need AVPF in order to respond within orders of magnitude of
> the RTT.  But there are other ways to provide the feedback needed than
> RTCP.  But that isn't to say that a new profile (AVPG ;-) ) couldn't also
> meet the needs here.
>
>  7- If we are elaborating AQM schemes, we should include PIE (I know, a
>> shameless plug for the one I am working on - but I think it is a valid
>> comment nonetheless). And to be rigorous, RED, CoDel and PIE are buffer
>> management algorithms that operate on a given queue. One can also apply
>> multiple queues (FQ being one variant of multiple queues) to the problem as
>> well.  To be more clear, we should mention queuing algorithms like RED,
>> CoDel and PIE and then mention that each of these algorithms can be
>> optionally mapped to multiple queues.  If this is going to cause a debate
>> of some sort, we could also just reference the broad class of AQMs and the
>> broad class of queue allocation schemes without elaborating the specific
>> algorithms.
>>
>
> The specific were more to indicate which ones we thought were most
> important to work/test with, so perhaps we should go to a more generic
> statement and move mentions of specifics to Varun's document.
>
>  8- I am reluctant to bring this into the conversation........ Once we
>> mention multiple queues, we may want to mention that the multiple queues
>> may represent different QOS buckets, and the activity in one queue may
>> impact the drain rate of lower priority queues, whereas the activity of a
>> lower priority queue will have a (very?) small impact on the drain rate of
>> a higher queue. I am reluctant to start this train of thought in this
>> document, so perhaps we do not mention multiple queues of multiple
>> priorities. If we are to mention different priority flows, perhaps we use
>> DSCP markings or VLAN tagging examples - but once again this quickly goes
>> down a very deep rat hole. We may just want to state that there are often
>> multiple priorities of flows, using the traditional SP voice, SP video,
>> general purpose data, and scavenger class as examples......... I am torn on
>> this one, and could be convinced to not include this in the document.
>>
>
> Fundamentally, we have 4 classes of packets we want to get across the
> network in webrtc (which also came up in the W3 TPAC WebRTC discussions
> this week with regard to DSCP markings/etc): Faster-than-audio (typically
> low bandwidth and/or intermittent), audio, video, and best-effort (maybe
> split up into interactive use, and bulk-transfer).  Outside of webrtc we
> would also have scavenger (below immediate bulk-transfer).  Typical
> browsing traffic/etc would fall into the same general classes as best
> effort and bulk-transfer.  Note that "audio" and "video" don't mean they
> have to be exclusively packets of that type, just that they they would have
> network priorities of that type.
>
> Interestingly (or problematically), splitting the congestion regimes by
> DSCP markings really needs to be done unless you know (or strongly believe)
> they're being ignored at the bottleneck (note: not necessarily ignored
> everywhere).  But splitting the congestion controllers has the side effect
> (if they are in one queue at the bottleneck) of reducing the feedback (per
> controller) and slowing the ability to notice changes in the bottleneck
> (and to respond).
>
>  Bill VerStee
>>
>
> Thanks again for a thoughtful review!
>
> --
> Randell Jesup
> randell-ietf@jesup.org
>
>
>

[rmcat] Review of Congestion Control Requirements… Bill Ver Steeg (versteb)
Re: [rmcat] Review of Congestion Control Requirem… Mirja Kuehlewind
Re: [rmcat] Review of Congestion Control Requirem… Randell Jesup
Re: [rmcat] Review of Congestion Control Requirem… Dave Taht
[rmcat] MW's review of draft-ietf-rmcat-cc-requir… Michael Welzl
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Zaheduzzaman Sarker
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Michael Welzl
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Zaheduzzaman Sarker
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Michael Welzl
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Randell Jesup
Re: [rmcat] MW's review of draft-ietf-rmcat-cc-re… Randell Jesup