Re: [Moq] Latency @ Twitch

Maybe a dumb thought, but is the PROBE_RTT phase required when sufficiently
application limited, as is primarily the case for live video? If I
understand correctly, it's meant to drain the queue to remeasure the
minimum RTT, but that doesn't seem necessary when the queue is constantly
being drained due to a lack of data to send.

Either way, the issue is that existing TCP algorithms don't care about the
live video use-case, and those are the ones that have been ported to QUIC
thus far. But like Justin mentioned, this doesn't actually matter for the
sake of standardizing a video over QUIC protocol provided the building
blocks are in place.

The real question is: do QUIC ACKs contain enough signal to implement an
adequate live video congestion control algorithm? If not, how can we
increase that signal, potentially taking cues from RMCAT (ex. RTT on a
per-packet basis)?

On Tue, Nov 9, 2021, 10:27 AM Mo Zanaty (mzanaty) <mzanaty=
40cisco.com@dmarc.ietf.org> wrote:

> All current QUIC CCs (BBRv1/2, CUBIC, NewReno, etc.) are not well suited
> for real-time media, even for a rough “envelope” or “circuit-breaker”.
> RMCAT CCs are explicitly designed for real-time media, but, of course, rely
> on RTCP feedback, so must be adapted to QUIC feedback.
>
>
>
> Mo
>
>
>
>
>
> On 11/9/21, 1:13 PM, "Bernard Aboba" <bernard.aboba@gmail.com> wrote:
>
>
>
> Justin said:
>
>
>
> "As others have noted, BBR does not work great out of the box for realtime
> scenarios."
>
>
>
> [BA] At the ICCRG meeting on Monday, there was an update on BBR2:
>
>
> https://datatracker.ietf.org/meeting/112/materials/slides-112-iccrg-bbrv2-update-00.pdf
>
>
>
> While there are some improvements, issues such as "PROBE_RTT" and rapid
> rampup after loss remain, and overall, it doesn't seem like BBR2 is going
> to help much with realtime scenarios.  Is that fair?
>
>
>
> On Tue, Nov 9, 2021 at 12:46 PM Justin Uberti <
> juberti@alphaexplorationco.com> wrote:
>
> Ultimately we found that it wasn't necessary to standardize the CC as long
> as the behavior needed from the remote side (e.g., feedback messaging)
> could be standardized.
>
>
>
> As others have noted, BBR does not work great out of the box for realtime
> scenarios. The last time this was discussed, the prevailing idea was to
> allow the QUIC CC to be used as a sort of circuit-breaker, but within that
> envelope the application could use whatever realtime algorithm it preferred
> (e.g, goog-cc).
>
>
>
> On Thu, Nov 4, 2021 at 3:58 AM Piers O'Hanlon <piers.ohanlon@bbc.co.uk>
> wrote:
>
>
>
> On 3 Nov 2021, at 21:46, Luke Curley <kixelated@gmail.com> wrote:
>
>
>
> Yeah, there's definitely some funky behavior in BBR when application
> limited but it's nowhere near as bad as Cubic/Reno. With those
> algorithms you need to burst enough packets to fully utilize the congestion
> window before it can be grown. With BBR I believe you need to burst just
> enough to fully utilize the pacer, and even then this condition
> <https://source.chromium.org/chromium/chromium/src/+/master:net/third_party/quiche/src/quic/core/congestion_control/bbr_sender.cc;l=393> lets
> you use application-limited samples if they would increase the send rate.
>
>
>
> And there’s also the idle cwnd collapse/reset behaviour to consider if
> you’re sending a number of frames together and their inter-data gap exceeds
> the RTO - I’m not quite sure how the various QUIC stacks have translated
> RFC2861/7661 advice on this…?
>
>
>
> I started with BBR first because it's simpler, but I'm going to try out
> BBR2 at some point because of the aforementioned PROBE_RTT issue. I don't
> follow the congestion control space closely enough; are there any notable
> algorithms that would better fit the live video use-case?
>
>
>
> I guess Google’s Goog_CC appears to be well used in the WebRTC space (e.g.
> WEBRTC
> <https://webrtc.googlesource.com/src/+/refs/heads/main/modules/congestion_controller/goog_cc>
>  and aiortc
> <https://github.com/aiortc/aiortc/blob/1a192386b721861f27b0476dae23686f8f9bb2bc/src/aiortc/rate.py#L271>)
> despite the draft
> <https://datatracker.ietf.org/doc/html/draft-ietf-rmcat-gcc> never making
> it to RFC status… There's also SCREAM
> <https://datatracker.ietf.org/doc/rfc8298/> which has an open
> source implementation <https://github.com/EricssonResearch/scream> but
> not sure how widely deployed it is.
>
>
>
>
>
> On Wed, Nov 3, 2021 at 2:12 PM Ian Swett <ianswett@google.com> wrote:
>
> From personal experience, BBR has some issues with application limited
> behavior, but it is still able to grow the congestion window, at least
> slightly, so it's likely an improvement over Cubic or Reno.
>
>
>
> On Wed, Nov 3, 2021 at 4:40 PM Luke Curley <kixelated@gmail.com> wrote:
>
> I think resync points are an interesting idea although we haven't
> evaluated them. Twitch did push for S-frames in AV1 which will be another
> option in the future instead of encoding a full IDR frame at these resync
> boundaries.
>
>
>
> An issue is you have to make the hard decision to abort the current
> download and frantically try to pick up the pieces before the buffer
> depletes. It's a one-way door (maybe your algorithm overreacted) and you're
> going to be throwing out some media just to redownload it at a lower
> bitrate.
>
>
>
> Ideally, you could download segments in parallel without causing
> contention. The idea is to spend any available bandwidth on the new segment
> to fix the problem, and any excess bandwidth on the old segment in
> the event it arrives before the player buffer actually depletes. That's
> more or less the core concept for what we've built using QUIC, and it's
> compatible with resync points if we later go down that route.
>
>
>
>
>
> And you're exactly right Piers. The fundamental issue is that a web player
> lacks the low level timing information required to infer the delivery rate.
> You would want something like BBR's rate estimation
> <https://datatracker.ietf.org/doc/html/draft-cheng-iccrg-delivery-rate-estimation> which
> inspects the time delta between packets to determine the send rate. That
> gets really difficult when the OS and browser delay flushing data to the
> application, be it for performance reasons or due to packet loss (to
> maintain head-of-line blocking).
>
>
>
> I did run into CUBIC/Reno not being able to grow the congestion window
> when frames are sent one at a time (application limited). I don't believe
> BBR suffers from the same problem though due to the aforementioned rate
> estimator.
>
>
>
> On Wed, Nov 3, 2021 at 10:05 AM Ali C. Begen <ali.begen@networked.media>
> wrote:
>
>
>
> > On Nov 3, 2021, at 6:50 PM, Piers O'Hanlon <piers.ohanlon@bbc.co.uk>
> wrote:
> >
> >
> >
> >> On 2 Nov 2021, at 20:39, Ali C. Begen <ali.begen=
> 40networked.media@dmarc.ietf.org> wrote:
> >>
> >>
> >>
> >>> On Nov 2, 2021, at 3:39 AM, Luke Curley <kixelated@gmail.com> wrote:
> >>>
> >>> Hey folks, I wanted to quickly summarize the problems we've run into
> at Twitch that have led us to QUIC.
> >>>
> >>>
> >>> Twitch is a live one-to-many product. We primarily focus on video
> quality due to the graphical fidelity of video games. Viewers can
> participate in a chat room, which the broadcaster reads and can respond to
> via video. This means that latency is also somewhat important to facilitate
> this social interaction.
> >>>
> >>> A looong time ago we were using RTMP for both ingest and distribution
> (Flash player). We switched to HLS for distribution to gain the benefit of
> 3rd party CDNs, at the cost of dramatically increasing latency. A later
> project lowered the latency of HLS using chunked-transfer delivery, very
> similar to LL-DASH (and not LL-HLS). We're still using RTMP for
> contribution.
> >>>
> > I guess Apple do also have their BYTERANGE/CTE mode for LL-HLS which is
> pretty similar to LL-DASH.
>
> Yes, Apple can list the parts (chunks in LL-DASH) as byteranges in the
> playlist but the frequent playlist refresh and part retrieval process is
> inevitable in LL-HLS, which is one of the main differences from LL-DASH (no
> need for manifest refresh and request per segment not chunk).
>
> >
> >>>
> >>> To summarize the issues with our current distribution system:
> >>>
> >>> 1. HLS suffers from head-of-line blocking.
> >>> During congestion, the current segment stalls and is delivered slower
> than the encoded bitrate. The player has no recourse than to wait for the
> segment to finish downloading, risking depleting the buffer. It can switch
> down to a lower rendition at segment boundaries, but these boundaries occur
> too infrequently (every 2s) to handle sudden congestion. Trying to switch
> earlier, either by canceling the current segment or downloading the lower
> rendition in parallel, only exacerbates the issue.
> >>
> > Isn't the HoL limitation more down to the use of HTTP/1.1?
> >
> >> DASH has the concept of Resync points that were designed exactly for
> this purpose (allowing you to emergency downshift in the middle of a
> segment).
> >>
> > I was curious if there are any studies or experience of how resync
> points perform in practice?
>
> Resync points are pretty fresh out of the oven. dash.js has it in the
> roadmap but not yet implemented (and we also need to generate test
> streams). So, there is no data available yet with the real clients. But, I
> suppose you can imagine how in-segment switching can help in sudden bw
> drops especially for long segments.
>
> >
> >>> 2. HLS has poor "auto" quality (ABR).
> >>> The player is responsible for choosing the rendition to download. This
> is a problem when media is delivered frame-by-frame (ie. HTTP
> chunked-transfer), as we're effectively application-limited by the encoder
> bitrate. The player can only measure the arrival timestamp of data and does
> not know when the network can sustain a higher bitrate without just trying
> it. We hosted an ACM challenge for this issue in particular.
> >>
> > The limitation here may also be down to the lack of access to
> sufficiently accurate timing information about data arrivals in the browser
> - unfortunately the Streams API, which provides data from the fetch API,
> doesn’t directly timestamp the data arrivals so the JS app has to timestamp
> it which can suffer from noise such as scheduling etc - especially a
> problem for small/fast data arrivals.
>
> Yes, you need to get rid of that noise (see LoL+).
>
> > I guess another issue could be that if the system is only sending single
> frames then the network transport may be operating in application limited
> mode so the cwnd doesn’t grow sufficiently to take advantage of the
> available capacity.
>
> Unless the video bitrate is too low, this should not be an issue most of
> the time.
>
> >
> >> That exact challenge had three competing solutions, two of which are
> now part of the official dash.js code. And yes, the player can figure what
> the network can sustain *without* trying higher bitrate renditions.
> >>
> https://github.com/Dash-Industry-Forum/dash.js/wiki/Low-Latency-streaming
> >> Or read the paper that even had “twitch” in its title here:
> https://ieeexplore.ieee.org/document/9429986
> >>
> > There was a recent study that seems to show that none of the current
> algorithms are that great for low latency, and the two new dash.js ones
> appear to lead to much higher levels of rebuffering:
> > https://dl.acm.org/doi/pdf/10.1145/3458305.3478442
>
> Brightcove’s paper uses the LoL and L2A algorithms from the challenge
> where low latency was the primary goal. For Twitch’s own evaluation, I
> suggest you watch:
> https://www.youtube.com/watch?v=rcXFVDotpy4
> We later addressed the rebuffering issue, developed LoL+, which is the
> version included in dash.js now and explained at the ieeexplore link I gave
> above.
>
> Copying the authors in case they want to add anything for the paper you
> cited.
>
> -acbegen
>
>
> >
> > Piers
> >
> >>> I believe this is why LL-HLS opts to burst small chunks of data
> (sub-segments) at the cost of higher latency.
> >>>
> >>>
> >>> Both of these necessitate a larger player buffer, which increases
> latency. The contribution system it's own problems, but let me sync up with
> that team first before I try to enumerate them.
> >>> --
> >>> Moq mailing list
> >>> Moq@ietf.org
> >>> https://www.ietf.org/mailman/listinfo/moq
> >>
> >> --
> >> Moq mailing list
> >> Moq@ietf.org
> >> https://www.ietf.org/mailman/listinfo/moq
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>
>
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>