Re: [Moq] Warp

Hey Hang,

ABR is the primary mechanism for HLS/DASH to deal with congestion. Warp
adds the ability to skip video at the end of media segments until the ABR
algorithm kicks in. So all things remaining the same, Warp would be a
better user experience.

That being said, we still want to reduce latency. The way to minimize
latency is to transfer each frame to the player as it is encoded, provided
any dependencies are transferred first (GoP structure).

This poses a problem for client-side ABR.
<https://blog.twitch.tv/en/2020/01/15/twitch-invites-you-to-take-on-our-acm-mmsys-2020-grand-challenge/>
Measuring
the arrival time of frames on the client side is not enough signal to
determine the connection bandwidth, making switching up renditions quite
difficult. There's three solutions to this problem that I've seen: 1. hold
back enough media to burst the connection (LL-HLS) or 2. run a speed test
on the connection every so often (our LHLS solution) or 3. hope that
machine learning can save the day.

With Warp we used a fourth option: 4. have the sender perform ABR. The
sender knows the send rate, knows the queue size, and is the entity
actively limiting the amount of data that can be sent. This is not ideal
for a CDN because they're traditionally designed to be stateless with a
standardized API, but I certainly think it's a solvable problem.

On Mon, Feb 14, 2022 at 3:02 AM shihang (C) <shihang9@huawei.com> wrote:

> @Luke, I wonder whether the timely bandwidth estimate is needed for ABR,
> given that client has 2-5s buffer anyway(when facing congestion which is
> the primary scenario of Warp). The client side ABR is more scalable than
> sender side ABR, right? Is the computation overhead of the sender side ABR
> one of obstacles when deploying to the CDN?
>
>
>
> Best Regards,
>
> Hang
>
>
>
>
>
> *发件人**:* Moq <moq-bounces@ietf.org> *代表 *Luke Curley
> *发送时间:* 2022年2月12日 13:20
> *收件人:* Law, Will <wilaw@akamai.com>
> *抄送:* MOQ Mailing List <moq@ietf.org>
> *主题:* Re: [Moq] Warp
>
>
>
> ...and to clarify what I mean by "CDN support", I mean using HTTP/3
> requests instead of QUIC streams. A client could request each HLS/DASH
> segment in parallel providing the Warp priority as a header
> <https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-priority>. You
> effectively get the same segment data and prioritization but encapsulated
> in a HTTP response.
>
>
>
> However it does get quite a bit more complicated than that. The biggest
> issue is that prioritization is not guaranteed, especially when multiple
> connections are involved (ex. different hostnames). It's also very
> difficult for the server to provide a timely bandwidth estimate for ABR. We
> opted to take the simpler route and push via WebTransport instead of
> pulling via HTTP/3.
>
> On Fri, Feb 11, 2022, 5:32 PM Luke Curley <kixelated@gmail.com> wrote:
>
> Hey Will,
>
>
>
> Unlike HLS, the media sender is responsible for ABR. Our server pulls the
> estimated bitrate directly from the QUIC congestion controller (BBR, Cubic,
> etc) and switches renditions at segment boundaries. This is a dramatic
> improvement over client-side ABR because it's the actual rate at which
> media can be sent. It's also the primary challenge with using Warp over
> HTTP/3 with CDN support.
>
>
>
> Also I want to clarify that this draft is not complete. I wanted to focus
> on what I felt were the core concepts that would shape a WG. That may have
> been a mistake because it's come up a few times... and in fact. the client
> can create streams. These are used to send messages like
> load/play/pause/track but somehow I completely neglected to document it.
>
>
>
>
>
> On Fri, Feb 11, 2022 at 4:27 PM Law, Will <wilaw@akamai.com> wrote:
>
> @Luke – how does WARP handle throughput variation across the connection
> (the equivalent of ABR with HAS)? The draft indicates that older frames are
> dropped in the face of congestion. This implies that resolution and encoded
> bitrate remain constant and that it’s the rendered frame rate that drops on
> the client to compensate for any throughput degradation. If that is
> correct, then at what point can the client decide I’m tired of receiving
> the 4K feed at 8fps, I’d rather get 1080p at 30fps? Conceivably it could
> request the server to begin sending a lower resolution/bitrate stream of
> data, however the established streams are unidirectional and no control
> back-channel is defined. It could also tune-in to a new QUIC stream at the
> appropriate bitrate, if there was some standard metadata to define what was
> available and how to access it.   Do you consider discovery and service
> description to be out of scope of this core protocol definition? If so, has
> any thought be given to extending WARP so that it includes service
> discovery and description and perhaps a control back-channel?
>
>
>
> Cheers
>
> Will
>
>
>
>
>
> *From: *Luke Curley <kixelated@gmail.com>
> *Date: *Friday, February 11, 2022 at 1:11 PM
> *To: *Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>
> *Cc: *MOQ Mailing List <moq@ietf.org>
> *Subject: *Re: [Moq] Warp
>
>
>
> Hey Sergio,
>
>
>
> Warp has flexible latency depending on the broadcaster and viewer(s).
>
>
>
> The broadcaster chooses their encoding settings, for example using
> b-frames (higher latency/quality) or using a larger look-ahead buffer
> (better compression and rate control). The viewer dynamically chooses their
> buffer size, dictating how long to wait before skipping the end of a
> segment.
>
>
>
> With a perfect network, Warp would transfer each video frame from the
> encoder to decoder as they are generated. However, congestion makes that
> impossible, which is why it's necessary to have a dynamic player buffer for
> smooth playback. For example, a viewer with a reliable connection may have
> a 500ms buffer, while a viewer with a cellular connection may have a 2s
> buffer, while a viewer in a developing country may have a 5s buffer, while
> a service that archives the stream may have a 30s buffer for maximum
> reliability.
>
>
>
> The broadcaster and any intermediate proxies do not know or care about
> each viewer's desired latency. They just create QUIC streams, transmit
> packets based on stream priority, and eventually close any streams if they
> reach some maximum upper bound. This makes it ideal for video distribution
> especially when multiple caches and proxies are involved.
>
>
>
> On Fri, Feb 11, 2022 at 11:59 AM Sergio Garcia Murillo <
> sergio.garcia.murillo@gmail.com> wrote:
>
> Hi luke,
>
>
>
> QUICK question, what is the target glass to glass latency for WARP?
>
>
>
> Best regards
>
> Sergio
>
>
>
>
>
> El vie, 11 feb 2022 20:22, Luke Curley <kixelated@gmail.com> escribió:
>
> Hey MOQ, I just published a draft for Warp
> <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-lcurley-warp/__;!!GjvTz_vk!CN-FzuL40h3RSvNUobOIUtEEMChMR2oAcW4N7QAzHt3yJISvAijnqM0MaK7E$>.
> Here's a quick FAQ:
>
>
>
> *What is Warp?*
>
> Twitch has developed a new video distribution protocol to replace our
> custom low-latency HLS stack. Warp uses QUIC streams to deliver media
> segments, prioritizing streams based on content and age. This allows
> viewers to skip old video content during congestion instead of buffering;
> improving the user experience and reducing latency.
>
>
>
> *What about contribution?*
>
> Warp is very similar to Facebook's RUSH
> <https://urldefense.com/v3/__https:/www.ietf.org/archive/id/draft-kpugin-rush-00.html__;!!GjvTz_vk!CN-FzuL40h3RSvNUobOIUtEEMChMR2oAcW4N7QAzHt3yJISvAijnqK2Y57G-$> and
> can be used as a contribution protocol. There's a few fundamental
> differences, like the prioritization scheme and transferring media as
> segments. This first version of the draft focuses on these core differences
> and omits anything else that could be a distraction.
>
>
>
> *Why not WebRTC?*
>
> We initially used WebRTC (both media and data channels) for
> last mile-delivery but the user experience was significantly worse than our
> existing stack. There were so many minor issues, primarily caused by
> WebRTC's focus on real-time latency and the inability to control the client
> (browser) behavior. I personally had to scrap years of work on a custom
> SFU. 😔
>
>
>
> *Why not use datagrams?*
>
> Warp uses QUIC streams because it dramatically simplifies the protocol. We
> get the full benefit of QUIC's fragmentation, congestion control, flow
> control, recovery, cancellation, multiplexing, etc. Using datagrams gives
> you extra flexibility but it also means you have to reimplement everything
> on every platform.
>
>
>
> *Why not use HTTP?*
>
> Good question! The key to warp is the prioritization mechanism, which
> could work with HTTP/3 and possibly HTTP/2. Twitch has the benefit of
> running our own network so it was just simpler to make a push-based
> protocol using QUIC and WebTransport. I've got some ideas for a more
> complicated HTTP solution that would enable CDN support..
>
>
>
> *How is media delivered?*
>
> Warp sends each segment (group of pictures) over a QUIC stream. Audio and
> newer video segments are prioritized, causing older video segments to
> starve during congestion. Either side can cancel the stream to effectively
> drop the tail of a segment. Media is quite linear by nature and most frames
> need to be processed in decode order.
>
>
>
> *Why not drop individual frames?*
>
> We decided that it wasn't worth dropping non-reference frames, given their
> infrequency and relatively small size for high quality media. Our hardware
> encodes (QuickSync) have only reference frames and we've seen software
> encodes with only 3% non-reference frames by file size. And of course,
> dropping reference frames will cause artifacting or freezing so that wasn't
> an option.
>
>
> * How could this be improved?*
>
> We want to experiment with layered coding (ex. SVC) at some point in the
> future. This would involve transferring non-reference frames/slices on a
> different QUIC stream so they can be deprioritized. Simulcast would work
> the same way: transfer each rendition on a different QUIC stream
> prioritized based on the resolution.
>
>
>
> *Why use fMP4?*
>
> HLS and DASH support CMAF: a standard for fragmenting MP4 files. Warp uses
> this file format so we can deliver the same segment data regardless of the
> delivery protocol. The Warp MP4 atom uses JSON because I was too lazy to do
> things "properly" for this first draft. The wire format doesn't matter!
>
> --
> Moq mailing list
> Moq@ietf.org
> https://www.ietf.org/mailman/listinfo/moq
>
> <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/moq__;!!GjvTz_vk!CN-FzuL40h3RSvNUobOIUtEEMChMR2oAcW4N7QAzHt3yJISvAijnqCbRA_82$>
>
>