Re: [Hls-interest] Image-based subtitles and trickplay tracks

Boy van Dijk <boy@unified-streaming.com> Thu, 28 May 2020 08:53 UTC

Return-Path: <boy@unified-streaming.com>
X-Original-To: hls-interest@ietfa.amsl.com
Delivered-To: hls-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D202A3A0C6E for <hls-interest@ietfa.amsl.com>; Thu, 28 May 2020 01:53:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.798
X-Spam-Level:
X-Spam-Status: No, score=-1.798 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=unified-streaming-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PQa_EaWgH914 for <hls-interest@ietfa.amsl.com>; Thu, 28 May 2020 01:53:00 -0700 (PDT)
Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2DE0F3A0C6D for <hls-interest@ietf.org>; Thu, 28 May 2020 01:53:00 -0700 (PDT)
Received: by mail-ed1-x531.google.com with SMTP id be9so22548862edb.2 for <hls-interest@ietf.org>; Thu, 28 May 2020 01:53:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=unified-streaming-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5nsRCH+zJlVSvnof84zAHBks2tPsPc1f+TkHVfYjCPQ=; b=QOZ3cvzQOykFCOjJm7L2twZKqjcFc8WVnifUDIpy6WH7Jp4y92EFohzsqaNfCcj+i1 xc4z9kbr+VQL+D2+B/IEDCPkmeXIxKls7ylph9s7xAiu5K/QAimvrkLLrAcn4iSc26Zu SQCSPnpAqkO5Ha1zmb8XG6bKcNzXXk4l/UsK74ml3RNiG1COyN8d/z7P8ec7uS2PxeCH eki/IXlrGZiGwqxGA8A7/huYDdYyXJMtCE+CtfVdiSj5Bxu2YTdtsfmsD/jzKk+JgYdx 93r+hSoetle5P1xN6Klx4AeKXP8edrcUKaYW40gNRHN7R68r4UTwEPggSx+NI39WScZu qUKQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5nsRCH+zJlVSvnof84zAHBks2tPsPc1f+TkHVfYjCPQ=; b=G8Jqz8+Lq3CGQStH3sool2OyHvNAbKZx7UKqdxxeaUD+kv5KXFlyoT/E4d9oIVReqg 5i2YcuCekgW6zhJjrippcpmlDWoaYFQR7zqirbeOh8oMBhS1PA1/iffEF/fA8UOiAAdu RwC8xAsa08QKTa+bnR1L1q4J8kpgutrWkydkoyQHT/nrz7d6V9hNq9sM0NkbT8qw8Eiy dSRqsYBtZUF3uzP7rpgmRIiU2m+QpGaD2p+FGjmuUZtjpFsM+kuSH4WGo/OfbFwjS/nM Tn4sprurF08lFoVellAA4NcZIGCLgSaLHJsSBSFddEe2GklQQoX4Xkt1oJ/DphEgeqb2 Fq+Q==
X-Gm-Message-State: AOAM530Ip+VUZTq7ympJrBeBd90w5/Rot/5xKm+mijbNzEjSPRkYEQKu 0m5f3Wib87gZLwU01jmrf6rrYgfr+30+fDD6eLF6GA==
X-Google-Smtp-Source: ABdhPJwchX8JARglLA/R8DG53A5AT6hAWMNv7W0OZHEqAb5tPPvrztM3rrSGhoVFZECRXShusE/z9TooWYdfx1Yh5vI=
X-Received: by 2002:a50:8707:: with SMTP id i7mr1947406edb.180.1590655977563; Thu, 28 May 2020 01:52:57 -0700 (PDT)
MIME-Version: 1.0
References: <89b3e7538c6d4477a97260da0a970e89@EX13D02EUB003.ant.amazon.com> <BB855961-4DA6-49D5-BFEC-EF8B85AEC241@akamai.com> <96EA8757-9837-4874-9B02-15FE44808385@apple.com>
In-Reply-To: <96EA8757-9837-4874-9B02-15FE44808385@apple.com>
From: Boy van Dijk <boy@unified-streaming.com>
Date: Thu, 28 May 2020 10:52:46 +0200
Message-ID: <CADxNcn+E1X+Q_-mwij-D26n3E59qj=j_MVC11V0U0-RbBWQgLQ@mail.gmail.com>
To: Roger Pantos <rpantos=40apple.com@dmarc.ietf.org>
Cc: "Law, Will" <wilaw=40akamai.com@dmarc.ietf.org>, "hls-interest@ietf.org" <hls-interest@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000080c13f05a6b17403"
Archived-At: <https://mailarchive.ietf.org/arch/msg/hls-interest/xQXh81Oqh6cUFkgoklXfTSaXg68>
Subject: Re: [Hls-interest] Image-based subtitles and trickplay tracks
X-BeenThere: hls-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions about HTTP Live Streaming \(HLS\)." <hls-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hls-interest/>
List-Post: <mailto:hls-interest@ietf.org>
List-Help: <mailto:hls-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 May 2020 08:53:03 -0000

Hi all,

Thank you for your valuable insights, Roger. Also on behalf of Unified
Streaming.

We have a demo available that showcases the use of a tiled thumbnails
trickplay track packaged as JPEG in CMAF. Dash.js already supports this:

https://reference.dashif.org/dash.js/v3.1.1/samples/dash-if-reference-player/index.html?url=https://demo.unified-streaming.com/video/tears-of-steel/tears-of-steel-tiled-thumbnails-static.mpd

Hope this is helpful.

Regards,
Boy

On Wed, May 27, 2020 at 8:19 PM Roger Pantos <rpantos=
40apple.com@dmarc.ietf.org> wrote:

>
>
> On May 26, 2020, at 2:52 PM, Law, Will <wilaw=40akamai.com@dmarc.ietf.org>
> wrote:
>
> Some additional data for this thread from a recent study just conducted
> for Apple device playback against Akamai CDN by the Client Optimization
> Team last month. You can see that for this ATV playback session, over the
> time period monitored, there were 22K requests in total, of which 98% were
> small range requests for thumbnails/trickplay.
>
> Every request has a fulfilment cost at some point. Tiled images would
> serve to lower the request rate against the edge, in this case, by an order
> of magnitude.   From a CDN perspective, we would lend our support towards
> the development of an optional tiled-image-based thumbnail solution for HLS.
>
>
> Be careful about that data. A lot of those i-frame requests are being done
> to support scanning playback where i-frames loads are sparse. For example
> at 96x only every ~16th frame is requested.
>
> Having said that, you can aggregate multiple JPEGs into a single fmp4 file
> pretty easily, which would set up the client for the same kind of
> optimizations to lower the request rate if it wanted to load every i-frame.
>
>
> Roger.
>
>
> Cheers
> Will
>
>
> <image001.png>
>
> *From: *"Weil, Nicolas" <nicoweil@elemental.com>
> *Date: *Tuesday, May 26, 2020 at 12:22 PM
> *To: *"May, Bill" <Bill.May@disneystreaming.com>, Roger Pantos <
> rpantos=40apple.com@dmarc.ietf.org>
> *Cc: *"hls-interest@ietf.org" <hls-interest@ietf.org>
> *Subject: *Re: [Hls-interest] Image-based subtitles and trickplay tracks
>
> Comments inline.
>
> *From:* May, Bill <Bill.May@disneystreaming.com>
> *Sent:* Tuesday, May 26, 2020 9:31 AM
> *To:* Roger Pantos <rpantos=40apple.com@dmarc.ietf.org>
> *Cc:* Weil, Nicolas <nicoweil@elemental.com>; hls-interest@ietf.org
> *Subject:* RE: [Hls-interest] Image-based subtitles and trickplay tracks
>
>
>
>
> On May 22, 2020, at 11:17 AM, Roger Pantos <
> rpantos=40apple.com@dmarc.ietf.org> wrote:
>
> Hello Nicholas. Thanks for bringing these up. I have some questions:
>
>
>
> On May 20, 2020, at 3:12 PM, Weil, Nicolas <nicoweil@elemental.com> wrote:
>
> Hello,
>
> We are often seeing two image-related topics causing interoperability
> problems as they are not currently covered by the HLS spec. Normalizing the
> implementations around an official specification for these two points would
> be great:
>
> *Image-based subtitles tracks*
> For workflow reasons and charset reasons, some content owners don't
> include text-based subtitles in the live channels sources that they provide
> to distributors, but rather image-based subtitles (like DVB-Sub). While
> it's possible to transform these subtitles as IMSC1 Image Profile as per
> DASH-IF IOP section 6.4.4, there is no equivalent IMSC1 Image Profile
> support in the HLS RFC, which means that companies will continue to rely on
> proprietary forks of the HLS RFC to support these use cases. Even if it
> wasn't supported by Apple players, it would be tremendously helpful for
> interoperability in the rest of the HLS ecosystem.
>
>
> I'd like to understand how widely validated the Image Profile of IMSC1 has
> been. Can anyone volunteer some examples where it’s been commercially
> deployed successfully? (Specifically IMSC1, vs. some other fork of TTML.)
> [NW] IMSC1 Image profile is now supported by ATSC3, DASH-IF IOP (with
> support in dash.js) and IMF (with support in IMFTool
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_IMFTool_IMFTool&d=DwMGaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=KkevKJerDHRF9WRs8nW8Ew&m=dBy7sHrrIgjIMBRCpRh9IfMnwq8RKZHcgJLKmlBotj0&s=fBrrNLMD_eSxO9MYInkAMOhyaX6Zicdvk3HTD6p5mho&e=>
>  which development has been sponsored by Netflix and other studios
> initially).
>
>
> Another question: there are requirements in the US at least that require
> the ability to change font sizes, colors, etc.   And, TBH, these are
> changes that help people world-wide.
>
>
> How would you meet those requirements with bit mapped subtitles?  Wouldn’t
> it be better to work to eliminate bitmapped subtitles completely?
> [NW] I believe these font size/color change requirements can be satisfied
> with IMSC1 Text Profile which is supported in rfc8216bis since 2017.
> As much as I’d like to get rid of bitmap subtitles, sometimes the content
> owners cannot provide anything else than bitmaps in the source feed. And
> it’s very challenging to apply a reliable OCR pass on it, for all target
> languages (Latin/Cyrillic/Asian/… charsets). IMSC1 Image Profile has got a
> decent industry support, and the Text Profile is already supported in HLS,
> so I would expect it to be a natural extension for HLS to support also the
> Image Profile.
>
>
> *Image-based trickplay tracks*
> For player resources optimization reasons, the use of a video track as a
> trickplay artefact is not always possible, and a lot of player providers
> recommend the use of image thumbnails tracks instead of special low
> framerate video tracks. DASHIF IOP section 6.2.6 covers this use case but
> there is equivalent support in the HLS RFC. There is the Image Media
> playlists HLS extension proposal from Roku/Disney/WarnerMedia here
> https://github.com/image-media-playlist/spec
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__nam12.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fgithub.com-252Fimage-2Dmedia-2Dplaylist-252Fspec-26data-3D02-257C01-257Cbill.may-2540disneystreaming.com-257C49edde2508884eb51cca08d7fe7c807e-257C65f03ca86d0a493e9e4ac85ac9526a03-257C1-257C0-257C637257683002059478-26sdata-3D9lV-252BiNvxiK5feFCV1hYgo5I1lW0TEpMlWl-252BpwKwVKZU-253D-26reserved-3D0&d=DwMGaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=KkevKJerDHRF9WRs8nW8Ew&m=dBy7sHrrIgjIMBRCpRh9IfMnwq8RKZHcgJLKmlBotj0&s=yKAoFpJAUPAHylJJXOXaXVSeWHIwBnO5F9MOWl5tgWc&e=> but
> its relevance/adoption is currently limited by the fact that it's not part
> of the RFC. Same logic here: even if not supported by Apple players which
> don't need it as they can leverage I-frame tracks, it would be super useful
> for the rest of the HLS ecosystem to get this officially part of the RFC.
>
>
> I'd like to better understand what’s driving this. Is the limitation
> essentially one of not being able to support an AVC decoder for i-frame
> display?
>
> If that’s the case then it seems that putting JPEG images into fMP4
> containers and using EXT-X-I-FRAME-STREAM-INF would be a smaller extension
> to HLS, both in terms of departure from the existing approach and less new
> spec to invent.
>
> One of the things I don’t love about the image-media-playlist spec is that
> it doesn’t follow the regular HLS timing model, where the media
> presentation time is defined in the media data itself. Instead it relies on
> precise synchronization of the EXTINF values, which seems like a recipe for
> long-term accumulation of floating point error, as well as difficult to
> achieve with multiple geographically-dispersed packagers for live.
>
>
>
> The limitation is exactly that.  A second decoder (AVC or HEVC) is not
> available on many devices.  This also makes mid-fragment switching
> difficult as well and makes switching between codecs impossible as well.
>
> The image-media-playlist spec does rely somewhat on floating point; no
> more so that a seek to date or seek to time does in a regular HLS playlist,
> however.  I’m not sure that anyone is asking for precise millisecond
> switching from these images to regular AV.
>
> I see 2 solutions to this problem: give a PTS/timescale in the HLS
> playlist (something like we did for transport stream to webVTT timing, but
> in the playlist), or wrap the jpeg in some sort of wrapper with timing
> (fmp4?).  It would be good, if that is the route, to have guidance from
> Apple on what specification to use.
>
> The thing about JPEGs is that they are easy; almost any software decodes
> them; wrapping them in FMP4 doesn’t make it easier or better.
>
> [NW] Thumbnails in the DASH-IF IOP are simple jpeg images and it makes it
> easy to produce and to manipulate on the service side (like aggregating
> several live thumbnails into tiles of thumbnails when a program is
> transitioned from Live to VOD). Using the same simple image container would
> allow direct interoperability with DASH, without requiring an additional
> CMAF+DASH specification cycle. As regards HLS, I was hoping that the use of
> EXT-PROGRAM-DATE-TIME would become mandatory, as per the preliminary LL-HLS
> specification. That would give use the millisecond-accurate time reference
> that we need to avoid drifts if we keep images in a simple image container.
>
> --
> Hls-interest mailing list
> Hls-interest@ietf.org
> https://www.ietf.org/mailman/listinfo/hls-interest
>
>
> --
> Hls-interest mailing list
> Hls-interest@ietf.org
> https://www.ietf.org/mailman/listinfo/hls-interest
>