Re: [Hls-interest] Image-based subtitles and trickplay tracks

Roger Pantos <rpantos@apple.com> Wed, 27 May 2020 18:09 UTC

Return-Path: <rpantos@apple.com>
X-Original-To: hls-interest@ietfa.amsl.com
Delivered-To: hls-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8EA993A086A; Wed, 27 May 2020 11:09:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=apple.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QXQYL7iIu1BU; Wed, 27 May 2020 11:09:29 -0700 (PDT)
Received: from ma1-aaemail-dr-lapp03.apple.com (ma1-aaemail-dr-lapp03.apple.com [17.171.2.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 05FF13A0A65; Wed, 27 May 2020 11:09:24 -0700 (PDT)
Received: from pps.filterd (ma1-aaemail-dr-lapp03.apple.com [127.0.0.1]) by ma1-aaemail-dr-lapp03.apple.com (8.16.0.42/8.16.0.42) with SMTP id 04RI1P0j020365; Wed, 27 May 2020 11:09:23 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=rvyvSm2AIoRLtbpIUOgXQDoBar42NLat1LHCujXoNSs=; b=nKig20pdpP+xS/ftgG9Be2EYLHwTUGxFSjdy54JOvLAOtgUMmHF8IcWa01KWmJ96qScl sq+wk0KTKVfd3CbqFPABaAtrl3TlNZc7PJHNvWcCKm2cHQ493hyOw8nV6H7V+exGlRM7 Ty3D2/PikCzttnY1hMl2QeWPVHlUBqxvr8n26zoXm3kv1AHHggAqQis2hdhm4UmxGgmF nJCt4ziFqwH7A/BPZHSmKAAdjv1+afa1MT//nWblj4Op7GIMJWwXIHSsIgTR8Zh2IX2U Z2x1x7SD9MOoZu+2uzFJqfy9QQH02I9dtiRcUi/4Bhm+vRz2c6PEF3f1NjQPIvz+PHhB qg==
Received: from rn-mailsvcp-mta-lapp04.rno.apple.com (rn-mailsvcp-mta-lapp04.rno.apple.com [10.225.203.152]) by ma1-aaemail-dr-lapp03.apple.com with ESMTP id 3172tvx9s6-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Wed, 27 May 2020 11:09:23 -0700
Received: from rn-mailsvcp-mmp-lapp02.rno.apple.com (rn-mailsvcp-mmp-lapp02.rno.apple.com [17.179.253.15]) by rn-mailsvcp-mta-lapp04.rno.apple.com (Oracle Communications Messaging Server 8.1.0.5.20200312 64bit (built Mar 12 2020)) with ESMTPS id <0QB000RXB53MMNK0@rn-mailsvcp-mta-lapp04.rno.apple.com>; Wed, 27 May 2020 11:09:22 -0700 (PDT)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp02.rno.apple.com by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.5.20200312 64bit (built Mar 12 2020)) id <0QB0007004OQHI00@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Wed, 27 May 2020 11:09:22 -0700 (PDT)
X-Va-A:
X-Va-T-CD: dfdea9760c77dd2123ec545208e44389
X-Va-E-CD: 53ef3156fdf55cbab86ead6654ef657e
X-Va-R-CD: 85ee598cd49b4e89db568df59764707c
X-Va-CD: 0
X-Va-ID: c05e451e-6804-4281-a970-1ddea0189906
X-V-A:
X-V-T-CD: dfdea9760c77dd2123ec545208e44389
X-V-E-CD: 53ef3156fdf55cbab86ead6654ef657e
X-V-R-CD: 85ee598cd49b4e89db568df59764707c
X-V-CD: 0
X-V-ID: 75b28a49-9b88-4858-94a2-96b385eb9d44
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-05-27_03:2020-05-27, 2020-05-27 signatures=0
Received: from [192.168.1.19] ([17.194.69.193]) by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.5.20200312 64bit (built Mar 12 2020)) with ESMTPSA id <0QB0010P553J1710@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Wed, 27 May 2020 11:09:20 -0700 (PDT)
From: Roger Pantos <rpantos@apple.com>
Message-id: <018C9F98-F179-489E-AB9A-A23087087B52@apple.com>
Content-type: multipart/alternative; boundary="Apple-Mail=_C96C665C-7C60-458D-A22D-11EFEFB43DBC"
MIME-version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
Date: Wed, 27 May 2020 11:09:19 -0700
In-reply-to: <3F521CDD-81F1-4CED-B54A-4978C0435A6F@disneystreaming.com>
Cc: "hls-interest@ietf.org" <hls-interest@ietf.org>
To: "May, Bill" <Bill.May=40disneystreaming.com@dmarc.ietf.org>
References: <4959700c860e48079af52488074e2236@EX13D02EUB001.ant.amazon.com> <CCB77CF4-0CF3-4D73-B93F-DC3160BD3B0B@apple.com> <3F521CDD-81F1-4CED-B54A-4978C0435A6F@disneystreaming.com>
X-Mailer: Apple Mail (2.3608.80.23.2.2)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.216, 18.0.687 definitions=2020-05-27_03:2020-05-27, 2020-05-27 signatures=0
Archived-At: <https://mailarchive.ietf.org/arch/msg/hls-interest/7lEusdUJxyt_uck2Bo94MEOctKs>
Subject: Re: [Hls-interest] Image-based subtitles and trickplay tracks
X-BeenThere: hls-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions about HTTP Live Streaming \(HLS\)." <hls-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hls-interest/>
List-Post: <mailto:hls-interest@ietf.org>
List-Help: <mailto:hls-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 May 2020 18:09:32 -0000


> On May 26, 2020, at 9:31 AM, May, Bill <Bill.May=40disneystreaming.com@dmarc.ietf.org> wrote:
> 
> 
> 
>> On May 22, 2020, at 11:17 AM, Roger Pantos <rpantos=40apple.com@dmarc.ietf.org <mailto:rpantos=40apple.com@dmarc.ietf.org>> wrote:
>> 
>> RPANTOS=40APPLE.COM@DMARC.IETF.ORG <mailto:RPANTOS=40APPLE.COM@DMARC.IETF.ORG> appears similar to someone who previously sent you email, but may not be that person. Learn why this could be a risk <http://aka.ms/LearnAboutSenderIdentification>	Feedback <http://aka.ms/SafetyTipsFeedback>
>> Hello Nicholas. Thanks for bringing these up. I have some questions:
>> 
>>> On May 20, 2020, at 3:12 PM, Weil, Nicolas <nicoweil@elemental.com <mailto:nicoweil@elemental.com>> wrote:
>>> 
>>> Hello,
>>>  
>>> We are often seeing two image-related topics causing interoperability problems as they are not currently covered by the HLS spec. Normalizing the implementations around an official specification for these two points would be great:
>>>  
>>> Image-based subtitles tracks
>>> For workflow reasons and charset reasons, some content owners don't include text-based subtitles in the live channels sources that they provide to distributors, but rather image-based subtitles (like DVB-Sub). While it's possible to transform these subtitles as IMSC1 Image Profile as per DASH-IF IOP section 6.4.4, there is no equivalent IMSC1 Image Profile support in the HLS RFC, which means that companies will continue to rely on proprietary forks of the HLS RFC to support these use cases. Even if it wasn't supported by Apple players, it would be tremendously helpful for interoperability in the rest of the HLS ecosystem.
>> 
>> I'd like to understand how widely validated the Image Profile of IMSC1 has been. Can anyone volunteer some examples where it’s been commercially deployed successfully? (Specifically IMSC1, vs. some other fork of TTML.)
> 
> Another question: there are requirements in the US at least that require the ability to change font sizes, colors, etc.   And, TBH, these are changes that help people world-wide.
> 
> How would you meet those requirements with bit mapped subtitles?  Wouldn’t it be better to work to eliminate bitmapped subtitles completely?
> 
>> 
>>>  
>>> Image-based trickplay tracks
>>> For player resources optimization reasons, the use of a video track as a trickplay artefact is not always possible, and a lot of player providers recommend the use of image thumbnails tracks instead of special low framerate video tracks. DASHIF IOP section 6.2.6 covers this use case but there is equivalent support in the HLS RFC. There is the Image Media playlists HLS extension proposal from Roku/Disney/WarnerMedia here https://github.com/image-media-playlist/spec <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fimage-media-playlist%2Fspec&data=02%7C01%7Cbill.may%40disneystreaming.com%7C49edde2508884eb51cca08d7fe7c807e%7C65f03ca86d0a493e9e4ac85ac9526a03%7C1%7C0%7C637257683002059478&sdata=9lV%2BiNvxiK5feFCV1hYgo5I1lW0TEpMlWl%2BpwKwVKZU%3D&reserved=0> but its relevance/adoption is currently limited by the fact that it's not part of the RFC. Same logic here: even if not supported by Apple players which don't need it as they can leverage I-frame tracks, it would be super useful for the rest of the HLS ecosystem to get this officially part of the RFC.
>> 
>> I'd like to better understand what’s driving this. Is the limitation essentially one of not being able to support an AVC decoder for i-frame display? 
>> 
>> If that’s the case then it seems that putting JPEG images into fMP4 containers and using EXT-X-I-FRAME-STREAM-INF would be a smaller extension to HLS, both in terms of departure from the existing approach and less new spec to invent.
>> 
>> One of the things I don’t love about the image-media-playlist spec is that it doesn’t follow the regular HLS timing model, where the media presentation time is defined in the media data itself. Instead it relies on precise synchronization of the EXTINF values, which seems like a recipe for long-term accumulation of floating point error, as well as difficult to achieve with multiple geographically-dispersed packagers for live.
>> 
> 
> The limitation is exactly that.  A second decoder (AVC or HEVC) is not available on many devices.  This also makes mid-fragment switching difficult as well and makes switching between codecs impossible as well.
> 
> The image-media-playlist spec does rely somewhat on floating point; no more so that a seek to date or seek to time does in a regular HLS playlist, however.  I’m not sure that anyone is asking for precise millisecond switching from these images to regular AV.

The difference is that with seek to date or seek to time you only need to get close enough to download the media segment. After that the precise timestamp is in the media, which will correct any accumulated  floating-point error.

I don’t agree that “close is good enough” for all HLS clients - I know of many apps where a disagreement between the i-frame in the scrubber and what is full-screen would be a must-fix bug.

> 
> I see 2 solutions to this problem: give a PTS/timescale in the HLS playlist (something like we did for transport stream to webVTT timing, but in the playlist), or wrap the jpeg in some sort of wrapper with timing (fmp4?).  It would be good, if that is the route, to have guidance from Apple on what specification to use.

I see the second approach as requiring (considerably) less new specification. It would basically be a SHOULD that clients accept jpeg-in-fMP4 for i-frame-only playlists. 

ISOBMFF is a little light on jpeg-in-mp4 specification — essentially because it’s just MPEG-4 with a particular codec — but there’s certainly Annex H of ISO/IEC 23008-12.

> 
> The thing about JPEGs is that they are easy; almost any software decodes them; wrapping them in FMP4 doesn’t make it easier or better.

It’s about equally easy to wrap JPEG in fmp4, particularly for anything that is already producing fMP4. ffmpeg, for instance, has supported it for ages.

What fmp4 brings to the party is the well-trodden MPEG-4 / CMAF timing model, which means you don’t need to invent something new (and possibly broken in ways that are, at least initially, difficult to appreciate).


Roger.

> -- 
> Hls-interest mailing list
> Hls-interest@ietf.org
> https://www.ietf.org/mailman/listinfo/hls-interest