Re: [Hls-interest] Image-based subtitles and trickplay tracks
Alex Giladi <alex.giladi@gmail.com> Fri, 29 May 2020 16:22 UTC
Return-Path: <alex.giladi@gmail.com>
X-Original-To: hls-interest@ietfa.amsl.com
Delivered-To: hls-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2158C3A0D86 for <hls-interest@ietfa.amsl.com>; Fri, 29 May 2020 09:22:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yLXnqD3A5haF for <hls-interest@ietfa.amsl.com>; Fri, 29 May 2020 09:22:02 -0700 (PDT)
Received: from mail-yb1-xb2d.google.com (mail-yb1-xb2d.google.com [IPv6:2607:f8b0:4864:20::b2d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E85EB3A0D87 for <hls-interest@ietf.org>; Fri, 29 May 2020 09:22:01 -0700 (PDT)
Received: by mail-yb1-xb2d.google.com with SMTP id s7so1399170ybo.9 for <hls-interest@ietf.org>; Fri, 29 May 2020 09:22:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PZTL87jv4jTqsn8SLayUClzPwMEwiqCDHpFwyv8TqtY=; b=DZvGnET5ii5CeCWOxoG7VyNaA3CbP/CfYLF7A/BwBO60O2lr4aKWXlJe7sJDXOWJk8 Z9zLR66V5BFB0ePeuhlxYt1LX+t4dFgnMYs/J+HA52Cia1JkU6WDNCtHCdFHZQ3nBHSh VM2H/uY1qkQlNBAHCZ/ooU0KvpsH+v8drf0fYIWMeJXvGk11L/OUt5QzjKhdYl0OgoZw w4Qo+TNHPOuRRPBVLCR6MxC1jDxwkAehCq6Ai3sU7VdZGa11KML1ORtxE5+QAuxQTrXA Etaj7Mdmz9zjkiAdtsjPVljXCh9VyJYnKPrnSwPINu+17gIYJi2glNDJKWaYDFg438pA WqDQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PZTL87jv4jTqsn8SLayUClzPwMEwiqCDHpFwyv8TqtY=; b=jYQWltfTL5rj87Ut/y7q81xwbN8xMRRIcl0lKnn6XE6KrKLsOa5h7mqklrofQf6Kel x99gROWFedChtPCHVnoz2+chL2AVadBH259+4imwOkj6RZsLcZzufxaT9oSPyAM1T1+G RYYwKZybyOsbHLNbRA04BQKczSoOSmfws8gIz5pATKNCn5AmFkx2lom5XrlxBZNFrDut OVww+KCNsXfZq9DS/XpTDAs1E+HM7DQ4MbR/vpXJUFZTMXdlxtDczb4x4TWwym5Cpwrb DZAwg6rVr+ICHCInHPueOgWFLYkJJzyxwLi2JGhPXjDOwtLEu2LCf917r89S23PQBXiZ QWLQ==
X-Gm-Message-State: AOAM5305v91FlcY8Q2sbMxjnBDFsaasxx54pizBf8XoZohAa7KdkHsd0 LFS4sug/rAGizkf6k4KxVFR0MgV02FaeTkSa8iE=
X-Google-Smtp-Source: ABdhPJzAmTABgrzzGrjQ78Ez9P7NWf7IYeNP+1BPFKs/hWoAbP4giYlU1DhwIwrQlI2SeMTpIkP8YHp+dTVhgYB3o9U=
X-Received: by 2002:a25:be53:: with SMTP id d19mr15110247ybm.138.1590769319380; Fri, 29 May 2020 09:21:59 -0700 (PDT)
MIME-Version: 1.0
References: <36B9F7F1-C37D-4FD4-921A-FFAE958AD791@bbc.co.uk>
In-Reply-To: <36B9F7F1-C37D-4FD4-921A-FFAE958AD791@bbc.co.uk>
From: Alex Giladi <alex.giladi@gmail.com>
Date: Fri, 29 May 2020 10:21:48 -0600
Message-ID: <CAF-MBSJVm5_csH+K0OdJ5R8i4zaCt55pqVfOGiFZjkpVLQ5Hvg@mail.gmail.com>
To: Nigel Megitt <nigel.megitt@bbc.co.uk>
Cc: "Weil, Nicolas" <nicoweil@elemental.com>, "May, Bill" <Bill.May@disneystreaming.com>, "hls-interest@ietf.org" <hls-interest@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000338a5505a6cbd8bf"
Archived-At: <https://mailarchive.ietf.org/arch/msg/hls-interest/REwn7mbw7olEJBtyqai1-yrMUMs>
Subject: Re: [Hls-interest] Image-based subtitles and trickplay tracks
X-BeenThere: hls-interest@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions about HTTP Live Streaming \(HLS\)." <hls-interest.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/hls-interest/>
List-Post: <mailto:hls-interest@ietf.org>
List-Help: <mailto:hls-interest-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hls-interest>, <mailto:hls-interest-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 May 2020 16:22:05 -0000
Hi Nigel, This is very interesting! How much bandwidth do the DVB subtitles take per language? Best, Alex. On Fri, May 29, 2020 at 10:09 AM Nigel Megitt <nigel.megitt@bbc.co.uk> wrote: > A somewhat off-topic point, but something I'd like to pick up on: > > On 26/05/2020, 13:22, "Weil, Nicolas" <nicoweil@elemental.com> wrote: > > > Another question: there are requirements in the US at least that > require the ability to change font sizes, colors, etc. And, TBH, these > are changes that help people world-wide. > > That isn't the way it looks from this side of the Atlantic! Subtitles > globally have a lot of localised cultural idioms, and allowing > unconstrained modification of settings like this, or making it seem okay > for the client device or player code to decide can cause well > thought-through authorial choices to be discarded in ways that break the > experience for viewers. > > For example in the UK colours are used to indicate changes of speaker, > sometimes multiple times per line, and in France I understand they are used > to indicate different sources or types of sound. > > In examining the claim about the benefit of those options, I've so far > been unable to find good published evidence demonstrating the benefit of > all of those customisations, but I did commission research for the BBC that > indicates some user preferences around customising text size. > > The point is, please don't design technical solutions on the _assumption_ > that those US requirements are needed or wanted globally. Of course I'm not > saying there's anything wrong with designing technical solutions to > _accommodate_ such requirements. > > Returning to the main thread, the BBC does broadcast bitmap subtitles > according to the DVB specifications mentioned elsewhere, and that is > considered a reasonable accessibility solution for televisions, for > audience members who cannot hear the sound. I realise this does not answer > Roger's question about use of image based subtitles based on IMSC Image > Profile specifically. > > One important factor in favour of bitmap subtitles is that the client side > work needed to modify the image to include the subtitles is minimised, > which can help with synchronisation requirements. For example there is no > question about installing fonts, using processor cycles to layout and > rasterise text etc. For lower-end devices, this can be a helpful part of > the solution. > > Nigel > > > > On 26/05/2020, 13:22, ""Weil, Nicolas" <nicoweil@elemental.com <mailto: > nicoweil@elemental.co=" <nicoweil@elemental.com <mailto: > nicoweil@elemental.co=> wrote: > > > > Comments inline. > > > > From: May, Bill <Bill.May@disneystreaming.com <mailto: > Bill.May@disneystream= > ing.com> > > Sent: Tuesday, May 26, 2020 9:31 AM > To: Roger Pantos <rpantos=3D40apple.com@dmarc.ietf.org <mailto:rpantos > =3D40= > apple.com@dmarc.ietf.org> > > Cc: Weil, Nicolas <nicoweil@elemental.com <mailto: > nicoweil@elemental.com> >= > ; hls-interest@ietf.org <mailto:hls-interest@ietf.org> > Subject: RE: [Hls-interest] Image-based subtitles and trickplay tracks > > > > > > On May 22, 2020, at 11:17 AM, Roger Pantos > <rpantos=3D40apple.com@dmarc.iet= > f.org <mailto:rpantos=3D40apple.com@dmarc.ietf.org> > wrote: > > > > Hello Nicholas. Thanks for bringing these up. I have some questions: > > > > On May 20, 2020, at 3:12 PM, Weil, Nicolas <nicoweil@elemental.com > <mailto:= > nicoweil@elemental.com> > wrote: > > > > Hello, > > > > We are often seeing two image-related topics causing interoperability > probl= > ems as they are not currently covered by the HLS spec.. Normalizing > the imp= > lementations around an official specification for these two points > would be= > great: > > > > Image-based subtitles tracks > For workflow reasons and charset reasons, some content owners don't > include= > text-based subtitles in the live channels sources that they provide > to dis= > tributors, but rather image-based subtitles (like DVB-Sub). While it's > poss= > ible to transform these subtitles as IMSC1 Image Profile as per > DASH-IF IOP= > section 6.4.4, there is no equivalent IMSC1 Image Profile support in > the H= > LS RFC, which means that companies will continue to rely on > proprietary for= > ks of the HLS RFC to support these use cases. Even if it wasn't > supported b= > y Apple players, it would be tremendously helpful for interoperability > in t= > he rest of the HLS ecosystem. > > > > I'd like to understand how widely validated the Image Profile of IMSC1 > has = > been. Can anyone volunteer some examples where it=E2=80=99s been > commercial= > ly deployed successfully? (Specifically IMSC1, vs. some other fork of > TTML.= > ) > > [NW] IMSC1 Image profile is now supported by ATSC3, DASH-IF IOP (with > suppo= > rt in dash.js) and IMF (with support in < > https://urldefense.proofpoint.com= > > /v2/url?u=3Dhttps-3A__github.com_IMFTool_IMFTool&d=3DDwMGaQ&c=3D96ZbZZcaMF4= > > w0F4jpN6LZg&r=3DKkevKJerDHRF9WRs8nW8Ew&m=3DdBy7sHrrIgjIMBRCpRh9IfMnwq8RKZHc= > gJLKmlBotj0&s=3DfBrrNLMD_eSxO9MYInkAMOhyaX6Zicdvk3HTD6p5mho&e=3D> > IMFTool w= > hich development has been sponsored by Netflix and other studios > initially)= > =2E > > > > Another question: there are requirements in the US at least that > require th= > e ability to change font sizes, colors, etc. And, TBH, these are > changes = > that help people world-wide. > > > > How would you meet those requirements with bit mapped subtitles? > Wouldn=E2= > =80=99t it be better to work to eliminate bitmapped subtitles > completely? > > [NW] I believe these font size/color change requirements can be > satisfied w= > ith IMSC1 Text Profile which is supported in rfc8216bis since 2017. > As much as I=E2=80=99d like to get rid of bitmap subtitles, sometimes > the c= > ontent owners cannot provide anything else than bitmaps in the source > feed.= > And it=E2=80=99s very challenging to apply a reliable OCR pass on it, > for = > all target languages (Latin/Cyrillic/Asian/=E2=80=A6 charsets). IMSC1 > Image= > Profile has got a decent industry support, and the Text Profile is > already= > supported in HLS, so I would expect it to be a natural extension for > HLS t= > o support also the Image Profile. > > > > Image-based trickplay tracks > For player resources optimization reasons, the use of a video track as > a tr= > ickplay artefact is not always possible, and a lot of player providers > reco= > mmend the use of image thumbnails tracks instead of special low > framerate v= > ideo tracks. DASHIF IOP section 6.2.6 covers this use case but there > is equ= > ivalent support in the HLS RFC. There is the Image Media playlists HLS > exte= > nsion proposal from Roku/Disney/WarnerMedia here > https://github.com/image-m= > edia-playlist/spec but its relevance/adoption is currently limited by > the f= > act that it's not part of the RFC. Same logic here: even if not > supported b= > y Apple players which don't need it as they can leverage I-frame > tracks, it= > would be super useful for the rest of the HLS ecosystem to get this > offici= > ally part of the RFC. > > > > I'd like to better understand what=E2=80=99s driving this. Is the > limitatio= > n essentially one of not being able to support an AVC decoder for > i-frame d= > isplay? > > > > If that=E2=80=99s the case then it seems that putting JPEG images into > fMP4= > containers and using EXT-X-I-FRAME-STREAM-INF would be a smaller > extension= > to HLS, both in terms of departure from the existing approach and > less new= > spec to invent. > > > > One of the things I don=E2=80=99t love about the image-media-playlist > spec = > is that it doesn=E2=80=99t follow the regular HLS timing model, where > the m= > edia presentation time is defined in the media data itself. Instead it > reli= > es on precise synchronization of the EXTINF values, which seems like a > reci= > pe for long-term accumulation of floating point error, as well as > difficult= > to achieve with multiple geographically-dispersed packagers for live. > > > > > > The limitation is exactly that. A second decoder (AVC or HEVC) is not > avai= > lable on many devices. This also makes mid-fragment switching > difficult as= > well and makes switching between codecs impossible as well. > > > > The image-media-playlist spec does rely somewhat on floating point; no > more= > so that a seek to date or seek to time does in a regular HLS > playlist, how= > ever. I=E2=80=99m not sure that anyone is asking for precise > millisecond s= > witching from these images to regular AV. > > > > I see 2 solutions to this problem: give a PTS/timescale in the HLS > playlist= > (something like we did for transport stream to webVTT timing, but in > the p= > laylist), or wrap the jpeg in some sort of wrapper with timing > (fmp4?). It= > would be good, if that is the route, to have guidance from Apple on > what s= > pecification to use. > > > > The thing about JPEGs is that they are easy; almost any software > decodes th= > em; wrapping them in FMP4 doesn=E2=80=99t make it easier or better. > > > > [NW] Thumbnails in the DASH-IF IOP are simple jpeg images and it makes > it e= > asy to produce and to manipulate on the service side (like aggregating > seve= > ral live thumbnails into tiles of thumbnails when a program is > transitioned= > from Live to VOD). Using the same simple image container would allow > direc= > t interoperability with DASH, without requiring an additional > CMAF+DASH spe= > cification cycle. As regards HLS, I was hoping that the use of > EXT-PROGRAM-= > DATE-TIME would become mandatory, as per the preliminary LL-HLS > specificati= > on. That would give use the millisecond-accurate time reference that > we nee= > d to avoid drifts if we keep images in a simple image container. > > > > > -- > Hls-interest mailing list > Hls-interest@ietf.org > https://www.ietf.org/mailman/listinfo/hls-interest >
- [Hls-interest] Image-based subtitles and trickpla… Weil, Nicolas
- Re: [Hls-interest] Image-based subtitles and tric… Phil Cluff
- Re: [Hls-interest] Image-based subtitles and tric… Alex Giladi
- Re: [Hls-interest] Image-based subtitles and tric… Roger Pantos
- Re: [Hls-interest] Image-based subtitles and tric… May, Bill
- Re: [Hls-interest] Image-based subtitles and tric… Weil, Nicolas
- Re: [Hls-interest] Image-based subtitles and tric… Law, Will
- Re: [Hls-interest] Image-based subtitles and tric… John Luther
- Re: [Hls-interest] Image-based subtitles and tric… Roger Pantos
- Re: [Hls-interest] Image-based subtitles and tric… Roger Pantos
- Re: [Hls-interest] Image-based subtitles and tric… Roger Pantos
- Re: [Hls-interest] Image-based subtitles and tric… Boy van Dijk
- Re: [Hls-interest] Image-based subtitles and tric… Weil, Nicolas
- Re: [Hls-interest] Image-based subtitles and tric… Law, Will
- Re: [Hls-interest] Image-based subtitles and tric… rufael
- Re: [Hls-interest] Image-based subtitles and tric… Nigel Megitt
- Re: [Hls-interest] Image-based subtitles and tric… Alex Giladi
- Re: [Hls-interest] Image-based subtitles and tric… Nigel Megitt