Re: [Cellar] QuickTime timecode tracks in Matroska, was Ancillary data in Matroska

Dave Rice <dave@dericed.com> Sat, 29 April 2017 02:05 UTC

Return-Path: <dave@dericed.com>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B1A30129AA7 for <cellar@ietfa.amsl.com>; Fri, 28 Apr 2017 19:05:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.12
X-Spam-Level:
X-Spam-Status: No, score=-1.12 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AabMZghn4uir for <cellar@ietfa.amsl.com>; Fri, 28 Apr 2017 19:05:49 -0700 (PDT)
Received: from s172.web-hosting.com (s172.web-hosting.com [68.65.122.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C9CF7129AB2 for <cellar@ietf.org>; Fri, 28 Apr 2017 19:03:33 -0700 (PDT)
Received: from ec2-52-44-2-169.compute-1.amazonaws.com ([52.44.2.169]:57274 helo=[10.101.2.210]) by server172.web-hosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.87) (envelope-from <dave@dericed.com>) id 1d4HjO-001jc0-Kh; Fri, 28 Apr 2017 22:03:32 -0400
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\))
From: Dave Rice <dave@dericed.com>
In-Reply-To: <CAOXsMF+uE6PzVUjPhDd7w98nGXRrJ8Wr4ynqCr2HUvJmbH0kvg@mail.gmail.com>
Date: Fri, 28 Apr 2017 22:03:24 -0400
Cc: Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A224997A-700D-47EB-91C8-DF960790539B@dericed.com>
References: <2651a6f3-9c9a-6f76-2ee7-4d5c23b1ce57@mediaarea.net> <BFB215F1-1254-409C-817A-7AB1A772437A@dericed.com> <922C5404-EC19-462D-A836-C952E66C2FD8@dericed.com> <B3BE02C3-9402-4802-BC9E-3F95020C641D@dericed.com> <9244b201-9097-365f-b7da-f2d8553616ee@mediaarea.net> <79FE60E9-88D8-4922-AE76-9D1924E1D6EB@dericed.com> <43056655-cf09-9d54-6552-4ed4b655e74f@mediaarea.net> <34CA7943-497F-449A-81DA-3237CB87BDD7@dericed.com> <1db0fb80-2441-a4f0-729e-bc5b5cfd1a84@mediaarea.net> <CAOXsMF+uE6PzVUjPhDd7w98nGXRrJ8Wr4ynqCr2HUvJmbH0kvg@mail.gmail.com>
To: Steve Lhomme <slhomme@matroska.org>
X-Mailer: Apple Mail (2.3226)
X-OutGoing-Spam-Status: No, score=-2.9
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server172.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - dericed.com
X-Get-Message-Sender-Via: server172.web-hosting.com: authenticated_id: dave@dericed.com
X-Authenticated-Sender: server172.web-hosting.com: dave@dericed.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/lF22_tp0gmHKNmgvFnEkNr1TPiY>
Subject: Re: [Cellar] QuickTime timecode tracks in Matroska, was Ancillary data in Matroska
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Apr 2017 02:05:51 -0000

> On Apr 26, 2017, at 3:51 AM, Steve Lhomme <slhomme@matroska.org> wrote:
> 
> A bit late on the party but I agree with the move to a more specific
> track type for timecodes rather than the generic ancillary data.
> 
> 2017-03-20 22:07 GMT+01:00 Jerome Martinez <jerome@mediaarea.net>:
>> Le 20/03/2017 à 21:37, Dave Rice a écrit :
>>> 
>>> [...]
>>>>> 
>>>>> I’ve only seen a single value stored but then sometimes an edit list
>>>>> used to alter the ordering of the timecode (in case when its
>>>>> non-sequential). For Matroska possibly it could simply store a new timecode
>>>>> Block at any case when it is not sequential.
>>>> 
>>>> What about seeking?
>>> 
>>> Isn't the seeking similar to QuickTime.
>> 
>> 
>> Difference is that in QuickTime, a wrong "seek table" makes the playback
>> impossible, so developers are more careful about it. and in the file I
>> remember (I must still find it), there was only one offset in the offset
>> table of QuickTime and a single "chunk" of time code with byte size of 4
>> bytes per frame (to be confirmed as you say you saw edit lists instead)
>> 
>>> In QuickTime you'd need the offset table of the timecode track to
>>> understand the number and location of all timecode samples. In Matroska you
>>> would use the associated CuePoints of the timecode track for the same
>>> reference. By parsing Cues the reader should be able to determine if the
>>> timecode is stripped or not and if not then have the CueTimes for each
>>> associated timecode Cluster.
>>> 
>>>> If you have a timecode block at the first frame and also at the middle of
>>>> the file, how a player can know that there is a timecode block breaking the
>>>> sequential order without parsing the whole file when there is a seek request
>>>> to the last minute of content? In that case, time code real value would be
>>>> set to "waiting for new block" (and this block never comes). I don't think
>>>> that forcing full parsing of the file with time code for seeking is a good
>>>> solution.
>>> 
>>> Same. But in the case of video if you seek to a P-frame then you have to
>>> go backwards and decode from the I-frame. Similarly if seeking to a
>>> timepoint without a timecode Cluster (analogous to a P-frame), then the
>>> reader would have to use the Cues to decode the prior timecode Cluster
>>> (analogous to I-frame) in order to determine the timecode value of the seek
>>> point.
>> 
>> 
>> Possible.
>> But we need to be careful, what does it means?
>> examples of issue:
>> 1/ can a player consider that if there is only one CuePoint with CueTrack =
>> the ID of the time code track, it means that this is a stripped content and
>> time codes are in sequential order?
>> 2/ can a player consider that if there are only 2 CuePoints with CueTrack =
>> the ID of the time code track, it means that this is a stripped content and
>> time codes are in sequential order except for one place and there is a
>> single discontinuity?
>> 3/ or must a player consider that if there only 2 CuePoints (let say frame 0
>> and frame 1000) with CueTrack = the ID of the time code track, it means that
>> it must seek to 0 if the requested frame is 999 (so a loooooong read of the
>> file on disk before being able to play frame 999)
>> If answers are 2/ yes 3/ no, I understand that the only method for storing a
>> time code track with all values not sequential is to have a CuePoint for
>> each time code frame (which makes the Cues huge, 15-20 bytes of CuePoint per
>> time code frame).
>> 
>> Please provide example about how you imagine CuePoints in the cases:
>> 1/ sequential time codes during 2000 frames
>> 2/ sequential time codes during 2000 frames except one discontinuity at
>> frame 1000
>> 3/ 2000 different times codes
>> and where a player must seek if the request is to seek to frame 999.
> 
> Are timecode frames supposed to be matching a specific video frame or
> audio frame ? Or they are loose ? If the former case it may be best to
> be attached as a BlockAdditional. Otherwise they should always be in
> sequential/display order, just like audio is. Even if video is not.

I'd propose to limit the use case to the former. The idea of storing timecode data within BlockAdditional is interesting, but would that limit the metadata possibilities. For instance if timecode data is stored as it's own TrackEntry, then potentially a file may contain two timecode TrackEntries, one depicting timecode values from the original source materials used (likely non-continuous), whereas another TrackEntry could represent a new continuous timecode track that served some purpose in a video production. With these timecodes as two TrackEntries with their own Clusters, they could have distinct names and could be independently described with flags, so one may be enabled and default while the other is disabled.

With timecode in BlockAddition, there could potentially be two sets of timecode (two BlockMore), but they would be indistinguishable and could not be controlled or labelled independently.

> Having a separate track means the timestamps may not match exactly the
> video or audio tracks. So in case of seeking/discontinuity it's like
> you said, there's a state where you need to way for the next Timecode
> block to know where you are.

This is the case if the there's one timecode value stored per video frame, but this may not be the case. We also discussed using QuickTime's strategy of storing an integer as a timecode while the timecode track's private data notes what equation to use to convert the integer into a timecode (flags like max24, timebase, etc). In this case a timecode value would only need to be stored at a point in time when the timecode is non-continuous.

> But in general a Timecode block
> should/must be placed before the audio/video it corresponds to. It's
> the same requirement there is in WebM that audio blocks are muxed
> before video blocks. Timecode would be even more prioritary.

Does Matroska have the same ordering requirement? I don't see them. Is it valid to store Clusters in reverse timestamp order?

Dave Rice