Re: [Cellar] QuickTime timecode tracks in Matroska, was Ancillary data in Matroska

Steve Lhomme <slhomme@matroska.org> Tue, 16 May 2017 11:50 UTC

Return-Path: <slhomme@matroska.org>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A043129C43 for <cellar@ietfa.amsl.com>; Tue, 16 May 2017 04:50:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.101
X-Spam-Level:
X-Spam-Status: No, score=0.101 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=matroska-org.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZBgGRGJeaWd for <cellar@ietfa.amsl.com>; Tue, 16 May 2017 04:50:45 -0700 (PDT)
Received: from mail-yw0-x232.google.com (mail-yw0-x232.google.com [IPv6:2607:f8b0:4002:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 34B10129687 for <cellar@ietf.org>; Tue, 16 May 2017 04:47:12 -0700 (PDT)
Received: by mail-yw0-x232.google.com with SMTP id 203so51251677ywe.0 for <cellar@ietf.org>; Tue, 16 May 2017 04:47:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=matroska-org.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FRG38IS9uRrmCPB6zhOKlwIMNFXDpJbsoRN36LTnis8=; b=WLNAZav6KfK3+h83QRY0/jCF8BJO7ZM+nxIXsrQg7VbVn5tpCWpGXr2+/TkNdJo8LC K7h3qeH6bi5h9+9xrTEUkd8ZaN2/s7BqXjtWgizXrgCxNu2zU3tr+pNmr/ZlO/axZP4d kGm7MngWY578euPet7As60s4dNbDxHSZbAcAorfnVVDIK40JsBFWUfGhv6ZsvSmhtAi5 psGb1oOX4DW2H1seKZgpCy+TMvfT3EQPnghhenMErfZA342GNPq6+0KHEbZR2dAn3Vgt 4Uc8Uyej3LyTwNd6RSSjxj2O5q+yZqTXH0DgQO9IT7xhmFaJzGP7XKj717AS9IHpRXk/ BYLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FRG38IS9uRrmCPB6zhOKlwIMNFXDpJbsoRN36LTnis8=; b=XWJ3d04/iRPh9hchXgrGujbxc1xXwd5HMsPJ2O5U68Os9EzNcRiqDz7AlAPpmzCnXj yqT0ag6XoRBhi8kiKLgoeJdYvfaupP4NrUuJz3pC4xPNFXBz35WyQCvIEbl4AFBlY3gi jjpeg2S6hEGjfUOgXre+pMcw+ryG61l9Sb7oWt456OFu5aKmVxBoOvkWAdkkEZz2n6xd MuTZ6akQZvbMDL8u0KQiktRHjQhiyaACc81rdsremoNPg0/4FpthvcJHh2My50+4Gi9l ueZlM3Slc3vMiHmGFunJfjAt7sBqDKTj3te+YLYZqGdVkptdn7ytWrVApqvqEXXaJUSP qd3Q==
X-Gm-Message-State: AODbwcBjkqRRTv6PmBXy3tVp31YdAw+6ftPZLawSKdT2U9bBmcLMOmI1 R9R/CpunHw0D7kb371U8/5IQeiz3tA==
X-Received: by 10.129.39.145 with SMTP id n139mr9175186ywn.10.1494935231158; Tue, 16 May 2017 04:47:11 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.83.13.6 with HTTP; Tue, 16 May 2017 04:47:10 -0700 (PDT)
In-Reply-To: <A224997A-700D-47EB-91C8-DF960790539B@dericed.com>
References: <2651a6f3-9c9a-6f76-2ee7-4d5c23b1ce57@mediaarea.net> <BFB215F1-1254-409C-817A-7AB1A772437A@dericed.com> <922C5404-EC19-462D-A836-C952E66C2FD8@dericed.com> <B3BE02C3-9402-4802-BC9E-3F95020C641D@dericed.com> <9244b201-9097-365f-b7da-f2d8553616ee@mediaarea.net> <79FE60E9-88D8-4922-AE76-9D1924E1D6EB@dericed.com> <43056655-cf09-9d54-6552-4ed4b655e74f@mediaarea.net> <34CA7943-497F-449A-81DA-3237CB87BDD7@dericed.com> <1db0fb80-2441-a4f0-729e-bc5b5cfd1a84@mediaarea.net> <CAOXsMF+uE6PzVUjPhDd7w98nGXRrJ8Wr4ynqCr2HUvJmbH0kvg@mail.gmail.com> <A224997A-700D-47EB-91C8-DF960790539B@dericed.com>
From: Steve Lhomme <slhomme@matroska.org>
Date: Tue, 16 May 2017 13:47:10 +0200
Message-ID: <CAOXsMFJN6+0v-Y=J_=Zw22ZYumXXGVfvXKMPMaWuNYmNeneypg@mail.gmail.com>
To: Dave Rice <dave@dericed.com>
Cc: Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/IoZueqKHJwpdrbqaL2uKO2xjWaU>
Subject: Re: [Cellar] QuickTime timecode tracks in Matroska, was Ancillary data in Matroska
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 May 2017 11:50:47 -0000

2017-04-29 4:03 GMT+02:00 Dave Rice <dave@dericed.com>:
>
>> On Apr 26, 2017, at 3:51 AM, Steve Lhomme <slhomme@matroska.org> wrote:
>>
>> A bit late on the party but I agree with the move to a more specific
>> track type for timecodes rather than the generic ancillary data.
>>
>> 2017-03-20 22:07 GMT+01:00 Jerome Martinez <jerome@mediaarea.net>:
>>> Le 20/03/2017 à 21:37, Dave Rice a écrit :
>>>>
>>>> [...]
>>>>>>
>>>>>> I’ve only seen a single value stored but then sometimes an edit list
>>>>>> used to alter the ordering of the timecode (in case when its
>>>>>> non-sequential). For Matroska possibly it could simply store a new timecode
>>>>>> Block at any case when it is not sequential.
>>>>>
>>>>> What about seeking?
>>>>
>>>> Isn't the seeking similar to QuickTime.
>>>
>>>
>>> Difference is that in QuickTime, a wrong "seek table" makes the playback
>>> impossible, so developers are more careful about it. and in the file I
>>> remember (I must still find it), there was only one offset in the offset
>>> table of QuickTime and a single "chunk" of time code with byte size of 4
>>> bytes per frame (to be confirmed as you say you saw edit lists instead)
>>>
>>>> In QuickTime you'd need the offset table of the timecode track to
>>>> understand the number and location of all timecode samples. In Matroska you
>>>> would use the associated CuePoints of the timecode track for the same
>>>> reference. By parsing Cues the reader should be able to determine if the
>>>> timecode is stripped or not and if not then have the CueTimes for each
>>>> associated timecode Cluster.
>>>>
>>>>> If you have a timecode block at the first frame and also at the middle of
>>>>> the file, how a player can know that there is a timecode block breaking the
>>>>> sequential order without parsing the whole file when there is a seek request
>>>>> to the last minute of content? In that case, time code real value would be
>>>>> set to "waiting for new block" (and this block never comes). I don't think
>>>>> that forcing full parsing of the file with time code for seeking is a good
>>>>> solution.
>>>>
>>>> Same. But in the case of video if you seek to a P-frame then you have to
>>>> go backwards and decode from the I-frame. Similarly if seeking to a
>>>> timepoint without a timecode Cluster (analogous to a P-frame), then the
>>>> reader would have to use the Cues to decode the prior timecode Cluster
>>>> (analogous to I-frame) in order to determine the timecode value of the seek
>>>> point.
>>>
>>>
>>> Possible.
>>> But we need to be careful, what does it means?
>>> examples of issue:
>>> 1/ can a player consider that if there is only one CuePoint with CueTrack =
>>> the ID of the time code track, it means that this is a stripped content and
>>> time codes are in sequential order?
>>> 2/ can a player consider that if there are only 2 CuePoints with CueTrack =
>>> the ID of the time code track, it means that this is a stripped content and
>>> time codes are in sequential order except for one place and there is a
>>> single discontinuity?
>>> 3/ or must a player consider that if there only 2 CuePoints (let say frame 0
>>> and frame 1000) with CueTrack = the ID of the time code track, it means that
>>> it must seek to 0 if the requested frame is 999 (so a loooooong read of the
>>> file on disk before being able to play frame 999)
>>> If answers are 2/ yes 3/ no, I understand that the only method for storing a
>>> time code track with all values not sequential is to have a CuePoint for
>>> each time code frame (which makes the Cues huge, 15-20 bytes of CuePoint per
>>> time code frame).
>>>
>>> Please provide example about how you imagine CuePoints in the cases:
>>> 1/ sequential time codes during 2000 frames
>>> 2/ sequential time codes during 2000 frames except one discontinuity at
>>> frame 1000
>>> 3/ 2000 different times codes
>>> and where a player must seek if the request is to seek to frame 999.
>>
>> Are timecode frames supposed to be matching a specific video frame or
>> audio frame ? Or they are loose ? If the former case it may be best to
>> be attached as a BlockAdditional. Otherwise they should always be in
>> sequential/display order, just like audio is. Even if video is not.
>
> I'd propose to limit the use case to the former. The idea of storing timecode data within BlockAdditional is interesting, but would that limit the metadata possibilities. For instance if timecode data is stored as it's own TrackEntry, then potentially a file may contain two timecode TrackEntries, one depicting timecode values from the original source materials used (likely non-continuous), whereas another TrackEntry could represent a new continuous timecode track that served some purpose in a video production. With these timecodes as two TrackEntries with their own Clusters, they could have distinct names and could be independently described with flags, so one may be enabled and default while the other is disabled.

In general it's not a good idea to have to places to define the same
thing. I suppose a "final" timecode compare to a "source material"
timecode would make sense though. In that case it's better to use a
separate track.

> With timecode in BlockAddition, there could potentially be two sets of timecode (two BlockMore), but they would be indistinguishable and could not be controlled or labelled independently.

Correct, there's no way to tag BlockAddition or set one as default.

>> Having a separate track means the timestamps may not match exactly the
>> video or audio tracks. So in case of seeking/discontinuity it's like
>> you said, there's a state where you need to way for the next Timecode
>> block to know where you are.
>
> This is the case if the there's one timecode value stored per video frame, but this may not be the case. We also discussed using QuickTime's strategy of storing an integer as a timecode while the timecode track's private data notes what equation to use to convert the integer into a timecode (flags like max24, timebase, etc). In this case a timecode value would only need to be stored at a point in time when the timecode is non-continuous.

That could nice. But if there's a discontinuity it's tricky to know a
particular timecode when seeking. That would require (as in mandatory)
that such track have Cue entries for all discontinuity. There's no
such requirements for other tracks for now.

>> But in general a Timecode block
>> should/must be placed before the audio/video it corresponds to. It's
>> the same requirement there is in WebM that audio blocks are muxed
>> before video blocks. Timecode would be even more prioritary.
>
> Does Matroska have the same ordering requirement?

No it is only for WebM.

> I don't see them. Is it valid to store Clusters in reverse timestamp order?

Clusters must have stricly incrementing timestamps. I thought it was
written somewhere but it's not.

> Dave Rice
>



-- 
Steve Lhomme
Matroska association Chairman