Re: [Cellar] Matroska Elements to support frame side data

Dave Rice <dave@dericed.com> Sun, 11 November 2018 23:15 UTC

Return-Path: <dave@dericed.com>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CA36F130DD1 for <cellar@ietfa.amsl.com>; Sun, 11 Nov 2018 15:15:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.121
X-Spam-Level:
X-Spam-Status: No, score=-1.121 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C6Gg9K2kUrQL for <cellar@ietfa.amsl.com>; Sun, 11 Nov 2018 15:15:38 -0800 (PST)
Received: from server172-3.web-hosting.com (server172-3.web-hosting.com [68.65.122.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BDCF1128CE4 for <cellar@ietf.org>; Sun, 11 Nov 2018 15:15:38 -0800 (PST)
Received: from cpe-104-162-94-162.nyc.res.rr.com ([104.162.94.162]:45189 helo=[10.0.1.17]) by server172.web-hosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from <dave@dericed.com>) id 1gLyx1-001o75-OC; Sun, 11 Nov 2018 18:15:35 -0500
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.0 \(3445.100.39\))
From: Dave Rice <dave@dericed.com>
In-Reply-To: <CAOXsMFJ-dEfn3k6xBwT1XQcKX_hvL1Om+0UVJGpscVgFDrB3xw@mail.gmail.com>
Date: Sun, 11 Nov 2018 18:15:30 -0500
Cc: Moritz Bunkus <moritz=40bunkus.org@dmarc.ietf.org>, Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>, Tobias Rapp <t.rapp@noa-archive.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <69462008-FABB-476E-8032-EEE21A832A9B@dericed.com>
References: <24ED459F-2375-4934-9156-E64BB1A8AC05@dericed.com> <62624df3-77e8-3234-8496-30354570abee@noa-archive.com> <1E9938AC-B503-42A4-B1A9-43345F81B30F@dericed.com> <87r2fz7u01.fsf@bunkus.org> <CAOXsMFJ-dEfn3k6xBwT1XQcKX_hvL1Om+0UVJGpscVgFDrB3xw@mail.gmail.com>
To: Steve Lhomme <slhomme@matroska.org>
X-Mailer: Apple Mail (2.3445.100.39)
X-OutGoing-Spam-Status: No, score=-2.9
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server172.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - dericed.com
X-Get-Message-Sender-Via: server172.web-hosting.com: authenticated_id: dave@dericed.com
X-Authenticated-Sender: server172.web-hosting.com: dave@dericed.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/mqJjNbbHboWu4fANlFMSHMDpMsA>
Subject: Re: [Cellar] Matroska Elements to support frame side data
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Nov 2018 23:15:41 -0000

Hi Steve,

Initially I looked at the BlockAdditions Element as a possibility to store side data, but the definition didn’t seem to allow it. Since the specification defers to the encoding ("Interpreted by the codec as it wishes”) there doesn’t seem to be any way for the Matroska specification to define a BlockAdditionalID value (except for 0 which is reserved as a reference to the main block). 

However, this approach works for me if others accept the break in reverse compatibility.

Several other definitions would require updates. Currently we have:
BlockAdditions: "Contain additional blocks to complete the main one. An EBML parser that has no knowledge of the Block structure could still see and use/skip these data.”
 BlockAddID: "An ID to identify the BlockAdditional level.” 
 BlockAdditional: "Interpreted by the codec as it wishes (using the BlockAddID)."
AlphaMode: "Alpha Video Mode. Presence of this Element indicates that the BlockAdditional Element could contain Alpha data.”

Notes about what definition updates would be needed:
BlockAdditions’s definition would have to be rewritten as the data wouldn’t necessarily “complete” the main block but might alternatively supplement it or describe it.
BlockAddID should include an enumerated list of integers and reference a registry on what each value means.
BlockAdditional’s definition would have to be updated, since only in some cases would the BlockAddition content be interpreted by the codec.
AlphaMode’s definition seems to imply that one of the BlockAdditional Elements contains alpha but gives no way to identify which one, so this is the largest reverse compatibility issue. Are there Matroska demuxers which would try to use timecode or rawcooked data as if it was alpha data?

So IIUC BlockAddID=1 is reserved for the context of the associated codec mapping (as done in A_WAVPACK4). Then we reserve BlockAddID=2 for alpha and 0 is reserved since BlockAdditionID uses 0 to reference the main block. Then any other type of data for storage in BlockAdditional would use the next available unsigned integer from 3 and would require an BlockAdditionMapping. I suggest that BlockAddId = 0, 1 and 2 would not require a BlockAdditionMapping since the specification would reserve them for particular purpose.

But if we already define BlockAddId for 0, 1, and 2 in the specification then why not continue and reserve 3 for rawcooked and 4 for timecode and so on. That would avoid the need for storing that data in BlockAdditionMapping, eliminate the need for new BlockAdditionMapping elements, and then demuxers could understand if they can use that data by simply checking the BlockAddId rather than looking up the corresponding value in BlockAddIDName.

Also what is the need for MaxBlockAdditionID? And what is the reason to keep BlockAdditionID low, simply to keep it to a single byte?

Dave

> On Nov 11, 2018, at 10:40 AM, Steve Lhomme <slhomme@matroska.org> wrote:
> 
> I would go with the Track way as well. Primarily because storing a
> string (which pretty much never changes) in each Block is a huge
> waste.
> 
> Add extra data per Block is already supported using BlockAdditions.
> There's already BlockAddID which correspond to Moritz' BlockMetadataID
> and BlockAdditional which correspond to BlockMetadataString /
> BlockMetadataBinary / BlockMetadataUInteger / BlockMetadataSInteger /
> BlockMetadataFloat. It's like a codec where the CodecID defines how
> the data in the binary blob should be interpreted.
> 
> The current system states that the additions are left to
> interpretation to the codec. It was originally designed to hold the
> lossless complement to lossy versions of Musepack. So in that case
> it's really meant to be passed to the codec. I think we can expand
> this system with keeping this default behaviour by default (albeit not
> used anywhere) and have different ones on demand.
> There's also an AlphaMode that also uses BlockAdditions to store the
> alpha track. Which pretty much no info on how to do it....
> 
> As noted timecode may be a separate track (as originally intended) if
> it is not related to the video frames (ie the timestamps doesn't
> match).
> 
> This would look like this:
> - Musepack lossless complement:
> Segment\Tracks\TrackEntry\MaxBlockAdditionID: 1
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 1
> (same as BlockAddID) (default)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName:
> "complement" (default)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType: 0
> (Codec Complement data) (default)
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 1
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> lossless part interpreted by the codec
> 
> - Alpha layer:
> Segment\Tracks\TrackEntry\MaxBlockAdditionID: 2
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 2
> (same as BlockAddID)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName: "alpha"
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType: 1
> (Alpha layer data)
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 2
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> alpha mask to apply on the video track
> 
> - RawCooked DPX data
> Segment\Tracks\TrackEntry\MaxBlockAdditionID: 3
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 3
> (same as BlockAddID)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName: "rawcooked"
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType:
> 0x1234567 (rawcooked identifier)
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 3
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> DPX data defined by RawCooked
> 
> - Timecode storing
> Segment\Tracks\TrackEntry\MaxBlockAdditionID: 3
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 3
> (same as BlockAddID)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName: "timecode"
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType:
> 0x890ABCD (SMPTE TC identifier, can be another ID for different kind
> of timecode)
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 3
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> Timecode storage
> 
> That means the alpha mode would not be backward compatible with
> existing files, because it requires non default values. But I don't
> think anyone ever used this improperly defined feature.
> 
> The value of MaxBlockAdditionID is kept low on purpose. BlockAddID 1
> was always for codec complement and thus 2 for the AlphaMode. But we
> don't need to go much higher that than now that we have a mapping. If
> there are Timecode AND Rawcooked it would be like this:
> Segment\Tracks\TrackEntry\MaxBlockAdditionID: 4
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 3
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName: "rawcooked"
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType:
> 0x1234567 (rawcooked identifier)
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDValue: 4
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDName: "timecode"
> Segment\Tracks\TrackEntry\BlockAdditionMapping\BlockAddIDType:
> 0x890ABCD (SMPTE TC identifier, can be another ID for different kind
> of timecode)
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 3
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> DPX data defined by RawCooked
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAddID: 4
> Segment\Cluster\BlockGroup\BlockAdditions\BlockMore\BlockAdditional:
> Timecode storage
> 
> Le lun. 5 nov. 2018 à 10:29, Moritz Bunkus
> <moritz=40bunkus.org@dmarc.ietf.org> a écrit :
>> 
>> Hey,
>> 
>>> In other thoughts on this suggestion, I think it could make it difficult
>>> to easily understand if a file has a particular type of side data. For
>>> instance if only a few Clusters somewhere in the Segment contain a
>>> certain type of side data, it would require parsing every Cluster to know
>>> what types of side data are available. This uncertainly wouldn’t be the
>>> same issue if the side data was itself a Track.
>> 
>> It's not entirely necessary to use a full track for side data. We can simply
>> signal the presence of side data in the track headers and refer to it from
>> the side data in the block groups. This would also mean we only have to
>> store the string identifying the side data type once (in the track headers)
>> instead of in each block.
>> 
>> (I'll use "BlockMetadata" as the basis for all element names
>> here. Initially I proposed "FrameMetadata", but "BlockMetadata" is fine
>> with me, too.)
>> 
>> For example:
>> 
>> Tracks
>> +- TrackEntry
>> +- TrackBlockMetadata (Master)
>>  +- TrackBlockMetadataType (String, required)
>>  +- TrackBlockMetadataID (Unsigned Integer, required)
>> 
>> …
>> 
>> Cluster
>> +- BlockGroup
>> +- BlockMetadata (Master)
>>  +- BlockMetadataID (Unsigned Integer, required, refers to existing
>>     TrackBlockMetadataID in track headers)
>>  +- BlockMetadataString (Unicode String, optional)
>>  +- BlockMetadataBinary (Binary, optional)
>>  +- BlockMetadataUInteger (Unsigned Integer, optional)
>>  +- BlockMetadataSInteger (Signed Integer, optional)
>>  +- BlockMetadataFloat (Float, optional)
>> 
>> with the restriction that exactly one of (BlockMetadataString,
>> BlockMetadataBinary, BlockMetadataUInteger, BlockMetadataSInteger,
>> BlockMetadataFloat) must exist.
>> 
>> Advantages as I see them:
>> 
>> • Less overhead (no repeated string parsing required)
>> • Quicker parsing (no repeated string parsing required)
>> • Presence of meta data is known upfront
>> • Not using a full-blown track for meta data would alleviate the need to
>>  specify how all those track and block features (e.g. BlockDuration,
>>  TrackDefaultDuration…) apply to a "meta data track".
>> 
>> Kind regards,
>> mosu
>> 
>> _______________________________________________
>> Cellar mailing list
>> Cellar@ietf.org
>> https://www.ietf.org/mailman/listinfo/cellar
> 
> 
> 
> -- 
> Steve Lhomme
> Matroska association Chairman