Re: [Cellar] Ancillary data in Matroska

Dave Rice <dave@dericed.com> Sat, 07 January 2017 16:57 UTC

Return-Path: <dave@dericed.com>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DE9A129522 for <cellar@ietfa.amsl.com>; Sat, 7 Jan 2017 08:57:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.652
X-Spam-Level:
X-Spam-Status: No, score=0.652 tagged_above=-999 required=5 tests=[SPF_NEUTRAL=0.652] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xlS0txZCL4qR for <cellar@ietfa.amsl.com>; Sat, 7 Jan 2017 08:57:22 -0800 (PST)
Received: from s172.web-hosting.com (s172.web-hosting.com [68.65.122.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 45A1D1294BC for <cellar@ietf.org>; Sat, 7 Jan 2017 08:57:22 -0800 (PST)
Received: from cpe-104-162-86-103.nyc.res.rr.com ([104.162.86.103]:41882 helo=[10.0.1.3]) by server172.web-hosting.com with esmtpsa (TLSv1:ECDHE-RSA-AES256-SHA:256) (Exim 4.87) (envelope-from <dave@dericed.com>) id 1cPuIx-0033Bv-Pg; Sat, 07 Jan 2017 11:57:21 -0500
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Dave Rice <dave@dericed.com>
In-Reply-To: <2651a6f3-9c9a-6f76-2ee7-4d5c23b1ce57@mediaarea.net>
Date: Sat, 07 Jan 2017 11:57:16 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <BFB215F1-1254-409C-817A-7AB1A772437A@dericed.com>
References: <2651a6f3-9c9a-6f76-2ee7-4d5c23b1ce57@mediaarea.net>
To: Jerome Martinez <jerome@mediaarea.net>
X-Mailer: Apple Mail (2.3124)
X-OutGoing-Spam-Status: No, score=-2.9
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server172.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - dericed.com
X-Get-Message-Sender-Via: server172.web-hosting.com: authenticated_id: dave@dericed.com
X-Authenticated-Sender: server172.web-hosting.com: dave@dericed.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/KkkHuCmCo-zq7Id_4LMB2kF5qzU>
Cc: cellar@ietf.org
Subject: Re: [Cellar] Ancillary data in Matroska
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Jan 2017 16:57:24 -0000

Hi all,

> On Dec 27, 2016, at 3:22 PM, Jerome Martinez <jerome@mediaarea.net> wrote:
> 
> We need additional codec to be supported in Matroska in order to be able to transwrap in a lossless manner content from older containers to Matroska, especially time codes (all the codecs defined in the patch can contain 1 or more time codes)
> The global idea is to be able to transwrap e.g. MOV (from Apple), MXF (from SMPTE), GXF (from Grass Valley, also known as SMPTE ST 360 and SMPTE RDD 14, found in some broadcaster archives), LXF (from Leitch/Harris, found in some broadcaster archives) without losing metadata in some sidecar non visible streams.

I think this is an interesting approach and it’s probably better than yet again defined another timecode containment system.

> After agreeing on the format, we will have to update tools e.g. FFmpeg in order to support such lossless transwrap.
> 
> 
> I added an "Ancillary" section with several mappings in the attached patch, please comment.
> 
> 
> Comments:
> 
> The frequent remark is "Why do you need so many time code formats, choose one and stop using the others": in the real world, people consider that one format is the best one, but not the same format as others, and they use only this time code for their work and don't plan to switch. You can have 10+ time codes in a file and each one is important for someone, as well as the sidecar information transported at the same time (never same between each format). There is no consensus about that for years, and I don't think we could achieve a consensus from the different actors (lot of them are not interested by our work at IETF) so the idea here is to be able to move from one container to another one without losing any metadata, absolutely not deciding about a reference time code (and other metadata) format, as Matroska does for e.g. video (Matroska does not say that VP9 should be used instead of AVC, it supports both).

Agreed.

> "N_" prefix arbitrary used (2nd letter of "Ancillary" because first letter is already used)

Please also add a patch for the TrackType Element in ebml_matroska.xml as we’ll need an unsigned integer to represent the ‘ancillary’ type of track.

> N_QUICKTIME: is a pure copy of V_QUICKTIME and A_QUICKTIME already supported in Matroska, N_ would be used which would be used for e.g. tmcd codec (QuickTime time code). Exactly same principles as video and audio. Note that classic usage is to store only the first frame content and other content is computed from it, so a player would have to read the first frame even if there is a direct seek request to another place; this could be avoided with some "hack" in the track header, could be the next step after this one is accepted.

Would need a clear caveat in the definition that this implementation would only support storage of the first timecode value and computation of each additional value. A transwrap from a mov file with a timecode track that uses an edit list to handle non-continuous timecode could not be a lossless transwrap in this case, since the edit list would be lost and the new timecode track in the Matroska file would appear continuous.

Also reference to tmcd atom is not clear enough. Is it tref/tmcd, gmhd/tmcd, stsd/tmcd.

If this is stsd/tmcd (I presume), then what time scale should be used? The time scale stored in tmcd or the time scale of the Matroska track?

I like that copying stsd/tmcd copies in the timecode label as well.

> VBI and ANC: used to transport time codes (LTC, VITC, ATC...), Recording Information, bar and pan/scan data, captions (North American CEA-608, CEA-708, European and Australian WSS/Teletext, Japanese ARIB B37...), camera acquisition dynamic metadata, Audio Metadata, Film Transfer and Video Production Information (in theory, I never saw it)...
> The "perfect" solution would be to decode VBI and ANC, and put each format in its own specification, but it would be an huge task and we still would have to store opaque content (content is present in the VBI/ANC but we ignore all from it right now), so it is better to start by the beginning and we consider all opaque without trying to decode the data, as do other containers.

I agree to store and consider it as opaque or to leave within the video but use the PixelCrop elements to note that the lines of the VBI data aren’t intended in presentation. 

> +**Codec ID:** N_VBI  
> +**Codec Name:** Vertical Blanking Interval  
> +**Description:** used in SDTV, see https://en.wikipedia.org/wiki/Vertical_blanking_interval for more information. Each Matroska block contains a SMPTE ST 436 VBI Frame Element.  

OK in the patch you reference storage of VBI as smpte 436m frame elements (example attached to https://trac.ffmpeg.org/ticket/726). Is the N_ prefix still relevant as the encoding would contain captioning data as well? “S_”?

> N_STRIPPEDTIMECODE: the Cluster part will contain no data, but despite of that it must be a complete standalone track, with Track Id ans so on, as it is a standalone track in the source container.

Do you mean that there would be an N_STRIPPEDTIMECODE but it would reference no Clusters, or that the N_STRIPPEDTIMECODE track would reference Clusters that contain no SimpleBlock and no BlockGroup? If the latter where would the value of the timecode be stored?

> +**Codec ID:** N_STRIPPEDTIMECODE  
> +**Codec Name:** Stripped time code  
> +**Description:** track containing no data in the Cluster elements, as all is in the track header: the CodecPrivate element contains, in big endian: 4 bytes Rate numerator, 4 bytes rate denominator, 4 bytes count of consecutive time codes / frames (the "duration", 0 if unknown), 1 byte time code drop frame flag (0 is no, 1 is yes, others are reserved), 4 byte the start time code (in frames). Note: this permits to map a MXF time code track or a GXF stripped time code track to Matroska.  

“no data in the Cluster elements” is not clear enough. The structure of CodecPrivate would not note flags such as if the timecode is limited to 24 hours or not. If that’s similar to the MXF and GXF limitations then I guess it’s fine. Any reference to the CodecPrivate structure? From the writing it’s not clear if this is a new structure or copied from some other timecode storage form.

I suggest throughout the patch we are consistent on capitalization (Frame vs frame) and spacing (timecode vs time code).

Also I’m worried about the confusion to be created between the Matroska Timecode* Elements and the concept of timecode (copies to external reference systems from source objects). An earlier discussion suggested calling the focus of this patch as Reference Timecode to distinguish it from Matroska’s own Timecode. I see that SMPTE is starting to reference timecode as “Time Label” which may also be appropriate here.

> N_SDTICP: nearly 1:1 copy of SMPTE ST 385 elements. In order to have all data and being able to have only 1 Matroska block for all STDI-CP elements, I kept also the 2 last bytes of the KLV "Key" and the KLV "Length" in addition of the KLV "Value". Note that something weird, this is a standalone track in source container but the difference with other tracks (including ANC or VBI) is that this stream has no track header so no track number, and Matroska specs makes TrackNumber mandatory, I wonder what should be used during transwrap (fake number?)

Not familiar with this type of timecode. Would max-TrackNumber + 1 suffice?

Dave Rice