Re: [Cellar] AV1 mapping Matroska

Andreas Rheinhardt <> Sat, 30 June 2018 21:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3FD27130E04 for <>; Sat, 30 Jun 2018 14:03:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id eBgjhcqmoTI2 for <>; Sat, 30 Jun 2018 14:03:10 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:400c:c0c::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 81CC2124C04 for <>; Sat, 30 Jun 2018 14:03:10 -0700 (PDT)
Received: by with SMTP id f16-v6so11889506wrm.3 for <>; Sat, 30 Jun 2018 14:03:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=subject:to:references:from:message-id:date:mime-version:in-reply-to :content-transfer-encoding; bh=9ydqtxBSRRRfOTM9v/7ZIYtSLMBtwtS+jU4HFa7EnFQ=; b=NOlk6X4APn73GIFGP9tW9IDgqeACTAIhP6woaThDxJseqiK8DKz57shNlDTmiP/ejB GkuU/vRVz9th5jkfz4nA10UlAkkbjlWsjxvne8SqMzxBqPIJw4rwYoONbv7MZHkVDGhd nXUf40p6zvG07vr32xMq+z/0jOnPCx2kzpKT3ytcorrmk9cI2c86f34+FjB9cm/VLOf0 +LEDUTJ5ajn4c/BiUCli0OEKtRSXlIkSofYubk89aiwBWbgk3MjuLkT1oT5iBbXUC0As PtRlzZc4+MDQRsO1+5bJSILuDdf5hcB7RW22VEazc1qCuf28e81m9jGqPpe2a8zfqR2u 22vQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-transfer-encoding; bh=9ydqtxBSRRRfOTM9v/7ZIYtSLMBtwtS+jU4HFa7EnFQ=; b=U/qqoj9IrgoG1ZGM9kXzc5OgqDSRLIx7wSzUqm9/TmXHoaCCC/2B02hweY5ZfCPntb jzxW/BFCaLuPIQRpudUa4g0taVUOogWvz3LOjWvcZ/WJXWly+TJg7XF1Y+Y5dchiawBb uuBNQC5Zz2UB3WrGwZT6QOB6AKBeQBLs4eao20l6Hn+fmcKgDAYS46RYtbFoqWLNP6x0 mQobBvijpdR4rSw2MZuSvJ3qUJEGRhJPnxt9/qjW0C9MUr5VazmTU8fVy+o3/HWySnXA iVUsuoa/MeyFo5LLdXJbOWQpKIacWboWKwzMqaYLiIL1qzCYae/w6qNegvjM+raSOAWG NJ4g==
X-Gm-Message-State: APt69E1fJxvo2QEFosd8s+lJla3HMhyDY1DsRsVSsouyKyXVIiwCTxXC O+6km3AatShLxDw26xsqkBdzGPAihLE=
X-Google-Smtp-Source: AAOMgpciV0dA8ZCCb/fp8dpu9qe5v8ETPQ3VxaYNu/26GeT3BG9lHPI4zngAOjsoN2UQtt1sl5fBsw==
X-Received: by 2002:adf:9883:: with SMTP id w3-v6mr16741856wrb.9.1530392588816; Sat, 30 Jun 2018 14:03:08 -0700 (PDT)
Received: from [] ( []) by with ESMTPSA id s200-v6sm10190701wmb.44.2018. for <> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 30 Jun 2018 14:03:07 -0700 (PDT)
References: <> <>
From: Andreas Rheinhardt <>
Message-ID: <>
Date: Sat, 30 Jun 2018 21:02:00 +0000
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [Cellar] AV1 mapping Matroska
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 30 Jun 2018 21:03:14 -0000


Steve Lhomme:
>> 3. Given that AV1 seems to only use one Sequence Header OBU at a time
>> couldn't one use the CodecState and CueCodecState elements to designate
>> where the currently active Sequence Header OBU can be found so that one
>> doesn't need to repeat the Sequence Header with every keyframe?
> Yes, that would be a good use for when seeking is used. But these
> elements were ruled out as deprecated because they were of no use
> until now. I know VLC doesn't read them for example.
Are you sure that it is deprecated? I couldn't find anything that says
this and the current version of the Codec Specs contains this: "When the
Initialisation is updated within a track then that updated
Initialisation data MUST be written into the CodecState Element of the
first Cluster to require it." (Btw: This requirement is not fulfilled
for the way H.264 is commonly muxed into Matroska.) Your draft also
includes nothing that indicates that CodecState is deprecated.

Anyway, given that this element is not part of WebM means that it can't
be used in Matroska either if one (sensibly) wants only one codec
mapping for both.
>> (Btw: There are currently conflicting recommendations regarding the
>> existence of Sequence Header OBU if there is actually only one Sequence
>> Header OBU. On the one hand, they should be omitted, on the other hand
>> every block marked as keyframe should start with a Sequence Header OBU.)
> You mean in my document or the AV1 specs ?
In your document. I have already made a PR for this. Timothy B.
Terriberry has misunderstood what I had in mind: At one place the draft
said that every keyframe should have a sequence header OBU and at
another place it said that there shouldn't be in-band sequence header
OBUs if all of them coincide. This is contradictory in case all of the
sequence header OBUs coincide.

2. How does one signal the difference between a real KEY_FRAME and an
INTRA_ONLY_FRAME (that isn't a valid random access point) when using
blocks (that don't have a keyframe flag)? According to your draft, the
relevant `BlockGroup` won't have any `ReferenceBlocks` so that both
frame types are indistinguishable at the container level when using
`BlockGroup`s (if the muxer knows how to handle AV1 it should of course
not reference an INTRA_ONLY_FRAME in the cues, but it would really be
advantageous to be able to infer this without resorting to cues which
are optional anyway and unavailable for e.g. live-streaming).

3. There is something wrong with the invisible flag:
a) According to 7.5. every temporal unit contains at least one frame for
which [show_frame] || [show_existing_frame] equals 1, i.e. for each
Matroska block there is an output frame (even if said output frame is
only from a frame buffer and not from data directly contained in the
Matroska block). This implies that actually no block should have the
invisible flag bit set. And if it is set, it actually means that the
frame that should normally be output due to said frame should not be
displayed (regardless of whether the frame that would normally be output
is one of the already existing ([show_existing_frame] == 1) frames or
not). Otherwise we'd be breaking the semantics of the invisible flag
(and therefore make it impossible to hide individual frames of an
already encoded AV1 track).
b) If one nevertheless wants to map the invisible flag to properties of
the track, it IMO shouldn't be done the way it is currently proposed:
i) It is possible that [show_frame] equals 1 (meaning the frame should
be immediately be output) and [showable_frame] is 1 (namely if
[frame_type] is not KEY_FRAME). In your draft the invisible bit should
be set to 1 for such a `Block` meaning that the frame should not be
displayed although it should be immediately output. This is obviously wrong.
ii) Apart from that there is the problem that a temporal unit can
contain more than one frame.
iii) A frame might also be displayed later if [showable_frame] equals 1.
So the closest to a invisible frame seems to be a frame with
[show_frame] and [showable_frame] equal to zero. So a temporal unit that
contains only frames with [show_frame] and [showable_frame] equal to 0
seems to be a sensible choice to merit the invisible flag. (Such a
temporal unit must contain a frame with [show_existing_frame] equal to 1.)

4. There is currently no hard requirement that only real random access
points get flagged as keyframes in Matroska/WebM. The current wording is
only a "SHOULD". Is this really intended?

5. Given that changing extradata are not that uncommon in H.264 I don't
deem it a good idea to restrict Matroska to one coded video sequence.
This might end up excluding a lot of content from being muxed into
Matroska (and it makes it much harder to append different tracks).

Andreas Rheinhardt