Re: [Cellar] AV1 mapping Matroska

Steve Lhomme <slhomme@matroska.org> Mon, 02 July 2018 08:40 UTC

Return-Path: <slhomme@matroska.org>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 02EF8126F72 for <cellar@ietfa.amsl.com>; Mon, 2 Jul 2018 01:40:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=matroska-org.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 55xW5ftywude for <cellar@ietfa.amsl.com>; Mon, 2 Jul 2018 01:40:50 -0700 (PDT)
Received: from mail-pg0-x235.google.com (mail-pg0-x235.google.com [IPv6:2607:f8b0:400e:c05::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B04EB130E41 for <cellar@ietf.org>; Mon, 2 Jul 2018 01:40:50 -0700 (PDT)
Received: by mail-pg0-x235.google.com with SMTP id r1-v6so1125024pgp.11 for <cellar@ietf.org>; Mon, 02 Jul 2018 01:40:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=matroska-org.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=6qM2J+iBa7D85Wb868oVp+ptvgf3Uqx2abupqkN5zHU=; b=ydiSMmrsSC2RXcih6TNdVFS0worxapXsPzHHJAs/SSrXi9juHSJGNmH7p0mAmqu3NE AOaSQOT/IJ/9Elo4YPddWZ1P9QkcEZJSn4M1KSWIO2lkK53zwy8HDxQiHdOAGfJxvbrJ aAvZKrowRBTat6bful96MGmXRWSvmhJ7ZVFEihrZu9BVchR0enbxxS8I2gx4zIJbUKaE 5ngtCzw7QYyKDuMFDyssgbhgNhm+P/iukI0ijzri/lKA0kMcL/S8bMz8jKrhjFAePMeR y0hldEBjt+cBbTlLMX/KKLjicSDktI/eQ8oCCCHyX7HtCYwHrAW2zimh/9HBp7SOKRQb O0wA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=6qM2J+iBa7D85Wb868oVp+ptvgf3Uqx2abupqkN5zHU=; b=VAmk/0e+kbq7+yC1pWsoV1jeTJL7VhkN0qyFxUQU17aWnTqUTT6dlsRMxDWkwtGglq TYF+kX44j0HV/4Dfcb9OyWe3LuIsh/N3FiyvGaTRs7/kyMs8P7Dwf995xkjVc6RtHerk jg5VAxabGIDZAIn0m6zGeQY1D+rFT6OMkh1x5CWFrH22b7b2ovJOgM6jGkWfJzfY6148 rsnAY4rGkIanALIkjPzlzI0ER6Sqrn5cfmdIeAsyB37qpn6ca+KcIH3GcnqAuFFqLOS+ fZI4p5hTWWf//C4Q9zb07n8act/WbID1uIhuN/WYBKi8lDXqqcHnc82fPnmtI0y2Dw3M Ut/g==
X-Gm-Message-State: APt69E2j+2UTkd6t4zI2jkgQ+9H8iPLKLvffKF2zlQlNvqtHHTWAqzp2 TWolzfzLCZvsIaVxG5wn8yzKf/YVP4/eVHkoiWeTAg==
X-Google-Smtp-Source: ADUXVKI+ALvoF3LH+rjHbUrXeiUULLCGg0yFtJ7MlE3woZAhwc0OcXG9JzZzOHuc0U6qduIhsROj1WUS3OuBDmpbGlU=
X-Received: by 2002:a65:4b0f:: with SMTP id r15-v6mr21301922pgq.103.1530520849964; Mon, 02 Jul 2018 01:40:49 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a17:90a:17af:0:0:0:0 with HTTP; Mon, 2 Jul 2018 01:40:49 -0700 (PDT)
In-Reply-To: <b14e1a09-1560-0adf-8321-d208ef30673b@googlemail.com>
References: <f603a9f5-d7dc-6640-1f53-f4d7b62a788e@googlemail.com> <CAOXsMFJJbHbh3=Gnu7Uag4Skq5+=jOimFNPxuicgDiVdbven4w@mail.gmail.com> <b14e1a09-1560-0adf-8321-d208ef30673b@googlemail.com>
From: Steve Lhomme <slhomme@matroska.org>
Date: Mon, 02 Jul 2018 10:40:49 +0200
Message-ID: <CAOXsMF+bfe68dBGkXON8G3BkUBO=47uFmVEaUhLZBFSbkctx2g@mail.gmail.com>
To: Andreas Rheinhardt <andreas.rheinhardt@googlemail.com>
Cc: Codec Encoding for LossLess Archiving and Realtime transmission <cellar@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/zyfSLeIf1Mzdb0ZAegTqgg1tf9Y>
Subject: Re: [Cellar] AV1 mapping Matroska
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jul 2018 08:40:55 -0000

2018-06-30 23:02 GMT+02:00 Andreas Rheinhardt
<andreas.rheinhardt@googlemail.com>:
> Hello,
>
> 1.
> Steve Lhomme:
>>> 3. Given that AV1 seems to only use one Sequence Header OBU at a time
>>> couldn't one use the CodecState and CueCodecState elements to designate
>>> where the currently active Sequence Header OBU can be found so that one
>>> doesn't need to repeat the Sequence Header with every keyframe?
>>
>> Yes, that would be a good use for when seeking is used. But these
>> elements were ruled out as deprecated because they were of no use
>> until now. I know VLC doesn't read them for example.
> Are you sure that it is deprecated? I couldn't find anything that says

No, you're right it's not. Only CueRefCodecState is.

> this and the current version of the Codec Specs contains this: "When the
> Initialisation is updated within a track then that updated
> Initialisation data MUST be written into the CodecState Element of the
> first Cluster to require it." (Btw: This requirement is not fulfilled
> for the way H.264 is commonly muxed into Matroska.) Your draft also
> includes nothing that indicates that CodecState is deprecated.
>
> Anyway, given that this element is not part of WebM means that it can't
> be used in Matroska either if one (sensibly) wants only one codec
> mapping for both.

Correct.

>>> (Btw: There are currently conflicting recommendations regarding the
>>> existence of Sequence Header OBU if there is actually only one Sequence
>>> Header OBU. On the one hand, they should be omitted, on the other hand
>>> every block marked as keyframe should start with a Sequence Header OBU.)
>>
>> You mean in my document or the AV1 specs ?
> In your document. I have already made a PR for this. Timothy B.
> Terriberry has misunderstood what I had in mind: At one place the draft
> said that every keyframe should have a sequence header OBU and at
> another place it said that there shouldn't be in-band sequence header
> OBUs if all of them coincide. This is contradictory in case all of the
> sequence header OBUs coincide.

OK I will try to clarify this.

> 2. How does one signal the difference between a real KEY_FRAME and an
> INTRA_ONLY_FRAME (that isn't a valid random access point) when using
> blocks (that don't have a keyframe flag)? According to your draft, the
> relevant `BlockGroup` won't have any `ReferenceBlocks` so that both
> frame types are indistinguishable at the container level when using
> `BlockGroup`s (if the muxer knows how to handle AV1 it should of course
> not reference an INTRA_ONLY_FRAME in the cues, but it would really be
> advantageous to be able to infer this without resorting to cues which
> are optional anyway and unavailable for e.g. live-streaming).

Indeed, there's an issue here. I think we had a similar discussion in
the past and we concluded that a ReferenceBlock with a value of 0 can
be used. Meaning it's referencing itself. I should add that to the
document.

> 3. There is something wrong with the invisible flag:
> a) According to 7.5. every temporal unit contains at least one frame for
> which [show_frame] || [show_existing_frame] equals 1, i.e. for each
> Matroska block there is an output frame (even if said output frame is
> only from a frame buffer and not from data directly contained in the
> Matroska block). This implies that actually no block should have the
> invisible flag bit set. And if it is set, it actually means that the
> frame that should normally be output due to said frame should not be
> displayed (regardless of whether the frame that would normally be output
> is one of the already existing ([show_existing_frame] == 1) frames or
> not). Otherwise we'd be breaking the semantics of the invisible flag
> (and therefore make it impossible to hide individual frames of an
> already encoded AV1 track).

Indeed, it's either " Each temporal unit must have exactly one shown
frame." or "Every layer that has a coded frame in a temporal unit must
have exactly one shown frame that is the last frame of that layer in
the temporal unit."

So each temporal unit has at least one visible frame. And since we map
1 Temporal Unit = 1 Block we don't have to extract the invisible part.


> b) If one nevertheless wants to map the invisible flag to properties of
> the track, it IMO shouldn't be done the way it is currently proposed:
> i) It is possible that [show_frame] equals 1 (meaning the frame should
> be immediately be output) and [showable_frame] is 1 (namely if
> [frame_type] is not KEY_FRAME). In your draft the invisible bit should
> be set to 1 for such a `Block` meaning that the frame should not be
> displayed although it should be immediately output. This is obviously wrong.
> ii) Apart from that there is the problem that a temporal unit can
> contain more than one frame.
> iii) A frame might also be displayed later if [showable_frame] equals 1.
> So the closest to a invisible frame seems to be a frame with
> [show_frame] and [showable_frame] equal to zero. So a temporal unit that
> contains only frames with [show_frame] and [showable_frame] equal to 0
> seems to be a sensible choice to merit the invisible flag. (Such a
> temporal unit must contain a frame with [show_existing_frame] equal to 1.)
>
> 4. There is currently no hard requirement that only real random access
> points get flagged as keyframes in Matroska/WebM. The current wording is
> only a "SHOULD". Is this really intended?

I'll double check on that.

> 5. Given that changing extradata are not that uncommon in H.264 I don't
> deem it a good idea to restrict Matroska to one coded video sequence.
> This might end up excluding a lot of content from being muxed into
> Matroska (and it makes it much harder to append different tracks).

I'm looking at how it's officially handled in MP4. It seems they have
different modes, but the "avc1" only has the SPS and PPS in the
extradata and not in the stream. Also they have this note:

"NOTE 1 It is recommended that when several parameter sets are used
and parameter set updating is desired, a separate parameter set
elementary stream be used"

Meaning a single file should not have changing parameters others that
the one expressed in the header.

Thanks a lot for your feedback !

> Andreas Rheinhardt
>
> _______________________________________________
> Cellar mailing list
> Cellar@ietf.org
> https://www.ietf.org/mailman/listinfo/cellar



-- 
Steve Lhomme
Matroska association Chairman