[Cellar] shepherd review of Matroska: part 1

Michael Richardson <mcr+ietf@sandelman.ca> Mon, 06 June 2022 13:25 UTC

From: Michael Richardson <mcr+ietf@sandelman.ca>
To: cellar@ietf.org
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg="pgp-sha512"; protocol="application/pgp-signature"
Date: Mon, 06 Jun 2022 09:24:51 -0400
Message-ID: <1216608.1654521891@dooku>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/hoyvAqjTxYNpQHdGU59TQXLEbRI>
Subject: [Cellar] shepherd review of Matroska: part 1
Precedence: list

This email is based upon my top-to-bottom read of the document as part of the
Shepherd review. I'm concerned mostly about things that will attract
IESG comments.

I shall come back to this email and reply with issue numbers, but I'm working
mostly offline as I read this, but my preference would be to discuss most
issues on the list.

I got as far as Section 6 in this email, and I continued in more emails.

section 1:
I'm concerned about the claim "THE" format.
I suspect that it may have already achieved this, but I wonder about how to
support that statement.

"Menus (like DVDs have)" <- is an informative reference useful?

the section:
"Matroska is an open standards project. This means for personal use it is
absolutely free to use and that the technical specifications describing the
bitstream are open to everybody, even to companies that would like to support
it in their products. "

I'm not sure why the relevance of "personal use". In fact, it's a bit of a concern!

I suggest we write:
"Matroska is an open standards project published as an IETF Standard Track
RFC. As per the terms of BCP78 [RFC5378], the technical specifications
describing the bitstream are open to everybody. This specification falls
under the terms of BCP79 with respect to IPR claims. While there are
patent claims associated with some CODECs that can be contained within
a Matroska container, the container itself is free of all such claims"

(Assuming that this is true)

section 2:

"This document covers Matroska versions 1, 2, 3 and 4. Matroska v4 is the
current version. Matroska 1 to 3 are no longer maintained. No new elements
are expected in files with theses version numbers. There MAY be further
additions to Matroska v4."

Should we references places/points when version 1,2,3 were
published/stabilized? Something like:

} This document covers Matroska versions 1, 2, 3 and 4.
} Matroska v4 is the current version.
} Matroska v1 first appeared around YEAR in PRODUCT [reference].
} Matroska v2 first appeared around YEAR in PRODUCT [reference].
} Matroska v3 first appeared around YEAR in PRODUCT [reference].
} Matroska 1 to 3 are no longer maintained.
} No new elements are expected in files with version numbers 1,2, or 3.
} New additions are expected to be added to Matroska v4 via the Extensions
} mechanisms described in Section 25. IANA Considerations.

section 4.4:
"The Matroska EBML Schema defines eight Top Level Elements: SeekHead,
Info, Tracks, Chapters, Cluster, Cues, Attachments, and Tags."

maybe insert forward references? Maybe arrange it as a bullet list?
The Matroska EBML Schema defines eight Top Level Elements:
* SeekHead []
* Info [],
* Tracks [],
* Chapters [],
* Cluster [],
* Cues [],
* Attachments [],
* Tags []
which may appear in any order, and may be repeated.

that would flow better into the next paragraph that explains that there is an
index. The next paragraph says that the MetaSeek is RECOMMENDED (which is
aka SHOULD).
When we write SHOULD there is usually some exception conditions by which the
writer might omit that. I think that it is a MUST that the reader/player be
able to cope without the MetaSeek?

Figure 4 suggests that only Audio/Video components are supported. I think
that's not the case. Could the diagram imply something else, or some text
say so?
Please review all the SHOULD, I suspect many of them are MUSTs.
(think: is there a reasonable exception for the writer?)
Perhaps, instead of specifying what the file SHOULD contain (with a vague
exception), can we say what a reader MUST tolerate?

For instance:
There SHOULD be one or more BlockGroup or SimpleBlock Element in each
Cluster Element.

Is there a case where a Cluster Element might contain ZERO BlockGroup's or
SimpleBlocks? Maybe just saying:

Cluster Element contain one or more BlockGroup or SimpleBlock Elements.

My guess:
In some situations (live recordings which are never unpaused???) a Cluster
Element MAY be empty if no data was collected.

Another:
The Timestamp Element SHOULD be the first Element in the Cluster.

Stupid question: can a file contain audio and video in different codecs?
For instance, can the English audio be in CODEC A, while the German is in
CODEC B? Can more than one audio or video track be played at the same time?

section 5.
I really hate the layout of these sections.
The empty paragraph line between items makes for far too much vertical space
in the HTML, and the same in the TXT. I wonder if we can turn these into
some kind of table.
Could the definition go just after the name?

Like, I'd want to read it like this in text:

name: Segment id: 0x18538067
path: \Segment
definition:
The Root Element that contains all other Top-Level Elements (Elements
defined only at Level 1). A Matroska file is composed of 1 Segment.

minOccurs: 1 maxOccurs: 1 type: master
unknownsizeallowed: 1

Usage notes:
BLAH BLAH BLAH

I think that "unknownsizeallowed" is really a true/false?
What is the difference between Usage Notes and Rationale?
(Does it have an "e" in English?)

About:
definition:
The Segment Position of the Element.

maybe it could say relative to what?
I think the units are bytes?
Should the units be part of the type?

5.1.2.1. SegmentUID Element
name: SegmentUID

I wonder if it's really a *uuid*?
Should SimpleBlock be listed after Block?

}5.1.3.5.2.2. BlockAddID Element
}name: BlockAddID
}definition:
}An ID to identify the BlockAdditional level. If BlockAddIDType of the
}corresponding block is 0, this value is also the value of BlockAddIDType for
}the meaning of the content of BlockAdditional.

Something tricky here, needs more explanation, or a forward reference to other
discussion...

}5.1.4.1.17. TrackTimestampScale Element
}name: TrackTimestampScale
}definition: DEPRECATED, DO NOT USE.

should it go into the Appendix?

}5.1.4.1.22. LanguageIETF Element
}name: LanguageIETF

I suggest you name it LanguageBCP47, which would be a better reference.
(or LanguageIETFBCP47 )

}5.1.4.1.30. TrackTranslate Element
}rationale:
}Chapter Codec may need to address content in specific track, but they may not know of the way to

s/Chapter Codec/The Chapter Codec/
s/in specific track/in a specific track/
s/they/it/

} 5.1.4.1.31.3. StereoMode Element
has an enumerated value. It may need to have some IANA Considerations.
I'm also concerned that section 18.10 has some v2-compatibility stuff kinda
hidden in it.
Should 0x53B9 be mentioned in this section too?

5.1.4.1.31.8...etc.
"The value of this Element SHOULD be kept the same when making a direct stream copy to another file."
I wonder if the elements that should be kept identical when making a direct
stream copy need to be marked more clearly.
Or maybe I should ask: when would the element not be copied exactly?
Maybe we can save some "ink" here and make this the default?

}5.1.4.1.31.14. UncompressedFourCC Element
}usage notes:
} This Element MUST NOT be used if the CodecID Element of the TrackEntry is
} set to "V_UNCOMPRESSED".
}
}Table 12: UncompressedFourCC implementation notes
}attribute note
}minOccurs UncompressedFourCC MUST be set (minOccurs=1) in TrackEntry,
} when the CodecID Element of the TrackEntry is set to "V_UNCOMPRESSED".

If I read correctly, the two statements are contradictory?

} 5.1.4.1.31.22. ChromaSitingHorz Element
thru ....31.27
and: 5.1.4.1.31.41. ProjectionType Element
5.1.4.1.33.4. TrackPlaneType Element
5.1.4.1.34.3. ContentEncodingScope Element
5.1.4.1.34.4. ContentEncodingType Element
5.1.4.1.34.6. ContentCompAlgo Element
5.1.4.1.34.9. ContentEncAlgo Element
5.1.7.1.4.18. ChapProcessTime Element
5.1.8.1.1.1. TargetTypeValue Element - looks like gaps matter?
do these need any allocation rules?

} Green X chromaticity coordinate, as defined by CIE 1931.
"CIE 1931" should be a reference.

}5.1.4.1.34.2. ContentEncodingOrder Element
}name: ContentEncodingOrder
}definition:
}Tell in which order to apply each ContentEncoding of the
}ContentEncodings. The decoder/demuxer MUST start with the ContentEncoding
}with the highest ContentEncodingOrder and work its way down to the
}ContentEncoding with the lowest ContentEncodingOrder. This value MUST be
}unique over for each ContentEncoding found in the ContentEncodings of this
}TrackEntry.

The english needs some work.
Maybe "Tells in which..."

} 5.1.4.1.34.9. ContentEncAlgo Element
questions will be asked about 1DES, and 3DES.
1) We need a note about how legacy files might be encrypted, and we need
those definitions for history.
2) Questions will be asked about how the keys were created.
I haven't read the Security Considerations, but it needs to say something.
3) We should probably have a suggestion on how encryption keys are
represented in text.
Consider one system that takes things in hex, and another that
uses strings, and that all hex values are valid strings.

} 5.1.4.1.34.10. ContentEncKeyID Element
Will need a reference to public key formats used.

} 5.1.4.1.34.11. ContentEncAESSettings Element
is this number of bits?

} 5.1.4.1.34.12. AESSettingsCipherMode Element
will need an IANA registry.

} 5.1.8.1.1.2. TargetType Element
might need IANA Considerations, but table looks broken.

value label
COLLECTION COLLECTION
EDITION EDITION
ISSUE ISSUE
...

=== got to section 6.

--
Michael Richardson <mcr+IETF@sandelman.ca>, Sandelman Software Works
-= IPv6 IoT consulting =-

Attachment: signature.asc

[Cellar] shepherd review of Matroska: part 1 Michael Richardson
[Cellar] shepherd review of Matroska: part 2 Michael Richardson
Re: [Cellar] shepherd review of Matroska: part 1 Steve Lhomme
Re: [Cellar] shepherd review of Matroska: part 2 Steve Lhomme
Re: [Cellar] shepherd review of Matroska: part 1 Steve Lhomme
[Cellar] media type registration for matroska Michael Richardson
Re: [Cellar] shepherd review of Matroska: part 1 Michael Richardson
Re: [Cellar] shepherd review of Matroska: part 1 Dave Rice
Re: [Cellar] shepherd review of Matroska: part 1 Michael Richardson
Re: [Cellar] shepherd review of Matroska: part 1 Moritz Bunkus
Re: [Cellar] shepherd review of Matroska: part 1 Michael Richardson
Re: [Cellar] shepherd review of Matroska: part 1 Steve Lhomme
Re: [Cellar] shepherd review of Matroska: part 1 Steve Lhomme
Re: [Cellar] shepherd review of Matroska: part 1 Steve Lhomme

[Cellar] shepherd review of Matroska: part 1

Attachment: signature.asc