Re: [codec] AD Evaluation of draft-ietf-codec-oggopus-09
"Timothy B. Terriberry" <tterribe@xiph.org> Fri, 11 December 2015 22:16 UTC
Return-Path: <tterribe@xiph.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C60541A9068; Fri, 11 Dec 2015 14:16:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.613
X-Spam-Level:
X-Spam-Status: No, score=-2.613 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_HI=-5, SPF_SOFTFAIL=0.665] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7ZPJryj_zlgU; Fri, 11 Dec 2015 14:16:43 -0800 (PST)
Received: from smtp.mozilla.org (mx1.scl3.mozilla.com [63.245.214.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A76FE1A908A; Fri, 11 Dec 2015 14:16:43 -0800 (PST)
Received: from localhost (localhost6.localdomain [127.0.0.1]) by mx1.mail.scl3.mozilla.com (Postfix) with ESMTP id 34751C1852; Fri, 11 Dec 2015 22:16:43 +0000 (UTC)
X-Virus-Scanned: amavisd-new at mozilla.org
Received: from smtp.mozilla.org ([127.0.0.1]) by localhost (mx1.mail.scl3.mozilla.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hUXwMJwIPICA; Fri, 11 Dec 2015 22:16:41 +0000 (UTC)
Received: from [10.130.20.84] (unknown [207.190.10.41]) (Authenticated sender: tterriberry@mozilla.com) by mx1.mail.scl3.mozilla.com (Postfix) with ESMTPSA id AF02EC1812; Fri, 11 Dec 2015 22:16:40 +0000 (UTC)
Message-ID: <566B4B47.9010809@xiph.org>
Date: Fri, 11 Dec 2015 14:16:39 -0800
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 SeaMonkey/2.26
MIME-Version: 1.0
To: Ben Campbell <ben@nostrum.com>, draft-ietf-codec-oggopus.all@ietf.org, codec@ietf.org
References: <86ACD2D0-02B6-473E-9E35-B9980166D9A0@nostrum.com>
In-Reply-To: <86ACD2D0-02B6-473E-9E35-B9980166D9A0@nostrum.com>
Content-Type: multipart/mixed; boundary="------------090501040108020500070205"
Archived-At: <http://mailarchive.ietf.org/arch/msg/codec/INlUbGEghHFVDYA5tLjnRm3zlZg>
Subject: Re: [codec] AD Evaluation of draft-ietf-codec-oggopus-09
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Dec 2015 22:16:50 -0000
Ben Campbell wrote: > Here is my AD evaluation of draft-ietf-codec-oggopus-09. Codec > encapsulation is not my strong suit, so I suspect many of my comments > can be answered with "this is the way we normally do this thing". But > I'd still like to discuss these before going to IETF last call. Thank you for the very thorough review. Comments in-line below (as an individual). > - section 2, last paragraph: > > Is this paragraph necessary or useful? (And is it really impossible to > create a non-compliant implementation that meets all the MUSTs?) I > suggest removing it. This was copied from RFC 5334. I don't think anyone has any particular attachment to it, so let's just remove it. > - section 3: > This section seems to use 2119 keywords to describe what I _think_ are > existing Ogg requirements. If so, they should use descriptive text, > unless they are in the form of attributed quotes. I think I can attribute this mostly to paranoia over RFC 3533 being informational, and using descriptive text of its own for what should be normative requirements. E.g., "Ogg also allows but does not require secondary header packets after the bos page for logical bitstreams and these must also precede any data packets in any logical bitstream. These subsequent header packets are framed into an integral number of pages, which will not contain any data packets." (from RFC333) vs. "The second packet in the logical Ogg bitstream MUST contain the comment header ... It MAY span multiple pages, beginning on the second page of the logical stream. However many pages it spans, the comment header packet MUST finish the page on which it completes." (from this draft) Unless someone objects, though, I'll try to rewrite the latter without the 2119 keywords. > - 3, last paragraph: > -- "implementations need to be prepared": > maybe that should be a MUST or SHOULD? I think the intent was just to point out that someone might violate the preceding SHOULD. It's not clear what actionable thing we'd be saying they MUST do (it would depend on the particular implementation). > -- "last page SHOULD NOT be a continued packet": > Why not MUST NOT? Can you describe reasons someone might reasonably > violate this? The intent was to allow but discourage page-level editing of files (i.e., cutting and splicing files by copying complete pages without examining the packets on them). Even just saving a live stream that drops at some point, you might reasonably expect to just be able to copy the data you received to a file. A MUST here would mean you would have to parse the pages, identify the continued packets from the lacing values, remove the extraneous data from the last page if you found some, repack the page buffer, and recompute the page checksum. That's a lot more machinery, and people would likely just ignore the requirement in practice. See also the discussion at <https://www.ietf.org/mail-archive/web/codec/current/msg03145.html>. > - 4.2, 2nd paragraph: > -- "samples which SHOULD be skipped": Why not MUST? (also, s/which/that) Some specific applications might have a reason for looking at that data (for example, when you have multiple files that you know were produced by chopping up a single existing file without re-encoding). In practice we haven't needed anything stronger to convince people to implement this correctly (especially when we point them to <https://people.xiph.org/~greg/opus_testvectors/correctness_trimming_nobeeps.opus>). > -- "SHOULD NOT be played": Is this redundant with SHOULD be skipped? Yes. Deleted. > - 4.3: Should the reference to [vorbis-trim] be normative? Can the > feature be implemented without understanding the referenced doc? The answer to the second question is "yes", which I think makes the answer to the first question "no". The reason this is here is to tell people familiar with how Vorbis works what mechanism they should use to achieve the same outcome in Opus. But if you're not familiar with Vorbis, then you can safely ignore the whole paragraph. > - 4.6: Should the reference to [seeking] be normative? Can the section > be implemented without understanding it? It is possible to implement seeking without knowing anything other than what is described in RFC 3533, but the linked documentation does help considerably. I would argue that the section itself is not normative, in that you could implement seeking with some other method (for example, linearly scanning the whole file on open to build an index, possibly caching such indexes for files that are opened frequently: this would not be my recommended approach, but people have done it in practice). You could also choose to implement only approximate seeking by just guessing a file position and decoding from there (also done in practice by some players, and a common method for other formats like MP3). Or you could choose not to implement seeking at all. > - 5.1, list entry 5: What is the unit for "Input Sample Rate"? Hz. It was specified in Figure 1, but you're right, it should be in the text, too. Added the sentence, "This is the sample rate of the original input (before encoding), in Hz." > - 5.1, entry 6, sentence starting with "Virtually all players..." > Is this a normative statement, or a statement of fact? If the former, > consider dropping “Virtually all” Dropped "virtually all". > - 5.1, entry 6, last paragraph: > -- "which MUST be omitted when the channel mapping family is 0": That > seems redundant to the previous paragraph. Yes. Deleted the sentence in the previous paragraph. > -- "Implementations SHOULD reject ID headers" > What does it mean, in context, to "reject" headers? The same thing as rejecting a stream (i.e., don't try to play/decode it). Changed to say that instead. > - 5.1.1.4: Why not MUST? I think the idea was to allow assigning reserved values without explicitly updating this specification. Even though I expect we _would_ want to update this specification, given how long it's taken to get this draft published, I'm wary of adding more pressure to people not to deploy new mappings (because this RFC says they MUST not), even when it looks like they'll be relatively stable and interoperable. At least not for the sole reason that they haven't gotten an RFC number yet. > - 5.2, last two paragraph: Why are the two SHOULDs not MUSTs? For the first SHOULD, this is the same as above: some other specification may eventually define the format of the data here and how to modify it, and I didn't think we'd want to make people supporting that specification in violation of this specification. I'm also leaning towards keeping this one a SHOULD since a) the eventual format of this data may apply to all codecs which share the vorbiscomment style of metadata, and b) we would probably try to do this informally first to gain implementation experience, and only bring it to the IETF if there is sufficient interest. I didn't want to make updating this document a prerequisite for that. See also the discussion in this thread: <https://www.ietf.org/mail-archive/web/codec/current/msg03061.html> (skip to <https://www.ietf.org/mail-archive/web/codec/current/msg03107.html> for the resolution of the debate). For the second SHOULD, it felt like the right strength for a judgment call like "excessive". I have no objection to changing it to MUST. > - 5.2.1, 2nd to last paragraph, last sentence: > I don’t think muxers assume things—what is the required concrete behavior? I replaced this with "A muxer SHOULD place the gain it wants other tools to use by default into the 'output gain' field, and not the comment tag." > - 5.2.1, last paragraph: Why is the SHOULD NOT not MUST NOT? In some contexts, there may be no confusion with multiple normalization schemes. The language in this section was debated a good deal on the list (see threads <https://www.ietf.org/mail-archive/web/codec/current/msg02930.html>, <https://www.ietf.org/mail-archive/web/codec/current/msg03012.html>, and <https://www.ietf.org/mail-archive/web/codec/current/msg03053.html>). I believe everyone is satisfied with the current compromise, and would prefer not to change the strength of any normative statements in it to avoid re-igniting the debate. > -6, first paragraph: > Why are the two SHOULDs not MUSTs? What does it mean to reject a packet? For the first SHOULD, the fact that most decoders are likely to reject such packets is enough to discourage encoders from emitting them, and I was wary of imposing a MUST restriction when RFC 6716 chose not to do so. For the second, this is the same as the previous comment about excessive allocations. I have no objection to changing this one, either. "Reject a packet" is just shorthand for "treat them as if it was a malformed Opus packet with an invalid TOC sequence". I've added explicit text saying the latter. > -7: "reference implementation" could use a citation. Should Added a reference to RFC 6716. > [linear-predection] be normative? Can this be implemented without > understanding that? If it does need to be normative, can you find a more > stable reference? Linear prediction is something that would be understood by "a person having ordinary skill in the art" (to borrow a phrase). We just wanted to include a reference to help those who might not be DSP engineers but wanted to learn more. > - 9, first paragraph: > This seems to defer at least some security considerations to 4732. If > so, this should be a normative reference. If, on the other hand, 4732 > just provides additional information that is not necessary to understand > this section, then the wording should be adjusted. Upgraded the reference to normative. > - 13: > Is this needed? Are the IETF BCP 78 and 79 provisions incorrect for this? This text was specifically requested by the Debian project (and I support it personally). The same issue came up for RFC 6716 (for the same reasons). See the discussion at <https://www.ietf.org/mail-archive/web/codec/current/msg02845.html> and <https://www.ietf.org/mail-archive/web/ietf/current/msg73116.html>. Reviewing those conversations, I do notice that the language of that section was changed after last call for RFC 6716, but the corresponding language in our source repository was not updated (because it was added to the document boilerplate manually by the RFC Editor, and not included in a separate section). > Editorial > ========= > - Section 1: Consider spelling out multiplexer (muxer) and demultiplexer > (demuxer) on first mention. Done. > - 3: Please expand TOC, SILK, and CELT on first mention. Doing this directly in the current text was a little bit awkward. I reworded it to "...An implementation of this specification SHOULD treat any Opus packet whose duration is different from that of the first Opus packet in an Ogg packet as if it were a malformed Opus packet with an invalid Table Of Contents (TOC) sequence. "The TOC sequence at the beginning of each Opus packet indicates the coding mode, audio bandwidth, channel count, duration (frame size), and number of frames per packet, as described in Section 3.1 of [RFC6716]. The coding mode is one of SILK, Hybrid, or Constrained Energy Lapped Transform (CELT). The combination of coding mode, audio bandwidth, and frame size is referred to as the configuration of an Opus packet." I did not expand SILK because it is not an acronym. > - 5, steps 2 and 3: > These steps in combination seem to say “ decode at the highest available > supported rate that the hardware can handle.” If so, the wording could > be simplified. It is a little bit more complicated. There are two rates here: one used for decoding, and one used for playback. The difference between steps 2 and 3 is that in step 3, the hardware's highest available sample rate is not one of the rates supported by Opus decoding. For example, if the highest available sample rate the hardware can handle is 32 kHz, then this is not a "supported rate" (i.e., not supported by Opus decoding), so step 2 does not apply and step 3 does. Step 3 says to decode at the next highest "supported rate", which would be 48 kHz, and then resample to a rate the hardware can handle (in this case 32 kHz). I don't think that's equivalent to your summary of the two steps. > - 7, 2nd paragraph: > I don’t think that’s literally true, you’ve defined 0, 1, and 255. The > rest are reserved. (I guess one could argue the guidance to treat > reserved values as 255 supports this statement, but I think that’s a > stretch. Yes, this is really true for the currently defined mapping families. Changed to "Each currently specified value of this octet indicates..." > -5.1.1.2: Do you expect the reader to know what Vorbis channel order is? > Also, please expand FLAC on first mention. It's a fairly common term among those who work with audio formats, but I added a "(see below)" for those who don't, since the table below it lists what that order is. > - 5.2.1, paragraph starting with " An Ogg Opus stream MUST NOT have more > than one of each tag, and if present their values MUST be an integer" > > each of these two new tags, or all tags? (I assume the former from the > numeric values) Also, singular/plural mismatch in "their values MUST be > an integer" "each of these tags" seems to resolve both issues. > -10, last paragraph: > > Updates, or extends? If the former, the draft needs an “Updates RFC > 5334” field in the initial heading. Added the field. A complete diff of the changes since -09 so far is attached (and, as always, can be found in the Opus git repository: <https://git.xiph.org/?p=opus.git>). I'll post again when I finish reviewing the changes for normative language for generic Ogg requirements and the extra copyright grant.
- [codec] AD Evaluation of draft-ietf-codec-oggopus… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ron
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ron
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ron
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Ben Campbell
- Re: [codec] AD Evaluation of draft-ietf-codec-ogg… Timothy B. Terriberry