Re: [codec] draft-ietf-codec-oggopus: attached pictures

"Timothy B. Terriberry" <> Mon, 25 August 2014 22:46 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id A19231A042D for <>; Mon, 25 Aug 2014 15:46:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.577
X-Spam-Status: No, score=-0.577 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_MED=-2.3, SPF_FAIL=0.001] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Zs3iMJ1C0SwX for <>; Mon, 25 Aug 2014 15:46:03 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id DF9301A040F for <>; Mon, 25 Aug 2014 15:45:54 -0700 (PDT)
Received: from [] ( []) (Authenticated sender: by (Postfix) with ESMTPSA id 1701FF24A1; Mon, 25 Aug 2014 15:45:54 -0700 (PDT)
Message-ID: <>
Date: Mon, 25 Aug 2014 15:45:53 -0700
From: "Timothy B. Terriberry" <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 SeaMonkey/2.26
MIME-Version: 1.0
To: Mark Harris <>, "" <>
References: <>
In-Reply-To: <>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Subject: Re: [codec] draft-ietf-codec-oggopus: attached pictures
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 25 Aug 2014 22:46:06 -0000

With my individual hat on...

Mark Harris wrote:
> Attached pictures are not mentioned at all in the current Ogg Opus
> draft (draft-ietf-codec-oggopus-04), which simply defers to the
> vorbis-comment specification for the format of metadata, with a few
> specific differences such as new gain tags.

The theory was that it wasn't necessary to re-invent the wheel here. 
Applications which already support Vorbis-style album art can add 
support for Opus album art with as little as one line of code (because 
they can, and do, share comment-parsing routines among 
Vorbis/Theora/Speex/etc.). This is not just a matter of engineering 
simplicity, but also speed of adoption.

>   (1) Specify that comments may contain either UTF-8 or binary data,
> according to some rule.  For example, if the name of the tag begins
> with "@" then its value is binary data and not intended to be
> displayed as text, and is otherwise UTF-8.  There is no technical

This has backwards-compatibility problems. Is this unique to Opus or 
would we expect other formats to adopt the same strategy? (see above 
about code-reuse). What would we do about existing tags that might 
already start with an '@'? How would we represent a field name that we 
want to start with an '@' (for whatever reason)?

We could also simply use a more compact form of embedding binary data in 
UTF-8, but those have higher implementation complexity (and two 
encodings to choose from is higher still).

>       Alternatively, ordinary tag names could be used and the value
> itself could be used to indicate that it is non-UTF-8 binary data, for
> example by including a null character prefix before the binary data,
> or a 0xff byte prefix (which is not valid UTF-8), or a null byte
> delimiter in place of "=".  However using distinct tag names seems
> cleaner.

Existing, naive tools are likely to mis-handle all of those things 
(e.g., copying the tag as the empty string, rejecting it as invalid and 
removing it, rejecting the whole comment packet as invalid, etc.).

>   (2) Immediately following the comments, in the same packet, allow a
> picture count followed by a length and binary data for each picture,

This is more practical, though we probably want to leave room to define 
additional types of binary data later (e.g., use the FLAC 
METADATA_BLOCK_HEADER and/or BLOCK_TYPE values). Again, I'm interested 
in the issues of code re-use for other formats. Notably, this is hard to 
make compatible with Vorbis because it encodes one non-zero "framing 
bit" after the main comment data (and mandates it be checked).

>   (3) Specify a way to store attached pictures in the file outside of
> the Opus stream.  This is the way that containers such as Quicktime
> and Matroska work, but to do that with Ogg would require another
> stream that contains the pictures, since the Ogg container itself does
> not provide metadata.

This can already be done by specifying a relative URL instead of actual 
picture data (by setting the mime type appropriately). This has the 
advantage that a single file can be used for multiple tracks, which 
provides considerably more space savings than avoiding BASE64 encoding 
as soon as you have more than 1 track from the same album. Support for 
and usage of this approach is almost non-existent. Whether that's 
because the convenience to users of having everything embedded in one 
file is worth the extra space or because of the inconvenience to 
application authors of having to (securely) deal with external files, or 
the combination of both of those effects, I don't know.

>   (4) Do not address this.  Attached pictures must be base64-encoded
> and written in a comment.  If users complain, recommend use of
> Matroska or another container that does not have this issue.

We could also choose not to address this _here_, and instead continue to 
defer to the existing vorbis-comment documentation. That wouldn't block 
publication of this draft, and give us time to get implementation 
feedback from the authors of media players and tools who would have to 
deal with this change. I think this is the approach that I personally