Re: [codec] draft-ietf-codec-oggopus: attached pictures

Basil Mohamed Gohar <basilgohar@librevideo.org> Tue, 26 August 2014 23:34 UTC

Return-Path: <basilgohar@librevideo.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F27861A00B5 for <codec@ietfa.amsl.com>; Tue, 26 Aug 2014 16:34:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.8
X-Spam-Level:
X-Spam-Status: No, score=0.8 tagged_above=-999 required=5 tests=[BAYES_50=0.8] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S8zWjjzK7bGO for <codec@ietfa.amsl.com>; Tue, 26 Aug 2014 16:34:45 -0700 (PDT)
Received: from mail.zaytoon.hidayahonline.net (zaytoon.hidayahonline.net [173.193.202.83]) by ietfa.amsl.com (Postfix) with ESMTP id 9B3991A017A for <codec@ietf.org>; Tue, 26 Aug 2014 16:34:44 -0700 (PDT)
Received: from [192.168.1.100] (d60-65-38-134.col.wideopenwest.com [65.60.134.38]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: basilgohar@librevideo.org) by mail.zaytoon.hidayahonline.net (Postfix) with ESMTPSA id 344C9657804 for <codec@ietf.org>; Tue, 26 Aug 2014 19:34:42 -0400 (EDT)
Message-ID: <53FD198E.8040103@librevideo.org>
Date: Tue, 26 Aug 2014 19:34:38 -0400
From: Basil Mohamed Gohar <basilgohar@librevideo.org>
Organization: Libre Video
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0
MIME-Version: 1.0
To: codec@ietf.org
References: <CAMdZqKEOfNEXEAGjyx2+5xW7QkrA2ekNVZym+ZLRsNoA0+cSYQ@mail.gmail.com> <53FBBCA1.8060708@xiph.org> <20140826132401.GR326@hex.shelbyville.oz>
In-Reply-To: <20140826132401.GR326@hex.shelbyville.oz>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/codec/u4XuFrYt_q35L0yCYyJBpOc__9k
Subject: Re: [codec] draft-ietf-codec-oggopus: attached pictures
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Aug 2014 23:34:52 -0000

On 08/26/2014 09:24 AM, Ron wrote:
> On Mon, Aug 25, 2014 at 03:45:53PM -0700, Timothy B. Terriberry wrote:
>> With my individual hat on...
>>
>> Mark Harris wrote:
>>> Attached pictures are not mentioned at all in the current Ogg Opus
>>> draft (draft-ietf-codec-oggopus-04), which simply defers to the
>>> vorbis-comment specification for the format of metadata, with a few
>>> specific differences such as new gain tags.
>>
>> The theory was that it wasn't necessary to re-invent the wheel here.
>> Applications which already support Vorbis-style album art can add support
>> for Opus album art with as little as one line of code (because they can, and
>> do, share comment-parsing routines among Vorbis/Theora/Speex/etc.). This is
>> not just a matter of engineering simplicity, but also speed of adoption.
>>
>>>  (1) Specify that comments may contain either UTF-8 or binary data,
>>> according to some rule.  For example, if the name of the tag begins
>>> with "@" then its value is binary data and not intended to be
>>> displayed as text, and is otherwise UTF-8.  There is no technical
>>
>> This has backwards-compatibility problems. Is this unique to Opus or would
>> we expect other formats to adopt the same strategy? (see above about
>> code-reuse). What would we do about existing tags that might already start
>> with an '@'? How would we represent a field name that we want to start with
>> an '@' (for whatever reason)?
>>
>> We could also simply use a more compact form of embedding binary data in
>> UTF-8, but those have higher implementation complexity (and two encodings to
>> choose from is higher still).
>>
>>>      Alternatively, ordinary tag names could be used and the value
>>> itself could be used to indicate that it is non-UTF-8 binary data, for
>>> example by including a null character prefix before the binary data,
>>> or a 0xff byte prefix (which is not valid UTF-8), or a null byte
>>> delimiter in place of "=".  However using distinct tag names seems
>>> cleaner.
>>
>> Existing, naive tools are likely to mis-handle all of those things (e.g.,
>> copying the tag as the empty string, rejecting it as invalid and removing
>> it, rejecting the whole comment packet as invalid, etc.).
>>
>>>  (2) Immediately following the comments, in the same packet, allow a
>>> picture count followed by a length and binary data for each picture,
>>
>> This is more practical, though we probably want to leave room to define
>> additional types of binary data later (e.g., use the FLAC
>> METADATA_BLOCK_HEADER and/or BLOCK_TYPE values). Again, I'm interested in
>> the issues of code re-use for other formats. Notably, this is hard to make
>> compatible with Vorbis because it encodes one non-zero "framing bit" after
>> the main comment data (and mandates it be checked).
>>
>>>  (3) Specify a way to store attached pictures in the file outside of
>>> the Opus stream.  This is the way that containers such as Quicktime
>>> and Matroska work, but to do that with Ogg would require another
>>> stream that contains the pictures, since the Ogg container itself does
>>> not provide metadata.
>>
>> This can already be done by specifying a relative URL instead of actual
>> picture data (by setting the mime type appropriately). This has the
>> advantage that a single file can be used for multiple tracks, which provides
>> considerably more space savings than avoiding BASE64 encoding as soon as you
>> have more than 1 track from the same album. Support for and usage of this
>> approach is almost non-existent. Whether that's because the convenience to
>> users of having everything embedded in one file is worth the extra space or
>> because of the inconvenience to application authors of having to (securely)
>> deal with external files, or the combination of both of those effects, I
>> don't know.
>>
>>>  (4) Do not address this.  Attached pictures must be base64-encoded
>>> and written in a comment.  If users complain, recommend use of
>>> Matroska or another container that does not have this issue.
>>
>> We could also choose not to address this _here_, and instead continue to
>> defer to the existing vorbis-comment documentation. That wouldn't block
>> publication of this draft, and give us time to get implementation feedback
>> from the authors of media players and tools who would have to deal with this
>> change. I think this is the approach that I personally prefer.
> 
> I think that sounds like the best approach to me too.
> 
> There are real things that can be improved there, but none of them are
> actually specific to the Opus Ogg mapping, and any solution to them
> that gets buy-in from implementors does seem likely to be something
> they'd also want to reuse more widely with other mappings as well.
> 
> Whether we'd eventually want to bring that work back here to formalise
> a new proposed standard for "Ogg metadata" is a separate question,
> but either way it seems like a work that could (and should) occur
> independently of OggOpus, and that could apply to many other things
> in addition to it.  The important question for this draft is have we
> adequately defined the needed codec mapping to encapsulate it in Ogg.
> 
>   Ron

Some thoughts:

1.  If the last option is preferred by most, then let's get to work on
canvassing the existing players that we think are best suited to this,
rather than letting it languish and becoming forgotten.  In other words,
let's make this our own itch and scratch it.

2.  Isn't there already a specification for Kate that allows usage of
images in Ogg?  Hackish, I know, but that might lend some guidance on a
way (good or bad?) to encode binary image data.

3.  Does Ogg Skeleton have any mechanism or "spot" for this kind of
binary image data?

Personally, I don't think the 33% inflation due to base64 encoding
aspect is that big of a deal.  I'd be interested to know if there are
cases where the image data becomes large in comparison to the audio data
in an OggOpus file, and that might make it more relevant to pursue a
more optimal solution.


-- 
Libre Video
http://librevideo.org