Re: [codec] OggOpus: Rational for excluding replaygain tags?

Calvin Walton <calvin.walton@kepstin.ca> Thu, 08 November 2012 06:14 UTC

Return-Path: <calvin.walton@kepstin.ca>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5AD5521F878C for <codec@ietfa.amsl.com>; Wed, 7 Nov 2012 22:14:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level:
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4xZBRc44cUS7 for <codec@ietfa.amsl.com>; Wed, 7 Nov 2012 22:14:06 -0800 (PST)
Received: from mail-ie0-f172.google.com (mail-ie0-f172.google.com [209.85.223.172]) by ietfa.amsl.com (Postfix) with ESMTP id BEE8121F8546 for <codec@ietf.org>; Wed, 7 Nov 2012 22:13:54 -0800 (PST)
Received: by mail-ie0-f172.google.com with SMTP id 9so4089157iec.31 for <codec@ietf.org>; Wed, 07 Nov 2012 22:13:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kepstin.ca; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:x-mailer:mime-version; bh=w0uge+zOUfuamJU8NPPqdmwhRnPyfBzyHFjs5P678oo=; b=MQjrEJREYOd6OKZPxqvFNCgnKl1ihii7gk+MVQCM9FMEGHUiEUwiNCPHT36i7Teuus 0qwkNc9ciS1LGh01T0mtZB71IDdN4a/brk74rI5ckC1jEQA+yaYklt9LDzS70Pwpkj+Y 4PhBICd3chc2vG9vzr+pBzc78q9qc0mSywoYc=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:x-mailer:mime-version:x-gm-message-state; bh=w0uge+zOUfuamJU8NPPqdmwhRnPyfBzyHFjs5P678oo=; b=GR+UXGD2FoEGQJUVERfQHf/d05Ulq8U7hXK0EljkYTgzjCU2vpwLB9Uv/niLfuLg1d RAuXuU01ZMQa+QmPfQLLpg70Wp3CGF4SWTwCfwQ58G9sCkCTKGhpL/zNT9gkkXf4nkPM WIAUVCE1plLBowunUa4Qge+lKJzYJx7Kkxna+X2EMopCD/3UMKTyT7j6vflFZMVnICrG Jy47cebQepFSrRKOdsPm0LHeNcws07ET+QfehUZMt9zcMTCWbbezrQ1jrPAbRpnBM1QA jO3I2kUbw7W2STego7t+62Fyh1evmL65O2IIfhdbDRelIA8VSZQR230mwEuRvNMDl85a X8sw==
Received: by 10.50.36.200 with SMTP id s8mr18922153igj.25.1352355234163; Wed, 07 Nov 2012 22:13:54 -0800 (PST)
Received: from [192.168.1.149] (CPE586d8fb6db38-CM78cd8e665875.cpe.net.cable.rogers.com. [99.224.21.194]) by mx.google.com with ESMTPS id ez8sm4053395igb.17.2012.11.07.22.13.52 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 07 Nov 2012 22:13:52 -0800 (PST)
Message-ID: <1352355230.25560.30.camel@ayu>
From: Calvin Walton <calvin.walton@kepstin.ca>
To: Ron <ron@debian.org>
Date: Thu, 08 Nov 2012 01:13:50 -0500
In-Reply-To: <20121108035305.GE6812@audi.shelbyville.oz>
References: <1352307794.14547.30.camel@ayu> <9B8EA46C78239244B5F7A07E163D3DFE08C500@CH1PRD0511MB432.namprd05.prod.outlook.com> <1352328081.14547.82.camel@ayu> <20121108035305.GE6812@audi.shelbyville.oz>
Content-Type: multipart/signed; micalg="sha1"; protocol="application/x-pkcs7-signature"; boundary="=-4aSf0/yXeo/Fa2Gp+paV"
X-Mailer: Evolution 3.6.0
Mime-Version: 1.0
X-Gm-Message-State: ALoCoQnrnPCe8jIQaX5V5nUXzbyVej1rOAaAvBoiRSqN4ZVKyTk83XOCLobVL7m0nSAS7VA3xmZn
Cc: codec@ietf.org
Subject: Re: [codec] OggOpus: Rational for excluding replaygain tags?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Nov 2012 06:14:07 -0000

On Thu, 2012-11-08 at 14:23 +1030, Ron wrote:
> A player must always apply the header gain to output the signal
> level that the person who created the file intended it to have
> (as if they really had scaled it by that level prior to encoding).
> 
> A player may also optionally apply normalisation to a standard
> level on top of that, either based on information in the file's
> comment tags, its own analysis of the file, or some other
> mechanism based on input from the user of that player.
> 
> The former is about gain choices made by the creator, the latter
> is more about gain choices made by the end user.  The former
> must always be applied so that players behave consistently, the
> latter may be applied at the sole discretion of the user and
> (how they configure) the applications that they use to play it.
> 
> 
> Is there something about that which we need to explain better
> in the draft?

There is one place in the draft that does conflict with this - the
rational for why an R128_ALBUM_GAIN comment was not included. It is
noted that the 'album' gain value should be stored in the header gain
field instead. However, if the header gain is supposed to be used by the
audio creator/encoder and not the end user, then this will overwrite the
creator's value.

In addition, since the header gain field has no semantic meaning, there
is no way to know that the value in this field corresponds to the R128
album gain as opposed to an arbitrary creator-provided value, meaning
that a player cannot reliably give the user the ability to select
'album' vs. 'track' gain during playback by their preference.

(I admit that for the time being, most opus files will be encoded by the
same person as the end user who will be listening to them, of course...)

Anyways, this is getting a bit far afield from my original topic...
I want to play back Ogg Opus files at the same loudness level as my
existing Replaygained Ogg Vorbis/FLAC/MP3/AAC/etc. files. Currently I
have 3 choices:

1. Add Vorbis-style REPLAYGAIN_{TRACK,ALBUM}_GAIN comments with
adjustments relative to the header gain (usually 0). This works
correctly in all players that I have tried, even though the spec says
that these tags should not be used. Since players generally share
comment parsing code between vorbis and opus, these comments are likely
to continue working for the foreseeable future.

2. Store either the track or album replaygain values in the header gain
field. This works correctly in all players, but does not allow me to
choose between album and track gain at playback time. (e.g. for
continuous playback versus random/shuffle playback)

3. Store an arbitrary value (usually 0) in the header gain field, and
use the R128_TRACK_GAIN comment with an R128 adjustment. Have the player
add an additional adjustment to adjust the reference level to match
replaygain. foobar2000 gets this almost right, except that it assumes
that the header gain field corresponds to the album gain, which may not
be correct. No other player currently supports this. (On the other hand,
since comment parsing code is generally shared, this means that the
R128_TRACK_GAIN comment will likely start working in Ogg Vorbis files
soon.)

At this point, I see the replaygain comments as the most compatible /
interoperable way of reaching my goal, but it does mean that I'm relying
on behaviour that the spec says I SHOULD NOT do. I'm wondering if it
would make sense to specify the behaviour for when these tags are
present.

-- 
Calvin Walton <calvin.walton@kepstin.ca>