Re: [codec] OggOpus: Rational for excluding replaygain tags?

Ron <ron@debian.org> Tue, 27 November 2012 07:59 UTC

Return-Path: <ron@debian.org>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9196621F852E for <codec@ietfa.amsl.com>; Mon, 26 Nov 2012 23:59:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.546
X-Spam-Level:
X-Spam-Status: No, score=-0.546 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FH_HOST_EQ_D_D_D_D=0.765, HOST_MISMATCH_NET=0.311, RCVD_IN_SORBS_DUL=0.877, RDNS_DYNAMIC=0.1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AMePEV3MZWFp for <codec@ietfa.amsl.com>; Mon, 26 Nov 2012 23:59:21 -0800 (PST)
Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [IPv6:2001:44b8:8060:ff02:300:1:2:6]) by ietfa.amsl.com (Postfix) with ESMTP id F3D9121F84E0 for <codec@ietf.org>; Mon, 26 Nov 2012 23:59:20 -0800 (PST)
Received: from ppp118-210-61-159.lns20.adl2.internode.on.net (HELO audi.shelbyville.oz) ([118.210.61.159]) by ipmail06.adl2.internode.on.net with ESMTP; 27 Nov 2012 18:29:19 +1030
Received: from localhost (localhost [127.0.0.1]) by audi.shelbyville.oz (Postfix) with ESMTP id 5E36D4F8F3; Tue, 27 Nov 2012 18:29:18 +1030 (CST)
X-Virus-Scanned: Debian amavisd-new at audi.shelbyville.oz
Received: from audi.shelbyville.oz ([127.0.0.1]) by localhost (audi.shelbyville.oz [127.0.0.1]) (amavisd-new, port 10024) with LMTP id s42eapyfySFQ; Tue, 27 Nov 2012 18:29:10 +1030 (CST)
Received: by audi.shelbyville.oz (Postfix, from userid 1000) id 0FACC4F902; Tue, 27 Nov 2012 18:29:10 +1030 (CST)
Date: Tue, 27 Nov 2012 18:29:10 +1030
From: Ron <ron@debian.org>
To: Calvin Walton <calvin.walton@kepstin.ca>
Message-ID: <20121127075909.GH2043@audi.shelbyville.oz>
References: <1352307794.14547.30.camel@ayu> <9B8EA46C78239244B5F7A07E163D3DFE08C500@CH1PRD0511MB432.namprd05.prod.outlook.com> <1352328081.14547.82.camel@ayu> <9B8EA46C78239244B5F7A07E163D3DFE05F88C2D@CH1PRD0511MB432.namprd05.prod.outlook.com> <1353991837.2000.41.camel@nayuki.kepstin.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1353991837.2000.41.camel@nayuki.kepstin.ca>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: codec@ietf.org
Subject: Re: [codec] OggOpus: Rational for excluding replaygain tags?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Nov 2012 07:59:22 -0000

On Mon, Nov 26, 2012 at 11:50:37PM -0500, Calvin Walton wrote:
> If I select to use 'Album Gain' mode for ReplayGain in the playback menu
> of my iPod running Rockbox:
> http://download.rockbox.org/daily/manual/rockbox-ipodvideo/rockbox-buildch7.html#x10-1270007.9
> there is currently no way for the player to know whether or not the
> arbitrary value in the header gain field represents R128-normalized
> album gain or something completely different.

The only thing the decoder needs to know about the header gain is that
it should _always_ apply it.  Trying to assign some other meaning to it
and then applying it selectively is precisely the kind of confusion,
misbehaviour, and random difference between players that this mechanism
aims to avoid.

The draft says "virtually all players", because specialised applications
may have reasons to want the raw data level, but the simple rule for
most players is "you can't know what it means, it's the level the author
of the file wanted samples decoded at, just always apply it before doing
anything else".

Trying to attach an 'album gain' mode selector switch to it adds the kind
of chaos that this recommendation is meant to avoid:

 To avoid confusion with multiple normalization schemes, an Opus
 comment header SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN,
 REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or
 REPLAYGAIN_ALBUM_PEAK tags.

In the absence of an ALBUM_GAIN tag, your album gain mode switch should
simply do exactly nothing at all.  If it tries to do something that it
thinks is "smart" instead, then it's lying to you about what the album
gain switch does (unless it's actually going to analyse the whole album
for you on the fly...).


> Most players fall back to using the track gain field if the album gain
> is not present - but this can't work properly with the current
> specification of Ogg Opus. They must either assume that the user has put
> R128 album gain into the header gain field (I think foobar2000 currently
> does this), or only use track gain.

If there are players where you explicitly select "album gain" and they
instead arbitrarily apply "track gain" instead, then I would say that
they are already not working properly.

But since we do seem to have generated some confusion here with at least
one person, I wonder if we need to reword this clause in the spec:


 There is no Opus comment tag corresponding to REPLAYGAIN_ALBUM_GAIN.
 That information should instead be stored in the ID header's 'output
 gain' field.


You seem to have confused that with "header gain may mean album gain".
That's not what it means at all.  What it means is "album gain is useless
and most things only get it wrong".  If you want your files normalised
to that level, and didn't do that when you first encoded them, then you
can still do it 'losslessly' by adjusting the header gain field later.

That way, _every_ player, that doesn't apply some other gain of its own,
will play them back with the desired "album gain".  And since "album gain"
is basically a euphemism for "play all tracks at the level that the artist
intended, not at some false-normalised level" -- that actually corresponds
fairly well to what the header gain does.

If an entire album is _supposed_ to be exceptionally quiet, or similarly
exceptionally loud, then you don't _want_ it normalised to some arbitrary
muzak sound pressure level that is the same for all albums.
Header gain lets you have all of those options, with the best guarantee
we can give that it will actually work the same in all players.


> My personal suggestion would be to have an R128_ALBUM_GAIN field, with
> the same format and similar semantics to the R128_TRACK_GAIN field. In
> the case where the album gain is stored in the header gain field, the
> R128_ALBUM_GAIN field SHOULD be present and be filled with a value of
> '0', indicating that no further adjustment is needed to get R128 album
> normalized loudness.

If you want all your albums to play at exactly the same level, regardless
of what the author intended, then you can tweak the header gain field
without re-encoding (and you become the new author that all players will
respect the choices of).

Selecting track gain does not mean that a player should ignore the header
gain.  It means the player should apply that relative to the already
applied header gain _in addition_.  Likewise, unselecting album gain
should _never_ cause a player to ignore the header gain.

If you are already confused with just 2 gain knobs to adjust, can you
imagine how many extra permutations of confused some people will be if
we add even more of them, with even more rules for when to apply them?


I don't mean to dismiss your concerns here, but I get the feeling that
there's some element of people talking past each other with different
impressions of what the same things actually mean.  And if that is the
case, then we should clarify the spec to better explain that header
gain is NOT "album gain", and should not be interpreted as such by any
'normal' player.

All the spec is intended to say is that an author _may_ use it for that,
just as they may use it to normalise to any other arbitrary level that
is appropriate for the *default* playback of that file.  But it's not
the job of a player to know what it is supposed to mean - for a player
to behave correctly, and the same as all other players, all it needs to
do is just always apply it.  We can't make a rule that is much simpler
to always get right across all player implementations than that :)

Does that makes sense?

  Cheers,
  Ron