Re: [codec] OggOpus: Rational for excluding replaygain tags?

Calvin Walton <calvin.walton@kepstin.ca> Tue, 27 November 2012 09:52 UTC

Return-Path: <calvin.walton@kepstin.ca>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0818621F850E for <codec@ietfa.amsl.com>; Tue, 27 Nov 2012 01:52:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.692
X-Spam-Level:
X-Spam-Status: No, score=-2.692 tagged_above=-999 required=5 tests=[AWL=0.907, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id smkuLi-jqzMF for <codec@ietfa.amsl.com>; Tue, 27 Nov 2012 01:52:54 -0800 (PST)
Received: from mail-ie0-f172.google.com (mail-ie0-f172.google.com [209.85.223.172]) by ietfa.amsl.com (Postfix) with ESMTP id 14A8C21F8507 for <codec@ietf.org>; Tue, 27 Nov 2012 01:52:53 -0800 (PST)
Received: by mail-ie0-f172.google.com with SMTP id c13so11376102ieb.31 for <codec@ietf.org>; Tue, 27 Nov 2012 01:52:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kepstin.ca; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:x-mailer:mime-version; bh=p62x2+rWcT8yH38hq9ceO5qgaTK8C3yA3rVx9QJDp3g=; b=mryZ2vjuW++wKsNNAQtdcErqFONfI2+tcySqSD0a5rwMVPjuT4YUl1O2tQowWoJMbO rBtaQZb//LewAuYq6xEwjhl59DCTRZH7g1t4LY1aljtjIVPoTvKCVL/VZQAo+uOOwFo8 kThgI4Ch1ehrp+pYocwp0HfeAfthdve3Domn8=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:subject:from:to:cc:date:in-reply-to:references :content-type:x-mailer:mime-version:x-gm-message-state; bh=p62x2+rWcT8yH38hq9ceO5qgaTK8C3yA3rVx9QJDp3g=; b=fO7kNQE4yCHzh67kyrCGXp2azPAgGrQ+fTPBeqD5paLf8SyQa4jzGyxqEsFv370ntT zKCAK9Biih/Hcr/IhuOZT6Nm1epT6ulnqxUVtCRmhmzA+WP3ly/3pWkiFhJ7DsBHP3Fv bDHiBglYOzS7AsjvLx0rlUIMUfE/wCpieuAs6eupKBytUSwEWU4+Ys8Iuc2zuleT+3yX Kjx4GGIQ/EvaNCugLWMNAVc8M2nxJyuG6zr9ar73ZXEv25vJkWxbFGGhMNmQR5COi84S uDcAUrb5xMNGcCxjO2nQ6IuQDyi5LQGy1IjWElPbwwyfDMxt7iP3xp9gG+Dr/EQFIg9G Mu7w==
Received: by 10.50.41.129 with SMTP id f1mr14754516igl.53.1354009973244; Tue, 27 Nov 2012 01:52:53 -0800 (PST)
Received: from [192.168.1.106] (CPE586d8fb6db38-CM78cd8e665875.cpe.net.cable.rogers.com. [174.112.205.165]) by mx.google.com with ESMTPS id gz10sm1347765igc.9.2012.11.27.01.52.50 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 27 Nov 2012 01:52:51 -0800 (PST)
Message-ID: <1354009968.10253.55.camel@ayu>
From: Calvin Walton <calvin.walton@kepstin.ca>
To: Ron <ron@debian.org>
Date: Tue, 27 Nov 2012 04:52:48 -0500
In-Reply-To: <20121127075909.GH2043@audi.shelbyville.oz>
References: <1352307794.14547.30.camel@ayu> <9B8EA46C78239244B5F7A07E163D3DFE08C500@CH1PRD0511MB432.namprd05.prod.outlook.com> <1352328081.14547.82.camel@ayu> <9B8EA46C78239244B5F7A07E163D3DFE05F88C2D@CH1PRD0511MB432.namprd05.prod.outlook.com> <1353991837.2000.41.camel@nayuki.kepstin.ca> <20121127075909.GH2043@audi.shelbyville.oz>
Content-Type: multipart/signed; micalg="sha1"; protocol="application/x-pkcs7-signature"; boundary="=-ixOB8mHezg2r7mlJiqgg"
X-Mailer: Evolution 3.6.0
Mime-Version: 1.0
X-Gm-Message-State: ALoCoQnbSBzaQ2Anst0B4e8KwSjFKmtfU9YEwAvEDBmmuBEJm5ii+hCnliXbOaVO3BCkV91Br+HW
Cc: codec@ietf.org
Subject: Re: [codec] OggOpus: Rational for excluding replaygain tags?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Nov 2012 09:52:56 -0000

On Tue, 2012-11-27 at 18:29 +1030, Ron wrote:
> On Mon, Nov 26, 2012 at 11:50:37PM -0500, Calvin Walton wrote:
> > If I select to use 'Album Gain' mode for ReplayGain in the playback menu
> > of my iPod running Rockbox:
> > http://download.rockbox.org/daily/manual/rockbox-ipodvideo/rockbox-buildch7.html#x10-1270007.9
> > there is currently no way for the player to know whether or not the
> > arbitrary value in the header gain field represents R128-normalized
> > album gain or something completely different.
> 
> The only thing the decoder needs to know about the header gain is that
> it should _always_ apply it.  Trying to assign some other meaning to it
> and then applying it selectively is precisely the kind of confusion,
> misbehaviour, and random difference between players that this mechanism
> aims to avoid.

This is very different from the ReplayGain tags. ReplayGain is
*optional* at playback time, and is only applied if the user
*explicitly* requests one or the other type of loudness normalization.

The issue is coming in because there are players that already support
the ReplayGain model, and they are trying to shoe-horn approximately
corresponding values into Opus header and comment fields that don't
share the same semantics.

ReplayGain-supporting players typically have 3 settings:
     1. ReplayGain off: Use the original audio levels of the file (as
        set by the mastering engineer/record label/artist/file encoder)
     2. ReplayGain album mode: Normalize the loudness of an album, but
        keep track-track variations. This is the counter to the loudness
        wars (see below), and is useful for general listening.
     3. ReplayGain track mode: Normalize all tracks to the same
        loudness. Useful for things like party background music randomly
        selected from multiple albums, or listening to music on public
        transit from a portable player, where a quiet song could be
        drowned out.

Players based on the current Ogg Opus standard cannot support all 3
modes on a single file, because the playback gain field may contain
either 1, 2, or something else entirely at the discretion of the person
who last modified the audio file.

Note that I generally want the "default" playback of the file to be at
original levels, so that if I e.g. pass a file to a friend who does not
use ReplayGain, they don't complain that "This file is too quiet!" Keep
in mind that a file of pop music with R128 normalization might play back
15 dB (or more!) quieter than the original signal!

As a result, I want my preferred playback loudness to be stored
*separately* from the default value, in a place where it is only used if
I explicitly select either album or track gain (and which one I choose
depends on what conditions I am listening to music under.)

> In the absence of an ALBUM_GAIN tag, your album gain mode switch should
> simply do exactly nothing at all.  If it tries to do something that it
> thinks is "smart" instead, then it's lying to you about what the album
> gain switch does (unless it's actually going to analyse the whole album
> for you on the fly...).

The reason for this is kind of a best effort fallback: You asked for
music to be normalized loudness. If it can't be normalized in the manner
you asked, a fallback would be to estimate a close level - Track gain is
usually within 1-2 dB of album gain on most releases.

If this isn't done, you could (given modern pop mastering) suddenly get
a track that's 10 dB (add 5 dB if you're using R128 instead of
ReplayGain) or more louder than everything else, simply because e.g. it
was a single-track download and you didn't notice that your scanning
tool didn't add 'album' gain to it. It's to protect your ears and/or
speakers.

> But since we do seem to have generated some confusion here with at least
> one person, I wonder if we need to reword this clause in the spec:
> 
> 
>  There is no Opus comment tag corresponding to REPLAYGAIN_ALBUM_GAIN.
>  That information should instead be stored in the ID header's 'output
>  gain' field.
> 
> 
> You seem to have confused that with "header gain may mean album gain".
> That's not what it means at all.  What it means is "album gain is useless
> and most things only get it wrong".  If you want your files normalised
> to that level, and didn't do that when you first encoded them, then you
> can still do it 'losslessly' by adjusting the header gain field later.
> 
> That way, _every_ player, that doesn't apply some other gain of its own,
> will play them back with the desired "album gain".  And since "album gain"
> is basically a euphemism for "play all tracks at the level that the artist
> intended, not at some false-normalised level" -- that actually corresponds
> fairly well to what the header gain does.
> 
> If an entire album is _supposed_ to be exceptionally quiet, or similarly
> exceptionally loud, then you don't _want_ it normalised to some arbitrary
> muzak sound pressure level that is the same for all albums.
> Header gain lets you have all of those options, with the best guarantee
> we can give that it will actually work the same in all players.

> If you want all your albums to play at exactly the same level, regardless
> of what the author intended, then you can tweak the header gain field
> without re-encoding (and you become the new author that all players will
> respect the choices of).

Unfortunately, if the record labels and mastering engineers get their
choice of selecting the playback level, you get something called
"Loudness Wars", because of the purely psychological fact that louder
music sounds better. This dates back to the days of vinyl, when e.g.
particularly "hot" vinyl masters would sound better when played in a
jukebox. Over the ~18 years that we've had CDs, the mastering volume of
pop music on CD has gone up around 10 dB. Record labels are refusing to
release quieter music, because it won't sound as good as the existing
loud recordings that someone already has. And online music sales have to
sound as good as the CD you just ripped (The artist rarely gets a say in
the matter, really...)

The main point of ReplayGain in album mode is to counter this trend and
even out the levels between albums, because the listener doesn't trust
the mastering engineer's choice of volume.

> Selecting track gain does not mean that a player should ignore the header
> gain.  It means the player should apply that relative to the already
> applied header gain _in addition_.  Likewise, unselecting album gain
> should _never_ cause a player to ignore the header gain.

All players that I've tested handle this correctly. Players always apply
the header gain, then any other gain values on top of that.

> I don't mean to dismiss your concerns here, but I get the feeling that
> there's some element of people talking past each other with different
> impressions of what the same things actually mean.  And if that is the
> case, then we should clarify the spec to better explain that header
> gain is NOT "album gain", and should not be interpreted as such by any
> 'normal' player.

If this is the case, then there should be a separate album gain field so
that there is a place to store a value which a player could interpret as
such.

> All the spec is intended to say is that an author _may_ use it for that,
> just as they may use it to normalise to any other arbitrary level that
> is appropriate for the *default* playback of that file.  But it's not
> the job of a player to know what it is supposed to mean - for a player
> to behave correctly, and the same as all other players, all it needs to
> do is just always apply it.  We can't make a rule that is much simpler
> to always get right across all player implementations than that :)

Players have already screwed this up, by assigning additional meaning to
a field which wasn't intended, since there wasn't a field corresponding
to the exact meaning desired.

-- 
Calvin Walton <calvin.walton@kepstin.ca>