Re: [codec] draft-ietf-codec-oggopus and "album" gain

Ron <> Wed, 27 August 2014 21:21 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 299F41A0303 for <>; Wed, 27 Aug 2014 14:21:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.8
X-Spam-Status: No, score=0.8 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Htq9LdWpN-eE for <>; Wed, 27 Aug 2014 14:21:31 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 13B841A0141 for <>; Wed, 27 Aug 2014 14:21:30 -0700 (PDT)
Received: from (HELO mailservice.shelbyville.oz) ([]) by with ESMTP; 28 Aug 2014 06:51:30 +0930
Received: from localhost (localhost []) by mailservice.shelbyville.oz (Postfix) with ESMTP id B335EFFF44; Thu, 28 Aug 2014 06:51:16 +0930 (CST)
X-Virus-Scanned: Debian amavisd-new at mailservice.shelbyville.oz
Received: from mailservice.shelbyville.oz ([]) by localhost (mailservice.shelbyville.oz []) (amavisd-new, port 10024) with LMTP id niYmeUzuHqQi; Thu, 28 Aug 2014 06:51:14 +0930 (CST)
Received: from hex.shelbyville.oz (hex.shelbyville.oz []) by mailservice.shelbyville.oz (Postfix) with ESMTPS id 931A2FF82C; Thu, 28 Aug 2014 06:51:14 +0930 (CST)
Received: by hex.shelbyville.oz (Postfix, from userid 1000) id 7EE8380470; Thu, 28 Aug 2014 06:51:14 +0930 (CST)
Date: Thu, 28 Aug 2014 06:51:14 +0930
From: Ron <>
To: Ian Nartowicz <>
Message-ID: <20140827212114.GV326@hex.shelbyville.oz>
References: <20140813222201.54fe7910@crunchbang> <> <20140816040140.GA31682@hex.shelbyville.oz> <> <20140827153043.2ff5e031@crunchbang>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20140827153043.2ff5e031@crunchbang>
User-Agent: Mutt/1.5.23 (2014-03-12)
Subject: Re: [codec] draft-ietf-codec-oggopus and "album" gain
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 27 Aug 2014 21:21:33 -0000

On Wed, Aug 27, 2014 at 03:30:43PM +0100, Ian Nartowicz wrote:
> On Mon, 25 Aug 2014 16:11:21 -0700 Ralph Giles <> wrote:
> >"By default, implementations of this specification MUST respect the
> >'output gain' field, but MAY NOT respect the comments. Encoder authors
> >are advised to take this into account. For example, to produce R128
> >normalized files it's more reliable for post-processing application to
> >update the 'output gain' field and write a comment 'R128_TRACK_GAIN=0'
> >than to put the normalized value directly in the comment."
> >
> >This removes the normative suggestion of 'should' while maintaining the
> >suggestion and rationale.
> This seems like a step backwards to me.  That MUST is a requirement
> that wasn't present before.  An earlier statement is "Virtually all players
> and media frameworks should apply it by default.", which I think is the
> appropriate guidance.

It really always has been intended as a MUST (but we know there are some
VERY RARE exceptions where you won't).

No general purpose player should ever provide a button to disable this.

The floating point version of Opus can ostensibly record samples in the
range of +/- FLT_MAX, and while the reference implementation places some
arbitrary "you've got to be kidding" lower bound on that, it's still
several times greater than the fixed point range, and there is no reason
an alternative implementation might not allow a larger range.

There's at least one test sample that is way below the range of hearing
without the output gain applied, and is only audible after it is, and
there is no reason that real world samples might not exist in the
opposite configuration - where they play at a normal level with the
gain applied, but would blow your speakers and eardrums out without it.

If someone played such a sample, which they thought they knew what it
contained having played it elsewhere, in a player that disables this,
you're potentially going to make them very, very sad.  Or at the very
least, surprised and annoyed.

The output gain is a guarantee to whoever mastered the recording that
"by default" this is the output level that will be seen.  It lets them
take the raw data of their latest rocket launch or atomic test and
'losslessly' adjust the gain to more reasonable listening levels, for
whatever they decide is 'reasonable' for that file, making it suitable
for more general use without needing to re-encode it.

> >I'd be inclined to drop the "By default" there, unless we're going
> >to expand on what the exceptions to the 'default' might be.
> I agree with Ron that saying "By default" and "MUST" in the same sentence is
> confusing, but I think the solution is not to say MUST.

My 'objection' to the "By default" was actually that it watered down the
MUST without providing specific guidance as to why most implementers
shouldn't assume they are the special snowflakes the exceptions apply to.

Your confusion in thinking that it's not really a MUST for a general
purpose player, is exactly why I think we need to strengthen and
clarify that to avoid having inconsistent players and sad users :)

> There are exceptions to the default.  Trying to specify them all here
> would seem an impossible feat of guesswork, but since they exist then
> MUST is too strong.

I don't think we need to specify them exhaustively to be able to give
some guidance for what they might be (or perhaps more importantly,
what they definitely are NOT).

In a general sense, really the *only* exception is: "I want to analyse
the raw signal level, in order to further adjust the output_gain and/or
other gain reference tags".

As soon as you say "I want to play this to a listening human", your
exception to the MUST is null and void.  They have a volume knob
which does that job, and it adjusts the level after the output_gain
correction is made (or possibly automatically with guidance from
the comment tags).

Personally, I think that exception is obvious enough in its own right
and well within the existing language elsewhere that we can simply
say MUST with no exceptions here (ie. an encoder that is going to
set the output_gain obviously must be able to measure the raw level
it is going to apply that to -- but that's not the same as playing
it to a listener at that level).

If we need some further language to clarify that though, then yes
we definitely ought to add it.  I can elaborate on the intention,
but we really do need you to tell us what vital clue is missing
that means that isn't clear in your reading of it.  If you think
we should drop the MUST, then clearly something is still not clear
enough here yet :)

> The intention of this change, after much discussion, was specifically not to
> constrain how an encoder should split an R128 gain between the output gain
> field and the tags.

That's actually never been constrained, and nothing in the language
proposed here should have changed that.  The output_gain has *never*
been constrained to any particular calibration other than "what the
person who made this file thinks is best", and any R128 gain was
*always* to be applied after the output_gain correction.

> Hence: advice rather than normative language.

We're providing advice about how you might use R128 tags and whether
or not a player needs to respect them.  The output_gain always has
been, and MUST be, normative and mandatory for all players at all

> Part of the reason is quickly apparent in the discussion of whether to
> place the track gain or album gain in the output gain field, and the
> answer was not to specify.

I thought the only confusion there was "how do I turn off album gain
if there isn't an explicit album gain tag?" - given that you MUST NOT
disable the output_gain, which may or may not have been calibrated to
R128 album gain levels.

You still MUST NOT turn off output gain, even if R128_ALBUM_GAIN=0
is explicitly specified, since the raw file still could be at "I'm
50 meters from ground zero of an atomic test" levels.  That still
doesn't imply *anything* about what the output_gain is or means.

But we've given you an explicit album gain tag now that you can
switch on or off if it's not zero.  I'm not sure what the remaining
confusion really is, but indeed, let's find it and clear it up.