Re: [codec] Comments on draft-ietf-codec-ambisonics-01

Jean-Marc Valin <jmvalin@mozilla.com> Mon, 13 March 2017 21:55 UTC

Return-Path: <jmvalin@mozilla.com>
X-Original-To: codec@ietfa.amsl.com
Delivered-To: codec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 81025129BC5 for <codec@ietfa.amsl.com>; Mon, 13 Mar 2017 14:55:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=mozilla.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vd-9F8JhLNP2 for <codec@ietfa.amsl.com>; Mon, 13 Mar 2017 14:55:10 -0700 (PDT)
Received: from mail-it0-x231.google.com (mail-it0-x231.google.com [IPv6:2607:f8b0:4001:c0b::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A328A129BBF for <codec@ietf.org>; Mon, 13 Mar 2017 14:55:10 -0700 (PDT)
Received: by mail-it0-x231.google.com with SMTP id w124so20726131itb.1 for <codec@ietf.org>; Mon, 13 Mar 2017 14:55:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mozilla.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=eS9+lRh5wWec8Gn6y0A1JaiR/+R6LVasWy2K6/+2Wmk=; b=WOU68H+vXArNvvMh3Jin52HpAUf30dikyi+ZXOlzNp9d75P2H5/zkQsOz7JTpQo1jr i1rfg5auFVBBnDlKlarP5fHFgCsQqGU/s8sW/HmonKkrkWTAtqWU7OWUQ7rLmhzHdF7/ YyNjvgph75DcPbwYnvd4zKQkn2KazAs7mb6Zk=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=eS9+lRh5wWec8Gn6y0A1JaiR/+R6LVasWy2K6/+2Wmk=; b=OuVH52lhMd1mOuTMMh7RL1pJPcWACYYSVTtgXcpNXi1UIog1Rq7xIKeFqaAq/7E28S Nmhn+S3y8gSocTzhamsDhtrxfIEnfXczdgTAzE4VUbG/PlX79bbLjVpyz4uRHu7Xlyjr es3dO51sre2Ap8OaKPCbhYvMB0x6UhphN7uzNNxqSMgK+1zBu9lI0esq8bIY3l2iaATs bzv0Y1NTdNopsvaPp2YiCR1ifH7ORSMnwhQE4CXY6y+UiyrqgARvigQ4B6UqyGUivY6M eYQRlpLt0CzzDWbAaytZVh/PG7dHhFskn17zJlSXhcXDVaRNHZMZVAXI3nAP6j86tfdB dSEA==
X-Gm-Message-State: AFeK/H2FzsGuxSWgknucu1CBQSe8fWIi2qaIP4lqVzwNI5zpCbrGWxLGRF+9PV7cv+eMXdHE
X-Received: by 10.36.6.72 with SMTP id 69mr13714969itv.75.1489442109874; Mon, 13 Mar 2017 14:55:09 -0700 (PDT)
Received: from panoramix.jmvalin.ca (modemcable067.31-56-74.mc.videotron.ca. [74.56.31.67]) by smtp.gmail.com with ESMTPSA id y21sm8785461ioi.0.2017.03.13.14.55.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Mar 2017 14:55:09 -0700 (PDT)
To: Drew Allen <bitllama@google.com>, Mark Harris <mark.hsj@gmail.com>, Jan Skoglund <jks@google.com>, "codec@ietf.org" <codec@ietf.org>
References: <2f534e1b-b1af-266a-50ef-36f1739d878b@jmvalin.ca> <CAMdZqKGzdndiwpdXsYcHS7+r8Ega5LcQmAvcjiuHTHJgtTUwDg@mail.gmail.com> <CA+KMCSXhS2m4Dkous=4RkOibYWuoi+V_zBrhi1+anm-c+syQ1Q@mail.gmail.com> <CAMdZqKFDtD684HMkoO9bXi-c+g+8R+ay9kPdWSQOtHFDbC3ZLA@mail.gmail.com> <CABQ9DcuD+Et6+rBG-rCnWX-Dk-9STZMeYs-6fQWTk1kyjigRhw@mail.gmail.com>
From: Jean-Marc Valin <jmvalin@mozilla.com>
Message-ID: <52f5a570-e9f4-ea49-515e-498f0ed4f1bb@mozilla.com>
Date: Mon, 13 Mar 2017 17:55:07 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0
MIME-Version: 1.0
In-Reply-To: <CABQ9DcuD+Et6+rBG-rCnWX-Dk-9STZMeYs-6fQWTk1kyjigRhw@mail.gmail.com>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="KORVi6DJm5wnPcDJhq6Iolm3xeMwHlfu2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/codec/-bOaesNfWjACrf8ScOUwPApczA4>
Subject: Re: [codec] Comments on draft-ietf-codec-ambisonics-01
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/codec/>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Mar 2017 21:55:12 -0000

On 13/03/17 05:44 PM, Drew Allen wrote:
> I think the issue is that the number of total channels rises
> quadratically in respect to the ambisonic order (N + 1)^2. If a user
> wants to use just the horizontal channels, it is only 2 * N + 1. If they
> wish to code very high-order (>10th order) horizontal channels, they
> would be artifically limited by all the zero channels being produced,
> no? Or can this handled without actually creating all those empty channels?

As far as I understand, the current draft already has all the
limitations you're describing. The channel mapping array is basically
equivalent to a CxC permutation matrix that multiplies the Cx(N+M)
weight matrix. The result is still a Cx(N+M) matrix, so using the
resulting matrix as weights can still do everything without the need for
the channel mapping to do the permutations.

Cheers,

	Jean-Marc

> On Mon, Mar 13, 2017 at 2:41 PM Mark Harris <mark.hsj@gmail.com
> <mailto:mark.hsj@gmail.com>> wrote:
> 
>     On Mon, Mar 13, 2017 at 10:38 AM, Jan Skoglund <jks@google.com
>     <mailto:jks@google.com>> wrote:
>     > Hey,
>     >
>     > Thanks for your comments
>     >
>     > On Mon, Mar 13, 2017 at 10:08 AM Mark Harris <mark.hsj@gmail.com
>     <mailto:mark.hsj@gmail.com>> wrote:
>     >>
>     >> On Fri, Feb 17, 2017 at 1:57 PM, Jean-Marc Valin
>     <jmvalin@jmvalin.ca <mailto:jmvalin@jmvalin.ca>>
>     >> wrote:
>     >> > 3.2.  Channel Mapping Family 3
>     >> >
>     >> > I would suggest removing the "Output Channel Numbering" field
>     because it
>     >> > is fully equivalent to simply permuting lines of the matrix.
>     Also, I
>     >> > believe that the size of the matrix was meant to be "32*(N+M)*C
>     bits"
>     >> > rather than "32*N*C bits".
>     >>
>     >> To expand on this a bit, a mapping family maps M+N decoded channels
>     >> (corresponding to the actual order of the coupled and uncoupled
>     >> channels in the bitstream) to C output channels (channels with a
>     >> specific semantic meaning).  The additional "Output Channel
>     Numbering"
>     >> table confuses things by adding an additional mapping from the output
>     >> channel numbers to a different set of numbers with actual semantic
>     >> meaning, leaving the output channel numbers with no apparent meaning.
>     >>
>     >> This does have a potential benefit as a matrix compression technique,
>     >> to reduce the size of the matrix when it would contain rows that are
>     >> all zero.  However considering that the matrix occurs only once, and
>     >> mapping family 2 already offers a way to compress the matrix, this
>     >> alone does not seem worth the complexity of another level of
>     >> indirection.  If matrix compression is desired it would probably be
>     >> less confusing to describe it in those terms and keep the semantic
>     >> meaning tied to the output channels.
>     >>
>     >>
>     >> The description of the Output Channel Numbering also does not specify
>     >> the intended behavior if the same value appears in the table multiple
>     >> times.
>     >>
>     >> Additionally, section 4.2 describes how to perform a stereo
>     downmix of
>     >> mapping family 3, but makes assumptions about the output channel
>     >> numbering.  This seems harmful and likely to promote implementations
>     >> that make similar assumptions.  If it is necessary to apply the
>     output
>     >> channel numbering described in section 3.2 in order to implement a
>     >> correct stereo downmix, then it would be better to simply use the
>     >> output channels from section 3 as input to the downmix, consolidating
>     >> sections 4.1 and 4.2, rather than specify new formulas that make
>     >> assumptions about the mapping.  That would also greatly simplify
>     >> section 4.
>     >>
>     >> Eliminating the Output Channel Numbering table as Jean-Marc suggests
>     >> should resolve these concerns.
>     >
>     >
>     > The problem is that once we allow mixed orders there is no unique
>     way for a
>     > receiver/decoder
>     > to resolve the mapping to ACNs from just a number of total output
>     channels.
> 
> 
>     In mapping family 2, the channel count (C) is the number of channels
>     in the fully periphonic configuration, but it is not necessary to
>     encode them all.  The channel mapping table can map each ACN to a
>     specific decoded channel or to silence.  For mixed order, some of the
>     ACNs will be mapped to silence and will not be encoded.
> 
>     In mapping family 3, the matrix can do everything that the channel
>     mapping table can do and more.  Why not treat C in the same manner, as
>     the number of channels in the fully periphonic configuration, even if
>     some are silent?
> 
>      - Mark
> 
>     _______________________________________________
>     codec mailing list
>     codec@ietf.org <mailto:codec@ietf.org>
>     https://www.ietf.org/mailman/listinfo/codec
> 
> 
> 
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>