Re: [Mlcodec] draft-valin-opus-extension-01

Jean-Marc Valin <jmvalin@jmvalin.ca> Mon, 07 August 2023 05:47 UTC

Return-Path: <jmvalin@jmvalin.ca>
X-Original-To: mlcodec@ietfa.amsl.com
Delivered-To: mlcodec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E30C6C14CF18 for <mlcodec@ietfa.amsl.com>; Sun, 6 Aug 2023 22:47:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.996
X-Spam-Level:
X-Spam-Status: No, score=-1.996 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=jmvalin-ca.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pfwfn2_bsl2D for <mlcodec@ietfa.amsl.com>; Sun, 6 Aug 2023 22:47:55 -0700 (PDT)
Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AFA5AC14CEFD for <mlcodec@ietf.org>; Sun, 6 Aug 2023 22:47:55 -0700 (PDT)
Received: by mail-qk1-x732.google.com with SMTP id af79cd13be357-765ae938b1bso354128685a.0 for <mlcodec@ietf.org>; Sun, 06 Aug 2023 22:47:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jmvalin-ca.20221208.gappssmtp.com; s=20221208; t=1691387274; x=1691992074; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=/kZWpg3QHh7FoS0yf5ZPSPG4ImJRomIskAoLRjcjPBY=; b=MHfSyEVauqarrfpbuquYaEunZzmEIGZ7pBNxkGzRCrGO8masynYx+0yGhKBAq3vN3z l3AA5uWpZ9ox6+KIwfIWuFt4Iw/LwJ5mn96XXdz6rX7NAD6bSmKTbKHetQYfMumKLfTl M3ejwzx+W9bfJAjk1yJtmWlfKXseWOy77Tj4Tbpkgo3gjDdD1JLTBmy1hxTx34n9GvDr w64klmjZcdrTAC8QR9JZXlUqTfPEfqNy2kYP7Jyc17LnlJb3Fw3FARpRDni7KqFRi6tP jMuJC+Cole4SU5F80SJA4GlopLL19RWYqDSq74k4UUGaFJx/DjxqiJBrN1KDAoAlnkjE Mp1g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691387274; x=1691992074; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/kZWpg3QHh7FoS0yf5ZPSPG4ImJRomIskAoLRjcjPBY=; b=gRT/sQfSM9cr7q6ZMMCbFry7WdWrXZnI6lRuJOP5OP3QqAODDtrPkHn0PSGjAuzOm7 ncpgMJhdbxdYn4rh5gCZdfH/4YxX+yCDhV40I893aFrg+6A542tPYZsLqVVbNWEuFODi J9ZIA4lSfR1jtBBBiEYjSoCpmXcLyWToyPqXduS76l3JvhWH/u6G2xJRiZS4DyIb/7U1 1YSIWEBfdxr+RCni4tAmeUOG5/XiqrRH7TRW9kYlctDvVzF+Vjm1z/Oh8AN5D9Ag4E90 w01ibbYcW1/MdmQkHE6Irh2AOsiBf+lAzVP4CBxF6Q7khhj7OTGrIGev2VPgOtsuRN+2 V8UQ==
X-Gm-Message-State: AOJu0YwsDC1v5Mq2JQt5zrVLF2rpuZSmyLPzZfuxdPLEPjYd+2gExMwg Wf7LuWqtNFUp6TiYeI/2LMzdmw==
X-Google-Smtp-Source: AGHT+IHJj0+fl9JryVEsg4YHVLZx5Y2OGj1L5yNKzMVE+kMXL8fjfMVWxsTN7PaJ1k4NMZplPLzZEQ==
X-Received: by 2002:a37:2c85:0:b0:768:14ee:234c with SMTP id s127-20020a372c85000000b0076814ee234cmr8637389qkh.50.1691387274298; Sun, 06 Aug 2023 22:47:54 -0700 (PDT)
Received: from [192.168.1.22] (modemcable097.81-22-96.mc.videotron.ca. [96.22.81.97]) by smtp.gmail.com with ESMTPSA id p16-20020a05620a133000b0076c701c3e71sm2390721qkj.121.2023.08.06.22.47.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 06 Aug 2023 22:47:53 -0700 (PDT)
Message-ID: <414f7d97-288b-d8bc-0caf-1b95a572e5eb@jmvalin.ca>
Date: Mon, 07 Aug 2023 01:47:52 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0
Content-Language: en-US
To: Roman Shpount <roman@telurix.com>, Greg Maxwell <gmaxwell@gmail.com>, "mlcodec@ietf.org" <mlcodec@ietf.org>
References: <CAAS2fgQ1HeWQUcTgpxTq66FFn_G6UnhToc8Rtz4Pkc-MKN7n8g@mail.gmail.com> <PH0PR17MB4908948401600486033FB66BAE0FA@PH0PR17MB4908.namprd17.prod.outlook.com> <CAD5OKxu452kOQkcP+sPQzcyLSmt8p5gxLzcK2bCy2p8dYRApUg@mail.gmail.com>
From: Jean-Marc Valin <jmvalin@jmvalin.ca>
In-Reply-To: <CAD5OKxu452kOQkcP+sPQzcyLSmt8p5gxLzcK2bCy2p8dYRApUg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mlcodec/dCe2M8CPaJdP_w2I6w5viVIHTrI>
Subject: Re: [Mlcodec] draft-valin-opus-extension-01
X-BeenThere: mlcodec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Machine Learning for Audio Coding <mlcodec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mlcodec/>
List-Post: <mailto:mlcodec@ietf.org>
List-Help: <mailto:mlcodec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Aug 2023 05:48:00 -0000

Hi Roman,

Indeed, maybe the DRED draft should more clearly emphasize its benefits. 
There are several reasons why I think the approach we're proposing would 
work better than using a separate ML codec through RED. Those can be 
divided in two categories:

1) There are benefits to having the redundancy integrated within Opus. 
In case of loss, we need to quickly switch from Opus, to the redundancy, 
and then back to Opus. Since codecs are stateful, you cannot directly 
switch back and forth without introducing discontinuities. Switching 
cleanly would either require deep integration of the two codecs (about 
as close as what we're already proposing) to update each other's states, 
or else figure out some cross-fading, which would create all kinds of 
undesirable side effects. I also believe that integration within Opus 
makes synchronization easier and is likely easier to deploy in general.

2) There are also benefits coming from the fact that DRED is not 
designed to be a general-purpose ML speech codec, but is rather 
specifically optimized for redundancy. For example, while each DRED 
packet is independent from the others (no prediction across packet), the 
many frames within a redundancy packet (up to a second right now) are 
still coded with prediction to make it more efficient. Our scheme also 
makes it possible to use a different bitrate as a function of how old 
each frame is within the redundancy. For example, the audio between 
t=0ms and t=40ms can be coded at 1000 b/s when part of the redundancy 
for the t=40ms packet, but it might be coded at just 400 b/s for the 
redundancy of the t=900ms packet. That can be done without having to 
re-encode the audio multiple times. You can read more about the DRED 
design principles in Section 2 of
https://arxiv.org/pdf/2212.04453.pdf
or in this blog post:
https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets

Cheers,

	Jean-Marc


On 2023-08-06 20:01, Roman Shpount wrote:
> Hi All,
> 
> One question that I have about Opus extensions is what are the benefits 
> of extending Opus vs using RED and combining Opus with other codecs, 
> such as ML-based codecs, which would provide redundant audio?
> 
> Thank You,
> _____________
> Roman Shpount
> 
> 
> On Sun, Aug 6, 2023 at 1:39 PM Stephan Wenger <stewe@stewe.org 
> <mailto:stewe@stewe.org>> wrote:
> 
>     Hi,____
> 
>     The constituting meeting of an IETF WG is an unusually early point
>     for an IETF working group draft adoption.  I’m not objecting at this
>     point, but can I ask to make this a two-week call at the minimum? 
>     It’s summer, and some relevant people I know of are on vacation.____
> 
>     Stephan____
> 
>     __ __
> 
>     __ __
> 
>     *From: *Mlcodec <mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>> on behalf of Greg Maxwell
>     <gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>>
>     *Date: *Saturday, August 5, 2023 at 11:34
>     *To: *mlcodec@ietf.org <mailto:mlcodec@ietf.org> <mlcodec@ietf.org
>     <mailto:mlcodec@ietf.org>>
>     *Subject: *[Mlcodec] draft-valin-opus-extension-01____
> 
>     At IETF 117 there was consensus in the room to adopt
>     draft-valin-opus-extension-01
>     ( https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/> )
>     as a working group document and a starting point for further
>     development.
> 
>     I'm raising this for the benefit of those who could not make the
>     meeting,
>     and to request that this consensus be confirmed on the list.
> 
>     So, if you have anything further to say about the adoption, positive or
>     negative, please speak up.
> 
>     -- 
>     Mlcodec mailing list
>     Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>____
> 
>     -- 
>     Mlcodec mailing list
>     Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
> 
>