Re: [Mlcodec] draft-valin-opus-extension-01

Jean-Marc Valin <jmvalin@jmvalin.ca> Tue, 08 August 2023 07:42 UTC

Return-Path: <jmvalin@jmvalin.ca>
X-Original-To: mlcodec@ietfa.amsl.com
Delivered-To: mlcodec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5A8B6C16B5A9 for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 00:42:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.995
X-Spam-Level:
X-Spam-Status: No, score=-1.995 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=jmvalin-ca.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qf_qs497iDfL for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 00:42:50 -0700 (PDT)
Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7BF74C16B5A8 for <mlcodec@ietf.org>; Tue, 8 Aug 2023 00:42:50 -0700 (PDT)
Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-635f293884cso30650596d6.3 for <mlcodec@ietf.org>; Tue, 08 Aug 2023 00:42:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jmvalin-ca.20221208.gappssmtp.com; s=20221208; t=1691480569; x=1692085369; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=sMPVhFam31/FOpt2suawAVTj9V00Hu4PXO0Y7Wh1oEI=; b=Ts5KYc00fWVSXrFNmdQvTlQUtd1S3KLeQLHR1Cg0mmXrDfBrNzGgzqdNd//7lL8W2o qqfOv3l94wso9p2fouPQqj1q8LUEl1kvu35stn3UfjY8q7xeDL+D0nJYGjyhUC9EvvL+ zT8zyzQI6NmoXfv/J3JKKoMIfsLwq/424BCbRcdhnox3YGftFlI8bzdXCWL0wzAr2cme A4CezslHTvMePVovUcCUHA1lGTlSQt57/TgiJmzEnCHcVWwgBhebhcZ0Psc83b1vcSa5 IHDFPPWGlGoBrU11nqwMAsQrriBkr5yjdnyXIjXIUImI7434oIaBO5J/6Qt5po1XW6nI EjjA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691480569; x=1692085369; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sMPVhFam31/FOpt2suawAVTj9V00Hu4PXO0Y7Wh1oEI=; b=SSyIla4fZLjtgYfbPZyGpF5T84R39pvqaG3lfOr5AMQ51R05JvUw1T1fQQE0FuOpXm zKcAVORnq2uFg+9LvqbEAka+Nim3Ky3pMkfbyEIS7t0LYEF3sNhIgMyGkVa+si1NzGBi zwhJLM6aGWXdiUncvmspMRmxkaHeisYX4OUNBcMyvwQgSWHdAL2ooNLLu8VLhb6xMkk9 dWKjsIzwAdv40GnYOmxV7ESfjXKmGuDM/b2qw+44u7oEuRBaEhgDjJrCaVMJIpEQGUm/ 3TA7e7wo3ZdpWULw1psUIIZQ0qX53VkpGihA2l4XjtvIGlZkoIs5TrGI8/huvRKlPynP WXUA==
X-Gm-Message-State: AOJu0YyqO8lMW8Cue9Uwlje1/QLBaScmqlLN6k2R1PYtru3jGItaVCTh WzfoVC723rWtriSv15qZsamPwQ==
X-Google-Smtp-Source: AGHT+IEy/NaVzRqxRV/G5ndv03KdN0Z+MU5YWMjWKSaYm47CI3kl0CwgsNC7UAcgQz25LzADRgmDHA==
X-Received: by 2002:ad4:4044:0:b0:63c:e9dd:631e with SMTP id r4-20020ad44044000000b0063ce9dd631emr10614135qvp.26.1691480568967; Tue, 08 Aug 2023 00:42:48 -0700 (PDT)
Received: from [192.168.1.22] (modemcable097.81-22-96.mc.videotron.ca. [96.22.81.97]) by smtp.gmail.com with ESMTPSA id m14-20020a0cf18e000000b0063d561ea04csm3408554qvl.102.2023.08.08.00.42.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Aug 2023 00:42:48 -0700 (PDT)
Message-ID: <ec9c66d0-d1a0-b1d3-6570-b43c0ac88147@jmvalin.ca>
Date: Tue, 08 Aug 2023 03:42:47 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0
Content-Language: en-US
To: Roman Shpount <roman@telurix.com>
Cc: Greg Maxwell <gmaxwell@gmail.com>, "mlcodec@ietf.org" <mlcodec@ietf.org>
References: <CAAS2fgQ1HeWQUcTgpxTq66FFn_G6UnhToc8Rtz4Pkc-MKN7n8g@mail.gmail.com> <PH0PR17MB4908948401600486033FB66BAE0FA@PH0PR17MB4908.namprd17.prod.outlook.com> <CAD5OKxu452kOQkcP+sPQzcyLSmt8p5gxLzcK2bCy2p8dYRApUg@mail.gmail.com> <414f7d97-288b-d8bc-0caf-1b95a572e5eb@jmvalin.ca> <CAD5OKxtSdMUZkTdSa7RyEssPMLqsfHwdj-GcV5hraCBUYCpGhg@mail.gmail.com>
From: Jean-Marc Valin <jmvalin@jmvalin.ca>
In-Reply-To: <CAD5OKxtSdMUZkTdSa7RyEssPMLqsfHwdj-GcV5hraCBUYCpGhg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mlcodec/5OerUsa4h9k8QxSni8kYbcOlD10>
Subject: Re: [Mlcodec] draft-valin-opus-extension-01
X-BeenThere: mlcodec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Machine Learning for Audio Coding <mlcodec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mlcodec/>
List-Post: <mailto:mlcodec@ietf.org>
List-Help: <mailto:mlcodec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Aug 2023 07:42:54 -0000

Hi Roman,

The interactions between DRED and the rest of Opus aren't that simple. 
DRED actually does quiet a bit of back-and-forth directly within the 
underlying SILK and CELT decoders. While it's probably possible to do 
all that while treating DRED as a separate codec, it would make things 
quite complicated -- and I don't actually know how I would do that.

When looking at Opus+DRED, I don't see any upside (but many downsides) 
to defining DRED as a separate codec. Now as for using DRED with other 
codecs, it may still be feasible to define a way to combine codec X with 
DRED. But even in that case, it's likely you'd need to implement the 
same kind of tight coupling with the decoder.

Regarding your comment about SDP, I'm not sure I understand what would 
be made more complicated with DRED being an Opus extension.

Cheers,

	Jean-Marc

On 2023-08-07 23:45, Roman Shpount wrote:
> Hi Jean-Marc,
> 
> I am sold on DRED (your point #2). It is extremely interesting.
> 
> What I need clarification about is whether DRED should be integrated 
> into Opus or if it should be defined as a standalone CODEC, which can be 
> used with other codecs using RED. I do see that there is some bitrate 
> saving by making it an Opus extension. On the other hand, if DRED is 
> defined as an independent CODEC, it can be used with other codecs, such 
> as AMR or Lyra. Using RED with two different CODECs is not entirely a 
> new idea. Synchronizing state and making smooth transitions is also 
> something that can be addressed. It is not that different from PLC 
> transitions.
> 
> Using RED for packaging also makes a lot of SDP negotiation issues 
> simpler. It also lets you expose parameters for DRED as SDP parameters, 
> which would be complicated with an Opus extension.
> 
> Best Regards,
> _____________
> Roman Shpount
> 
> 
> On Mon, Aug 7, 2023 at 1:47 AM Jean-Marc Valin <jmvalin@jmvalin.ca 
> <mailto:jmvalin@jmvalin.ca>> wrote:
> 
>     Hi Roman,
> 
>     Indeed, maybe the DRED draft should more clearly emphasize its
>     benefits.
>     There are several reasons why I think the approach we're proposing
>     would
>     work better than using a separate ML codec through RED. Those can be
>     divided in two categories:
> 
>     1) There are benefits to having the redundancy integrated within Opus.
>     In case of loss, we need to quickly switch from Opus, to the
>     redundancy,
>     and then back to Opus. Since codecs are stateful, you cannot directly
>     switch back and forth without introducing discontinuities. Switching
>     cleanly would either require deep integration of the two codecs (about
>     as close as what we're already proposing) to update each other's
>     states,
>     or else figure out some cross-fading, which would create all kinds of
>     undesirable side effects. I also believe that integration within Opus
>     makes synchronization easier and is likely easier to deploy in general.
> 
>     2) There are also benefits coming from the fact that DRED is not
>     designed to be a general-purpose ML speech codec, but is rather
>     specifically optimized for redundancy. For example, while each DRED
>     packet is independent from the others (no prediction across packet),
>     the
>     many frames within a redundancy packet (up to a second right now) are
>     still coded with prediction to make it more efficient. Our scheme also
>     makes it possible to use a different bitrate as a function of how old
>     each frame is within the redundancy. For example, the audio between
>     t=0ms and t=40ms can be coded at 1000 b/s when part of the redundancy
>     for the t=40ms packet, but it might be coded at just 400 b/s for the
>     redundancy of the t=900ms packet. That can be done without having to
>     re-encode the audio multiple times. You can read more about the DRED
>     design principles in Section 2 of
>     https://arxiv.org/pdf/2212.04453.pdf
>     <https://arxiv.org/pdf/2212.04453.pdf>
>     or in this blog post:
>     https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets <https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets>
> 
>     Cheers,
> 
>              Jean-Marc
> 
> 
>     On 2023-08-06 20:01, Roman Shpount wrote:
>      > Hi All,
>      >
>      > One question that I have about Opus extensions is what are the
>     benefits
>      > of extending Opus vs using RED and combining Opus with other codecs,
>      > such as ML-based codecs, which would provide redundant audio?
>      >
>      > Thank You,
>      > _____________
>      > Roman Shpount
>      >
>      >
>      > On Sun, Aug 6, 2023 at 1:39 PM Stephan Wenger <stewe@stewe.org
>     <mailto:stewe@stewe.org>
>      > <mailto:stewe@stewe.org <mailto:stewe@stewe.org>>> wrote:
>      >
>      >     Hi,____
>      >
>      >     The constituting meeting of an IETF WG is an unusually early
>     point
>      >     for an IETF working group draft adoption.  I’m not objecting
>     at this
>      >     point, but can I ask to make this a two-week call at the
>     minimum?
>      >     It’s summer, and some relevant people I know of are on
>     vacation.____
>      >
>      >     Stephan____
>      >
>      >     __ __
>      >
>      >     __ __
>      >
>      >     *From: *Mlcodec <mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>
>      >     <mailto:mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>>> on behalf of Greg Maxwell
>      >     <gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>
>     <mailto:gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>>>
>      >     *Date: *Saturday, August 5, 2023 at 11:34
>      >     *To: *mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>
>     <mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>      >     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>>
>      >     *Subject: *[Mlcodec] draft-valin-opus-extension-01____
>      >
>      >     At IETF 117 there was consensus in the room to adopt
>      >     draft-valin-opus-extension-01
>      >     (
>     https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>
>      >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>> )
>      >     as a working group document and a starting point for further
>      >     development.
>      >
>      >     I'm raising this for the benefit of those who could not make the
>      >     meeting,
>      >     and to request that this consensus be confirmed on the list.
>      >
>      >     So, if you have anything further to say about the adoption,
>     positive or
>      >     negative, please speak up.
>      >
>      >     --
>      >     Mlcodec mailing list
>      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
>      > https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>____
>      >
>      >     --
>      >     Mlcodec mailing list
>      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
>      > https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>
>      >
>      >
>