Re: [Mlcodec] draft-valin-opus-extension-01

Jean-Marc Valin <jmvalin@jmvalin.ca> Wed, 09 August 2023 06:48 UTC

Return-Path: <jmvalin@jmvalin.ca>
X-Original-To: mlcodec@ietfa.amsl.com
Delivered-To: mlcodec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BA929C151094 for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 23:48:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.995
X-Spam-Level:
X-Spam-Status: No, score=-1.995 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=jmvalin-ca.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3LY6H-e86KKE for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 23:48:11 -0700 (PDT)
Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DAC00C15107B for <mlcodec@ietf.org>; Tue, 8 Aug 2023 23:48:11 -0700 (PDT)
Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-76595a7b111so458445285a.2 for <mlcodec@ietf.org>; Tue, 08 Aug 2023 23:48:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jmvalin-ca.20221208.gappssmtp.com; s=20221208; t=1691563690; x=1692168490; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=DNbuOxxW9qQHTqJqMOFeEWa23++6U8/RN38OmE9d4Dw=; b=kGRt2MB0VEHzNYFDFPgHYpvQ5+BgJhTVu1p+/+rHWEyDRzVMB218FL1R4t6pUW6nCR plQwFdRliRbv1107d9HIXwlaAYYIYrFD0xIezJPszPbjvGC/Zt4JfY9E4DaahRhkHMw3 fZjdzhCvMMrIGQxr2lvIR20epbAT82rzZaAZ3V4fSgCnTCFt+JHpkMAN+3zN3lY3DSWe Yv5R/LQy0JY77SsV2VjNopgJ2UEL4+m5nyVP/ZRwyG8rMcgD7xO5iqDfnDSLIf5NGSIX h4G1bv+KFppmKC9QhF+FvLb1BrD8Wbfuk7NGLpBzGw8Ol3fKIQiN20lneweMlUEweGBm E/Jg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691563690; x=1692168490; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DNbuOxxW9qQHTqJqMOFeEWa23++6U8/RN38OmE9d4Dw=; b=TQleCN8XsHzge2aXG5FNZv/EqU2+g2uqV3Ea2j0Ok1jVGZlQYMDeVueW8gxb5b4bzh MBGoDeEF7eO5a6P43LoOABp561+r00RvMyhxsedl0lJ38GlfcNNwQpBN29PX2sDUKmET JSphtYWgFmyfk/9hEtmzaXqforb34HM/vpvWtzxm0vOpQmYpyIPUDu4Ymm07hTMNrkms ce+u/OhI6d9E/5UC28qyvBlq3J2pn/lHI0P4Hu3Yb+xXfA0Nd+2ftiUx63ZwSPTCG+kN ZUMxothtVScEl2Ig1WdLMmNcFMJH6Csm4B2jbAvE+SSrvKiC1I0fSBP5cm/kJOi9LCIP GkWQ==
X-Gm-Message-State: AOJu0YyZYzOwqh2tDxsMrahg1KXngSs9EcKA8VFqAJ0FH9VjMk//iLTk 0utHDmJdC3R/NcEeaR1kB8guYQ==
X-Google-Smtp-Source: AGHT+IG6wPOPgMNUaQj5nbYG+SZvBRpZAacd1iaqb/s/iazRdSIFygJk0AIKmOqPECi5JPGm4CYJmw==
X-Received: by 2002:a05:620a:44c4:b0:76c:4833:f851 with SMTP id y4-20020a05620a44c400b0076c4833f851mr2912313qkp.47.1691563690419; Tue, 08 Aug 2023 23:48:10 -0700 (PDT)
Received: from [192.168.1.22] (modemcable097.81-22-96.mc.videotron.ca. [96.22.81.97]) by smtp.gmail.com with ESMTPSA id f26-20020a05620a12fa00b0076cce1e9a1csm3796965qkl.31.2023.08.08.23.48.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Aug 2023 23:48:10 -0700 (PDT)
Message-ID: <d0051879-f1e9-3002-1c16-a6a969889d46@jmvalin.ca>
Date: Wed, 09 Aug 2023 02:48:09 -0400
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0
Content-Language: en-US
To: Roman Shpount <roman@telurix.com>
Cc: Greg Maxwell <gmaxwell@gmail.com>, "mlcodec@ietf.org" <mlcodec@ietf.org>
References: <CAAS2fgQ1HeWQUcTgpxTq66FFn_G6UnhToc8Rtz4Pkc-MKN7n8g@mail.gmail.com> <PH0PR17MB4908948401600486033FB66BAE0FA@PH0PR17MB4908.namprd17.prod.outlook.com> <CAD5OKxu452kOQkcP+sPQzcyLSmt8p5gxLzcK2bCy2p8dYRApUg@mail.gmail.com> <414f7d97-288b-d8bc-0caf-1b95a572e5eb@jmvalin.ca> <CAD5OKxtSdMUZkTdSa7RyEssPMLqsfHwdj-GcV5hraCBUYCpGhg@mail.gmail.com> <ec9c66d0-d1a0-b1d3-6570-b43c0ac88147@jmvalin.ca> <CAD5OKxv4GofCxrFC1tkhVKP_+q3Jr-D5e+Ma3YRjuTQGYMWqOA@mail.gmail.com>
From: Jean-Marc Valin <jmvalin@jmvalin.ca>
In-Reply-To: <CAD5OKxv4GofCxrFC1tkhVKP_+q3Jr-D5e+Ma3YRjuTQGYMWqOA@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mlcodec/JEwCBLsTTphPz2HQgXBAAYCs-vY>
Subject: Re: [Mlcodec] draft-valin-opus-extension-01
X-BeenThere: mlcodec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Machine Learning for Audio Coding <mlcodec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mlcodec/>
List-Post: <mailto:mlcodec@ietf.org>
List-Help: <mailto:mlcodec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2023 06:48:15 -0000

Hi Roman,

In terms of SDP for DRED, right now the idea is just for the receiver to 
tell the sender "I'm able to use up to X seconds of DRED". From there, 
the sender can figure out how to distribute its total bitrate budget 
between the "base" Opus and DRED, depending on its estimate of the 
network conditions. If there's no loss, the encoder could even stop 
sending DRED completely. Or if most of the previous frames are silence, 
the encoder will encode less redundancy and spend more on the base Opus. 
That's another reason why I don't think we'd want DRED to be 
controlled/signaled independently from the rest of Opus.

	Jean-Marc


On 2023-08-08 23:16, Roman Shpount wrote:
> Hi Jean-Marc,
> 
> I did not know how tightly DRED was integrated with Opus. What you are 
> saying essentially implies that DRED is an integral part of the Opus 
> codec, which might justify it being an extension.
> 
> That said, sending Opus and DRED payload using a RED packet does not 
> mean they should be independently decoded. All that changes is just 
> payload packaging. You can define how DRED is decoded in combination 
> with Opus. Someone else can adapt the same procedures to AMR-WB or EVS. 
> The DRED encoder, payload definition, and SDP negotiation procedures can 
> be re-used.
> 
> Finally, you need to define Opus extension negotiation in SDP. If the 
> DRED payload is negotiated as a part of RED, all the SDP negotiation 
> procedures are already defined. All you need to do is to define the 
> payload format and mime-type parameters.
> _____________
> Roman Shpount
> 
> 
> On Tue, Aug 8, 2023 at 3:42 AM Jean-Marc Valin <jmvalin@jmvalin.ca 
> <mailto:jmvalin@jmvalin.ca>> wrote:
> 
>     Hi Roman,
> 
>     The interactions between DRED and the rest of Opus aren't that simple.
>     DRED actually does quiet a bit of back-and-forth directly within the
>     underlying SILK and CELT decoders. While it's probably possible to do
>     all that while treating DRED as a separate codec, it would make things
>     quite complicated -- and I don't actually know how I would do that.
> 
>     When looking at Opus+DRED, I don't see any upside (but many downsides)
>     to defining DRED as a separate codec. Now as for using DRED with other
>     codecs, it may still be feasible to define a way to combine codec X
>     with
>     DRED. But even in that case, it's likely you'd need to implement the
>     same kind of tight coupling with the decoder.
> 
>     Regarding your comment about SDP, I'm not sure I understand what would
>     be made more complicated with DRED being an Opus extension.
> 
>     Cheers,
> 
>              Jean-Marc
> 
>     On 2023-08-07 23:45, Roman Shpount wrote:
>      > Hi Jean-Marc,
>      >
>      > I am sold on DRED (your point #2). It is extremely interesting.
>      >
>      > What I need clarification about is whether DRED should be integrated
>      > into Opus or if it should be defined as a standalone CODEC, which
>     can be
>      > used with other codecs using RED. I do see that there is some
>     bitrate
>      > saving by making it an Opus extension. On the other hand, if DRED is
>      > defined as an independent CODEC, it can be used with other
>     codecs, such
>      > as AMR or Lyra. Using RED with two different CODECs is not
>     entirely a
>      > new idea. Synchronizing state and making smooth transitions is also
>      > something that can be addressed. It is not that different from PLC
>      > transitions.
>      >
>      > Using RED for packaging also makes a lot of SDP negotiation issues
>      > simpler. It also lets you expose parameters for DRED as SDP
>     parameters,
>      > which would be complicated with an Opus extension.
>      >
>      > Best Regards,
>      > _____________
>      > Roman Shpount
>      >
>      >
>      > On Mon, Aug 7, 2023 at 1:47 AM Jean-Marc Valin
>     <jmvalin@jmvalin.ca <mailto:jmvalin@jmvalin.ca>
>      > <mailto:jmvalin@jmvalin.ca <mailto:jmvalin@jmvalin.ca>>> wrote:
>      >
>      >     Hi Roman,
>      >
>      >     Indeed, maybe the DRED draft should more clearly emphasize its
>      >     benefits.
>      >     There are several reasons why I think the approach we're
>     proposing
>      >     would
>      >     work better than using a separate ML codec through RED. Those
>     can be
>      >     divided in two categories:
>      >
>      >     1) There are benefits to having the redundancy integrated
>     within Opus.
>      >     In case of loss, we need to quickly switch from Opus, to the
>      >     redundancy,
>      >     and then back to Opus. Since codecs are stateful, you cannot
>     directly
>      >     switch back and forth without introducing discontinuities.
>     Switching
>      >     cleanly would either require deep integration of the two
>     codecs (about
>      >     as close as what we're already proposing) to update each other's
>      >     states,
>      >     or else figure out some cross-fading, which would create all
>     kinds of
>      >     undesirable side effects. I also believe that integration
>     within Opus
>      >     makes synchronization easier and is likely easier to deploy
>     in general.
>      >
>      >     2) There are also benefits coming from the fact that DRED is not
>      >     designed to be a general-purpose ML speech codec, but is rather
>      >     specifically optimized for redundancy. For example, while
>     each DRED
>      >     packet is independent from the others (no prediction across
>     packet),
>      >     the
>      >     many frames within a redundancy packet (up to a second right
>     now) are
>      >     still coded with prediction to make it more efficient. Our
>     scheme also
>      >     makes it possible to use a different bitrate as a function of
>     how old
>      >     each frame is within the redundancy. For example, the audio
>     between
>      >     t=0ms and t=40ms can be coded at 1000 b/s when part of the
>     redundancy
>      >     for the t=40ms packet, but it might be coded at just 400 b/s
>     for the
>      >     redundancy of the t=900ms packet. That can be done without
>     having to
>      >     re-encode the audio multiple times. You can read more about
>     the DRED
>      >     design principles in Section 2 of
>      > https://arxiv.org/pdf/2212.04453.pdf
>     <https://arxiv.org/pdf/2212.04453.pdf>
>      >     <https://arxiv.org/pdf/2212.04453.pdf
>     <https://arxiv.org/pdf/2212.04453.pdf>>
>      >     or in this blog post:
>      >
>     https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets <https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets> <https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets <https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets>>
>      >
>      >     Cheers,
>      >
>      >              Jean-Marc
>      >
>      >
>      >     On 2023-08-06 20:01, Roman Shpount wrote:
>      >      > Hi All,
>      >      >
>      >      > One question that I have about Opus extensions is what are the
>      >     benefits
>      >      > of extending Opus vs using RED and combining Opus with
>     other codecs,
>      >      > such as ML-based codecs, which would provide redundant audio?
>      >      >
>      >      > Thank You,
>      >      > _____________
>      >      > Roman Shpount
>      >      >
>      >      >
>      >      > On Sun, Aug 6, 2023 at 1:39 PM Stephan Wenger
>     <stewe@stewe.org <mailto:stewe@stewe.org>
>      >     <mailto:stewe@stewe.org <mailto:stewe@stewe.org>>
>      >      > <mailto:stewe@stewe.org <mailto:stewe@stewe.org>
>     <mailto:stewe@stewe.org <mailto:stewe@stewe.org>>>> wrote:
>      >      >
>      >      >     Hi,____
>      >      >
>      >      >     The constituting meeting of an IETF WG is an unusually
>     early
>      >     point
>      >      >     for an IETF working group draft adoption.  I’m not
>     objecting
>      >     at this
>      >      >     point, but can I ask to make this a two-week call at the
>      >     minimum?
>      >      >     It’s summer, and some relevant people I know of are on
>      >     vacation.____
>      >      >
>      >      >     Stephan____
>      >      >
>      >      >     __ __
>      >      >
>      >      >     __ __
>      >      >
>      >      >     *From: *Mlcodec <mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>
>      >     <mailto:mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>>
>      >      >     <mailto:mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>
>      >     <mailto:mlcodec-bounces@ietf.org
>     <mailto:mlcodec-bounces@ietf.org>>>> on behalf of Greg Maxwell
>      >      >     <gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>
>     <mailto:gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>>
>      >     <mailto:gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>
>     <mailto:gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>>>>
>      >      >     *Date: *Saturday, August 5, 2023 at 11:34
>      >      >     *To: *mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>
>      >     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>>
>      >     <mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>
>      >      >     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>
>     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>>>
>      >      >     *Subject: *[Mlcodec] draft-valin-opus-extension-01____
>      >      >
>      >      >     At IETF 117 there was consensus in the room to adopt
>      >      >     draft-valin-opus-extension-01
>      >      >     (
>      > https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>
>      >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>>
>      >      >   
>       <https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>
>      >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/
>     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>>> )
>      >      >     as a working group document and a starting point for
>     further
>      >      >     development.
>      >      >
>      >      >     I'm raising this for the benefit of those who could
>     not make the
>      >      >     meeting,
>      >      >     and to request that this consensus be confirmed on the
>     list.
>      >      >
>      >      >     So, if you have anything further to say about the
>     adoption,
>      >     positive or
>      >      >     negative, please speak up.
>      >      >
>      >      >     --
>      >      >     Mlcodec mailing list
>      >      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
>      >     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>>
>      >      > https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>
>      >      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>>____
>      >      >
>      >      >     --
>      >      >     Mlcodec mailing list
>      >      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
>      >     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
>     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>>
>      >      > https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>
>      >      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>
>      >     <https://www.ietf.org/mailman/listinfo/mlcodec
>     <https://www.ietf.org/mailman/listinfo/mlcodec>>>
>      >      >
>      >      >
>      >
>