Re: [Mlcodec] draft-valin-opus-extension-01

Roman Shpount <roman@telurix.com> Wed, 09 August 2023 03:16 UTC

Return-Path: <roman@telurix.com>
X-Original-To: mlcodec@ietfa.amsl.com
Delivered-To: mlcodec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CED4AC14CEFE for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 20:16:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.105
X-Spam-Level:
X-Spam-Status: No, score=-2.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=telurix.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jTYVJ-SXv2lQ for <mlcodec@ietfa.amsl.com>; Tue, 8 Aug 2023 20:16:20 -0700 (PDT)
Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4FF57C14CEFD for <mlcodec@ietf.org>; Tue, 8 Aug 2023 20:16:20 -0700 (PDT)
Received: by mail-oo1-xc2a.google.com with SMTP id 006d021491bc7-56c685b7e91so4207660eaf.2 for <mlcodec@ietf.org>; Tue, 08 Aug 2023 20:16:20 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telurix.com; s=google; t=1691550979; x=1692155779; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/ddEQSgzZIIzWCYuIvhWabyQYtilIKjuThtkE33HLGg=; b=MmN2usqlXNdex87/IhcLSMFrA5qs5IKWOvHL+YKowQ1KcqByYlyvsJsIVF51jyNSr8 ZSkXBQwTqquvF4q6voIAzi1IniFoXh1bAxd5ZK+0Dm4YebYkHrCT8VfI7Ismn6bwXa2G kT7CiydeSzvZ9lVTsyTrlnC73JSnx9LRx5tNs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691550979; x=1692155779; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/ddEQSgzZIIzWCYuIvhWabyQYtilIKjuThtkE33HLGg=; b=NCVMg+yyeqCoWTNetYd/1NfAD5m4q7gTKx4HsILCKEvcU3xfVaO8OlN4GsmPAm6s0U RkxEhIG5MOBiKoLHHQACAQJT6fCuYGBf+8LOPO9ivDPkLLUvUIl/wwV+7Jw6flU9uirD CaTAnRgkNlJAqhz+c32YF9CjMn2W6UrkuZfQyRpZGtURTg0QdhOaANffcy7yx++coHKm SZUaobp8KXn38E9/MqW9Ee17jtTGbyeAu/vwGSfQjbpE5llEMS0BwOnn/7iyy6YEfAf8 Yo/t5G69ooSS3+jkVMVHeqnRU0NTlJF59DCE080kPYbGTiJjhOaVDnPyLvMk2BAk4NnL UbeQ==
X-Gm-Message-State: AOJu0Yx4Zxg8dZkV49wnw0muqFxbgiob6ecoYv+J9NuMwOWPp1Ym5V37 3TS0ubGxHKJY015FCLxJPLhk1rdZUkPmoblnBao=
X-Google-Smtp-Source: AGHT+IH8B95XLXEYo09msecEYF2f41/SP//YkU9Wjblx+MPOvOcyWsTdmh0jjoh+eAusDCwdyfoiWA==
X-Received: by 2002:a4a:2b01:0:b0:56c:780a:1899 with SMTP id i1-20020a4a2b01000000b0056c780a1899mr1473710ooa.4.1691550979002; Tue, 08 Aug 2023 20:16:19 -0700 (PDT)
Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com. [209.85.161.53]) by smtp.gmail.com with ESMTPSA id d190-20020a4a52c7000000b0056422cfb35csm6389850oob.40.2023.08.08.20.16.18 for <mlcodec@ietf.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Aug 2023 20:16:18 -0700 (PDT)
Received: by mail-oo1-f53.google.com with SMTP id 006d021491bc7-56c685b7e91so4207650eaf.2 for <mlcodec@ietf.org>; Tue, 08 Aug 2023 20:16:18 -0700 (PDT)
X-Received: by 2002:a05:6808:18f:b0:3a7:4f61:7bb2 with SMTP id w15-20020a056808018f00b003a74f617bb2mr1707801oic.50.1691550978163; Tue, 08 Aug 2023 20:16:18 -0700 (PDT)
MIME-Version: 1.0
References: <CAAS2fgQ1HeWQUcTgpxTq66FFn_G6UnhToc8Rtz4Pkc-MKN7n8g@mail.gmail.com> <PH0PR17MB4908948401600486033FB66BAE0FA@PH0PR17MB4908.namprd17.prod.outlook.com> <CAD5OKxu452kOQkcP+sPQzcyLSmt8p5gxLzcK2bCy2p8dYRApUg@mail.gmail.com> <414f7d97-288b-d8bc-0caf-1b95a572e5eb@jmvalin.ca> <CAD5OKxtSdMUZkTdSa7RyEssPMLqsfHwdj-GcV5hraCBUYCpGhg@mail.gmail.com> <ec9c66d0-d1a0-b1d3-6570-b43c0ac88147@jmvalin.ca>
In-Reply-To: <ec9c66d0-d1a0-b1d3-6570-b43c0ac88147@jmvalin.ca>
From: Roman Shpount <roman@telurix.com>
Date: Tue, 08 Aug 2023 23:16:06 -0400
X-Gmail-Original-Message-ID: <CAD5OKxv4GofCxrFC1tkhVKP_+q3Jr-D5e+Ma3YRjuTQGYMWqOA@mail.gmail.com>
Message-ID: <CAD5OKxv4GofCxrFC1tkhVKP_+q3Jr-D5e+Ma3YRjuTQGYMWqOA@mail.gmail.com>
To: Jean-Marc Valin <jmvalin@jmvalin.ca>
Cc: Greg Maxwell <gmaxwell@gmail.com>, "mlcodec@ietf.org" <mlcodec@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000002c52b8060274e7c2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mlcodec/BLvcdKpuuWHY2px1VwyoHaki9IU>
Subject: Re: [Mlcodec] draft-valin-opus-extension-01
X-BeenThere: mlcodec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Machine Learning for Audio Coding <mlcodec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mlcodec/>
List-Post: <mailto:mlcodec@ietf.org>
List-Help: <mailto:mlcodec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mlcodec>, <mailto:mlcodec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2023 03:16:24 -0000

Hi Jean-Marc,

I did not know how tightly DRED was integrated with Opus. What you are
saying essentially implies that DRED is an integral part of the Opus codec,
which might justify it being an extension.

That said, sending Opus and DRED payload using a RED packet does not mean
they should be independently decoded. All that changes is just payload
packaging. You can define how DRED is decoded in combination with Opus.
Someone else can adapt the same procedures to AMR-WB or EVS. The DRED
encoder, payload definition, and SDP negotiation procedures can be re-used.

Finally, you need to define Opus extension negotiation in SDP. If the DRED
payload is negotiated as a part of RED, all the SDP negotiation procedures
are already defined. All you need to do is to define the payload format and
mime-type parameters.
_____________
Roman Shpount


On Tue, Aug 8, 2023 at 3:42 AM Jean-Marc Valin <jmvalin@jmvalin.ca> wrote:

> Hi Roman,
>
> The interactions between DRED and the rest of Opus aren't that simple.
> DRED actually does quiet a bit of back-and-forth directly within the
> underlying SILK and CELT decoders. While it's probably possible to do
> all that while treating DRED as a separate codec, it would make things
> quite complicated -- and I don't actually know how I would do that.
>
> When looking at Opus+DRED, I don't see any upside (but many downsides)
> to defining DRED as a separate codec. Now as for using DRED with other
> codecs, it may still be feasible to define a way to combine codec X with
> DRED. But even in that case, it's likely you'd need to implement the
> same kind of tight coupling with the decoder.
>
> Regarding your comment about SDP, I'm not sure I understand what would
> be made more complicated with DRED being an Opus extension.
>
> Cheers,
>
>         Jean-Marc
>
> On 2023-08-07 23:45, Roman Shpount wrote:
> > Hi Jean-Marc,
> >
> > I am sold on DRED (your point #2). It is extremely interesting.
> >
> > What I need clarification about is whether DRED should be integrated
> > into Opus or if it should be defined as a standalone CODEC, which can be
> > used with other codecs using RED. I do see that there is some bitrate
> > saving by making it an Opus extension. On the other hand, if DRED is
> > defined as an independent CODEC, it can be used with other codecs, such
> > as AMR or Lyra. Using RED with two different CODECs is not entirely a
> > new idea. Synchronizing state and making smooth transitions is also
> > something that can be addressed. It is not that different from PLC
> > transitions.
> >
> > Using RED for packaging also makes a lot of SDP negotiation issues
> > simpler. It also lets you expose parameters for DRED as SDP parameters,
> > which would be complicated with an Opus extension.
> >
> > Best Regards,
> > _____________
> > Roman Shpount
> >
> >
> > On Mon, Aug 7, 2023 at 1:47 AM Jean-Marc Valin <jmvalin@jmvalin.ca
> > <mailto:jmvalin@jmvalin.ca>> wrote:
> >
> >     Hi Roman,
> >
> >     Indeed, maybe the DRED draft should more clearly emphasize its
> >     benefits.
> >     There are several reasons why I think the approach we're proposing
> >     would
> >     work better than using a separate ML codec through RED. Those can be
> >     divided in two categories:
> >
> >     1) There are benefits to having the redundancy integrated within
> Opus.
> >     In case of loss, we need to quickly switch from Opus, to the
> >     redundancy,
> >     and then back to Opus. Since codecs are stateful, you cannot directly
> >     switch back and forth without introducing discontinuities. Switching
> >     cleanly would either require deep integration of the two codecs
> (about
> >     as close as what we're already proposing) to update each other's
> >     states,
> >     or else figure out some cross-fading, which would create all kinds of
> >     undesirable side effects. I also believe that integration within Opus
> >     makes synchronization easier and is likely easier to deploy in
> general.
> >
> >     2) There are also benefits coming from the fact that DRED is not
> >     designed to be a general-purpose ML speech codec, but is rather
> >     specifically optimized for redundancy. For example, while each DRED
> >     packet is independent from the others (no prediction across packet),
> >     the
> >     many frames within a redundancy packet (up to a second right now) are
> >     still coded with prediction to make it more efficient. Our scheme
> also
> >     makes it possible to use a different bitrate as a function of how old
> >     each frame is within the redundancy. For example, the audio between
> >     t=0ms and t=40ms can be coded at 1000 b/s when part of the redundancy
> >     for the t=40ms packet, but it might be coded at just 400 b/s for the
> >     redundancy of the t=900ms packet. That can be done without having to
> >     re-encode the audio multiple times. You can read more about the DRED
> >     design principles in Section 2 of
> >     https://arxiv.org/pdf/2212.04453.pdf
> >     <https://arxiv.org/pdf/2212.04453.pdf>
> >     or in this blog post:
> >
> https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets
> <
> https://www.amazon.science/blog/neural-encoding-enables-more-efficient-recovery-of-lost-audio-packets
> >
> >
> >     Cheers,
> >
> >              Jean-Marc
> >
> >
> >     On 2023-08-06 20:01, Roman Shpount wrote:
> >      > Hi All,
> >      >
> >      > One question that I have about Opus extensions is what are the
> >     benefits
> >      > of extending Opus vs using RED and combining Opus with other
> codecs,
> >      > such as ML-based codecs, which would provide redundant audio?
> >      >
> >      > Thank You,
> >      > _____________
> >      > Roman Shpount
> >      >
> >      >
> >      > On Sun, Aug 6, 2023 at 1:39 PM Stephan Wenger <stewe@stewe.org
> >     <mailto:stewe@stewe.org>
> >      > <mailto:stewe@stewe.org <mailto:stewe@stewe.org>>> wrote:
> >      >
> >      >     Hi,____
> >      >
> >      >     The constituting meeting of an IETF WG is an unusually early
> >     point
> >      >     for an IETF working group draft adoption.  I’m not objecting
> >     at this
> >      >     point, but can I ask to make this a two-week call at the
> >     minimum?
> >      >     It’s summer, and some relevant people I know of are on
> >     vacation.____
> >      >
> >      >     Stephan____
> >      >
> >      >     __ __
> >      >
> >      >     __ __
> >      >
> >      >     *From: *Mlcodec <mlcodec-bounces@ietf.org
> >     <mailto:mlcodec-bounces@ietf.org>
> >      >     <mailto:mlcodec-bounces@ietf.org
> >     <mailto:mlcodec-bounces@ietf.org>>> on behalf of Greg Maxwell
> >      >     <gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>
> >     <mailto:gmaxwell@gmail.com <mailto:gmaxwell@gmail.com>>>
> >      >     *Date: *Saturday, August 5, 2023 at 11:34
> >      >     *To: *mlcodec@ietf.org <mailto:mlcodec@ietf.org>
> >     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>
> >     <mlcodec@ietf.org <mailto:mlcodec@ietf.org>
> >      >     <mailto:mlcodec@ietf.org <mailto:mlcodec@ietf.org>>>
> >      >     *Subject: *[Mlcodec] draft-valin-opus-extension-01____
> >      >
> >      >     At IETF 117 there was consensus in the room to adopt
> >      >     draft-valin-opus-extension-01
> >      >     (
> >     https://datatracker.ietf.org/doc/draft-valin-opus-extension/
> >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>
> >      >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/
> >     <https://datatracker.ietf.org/doc/draft-valin-opus-extension/>> )
> >      >     as a working group document and a starting point for further
> >      >     development.
> >      >
> >      >     I'm raising this for the benefit of those who could not make
> the
> >      >     meeting,
> >      >     and to request that this consensus be confirmed on the list.
> >      >
> >      >     So, if you have anything further to say about the adoption,
> >     positive or
> >      >     negative, please speak up.
> >      >
> >      >     --
> >      >     Mlcodec mailing list
> >      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
> >     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
> >      > https://www.ietf.org/mailman/listinfo/mlcodec
> >     <https://www.ietf.org/mailman/listinfo/mlcodec>
> >      >     <https://www.ietf.org/mailman/listinfo/mlcodec
> >     <https://www.ietf.org/mailman/listinfo/mlcodec>>____
> >      >
> >      >     --
> >      >     Mlcodec mailing list
> >      > Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>
> >     <mailto:Mlcodec@ietf.org <mailto:Mlcodec@ietf.org>>
> >      > https://www.ietf.org/mailman/listinfo/mlcodec
> >     <https://www.ietf.org/mailman/listinfo/mlcodec>
> >      >     <https://www.ietf.org/mailman/listinfo/mlcodec
> >     <https://www.ietf.org/mailman/listinfo/mlcodec>>
> >      >
> >      >
> >
>