Re: [codec] #8: Sample rates?

stephen botzko <stephen.botzko@gmail.com> Wed, 14 April 2010 11:56 UTC

Return-Path: <stephen.botzko@gmail.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 216583A699F for <codec@core3.amsl.com>; Wed, 14 Apr 2010 04:56:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hO9dIBm1COYu for <codec@core3.amsl.com>; Wed, 14 Apr 2010 04:56:18 -0700 (PDT)
Received: from mail-gw0-f44.google.com (mail-gw0-f44.google.com [74.125.83.44]) by core3.amsl.com (Postfix) with ESMTP id 9A15A28C1C5 for <codec@ietf.org>; Wed, 14 Apr 2010 04:51:46 -0700 (PDT)
Received: by gwb1 with SMTP id 1so5166gwb.31 for <codec@ietf.org>; Wed, 14 Apr 2010 04:51:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=+FL+s08PfCsjnVKofG9Kt+SVzz3wJaeW4VoUBqSOtso=; b=Q77bChHHfBbCXUnmnoSzdo/Qg1w1K843SJ14qURQBdSvp8XBi9sLcvPn04bidoGEMN 2izU3hnpmYHwC8wFGA0VBKZWz7TxBQm/cu1SJyojPT0ljO2KxLB/GtP4enWT4asnu/pD 3qPbnypBYhTPS0L/4iQHF8IVdh+eFhjHjBfAQ=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=xsJGxFQrwbIn+X2j0FR2FM0q8e/l1/F7Qnry2oSl4qVGPs0OA9e+mSYSomooA7EEOq dgoJQLMbaNlIc81yVGU32bSVVrfe75AKOujjKxqUysMckTK6LvX/+L3Lx5QntjEp7etx 4Q2cplyMK80TNpsMcY1q4UfeBT7p7U2gs2PJE=
MIME-Version: 1.0
Received: by 10.231.85.133 with HTTP; Wed, 14 Apr 2010 04:51:35 -0700 (PDT)
In-Reply-To: <4bc5a8c4.07a5660a.30ee.5382@mx.google.com>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <y2q6e9223711004131303l15fb87ffoe1039c56d21c565f@mail.gmail.com> <20100413164818.546929eae97cjjr6@mail.skype.net> <z2g6e9223711004131723qa66e5a82y3bea15ae44ae5ba0@mail.gmail.com> <4BC514CE.2080800@fas.harvard.edu> <20100413183602.86565rmv5hve5d6q@mail.skype.net> <4BC52068.1080906@fas.harvard.edu> <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com> <000c01cadbc1$86a14f10$93e3ed30$@de> <4bc5a8c4.07a5660a.30ee.5382@mx.google.com>
Date: Wed, 14 Apr 2010 07:51:35 -0400
Received: by 10.101.101.2 with SMTP id d2mr12291555anm.240.1271245896060; Wed, 14 Apr 2010 04:51:36 -0700 (PDT)
Message-ID: <o2u6e9223711004140451s567cff9dwf12cbda8649a3f85@mail.gmail.com>
From: stephen botzko <stephen.botzko@gmail.com>
To: Roni Even <ron.even.tlv@gmail.com>
Content-Type: multipart/alternative; boundary="001636ed70fe0ef22104843100fc"
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 11:56:20 -0000

Good points, thanks for clarifying..

Personally I favor carrying those H.264 parameter sets on the media path,
since there are situations (switched multipoint calls for one) where the
timing matters.  With that use case, if reliable-but-too-late delivery
occurs, there are decoding errors even if there is no packet loss.

Though of course SDP transmission alone may be suitable for other
applications, and it is perfectly legal to send them both ways.

Stephen Botzko

2010/4/14 Roni Even <ron.even.tlv@gmail.com>

>  Hi,
>
> Negotiation of codec parameters is not a tradition it  is needed if there
> are optional modes that the decoder can support in order to allow the sender
> to know if the receiver can receive the specific mode. If there are
> mandatory modes you may be able to provide the information in-band but this
> is not negotiation. Also note that while the signaling may use reliable
> channel the media path is not reliable and may suffer packet loss that may
> cause the loss of important parameters. We have such example in the H.264
> parameter sets where they can be carried in the SDP for reliability on
> in-band as part of the payload.
>
>
>
> Roni Even
>
>
>
>
>
> *From:* codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf
> Of *Christian Hoene
> *Sent:* Wednesday, April 14, 2010 1:59 PM
> *To:* 'stephen botzko'
>
> *Cc:* codec@ietf.org
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> Hi ,
>
>
>
> comments inline:
>
>
>
> *From:* codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf
> Of *stephen botzko
> *Sent:* Wednesday, April 14, 2010 4:55 AM
> *To:* bens@alum.mit.edu
> *Cc:* codec@ietf.org
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> When I said signaling I meant SDP, not anything in the bitstream itself.  I
> was not excluding audio bandwidth changes mid-call as part of network
> adaptation.  Though as we all agree this needs to be carefully designed.
>
> I agree it is best if the decoder does not require any knowledge of the SDP
> negotiation (or any other information beyond the RTP packet stream itself)
> in order to correctly decode the audio -- which I think is what you were
> concerned about.
>
> CH: Negotiating codec parameters with SDP has a long tradition. Take for
> example µLaw (RTP payload type 0): Here you negotiate the sampling rate.
> Also, the number of channels are negotiated for many codecs. I think
> sampling rate and number of channels can be done with SDP. However, I would
> avoid other codec specific parameters. Especially, in case of AMR the
> negotiation is quite complex should be avoided for the Internet  CODEC.
>
> Christian
>
> It would be a nice property if reducing the acoustic bandwidth also allowed
> the MIPS to be reduced, but I do not think it is a requirement;  I'd
> personally rather manage complexity with a Low Complexity profile (if that
> is really needed), since then I could keep the acoustic bandwidth (accepting
> a higher bit rate instead).
>
>
> Stephen Botzko
>
> On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz <
> bmschwar@fas.harvard.edu> wrote:
>
> Koen Vos wrote:
> > Quoting "Benjamin M. Schwartz":
> >> 1. Why would high frequencies be unheard?  Cheap speakers and
> microphones
> >> have difficulties with low frequencies, but not high frequencies, and
> >> routinely go all the way up past the limit of hearing.
> >
> > Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> > don't go above 8 kHz.  Same for some mobile devices.
>
> True.
>
>
> >> 2. Why would it need to be negotiated?  For a suitably designed format,
> >> the encoder could choose not to waste bits on high frequencies without
> >> any
> >> negotiation or extra signalling.
> >
> > Without signaling, how would the encoder know that the farend decoder
> > will not take advantage of frequencies above a certain threshold?
>
> When I say signalling, I mean signalling within the codec bitstream.  The
> encoder can change its behavior based on knowledge of the receiver's
> configuration, but the bitstream does not need any extra signalling to
> indicate the change in behavior.
>
>
> >>> Signaling the bandwidth, and defining the
> >>> internal codec rate as fullband should let us lock down the RTP
> >>> timestamp
> >>> rate at 48 kHz (which I think is desirable).
> >>
> >> I do agree that having "only one mode" would be ideal, to maximize
> >> interoperability.  I wonder whether we can achieve high enough
> >> computational efficiency for this to be viable.
> >
> > Changing the RTP timestamp sampling rate causes no computational
> > complexity, does it?  Perhaps an extra multiplication for each packet or
> > so?  The point was that RTP timestamp sampling rate should disconnected
> > from the actual audio signals.
>
> Right, but Stephen also suggested "defining the internal codec rate as
> fullband".  From this, I imagined a scenario in which all (compliant) IWAC
> implementations MUST decode all IWAC streams, which always have a sampling
> rate of 48 KHz.  I think this is a great idea, to achieve really good
> interoperability.
>
> If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
> would presumably produce as good quality/bitrate with lower encoder and
> decoder complexity.  However, if we can make IWAC sufficiently
> low-complexity, operating at 48 KHz may be acceptable.  It will help if we
> can structure the codec so that operating at lower bandwidth is very
> efficient.  For example, it may be possible to structure a transform codec
> such that unneeded high frequencies can cheaply be zero'd on encode and
> ignored on decode.
>
> --Ben
>
>
>