Re: [codec] #8: Sample rates?

stephen botzko <stephen.botzko@gmail.com> Wed, 14 April 2010 13:55 UTC

Return-Path: <stephen.botzko@gmail.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 18ACD3A689A for <codec@core3.amsl.com>; Wed, 14 Apr 2010 06:55:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.528
X-Spam-Level:
X-Spam-Status: No, score=-2.528 tagged_above=-999 required=5 tests=[AWL=0.070, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z2YcLijJUc5c for <codec@core3.amsl.com>; Wed, 14 Apr 2010 06:55:46 -0700 (PDT)
Received: from mail-iw0-f189.google.com (mail-iw0-f189.google.com [209.85.223.189]) by core3.amsl.com (Postfix) with ESMTP id A24CD3A6A58 for <codec@ietf.org>; Wed, 14 Apr 2010 06:55:43 -0700 (PDT)
Received: by iwn27 with SMTP id 27so69968iwn.5 for <codec@ietf.org>; Wed, 14 Apr 2010 06:55:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=43Cc7l+dUZ/2SzrhSBbFIizffnpVKJ360CCnV9fErmI=; b=ZyfIxnE5XksRQG5Zd1R8I8AIzVCwxr4GvPYSHNO0s869JnPe2vmJA4Y23W8swfPfib +DY6bqE02SElOYRqG1UKJd/Q9AqbLf2+8r5PqLIPodifHdW59hSD5O81vOQYOD7wamr+ Gl7jrMzuAKCLXxdnrpERxff2z5jDNrYxV8I9A=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=OWdbMP4Gg4qMUiQVnEUaOPex23Ghk5TFJGwaQWpD9aPrAMi4VnX0HziulvcEAuvzA2 XlmRisq0e7qvQStf9W65MbXWNeERrfOUdU3JxBlEYxmH0+eVZp61F5b2DZ8aECSpOz1o InOSF3NzFqTH8TadgQg/NcQwxgNeBpdWOqt8k=
MIME-Version: 1.0
Received: by 10.231.85.133 with HTTP; Wed, 14 Apr 2010 06:55:32 -0700 (PDT)
In-Reply-To: <004d01cadbcf$3b43c210$b1cb4630$@de>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <20100413183602.86565rmv5hve5d6q@mail.skype.net> <4BC52068.1080906@fas.harvard.edu> <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com> <000c01cadbc1$86a14f10$93e3ed30$@de> <4bc5a8c4.07a5660a.30ee.5382@mx.google.com> <o2u6e9223711004140451s567cff9dwf12cbda8649a3f85@mail.gmail.com> <003d01cadbcb$32653ce0$972fb6a0$@de> <j2l6e9223711004140525t3332de9cx753c41e7d6bfc158@mail.gmail.com> <004d01cadbcf$3b43c210$b1cb4630$@de>
Date: Wed, 14 Apr 2010 09:55:32 -0400
Received: by 10.231.170.14 with SMTP id b14mr3335488ibz.54.1271253333028; Wed, 14 Apr 2010 06:55:33 -0700 (PDT)
Message-ID: <k2v6e9223711004140655te3e1dea2r1d69255a0cb28fc0@mail.gmail.com>
From: stephen botzko <stephen.botzko@gmail.com>
To: Christian Hoene <hoene@uni-tuebingen.de>
Content-Type: multipart/alternative; boundary="001636d34a9b55b1a6048432bb81"
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 13:55:49 -0000

In-band for me in this case means RTP only.  Or in some other contexts RTP
payload only.

I think there is
(a) negotiation - particularly defining what optional modes the receiver *
can* handle.  In the case of sample rates, number of channels, and any
optional facilities, this is generally not changing mid-call.  SDP is the
right place for this. (SDP SHALL be used for this)

(b) Feedback - messages related to QOS, packet loss, etc are what I mean by
this.  This should be done in RTCP, though given the lack of VOIP support
perhaps there should be a SIP INFO backup path.  Feedback should not be done
with RTP.

(c) Control - Per RFC 3551, RTCP can be used for "loosely controlled"
sessions, but "may be fully or partially subsumed by a separate session
control protocol".  Given the statements on RTCP support in the VOIP
infrastructure, we should be careful about putting unique controls in RTCP.
However, it should not be done in RTP.  Since most other audio codecs don't
require this stuff, I suspect we won't either.  Though we will see...

(d) In-band (RTP).  Not sure how else to say this.  Ideally RTP streams
carrying CODEC (with no out-of-band, no RTCP, no SDP parameter knowledge)
can be decoded.  That is, I think it should be possible to fully decode
unencrypted CODEC bitstreams with only the RTP packets abstracted from
Wireshark.  Even if the operating mode changes mid-flight.  RFC 5404 (G.719)
is one example of a packetization that accomplishes this, but there are
other packetization RFCs which do not.

Is this more clear?

Regards
Stephen Botzko



On Wed, Apr 14, 2010 at 8:37 AM, Christian Hoene <hoene@uni-tuebingen.de>wrote:

>  Hi Stephen,
>
>
>
> I am getting confused… Do you mean that the parameter about sampling rate
> MUST be negotiated with SDP and not transmitted in-band?
>
> Or MUST NOT be negotiated inband but only transmitted inband?
>
> Inband means within RTP/RTCP/RTPextentions and/or the Internet CODEC
> payload.
>
>
>
> What do you think about my second question on sampling rate limits?
>
>
>
> With best regards,
>
>  Christian
>
>
>
>
>
>
>
> ---------------------------------------------------------------
>
> Dr.-Ing. Christian Hoene
>
> Interactive Communication Systems (ICS), University of Tübingen
>
> Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532
> http://www.net.uni-tuebingen.de/
>
>
>
> *From**:* stephen botzko [mailto:stephen.botzko@gmail.com]
> *Sent:* Wednesday, April 14, 2010 2:25 PM
> *To:* Christian Hoene
> *Cc:* Roni Even; codec@ietf.org
>
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> All negotiation should be done with SDP (and should *never* be done
> in-band)
>
> And the RTP transport should be robust enough to permit seamless changes to
> any mode that is consistent with the negotiation (with no signaling).
>
> The first point I think is essential.  The second reflects my own view on
> how RTP packetization should be done.
>
> Stephen Botzko
>
> On Wed, Apr 14, 2010 at 8:08 AM, Christian Hoene <hoene@uni-tuebingen.de>
> wrote:
>
> Hi,
>
>
>
> I am fine with dropping any SDP negotiation on codec parameters including
> sampling rate and channels. I like the idea of splitting signaling and
> transportation issues.
>
>
>
> But one question remain. We had the question on limiting the complexity for
> some kind of devices by choosing a lower sampling rate or a low number of
> channels. Shall this negotiation be done with SDP or inband?
>
>
>
> Christian
>
>
>
>
>
> ---------------------------------------------------------------
>
> Dr.-Ing. Christian Hoene
>
> Interactive Communication Systems (ICS), University of Tübingen
>
> Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532
> http://www.net.uni-tuebingen.de/
>
>
>
> *From:* stephen botzko [mailto:stephen.botzko@gmail.com]
> *Sent:* Wednesday, April 14, 2010 1:52 PM
> *To:* Roni Even
> *Cc:* Christian Hoene; codec@ietf.org
>
>
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> Good points, thanks for clarifying..
>
> Personally I favor carrying those H.264 parameter sets on the media path,
> since there are situations (switched multipoint calls for one) where the
> timing matters.  With that use case, if reliable-but-too-late delivery
> occurs, there are decoding errors even if there is no packet loss.
>
> Though of course SDP transmission alone may be suitable for other
> applications, and it is perfectly legal to send them both ways.
>
> Stephen Botzko
>
> 2010/4/14 Roni Even <ron.even.tlv@gmail.com>
>
> Hi,
>
> Negotiation of codec parameters is not a tradition it  is needed if there
> are optional modes that the decoder can support in order to allow the sender
> to know if the receiver can receive the specific mode. If there are
> mandatory modes you may be able to provide the information in-band but this
> is not negotiation. Also note that while the signaling may use reliable
> channel the media path is not reliable and may suffer packet loss that may
> cause the loss of important parameters. We have such example in the H.264
> parameter sets where they can be carried in the SDP for reliability on
> in-band as part of the payload.
>
>
>
> Roni Even
>
>
>
>
>
> *From:* codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf
> Of *Christian Hoene
> *Sent:* Wednesday, April 14, 2010 1:59 PM
> *To:* 'stephen botzko'
>
>
> *Cc:* codec@ietf.org
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> Hi ,
>
>
>
> comments inline:
>
>
>
> *From:* codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf
> Of *stephen botzko
> *Sent:* Wednesday, April 14, 2010 4:55 AM
> *To:* bens@alum.mit.edu
> *Cc:* codec@ietf.org
> *Subject:* Re: [codec] #8: Sample rates?
>
>
>
> When I said signaling I meant SDP, not anything in the bitstream itself.  I
> was not excluding audio bandwidth changes mid-call as part of network
> adaptation.  Though as we all agree this needs to be carefully designed.
>
> I agree it is best if the decoder does not require any knowledge of the SDP
> negotiation (or any other information beyond the RTP packet stream itself)
> in order to correctly decode the audio -- which I think is what you were
> concerned about.
>
> CH: Negotiating codec parameters with SDP has a long tradition. Take for
> example µLaw (RTP payload type 0): Here you negotiate the sampling rate.
> Also, the number of channels are negotiated for many codecs. I think
> sampling rate and number of channels can be done with SDP. However, I would
> avoid other codec specific parameters. Especially, in case of AMR the
> negotiation is quite complex should be avoided for the Internet  CODEC.
>
> Christian
>
> It would be a nice property if reducing the acoustic bandwidth also allowed
> the MIPS to be reduced, but I do not think it is a requirement;  I'd
> personally rather manage complexity with a Low Complexity profile (if that
> is really needed), since then I could keep the acoustic bandwidth (accepting
> a higher bit rate instead).
>
>
> Stephen Botzko
>
> On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz <
> bmschwar@fas.harvard.edu> wrote:
>
> Koen Vos wrote:
> > Quoting "Benjamin M. Schwartz":
> >> 1. Why would high frequencies be unheard?  Cheap speakers and
> microphones
> >> have difficulties with low frequencies, but not high frequencies, and
> >> routinely go all the way up past the limit of hearing.
> >
> > Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> > don't go above 8 kHz.  Same for some mobile devices.
>
> True.
>
>
> >> 2. Why would it need to be negotiated?  For a suitably designed format,
> >> the encoder could choose not to waste bits on high frequencies without
> >> any
> >> negotiation or extra signalling.
> >
> > Without signaling, how would the encoder know that the farend decoder
> > will not take advantage of frequencies above a certain threshold?
>
> When I say signalling, I mean signalling within the codec bitstream.  The
> encoder can change its behavior based on knowledge of the receiver's
> configuration, but the bitstream does not need any extra signalling to
> indicate the change in behavior.
>
>
> >>> Signaling the bandwidth, and defining the
> >>> internal codec rate as fullband should let us lock down the RTP
> >>> timestamp
> >>> rate at 48 kHz (which I think is desirable).
> >>
> >> I do agree that having "only one mode" would be ideal, to maximize
> >> interoperability.  I wonder whether we can achieve high enough
> >> computational efficiency for this to be viable.
> >
> > Changing the RTP timestamp sampling rate causes no computational
> > complexity, does it?  Perhaps an extra multiplication for each packet or
> > so?  The point was that RTP timestamp sampling rate should disconnected
> > from the actual audio signals.
>
> Right, but Stephen also suggested "defining the internal codec rate as
> fullband".  From this, I imagined a scenario in which all (compliant) IWAC
> implementations MUST decode all IWAC streams, which always have a sampling
> rate of 48 KHz.  I think this is a great idea, to achieve really good
> interoperability.
>
> If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
> would presumably produce as good quality/bitrate with lower encoder and
> decoder complexity.  However, if we can make IWAC sufficiently
> low-complexity, operating at 48 KHz may be acceptable.  It will help if we
> can structure the codec so that operating at lower bandwidth is very
> efficient.  For example, it may be possible to structure a transform codec
> such that unneeded high frequencies can cheaply be zero'd on encode and
> ignored on decode.
>
> --Ben
>
>
>
>
>
>
>