Re: [codec] #8: Sample rates?

"Schwarz Albrecht" <Albrecht.Schwarz@alcatel-lucent.com> Wed, 14 April 2010 15:30 UTC

Return-Path: <Albrecht.Schwarz@alcatel-lucent.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 60F9B3A682F for <codec@core3.amsl.com>; Wed, 14 Apr 2010 08:30:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.716
X-Spam-Level:
X-Spam-Status: No, score=0.716 tagged_above=-999 required=5 tests=[AWL=2.964, BAYES_00=-2.599, HELO_EQ_FR=0.35, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZkQoNGpzyD59 for <codec@core3.amsl.com>; Wed, 14 Apr 2010 08:30:03 -0700 (PDT)
Received: from smail2.alcatel.fr (smail2.alcatel.fr [64.208.49.57]) by core3.amsl.com (Postfix) with ESMTP id D46043A6858 for <codec@ietf.org>; Wed, 14 Apr 2010 08:30:01 -0700 (PDT)
Received: from FRVELSBHS03.ad2.ad.alcatel.com (frvelsbhs03.dc-m.alcatel-lucent.com [155.132.6.75]) by smail2.alcatel.fr (8.14.3/8.14.3/ICT) with ESMTP id o3EFTetn032258; Wed, 14 Apr 2010 17:29:41 +0200
Received: from FRVELSMBS23.ad2.ad.alcatel.com ([155.132.6.51]) by FRVELSBHS03.ad2.ad.alcatel.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 14 Apr 2010 17:29:39 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CADBE7.4C185D0E"
Date: Wed, 14 Apr 2010 17:25:33 +0200
Message-ID: <F4562D4585113D42AC08DC47FDEC49B0027D5B55@FRVELSMBS23.ad2.ad.alcatel.com>
In-Reply-To: <k2v6e9223711004140655te3e1dea2r1d69255a0cb28fc0@mail.gmail.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [codec] #8: Sample rates?
Thread-Index: Acrb2jDQSN+FCik0R/ODIK3Z5HE5XQAC3rhg
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org><20100413183602.86565rmv5hve5d6q@mail.skype.net><4BC52068.1080906@fas.harvard.edu><x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com><000c01cadbc1$86a14f10$93e3ed30$@de><4bc5a8c4.07a5660a.30ee.5382@mx.google.com><o2u6e9223711004140451s567cff9dwf12cbda8649a3f85@mail.gmail.com><003d01cadbcb$32653ce0$972fb6a0$@de><j2l6e9223711004140525t3332de9cx753c41e7d6bfc158@mail.gmail.com><004d01cadbcf$3b43c210$b1cb4630$@de> <k2v6e9223711004140655te3e1dea2r1d69255a0cb28fc0@mail.gmail.com>
From: Schwarz Albrecht <Albrecht.Schwarz@alcatel-lucent.com>
To: stephen botzko <stephen.botzko@gmail.com>, Christian Hoene <hoene@uni-tuebingen.de>
X-OriginalArrivalTime: 14 Apr 2010 15:29:39.0908 (UTC) FILETIME=[4C64C040:01CADBE7]
X-Scanned-By: MIMEDefang 2.64 on 155.132.188.80
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 15:30:05 -0000

You should replace "SDP" by "SDP Offer/Anser" (protocol), in order to emphasize the requirement for a) indication and possibly b) negotiation of codec configurations.
The number of potential parameters you are mentioning could result in a number of various codec configurations.
If this is the case, then the SDP O/A extensions would be recommended:
http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-media-capabilities/
which implies also support of
http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-capability-negotiation/
 
[Instead of RFC 3264 defined SDP Offer/Answer, which may be sufficient for a single codec configuration, but definetely insufficient in case of multiple codec configurations]


________________________________

	From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of stephen botzko
	Sent: Mittwoch, 14. April 2010 15:56
	To: Christian Hoene
	Cc: codec@ietf.org
	Subject: Re: [codec] #8: Sample rates?
	
	
	In-band for me in this case means RTP only.  Or in some other contexts RTP payload only.
	
	I think there is 
	(a) negotiation - particularly defining what optional modes the receiver can handle.  In the case of sample rates, number of channels, and any optional facilities, this is generally not changing mid-call.  SDP is the right place for this. (SDP SHALL be used for this)
	
	(b) Feedback - messages related to QOS, packet loss, etc are what I mean by this.  This should be done in RTCP, though given the lack of VOIP support perhaps there should be a SIP INFO backup path.  Feedback should not be done with RTP.
	
	(c) Control - Per RFC 3551, RTCP can be used for "loosely controlled" sessions, but "may be fully or partially subsumed by a separate session control protocol".  Given the statements on RTCP support in the VOIP infrastructure, we should be careful about putting unique controls in RTCP.  However, it should not be done in RTP.  Since most other audio codecs don't require this stuff, I suspect we won't either.  Though we will see... 
	
	(d) In-band (RTP).  Not sure how else to say this.  Ideally RTP streams carrying CODEC (with no out-of-band, no RTCP, no SDP parameter knowledge) can be decoded.  That is, I think it should be possible to fully decode unencrypted CODEC bitstreams with only the RTP packets abstracted from Wireshark.  Even if the operating mode changes mid-flight.  RFC 5404 (G.719) is one example of a packetization that accomplishes this, but there are other packetization RFCs which do not.
	
	Is this more clear?
	
	Regards
	Stephen Botzko
	
	
	
	
	On Wed, Apr 14, 2010 at 8:37 AM, Christian Hoene <hoene@uni-tuebingen.de> wrote:
	

		Hi Stephen,

		 

		I am getting confused... Do you mean that the parameter about sampling rate MUST be negotiated with SDP and not transmitted in-band?

		Or MUST NOT be negotiated inband but only transmitted inband?

		Inband means within RTP/RTCP/RTPextentions and/or the Internet CODEC payload.

		 

		What do you think about my second question on sampling rate limits?

		 

		With best regards,

		 Christian

		 

		 

		 

		---------------------------------------------------------------

		Dr.-Ing. Christian Hoene

		Interactive Communication Systems (ICS), University of Tübingen 

		Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532 
		http://www.net.uni-tuebingen.de/ <http://www.net.uni-tuebingen.de/> 

		 

		From: stephen botzko [mailto:stephen.botzko@gmail.com] 
		Sent: Wednesday, April 14, 2010 2:25 PM
		To: Christian Hoene
		Cc: Roni Even; codec@ietf.org 

		Subject: Re: [codec] #8: Sample rates?

		

		 

		All negotiation should be done with SDP (and should never be done in-band)
		
		And the RTP transport should be robust enough to permit seamless changes to any mode that is consistent with the negotiation (with no signaling).  
		
		The first point I think is essential.  The second reflects my own view on how RTP packetization should be done.
		
		Stephen Botzko

		On Wed, Apr 14, 2010 at 8:08 AM, Christian Hoene <hoene@uni-tuebingen.de> wrote:

		Hi,

		 

		I am fine with dropping any SDP negotiation on codec parameters including sampling rate and channels. I like the idea of splitting signaling and transportation issues.

		 

		But one question remain. We had the question on limiting the complexity for some kind of devices by choosing a lower sampling rate or a low number of channels. Shall this negotiation be done with SDP or inband?

		 

		Christian

		 

		 

		---------------------------------------------------------------

		Dr.-Ing. Christian Hoene

		Interactive Communication Systems (ICS), University of Tübingen 

		Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532 
		http://www.net.uni-tuebingen.de/ <http://www.net.uni-tuebingen.de/> 

		 

		From: stephen botzko [mailto:stephen.botzko@gmail.com] 
		Sent: Wednesday, April 14, 2010 1:52 PM
		To: Roni Even
		Cc: Christian Hoene; codec@ietf.org

		
		Subject: Re: [codec] #8: Sample rates?

		 

		Good points, thanks for clarifying..
		
		Personally I favor carrying those H.264 parameter sets on the media path, since there are situations (switched multipoint calls for one) where the timing matters.  With that use case, if reliable-but-too-late delivery occurs, there are decoding errors even if there is no packet loss.  
		
		Though of course SDP transmission alone may be suitable for other applications, and it is perfectly legal to send them both ways.
		
		Stephen Botzko

		2010/4/14 Roni Even <ron.even.tlv@gmail.com>

		Hi,

		Negotiation of codec parameters is not a tradition it  is needed if there are optional modes that the decoder can support in order to allow the sender to know if the receiver can receive the specific mode. If there are mandatory modes you may be able to provide the information in-band but this is not negotiation. Also note that while the signaling may use reliable channel the media path is not reliable and may suffer packet loss that may cause the loss of important parameters. We have such example in the H.264 parameter sets where they can be carried in the SDP for reliability on in-band as part of the payload.

		 

		Roni Even

		 

		 

		From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of Christian Hoene
		Sent: Wednesday, April 14, 2010 1:59 PM
		To: 'stephen botzko'

		
		Cc: codec@ietf.org
		Subject: Re: [codec] #8: Sample rates?

		 

		Hi ,

		 

		comments inline:

		 

		From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of stephen botzko
		Sent: Wednesday, April 14, 2010 4:55 AM
		To: bens@alum.mit.edu
		Cc: codec@ietf.org
		Subject: Re: [codec] #8: Sample rates?

		 

		When I said signaling I meant SDP, not anything in the bitstream itself.  I was not excluding audio bandwidth changes mid-call as part of network adaptation.  Though as we all agree this needs to be carefully designed.
		
		I agree it is best if the decoder does not require any knowledge of the SDP negotiation (or any other information beyond the RTP packet stream itself) in order to correctly decode the audio -- which I think is what you were concerned about.

		CH: Negotiating codec parameters with SDP has a long tradition. Take for example µLaw (RTP payload type 0): Here you negotiate the sampling rate. Also, the number of channels are negotiated for many codecs. I think sampling rate and number of channels can be done with SDP. However, I would avoid other codec specific parameters. Especially, in case of AMR the negotiation is quite complex should be avoided for the Internet  CODEC.

		Christian
		
		It would be a nice property if reducing the acoustic bandwidth also allowed the MIPS to be reduced, but I do not think it is a requirement;  I'd personally rather manage complexity with a Low Complexity profile (if that is really needed), since then I could keep the acoustic bandwidth (accepting a higher bit rate instead).

		
		Stephen Botzko

		On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz <bmschwar@fas.harvard.edu> wrote:

		Koen Vos wrote:
		> Quoting "Benjamin M. Schwartz":
		>> 1. Why would high frequencies be unheard?  Cheap speakers and microphones
		>> have difficulties with low frequencies, but not high frequencies, and
		>> routinely go all the way up past the limit of hearing.
		>
		> Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
		> don't go above 8 kHz.  Same for some mobile devices.

		True.

		
		>> 2. Why would it need to be negotiated?  For a suitably designed format,
		>> the encoder could choose not to waste bits on high frequencies without
		>> any
		>> negotiation or extra signalling.
		>
		> Without signaling, how would the encoder know that the farend decoder
		> will not take advantage of frequencies above a certain threshold?

		When I say signalling, I mean signalling within the codec bitstream.  The
		encoder can change its behavior based on knowledge of the receiver's
		configuration, but the bitstream does not need any extra signalling to
		indicate the change in behavior.

		
		>>> Signaling the bandwidth, and defining the
		>>> internal codec rate as fullband should let us lock down the RTP
		>>> timestamp
		>>> rate at 48 kHz (which I think is desirable).
		>>
		>> I do agree that having "only one mode" would be ideal, to maximize
		>> interoperability.  I wonder whether we can achieve high enough
		>> computational efficiency for this to be viable.
		>
		> Changing the RTP timestamp sampling rate causes no computational
		> complexity, does it?  Perhaps an extra multiplication for each packet or
		> so?  The point was that RTP timestamp sampling rate should disconnected
		> from the actual audio signals.

		Right, but Stephen also suggested "defining the internal codec rate as
		fullband".  From this, I imagined a scenario in which all (compliant) IWAC
		implementations MUST decode all IWAC streams, which always have a sampling
		rate of 48 KHz.  I think this is a great idea, to achieve really good
		interoperability.
		
		If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
		would presumably produce as good quality/bitrate with lower encoder and
		decoder complexity.  However, if we can make IWAC sufficiently
		low-complexity, operating at 48 KHz may be acceptable.  It will help if we
		can structure the codec so that operating at lower bandwidth is very
		efficient.  For example, it may be possible to structure a transform codec
		such that unneeded high frequencies can cheaply be zero'd on encode and
		ignored on decode.
		
		--Ben