Re: [codec] #24: Negotiation of codec parameters?
"codec issue tracker" <trac@tools.ietf.org> Sun, 02 May 2010 08:32 UTC
Return-Path: <trac@tools.ietf.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 42D373A68C7 for <codec@core3.amsl.com>; Sun, 2 May 2010 01:32:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.183
X-Spam-Level:
X-Spam-Status: No, score=-101.183 tagged_above=-999 required=5 tests=[AWL=-1.183, BAYES_50=0.001, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sHzk4fNd+FDa for <codec@core3.amsl.com>; Sun, 2 May 2010 01:32:28 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (unknown [IPv6:2001:1890:1112:1::2a]) by core3.amsl.com (Postfix) with ESMTP id AAE913A67B3 for <codec@ietf.org>; Sun, 2 May 2010 01:32:28 -0700 (PDT)
Received: from localhost ([::1] helo=zinfandel.tools.ietf.org) by zinfandel.tools.ietf.org with esmtp (Exim 4.69) (envelope-from <trac@tools.ietf.org>) id 1O8Ub0-0003Z1-Fz; Sun, 02 May 2010 01:32:14 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: codec issue tracker <trac@tools.ietf.org>
X-Trac-Version: 0.11.6
Precedence: bulk
Auto-Submitted: auto-generated
X-Mailer: Trac 0.11.6, by Edgewall Software
To: hoene@uni-tuebingen.de
X-Trac-Project: codec
Date: Sun, 02 May 2010 08:32:14 -0000
X-URL: http://tools.ietf.org/codec/
X-Trac-Ticket-URL: http://trac.tools.ietf.org/wg/codec/trac/ticket/24#comment:1
Message-ID: <071.e3c35995edbdbaccee3438f1a110069b@tools.ietf.org>
References: <062.6a10c93c1a05ea5f21f5afc0b48f2660@tools.ietf.org>
X-Trac-Ticket-ID: 24
In-Reply-To: <062.6a10c93c1a05ea5f21f5afc0b48f2660@tools.ietf.org>
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Rcpt-To: hoene@uni-tuebingen.de, codec@ietf.org
X-SA-Exim-Mail-From: trac@tools.ietf.org
X-SA-Exim-Scanned: No (on zinfandel.tools.ietf.org); SAEximRunCond expanded to false
Cc: codec@ietf.org
Subject: Re: [codec] #24: Negotiation of codec parameters?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Reply-To: codec@ietf.org
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 02 May 2010 08:32:30 -0000
#24: Negotiation of codec parameters? ------------------------------------+--------------------------------------- Reporter: hoene@… | Owner: Type: enhancement | Status: new Priority: minor | Milestone: Component: requirements | Version: Severity: - | Keywords: ------------------------------------+--------------------------------------- Comment(by hoene@…): [kpfleming@]: If our goal is to use RTP AVP/SAVP/AVPF/SAVPF profiles for transport (as seems likely), then differences in sample rates between stream offers must be listed separately in the SDP. Whether they have a different codec 'name' in the SDP or not seems less important, because the combination of the codec name and sample rate is required to uniquely identify the format in any case. Note that this is *sample rate*, and not bitstream rate. [Christian]: No, please not. Please keep the interface to the codec as simple as possible! In addition, one must consider the following requirements: 1) First, the sample rate MUST be changed dynamically to cope with varying transmission bandwidths. ... [Stephen]: Dynamically changing sample rates on the system level adds some complexity for RTP, since the timestamp granularity is supposed to be the sample rate. BTW, dynamically changing the sample rate may be in conflict with the idea of low-complexity compressed-domain mixing (even if the conversion is done internally). [Kevin]: If the desire is for the codec to be able to change sample rates to adjust to network conditions, then I agree with Stephen... the 'external' sample rate (input to the encoder and output from the decoder) should be fixed, and this is what would be negotiated in SDP and used for RTP timestamps. The codec can downsample in the encoder and upsample in the decoder if it has decided to transmit fewer bits across the network. [Stephen]: Something like: CODEC MAY reduce the acoustic bandwidth at lower bit rates in order to optimize audio quality. This is free of any technology assumption about how the acoustic bandwidth is reduced. The MAY indicates that it is permissible. But if the CODEC algorithm doesn't need to reduce the acoustic bandwidth, then we are making no statement that it SHOULD (or SHOULD NOT). Kevin is distinguishing dynamic changes to the sample rate (for bandwidth management) from multiple fixed sample rates; and I agree that is a key distinction. I have not heard any clear application requirement for more than one fixed sampling rate. Though if there is such a requirement, IMHO we would have to negotiate the rate within SDP in the usual way, and it would affect the RTP timestamps, jitter buffers, etc. G.722.1 / G.722.1C is one precedent - it is the same core codec, but can run at two different sample rates (negotiated by SDP). [Christian]: It still might make sense to negotiate the maximal supported sampling rate via SDP or, if possible, to select one out of multiple sampling rates, if the audio receiver can cope with multiple rates well. The internal sampling frequency of the codec NEEDS NOT to be affected by the external sampling frequency. However, the decoder might want to signal to the encoder that the decoding is requiring too many computational resources and that a less complex coding mode (or a lower sampling frequency) should be taken. [Stephen]: This would make the signaling more complicated - personally I am not convinced it is worth it. I think a better avenue is to bound overall complexity, and to focus on dynamically adapting to network conditions (as opposed to dynamic complexity management). You can't dynamically negotiate complexity in many scenarios anyway - for instance it makes no sense if you are using multicast. [Christian]: > This would make the signaling more complicated - personally I am not convinced it is worth it. It is a difficult tradeoff. However, signaling overload is done in Skype. Such as signaling might be very useful for mobile devices, which want to save power and thus lower their CPU clock. Or wireless IP based headphones which do not have large batteries. I am thinking of signaling the states: overloaded, fine, and low. That should be enough for most operational cases. > I think a better avenue is to bound overall complexity, and to focus on dynamically adapting to network conditions (as opposed to dynamic complexity management). I just like to remind that the good old TCP does support both: congestion control to adapt to network conditions and flow control take into account an overloaded (=full) receiver. [Stephen]: TCP is a different case, since for this we are using RTCP to signal our feedback, and I don't think it has the facility you are envisioning. This concept seems pretty theoretical to me. If we need to manage complexity / quality tradeoffs, why not just use profiles (as AVC/H.264 does) or create a low complexity variant (like G.729A). I really don't see the need for dynamic complexity management. BTW, you seem to be assuming that a lower sample rate results in significantly less complexity. The savings there might not be as great as you think, especially if the receiver needs to resample anyway (to prevent those sound card limitations you were talking about before). [Roman]: RTCP is almost universally not implemented. The biggest VoIP gateway on the market does not generate RTCP. If we will rely on any RTCP functionality for bandwidth control it will probably be ignored. [Stephen]: Videoconferencing devices do almost always support RTCP. It is regrettable that so many VOIP devices do not. Anyway, I do not think our charter scope includes invention of a new mechanism for signaling the network quality. [Roman]: My remark about RTCP was to try to develop a CODEC that will function properly with RTCP absent. If we require RTCP based mechanisms in order for the CODEC to operate properly, this can impede the adoption of this CODEC. In no way do I propose to create new signaling mechanisms. [Stephen]: I rather like the idea of negotiating maximum audio bandwidth. For me that is different from dynamic complexity management, and is being signaled for a different purpose (wasting coded bits on unheard spectrum degrades the quality of the heard spectrum). {Ben]: Why would it need to be negotiated? For a suitably designed format, the encoder could choose not to waste bits on high frequencies without any negotiation or extra signalling. [...] I do agree that having "only one mode" would be ideal, to maximize interoperability. I wonder whether we can achieve high enough computational efficiency for this to be viable. [Koen]:Not all hardware supports arbitrary/high sampling rates. PSTN gateways don't go above 8 kHz. Same for some mobile devices. Without signaling, how would the encoder know that the farend decoder will not take advantage of frequencies above a certain threshold? [Ben]: When I say signalling, I mean signalling within the codec bitstream. The encoder can change its behavior based on knowledge of the receiver's configuration, but the bitstream does not need any extra signalling to indicate the change in behavior. If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz would presumably produce as good quality/bitrate with lower encoder and decoder complexity. However, if we can make IWAC sufficiently low- complexity, operating at 48 KHz may be acceptable. It will help if we can structure the codec so that operating at lower bandwidth is very efficient. [Stephen]: When I said signaling I meant SDP, not anything in the bitstream itself. I was not excluding audio bandwidth changes mid-call as part of network adaptation. Though as we all agree this needs to be carefully designed. I agree it is best if the decoder does not require any knowledge of the SDP negotiation (or any other information beyond the RTP packet stream itself) in order to correctly decode the audio -- which I think is what you were concerned about. It would be a nice property if reducing the acoustic bandwidth also allowed the MIPS to be reduced, but I do not think it is a requirement; I'd personally rather manage complexity with a Low Complexity profile (if that is really needed), since then I could keep the acoustic bandwidth (accepting a higher bit rate instead). [Christian]: Negotiating codec parameters with SDP has a long tradition. Take for example µLaw (RTP payload type 0): Here you negotiate the sampling rate. Also, the number of channels are negotiated for many codecs. I think sampling rate and number of channels can be done with SDP. However, I would avoid other codec specific parameters. Especially, in case of AMR the negotiation is quite complex should be avoided for the Internet CODEC. [Stephen]: My point here was not that SDP negotiation should be avoided. I was tryiing to say that it is best if the RTP payload is complete (in the sense that it can be fully decoded even if you ignore the signaling, as long as you know the codec itself). For instance, if you negotiate the number of channels, it should be possible for the decoder to identify the number of channels from the RTP payload. There are codecs where this is not done. Though personally I think it is the best architectural approach, even if it costs some payload bits. One reason is that I think changing modes should be seamless, and there is a race condition between the signaling and the RTP payload. If you are adapting to network conditions, it is particularly useful to change on the fly. [Roni]: Negotiation of codec parameters is not a tradition it is needed if there are optional modes that the decoder can support in order to allow the sender to know if the receiver can receive the specific mode. If there are mandatory modes you may be able to provide the information in-band but this is not negotiation. Also note that while the signaling may use reliable channel the media path is not reliable and may suffer packet loss that may cause the loss of important parameters. We have such example in the H.264 parameter sets where they can be carried in the SDP for reliability on in-band as part of the payload. [Stephen]: Personally I favor carrying those H.264 parameter sets on the media path, since there are situations (switched multipoint calls for one) where the timing matters. With that use case, if reliable-but-too-late delivery occurs, there are decoding errors even if there is no packet loss. Though of course SDP transmission alone may be suitable for other applications, and it is perfectly legal to send them both ways. [Christian]: I am fine with dropping any SDP negotiation on codec parameters including sampling rate and channels. I like the idea of splitting signaling and transportation issues. But one question remains. We had the question on limiting the complexity for some kind of devices by choosing a lower sampling rate or a low number of channels. Shall this negotiation be done with SDP or inband? [Stephen]: All negotiation should be done with SDP (and should never be done in-band). And the RTP transport should be robust enough to permit seamless changes to any mode that is consistent with the negotiation (with no signaling). The first point I think is essential. The second reflects my own view on how RTP packetization should be done. [Christian]: I am getting confused… Do you mean that the parameter about sampling rate MUST be negotiated with SDP and not transmitted in-band? Or MUST NOT be negotiated inband but only transmitted inband? Inband means within RTP/RTCP/RTPextentions and/or the Internet CODEC payload. [Stephen]: In-band for me in this case means RTP only. Or in some other contexts RTP payload only. I think there is (a) negotiation - particularly defining what optional modes the receiver can handle. In the case of sample rates, number of channels, and any optional facilities, this is generally not changing mid-call. SDP is the right place for this. (SDP SHALL be used for this) (b) Feedback - messages related to QOS, packet loss, etc are what I mean by this. This should be done in RTCP, though given the lack of VOIP support perhaps there should be a SIP INFO backup path. Feedback should not be done with RTP. (c) Control - Per RFC 3551, RTCP can be used for "loosely controlled" sessions, but "may be fully or partially subsumed by a separate session control protocol". Given the statements on RTCP support in the VOIP infrastructure, we should be careful about putting unique controls in RTCP. However, it should not be done in RTP. Since most other audio codecs don't require this stuff, I suspect we won't either. Though we will see... (d) In-band (RTP). Not sure how else to say this. Ideally RTP streams carrying CODEC (with no out-of-band, no RTCP, no SDP parameter knowledge) can be decoded. That is, I think it should be possible to fully decode unencrypted CODEC bitstreams with only the RTP packets abstracted from Wireshark. Even if the operating mode changes mid-flight. RFC 5404 (G.719) is one example of a packetization that accomplishes this, but there are other packetization RFCs which do not. [Schwarz]: You should replace "SDP" by "SDP Offer/Anser" (protocol), in order to emphasize the requirement for a) indication and possibly b) negotiation of codec configurations. The number of potential parameters you are mentioning could result in a number of various codec configurations. If this is the case, then the SDP O/A extensions would be recommended: http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-media-capabilities/ which implies also support of http://tools.ietf.org/wg/mmusic/draft-ietf-mmusic-sdp-capability- negotiation/ [Instead of RFC 3264 defined SDP Offer/Answer, which may be sufficient for a single codec configuration, but definetely insufficient in case of multiple codec configurations] -- Ticket URL: <http://trac.tools.ietf.org/wg/codec/trac/ticket/24#comment:1> codec <http://tools.ietf.org/codec/>
- [codec] #24: Negotiation of codec parameters? codec issue tracker
- Re: [codec] #24: Negotiation of codec parameters? codec issue tracker
- Re: [codec] requirements #24 (new): Negotiation o… codec issue tracker
- Re: [codec] #24: Negotiation of codec parameters? codec issue tracker
- Re: [codec] #24: Negotiation of codec parameters? Roni Even