[MMUSIC] Telephone-event and multiple clock rates

Adam Roach <adam@nostrum.com> Thu, 23 January 2014 20:18 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC9C61A00C5 for <mmusic@ietfa.amsl.com>; Thu, 23 Jan 2014 12:18:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.464
X-Spam-Level: *
X-Spam-Status: No, score=1.464 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, HELO_MISMATCH_COM=0.553, HOST_MISMATCH_NET=0.311, HTML_MESSAGE=0.001, J_CHICKENPOX_15=0.6] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uzpwGottvDDZ for <mmusic@ietfa.amsl.com>; Thu, 23 Jan 2014 12:18:06 -0800 (PST)
Received: from shaman.nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id 86B7B1A0049 for <mmusic@ietf.org>; Thu, 23 Jan 2014 12:18:06 -0800 (PST)
Received: from orochi-2.roach.at (99-152-145-110.lightspeed.dllstx.sbcglobal.net [99.152.145.110]) (authenticated bits=0) by shaman.nostrum.com (8.14.3/8.14.3) with ESMTP id s0NKI4X4069577 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for <mmusic@ietf.org>; Thu, 23 Jan 2014 14:18:05 -0600 (CST) (envelope-from adam@nostrum.com)
Message-ID: <52E178F6.70009@nostrum.com>
Date: Thu, 23 Jan 2014 14:17:58 -0600
From: Adam Roach <adam@nostrum.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: "mmusic@ietf.org" <mmusic@ietf.org>
Content-Type: multipart/alternative; boundary="------------010400040301040307060201"
Received-SPF: pass (shaman.nostrum.com: 99.152.145.110 is authenticated by a trusted mechanism)
Subject: [MMUSIC] Telephone-event and multiple clock rates
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Jan 2014 20:18:07 -0000

I've had an interesting issue brought to my attention regarding RFC4473 
"telephone-event" handling.

Section 2.1 requires that "Named telephone events are carried as part of 
the audio stream and MUST use the same sequence number and timestamp 
base as the regular audio channel".

On its face, this would imply that offering voice codecs with varying 
clock rates would result in the need to offer several PTs with 
telephone-event, varying only in frequency.

So, for example, if we were setting up a media session that looked like this

    m=audio 12346 RTP/AVP 0 97 98 99
    a=rtpmap:0 PCMU/8000
    a=rtpmap:97 opus/4800 0/2
    a=rtpmap:98 AMR-WB/16000
    a=rtpmap:99 speex/32000

But want to add telephone-event to it, we would need to do this:

    m=audio 12346 RTP/AVP 0 97 98 99 100 101 102 103
    a=rtpmap:0 PCMU/8000
    a=rtpmap:97 opus/48000/2
    a=rtpmap:98 AMR-WB/16000
    a=rtpmap:99 speex/32000
    a=rtpmap:100 telephone-event/8000
    a=fmtp:100 0-15
    a=rtpmap:101 telephone-event/48000
    a=fmtp:101 0-15
    a=rtpmap:102 telephone-event/16000
    a=fmtp:102 0-15
    a=rtpmap:103 telephone-event/32000
    a=fmtp:103 0-15


Rather than this:

    m=audio 12346 RTP/AVP 0 97 98 99 100
    a=rtpmap:0 PCMU/8000
    a=rtpmap:0 PCMA/8000
    a=rtpmap:97 opus/48000/2
    a=rtpmap:98 AMR-WB/16000
    a=rtpmap:99 speex/32000
    a=rtpmap:100 telephone-event/8000
    a=fmtp:100 0-15


And then it would be incumbent on the remote end to ensure that they use 
the *correct* DTMF PT to match the codec that they're using for speech.

This all seems like patent nonsense, and it burns an additional PT for 
each clock rate. Is that really what was intended by this language in 
RFC 4733?

/a