Re: [secdir] [Sipping] draft-wing-sipping-srtp-key-04 (was no subject)

Hadriel Kaplan <> Sun, 15 February 2009 18:48 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8B10F3A69B0; Sun, 15 Feb 2009 10:48:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.421
X-Spam-Status: No, score=-2.421 tagged_above=-999 required=5 tests=[AWL=0.178, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id y7z0RJFk0VsK; Sun, 15 Feb 2009 10:48:51 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 27BB53A699E; Sun, 15 Feb 2009 10:48:51 -0800 (PST)
Received: from ( by ( with Microsoft SMTP Server (TLS) id; Sun, 15 Feb 2009 13:47:40 -0500
Received: from ([]) by mail ([]) with mapi; Sun, 15 Feb 2009 13:47:39 -0500
From: Hadriel Kaplan <>
To: Eric Rescorla <>, "" <>, "" <>, "" <>
Date: Sun, 15 Feb 2009 13:47:36 -0500
Thread-Topic: [Sipping] draft-wing-sipping-srtp-key-04 (was no subject)
Thread-Index: AcmPi30LkMWMKzWIQiCliY9xlKjOsAAbW2Jw
Message-ID: <E6C2E8958BA59A4FB960963D475F7AC313F794E897@mail>
References: <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailman-Approved-At: Sun, 15 Feb 2009 19:06:57 -0800
Subject: Re: [secdir] [Sipping] draft-wing-sipping-srtp-key-04 (was no subject)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 15 Feb 2009 18:48:52 -0000

> -----Original Message-----
> From: [] On Behalf
> Of Eric Rescorla
> Sent: Sunday, February 15, 2009 11:59 AM
> This document describes two major use cases for this type of
> technology:
> - Monitoring (call recording)
> - Transcoding
> I don't think it's particularly useful to conflate these cases, which
> are really quite different.

I agree - it's really, really weird to think of transcoding cases doing things this draft's way. (in fact it's hard to believe call recording apps should work this way either, but I know they want to)

> Similarly, this document fails to distinguish adequately between
> real-time and non-real-time use cases. Many monitoring/call recording
> applications are inherently non-real-time: you record the call
> and some time in the future, the call may or may not be replayed.

I used to think that too, but I've come to find out that some call monitoring apps really do need real-time play-back; because it's not "play-back", it's like a 3-party conference with a silent listener.  For example some support call centers have managers who listen in on active calls randomly, or listen in to newly-hired staff for the first day or so.
Also, some vertical markets must successfully record all calls, for legal reasons, and even the chance of the keys getting lost due to phone reboots or whatever is not acceptable to them.  So I think getting the keys at the beginning of the call is a requirement.

> Finally, the elephant under the covers here is lawful intercept.
> the authors specifically disclaim it, but it's quite clear that
> this is usable as an LI system. Indeed, many such systems
> (e.g., FORTEZZA) involve cooperation from the endpoint being
> monitored.

I actually believe them that it's not applicable.  Most LI systems cannot work like FORTEZZA, specifically because they cannot let the user know he/she is being tapped and cannot rely on cooperation.  I don't think any self-respecting LI system would rely on the endpoints to give it keys. :)

Besides, pretty much any Call Monitoring application has unavoidable similarities with Lawful Interception.  The same could be argued for Troubleshooting mechanisms too.  So what/who cares?  We cannot and should not define a mechanism for LI in the IETF; but that doesn't mean we can't define mechanisms for other purposes, which may also happen to be usable as LI mechanisms.

> 4.3.
> If the requirement for recording is this strong, wouldn't it
> be better not to rely on the UA doing the right thing? Rather
> enforce it in a firewall or IDS.

Yeah, this is whacked.  The whole mechanism is odd, imho.  If you need to record calls in a call center, use keys in signaling the B2BUA can see, or terminate the SRTP at the B2BUA.  It's not as secure as DTLS-SRTP end-to-end, obviously, but it's not as secure as that anyway in this draft's mechanism.  You're already trusting middleboxes with the keys in this draft's mechanism, so trust them with the keys a priori.

Also, as an aside - using SIP to-from the recording server makes no sense.  You can pretend it's a 3-party conference call, but it's not true.  It will really get confusing when calls get REFER transferred, for example.  I know some vendors want to do this, but it's a really bad idea.  They're gonna be in a world of hurt spending all their time troubleshooting and enhancing their SIP stacks to handle this odd model for every corner case, instead of spending it on their business-specific application logic.  Just because SIP is a hammer doesn't mean this application is a nail.