Re: [secdir] [MMUSIC] Review of draft-saito-mmusic-sdp-ike-06

Makoto Saito <> Mon, 21 December 2009 15:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 029903A6897; Mon, 21 Dec 2009 07:38:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 1.769
X-Spam-Level: *
X-Spam-Status: No, score=1.769 tagged_above=-999 required=5 tests=[BAYES_20=-0.74, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id f5yGfN4FhlIB; Mon, 21 Dec 2009 07:38:02 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 5AD933A657C; Mon, 21 Dec 2009 07:38:02 -0800 (PST)
Received: from ( [IPv6:2402:c800:ff06:208::212]) by (NTTv6MTA) with ESMTP id CD94ABDC28; Tue, 22 Dec 2009 00:35:08 +0900 (JST)
Received: from [IPv6:::1] ( [IPv6:::1]) by (NTTv6MTA) with ESMTP id BD3DE7046C; Tue, 22 Dec 2009 00:35:08 +0900 (JST)
Message-ID: <>
Date: Tue, 22 Dec 2009 00:36:49 +0900
From: Makoto Saito <>
User-Agent: Thunderbird (Windows/20090812)
MIME-Version: 1.0
To: Eric Rescorla <>
References: <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Mailman-Approved-At: Mon, 21 Dec 2009 12:04:26 -0800
Subject: Re: [secdir] [MMUSIC] Review of draft-saito-mmusic-sdp-ike-06
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Security Area Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 21 Dec 2009 15:38:04 -0000


Thank you very much for taking your time to provide such
detailed comments.

My comments inline..

> This document describes a mechanism whereby a SIP/SDP exchange can
> be used to kick off an IPsec association. The idea seems to be
> that I have the AOR for some machine behind a NAT or a firewall
> and I want to set up an IPsec tunnel. So, I use SIP address
> resolution and then SIP to signal to it and then set up an
> IPsec SA as if it were a media connection.
> 1. Use Cases
> When I reviewed this document back in 2007, I was sort of
> lukewarm on it. The authors list some use cases, but I don't
> find them that convincing:
>    o  Sharing media using a framework developed by Digital Living
>       Network Alliance (DLNA) or similar protocols over VPN between two
>       user devices.
>    o  Remote desktop applications over VPN initiated by SIP call.  As an
>       additional function of click-to-call, a customer service agent can
>       access a customer's PC remotely to troubleshoot the problem while
>       talking with the customer over the phone.
>    o  Accessing and controlling medical equipment (medical robotics)
>       remotely to monitor the elderly in a rural area (remote care
>       services).
>    o  Local area network (LAN)-based gaming protocol based on peer-to-
>       peer rather than via a gaming server.
> My skepticism is that setting up a VPN for applications like this
> seems like overkill. VPNs have a bunch of ancillary security
> implications that aren't really necessary for these applications.
> It's important to remember that IPsec provides not only a network
> connectivity function but also a firewalling function.  (RFC 4301 S 2.1),
> and I worry that we're confusing these two to some extent. Consider
> the last case, the gaming system. In this case, we don't want to
> open a generic VPN connection, we want to open a connection directly
> to the gaming up. Why is IPsec a good mechanism here? The
> other examples seem to raise the same issue.

Actually, joining the same local network makes DLNA perform very

  We are particularly focusing on LAN-based applications
so that it is not necessarily an overkilling to set up VPN for those
applications. For example, DLNA is used to share private contents
inside the LAN but it doesn't have sufficient security mechanisms
for the use over the Internet. So we think VPN is a simple solution
for that purpose. Anyway, we already have the implementation for
this application and start to deploy them.

> 2. Coordination Of Multiple Elements
> This brings me to another issue, the tight coordination required
> between multiple elements on the home network. Again, in the
> gaming setting, we have:
>   - The IPsec Security Policy Database (SPD)
>   - The user's SIP stack (e.g., softphone)
>   - The gaming app which is consuming the traffic
> As I understand the current proposal, what has to happen here is:
> - A call comes into the SIP stack
> - The SIP stack somehow notifies the gaming app (or maybe it
>   has preexisting policy)
> - The gaming app agrees to accept the connection
> - The gaming app then generates the appropriate SPD entry(s)
> - The gaming app notifies the SIP stack that it's OK
> - The SIP call is accepted
> - ICE is run to establish connectivity
> - The IKE stack runs to set up the IKE channel.
> This seems like a heck of a lot of interlocking pieces to set up what's
> basically an app-to-app connection. Of course, you could also put
> the SIP stack into the gaming app, but that's ridiculously heavyweight
> for this purpose.
> I should also mention that in terms of implementation complexity,
> ICE seems like a real problem. The issue here is file descriptor
> and channel management. The obvious way to implement an ICE stack
> (and the way that mine works) is that the stack opens socket(s)
> locally and then presents an abstraction to the application which
> it can then use to read and write on. However, in this case, we
> have three separate pieces of code (and probably execution contexts)
> which all need to send/receive data on the same socket:
> - The ICE stack
> - The IKE stack
> - The IPsec stack
> And the demultiplexing between these is data dependent. Doesn't this
> mean that we'll need a central dispatcher process whose job it is
> to hand off the packets to each other module? I'm having trouble
> visualizing this being something people are willing to implement.

The demultiplexing process is simply a combination of existing
demultiplexing processes of ICE and IKE-NAT-T. That is,
    if bits 0..31 == 0, dispatch to IKE module
    else if bits 32..63 == magic-cookie, and parsing packet yields
       STUN fingerprint, dispatch to ICE module
    else dispatch to IPsec module

Anyway, it is true that the combination of ICE and IKE/IPsec is
complicated because ICE by its nature complicated. We don't think
ICE should be a MUST in this specification. In fact the environment
where we are deploying this we actually don't need ICE, but we
foresee a need for ICE despite its complexity in different
environment where this specification may be deployed.

> 3. Security Model
> As I understand it, the way that this system is intended
> to work is that the home system has an ACL indexed by remote
> AOR. If a SIP call comes through allegedly from a permitted AOR
> (via RFC 4474) it allows the VPN connection to be established.
> That seems to place a very large amount of trust in the SIP
> proxy. In essence you're giving the SIP proxy the keys to your
> firewall. I can't really see any circumstances in which I would
> be willing to do that.
> By contrast, classic IPsec/SSH/SSL VPNs rely on credentials
> that are immediately on the the remote side. That seems far
> more secure. 

SIP proxy is not necessarily given a strong authority to
establish a VPN into a home network.
This draft does not eliminate other authorization or
authentication that a user or an implementation might
want to perform before bringing up the VPN.  For example,
password authentication can be used in addition to what
is described in the draft.  This is described in Section 8.

> I am also concerned about the fairly loose coupling between the
> authentication at the IKE layer and the firewall hole punching.
> As I understand it, the SIP/ICE system doesn't do any authentication
> at all: it just punches a hole and then propagates the packets to
> the IKE/IPsec system without looking at them at all. I don't
> see any immediate way to exploit this, but it's not clear to me
> that it's safe either.

I'm afraid that there is a misunderstanding here, and likely
there is a text in the draft that is misleading.
In our use cases, a home router is a SIP UA and it doesn't
open a hole to the home network until IPsec tunnel is established.
The holes which SIP/ICE try to open are on external routers such
as a large scale NAT on ISP network. In either case, the home network
never sees a packet until both SIP and IKE negotiations have completed

> 4. Multiplexing
> Why are you using the same channel for IKE and IPsec tunnelling?
> IKE supports multiple media channels. This seems like an architectural
> issue, which is why I have it in general comments.

RFC3947 and 3948 specify the method of IPsec NAT traversal and
it uses the same channel for IKE and IPsec. We don't try to do
anything special here.

> 5. Grammar/Writing
> This document has a lot of writing/grammatical errors. It really
> needs a copy-edit pass.
> S 3.
> I'm not sure I understand what information the remote host/app
> has. Is it going to be making calls to my ordinary SIP AOR or
> to some specialized AOR connected to the app...
>    Forking to multiple registered instances is outside the scope in this
>    use case, so there is only one registered instance for each side.
> How do you guarantee this? See above about which AOR...

Forking is not necessary in our use cases, so we made it
outside the scope. Therefore, at least UAs which use this
mechanism don't try to fork. If they encounter with forked
answers, it should be treated as an illegal process.
I'm going to specify this in the next revision.

> S 4.
> This set of definitions seems clumsy. How do I know if I should be
> establishing ike-esp or ike-esp-udpencap? Should I be establishing
> two media channels in parallel?

The definition of ike-esp and ike-esp-udpencap may have been
awkward. In fact, whether ipsec nat traversal is necessary or not
is decided during the ike session. So the definition of ike-esp-udpencap
should have been "ike supporting nat traversal" (ipsec nat traversal is
optional spec of ike). Even if they exchange ike-esp-udpencap when there
is no nat between them, they will start normal ike and it will end up
with normal ipsec tunnel. I will specify it clearly in the next version.

> S 8.2.
> This PSK mechanism seems to introduce a weakness not present in 
> the original IKE-PSK spec: in RFC 4306, you only do the PSK exchange
> over an encrypted channel established via a DH exchange. That means
> that an attacker must actively intercept the channel in order to
> mount a dictionary attack on a PSK which is actually a password.
> Sending a PSK hash enables an attack by any attacker who can 
> see the data on the SIP channel. 
> Why are you allowing MD2 and MD5?

MD2 and MD5 are certainly obsolete algorithms and we will delete them.

> Abstract:
>    This document specifies how to establish secure media sessions over a
>    virtual private network using Session Initiation Protocol for the
>    purpose of on-demand media/application sharing between peers.  It
> You're not establishing a secure media session over a vpn, right?
> You're establishing a media session to use a vpn over.

Right. I'm going to fix it.

> S 2.2.
>       SIP has a cross-NAT rendezvous mechanism, such as ICE
>       [I-D.ietf-mmusic-ice].  This effective function can be used for
> "such as ICE"? SIP *is* a cross-NAT rendezvous mechanism. ICE
> is a mechanism for opening ports through the NAT. Also, since
> this is the only IETF mechanism "such as" seems weird.
>    specifies the method to exchange the fingerprint of a self-signed
> specifies a method
> -Ekr

I'm going to modify them in the next revision. Thank you very much
for your helpful comments.

Best regards,