Re: [IPsec] FYI: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria

Tero Kivinen <kivinen@iki.fi> Thu, 26 September 2019 09:27 UTC

Return-Path: <kivinen@iki.fi>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 03C13120863 for <ipsec@ietfa.amsl.com>; Thu, 26 Sep 2019 02:27:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.419
X-Spam-Level:
X-Spam-Status: No, score=-3.419 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KpRsmxa9P1RX for <ipsec@ietfa.amsl.com>; Thu, 26 Sep 2019 02:27:13 -0700 (PDT)
Received: from mail.kivinen.iki.fi (fireball.acr.fi [83.145.195.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61CD6120859 for <ipsec@ietf.org>; Thu, 26 Sep 2019 02:27:12 -0700 (PDT)
Received: from fireball.acr.fi (localhost [127.0.0.1]) by mail.kivinen.iki.fi (8.15.2/8.15.2) with ESMTP id x8Q9R9m6015219; Thu, 26 Sep 2019 12:27:09 +0300 (EEST)
Received: (from kivinen@localhost) by fireball.acr.fi (8.15.2/8.14.8/Submit) id x8Q9R83X019737; Thu, 26 Sep 2019 12:27:08 +0300 (EEST)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <23948.33900.216096.521743@fireball.acr.fi>
Date: Thu, 26 Sep 2019 12:27:08 +0300
From: Tero Kivinen <kivinen@iki.fi>
To: Tristan Ninet <tristan.ninet@inria.fr>
Cc: Paul Wouters <paul@nohats.ca>, "ipsec@ietf.org WG" <ipsec@ietf.org>, Olivier Zendra <olivier.zendra@inria.fr>, romaric maillard <romaric.maillard@thalesgroup.com>
In-Reply-To: <94576796.2706392.1569086539715.JavaMail.zimbra@inria.fr>
References: <94576796.2706392.1569086539715.JavaMail.zimbra@inria.fr>
X-Mailer: VM 8.2.0b under 25.1.1 (x86_64--netbsd)
X-Edit-Time: 50 min
X-Total-Time: 54 min
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/Pz-wlk-EywJbnjlYRn9E6E1FeV0>
Subject: Re: [IPsec] FYI: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Sep 2019 09:27:16 -0000

Tristan Ninet writes:
> Dear Mr. Wouters,
> 
> Thank you for your interest in our work.
> 
> > I've read through the paper, and I believe is very much misrepresents what it
> > deems is a DoS attack against the IKEv2 protocol.
> >
> > The DoS attack described seems to think it can change the IP address and cause
> > Initiator to be authenticated by a different peer than intended (ignoring all of
> > IDi / IDr payloads it is relaying). Then the different peer is happy, but the
> > last IKE_AUTH reply to the initiator would signify a failure. Then when the
> > initiator sends an Informational message with a Delete payload and
> > AUTHENTICATION_FAILED notify, the attacker drops the message. Now the different
> > peer has "lost resources" since its IKE SA (and possibly IPsec SA) is up. A
> > proper implementation would send a Liveness probe if its IPsec SA counters
> > remain zero. It would also put an idle limit on an childless SA that resulted
> > from a TS_UNAVAILABLE (as opposed to a by design childless IKE SA)
> 
> Let us denote Initiator by A, Responder by B, Victim by C, IKE_SA_INIT request
> by m1, IKE_SA_INIT response by m2, IKE_AUTH request by m3, and IKE_AUTH response
> by m4.
> 
> I understand you are saying that authentication of A to C would fail because of
> IDi and IDr payloads.

Normally authentication from A to C will fail because of different
authentication information. I.e., if they are using pre-shared keys
the pre-shared key for A to B is different than A to C (unless B and C
are part of same infrastructure, i.e., load-sharing or high
availaibility pairs). If A is using certificates then usually it has
two different CA infrastructures one for B and one for C, so C will
not accept certificate meant for B and will fail authentication.

You can get this work if B and C are sharing configuration i.e., they
are two different gateways in the same adminstrative domain. 

> However, we put as a requirement to the attack that C trusts A, i.e. C has some
> configuration entry with the ID of A. In this case, authentication will succeed.

You also put restriction that A shares exactly same authentication
information for B and C both. This is not normal for cases where B and
C are independent. 

> You then say that a proper implementation would send a Liveness probe if its
> IPsec SA sequence numbers remain zero.
> 
> The RFC does say that Liveness checks are needed. In this regard, strongswan and
> libreswan do not follow the RFC since in both implementations, Dead Peer
> Detection (DPD) is disabled by default.

RFC says:

								If no
   cryptographically protected messages have been received on an IKE SA
   or any of its Child SAs recently, the system needs to perform a
   liveness check in order to prevent sending messages to a dead peer.

Usually DPD will be have timeout of 10-30 seconds, i.e., after initial
silence, i.e., if there is nothing happening in the IKEv2 SA after
final IKE_AUTH in few tens of seconds C will send DPD message to A and
will not get reply to that, and C will delete the IKE SA after few
minutes. 

> However, DPD does not deter the attack. A classic flooding DoS attack can only
> set up half-open SAs. An IKEv2 implementation should remove half-open SAs after
> some short time. In strongswan by default half-open SAs are removed after 30s.
> Therefore a high-rate of m1 messages is needed to achieve memory exhaustion
> using classic flooding against IKEv2.
> 
> However, the Deviation attack sets up full IKE SAs (if not Child SAs as well) in
> C. We measured that a full connection (one IKE SA + one Child SA) is 23kB in
> strongswan, whereas a half-open SA is 1kB. This divides by 23 the minimum
> throughput of m1 messages to deviate in order to exhaust the memory of C.

Note, that sending classic flooding attack packets requires no effort
from the attacker. Settting up full IKEv2 SA do require
Diffie-Hellmans, Certificate verifications etc, thus rate you can set
them up is much more limited than classic case. The maximum number of
IKEv2 SAs that can be set up each second is usually in order of
hundreds or thousands, thus even if it uses 23kB for two minutes
(until DPD kicks them out) that is only few gigabytes of memory.
Older phones would run out of memory at that point, but newer ones
have more memory than that...

Also as you are relying for A to set initiate the connection you can
only set up one connection per few minutes, as A will not immediately
retry connection to B/C when it receives IKE_AUTH message failing
authentication. The RFC does not provide instructions for this case,
but authentication failure in IKE_AUTH means there is something wrong
that most likely will not be automatically fixed, the clients usually
wait for some time before retrying. I.e., if you are waiting for
adminstrator of either end to go and fix the authentication
information in configuration there is no point retrying every second.
You can try every 30 seconds without any issues, and still almost
immediately detect when other end fixes their configuration.

> In strongswan, when DPD is enabled, by default, connections with a dead peer
> (such as in the Deviation Attack) are removed after dpdtimeout + total
> retransmission timeout = 30s + 165s = 195s. It does not make sense to go much
> lower, and some implementations might want to set this timeout higher so that
> bandwidth is not overwhelmed. This longer stay of undesirable connections in C's
> memory divides by 6 the minimum throughput of m1 messages to deviate in order to
> exhaust the memory of C.

So if A tries every 30 seconds and C keeps it in memory for 200
seconds that still means we only have 6-7 of them active at one time.
Not really proper attack. 

> In total the Deviation Attack thus divides the required throughput by 140. In
> consequence the DA is much harder to detect using intrusion detection systems
> than classic DoS attacks, in particular when DPD timeout is high.

You seem to be assuming that A will retry immediately when it gets
authentication failure? Even if it did that usually the setup time
needed for one IKEv2 SA is in order of second or so, so getting more
than one connection per second from the same host A is difficult (2 *
RTT + Diffie-Hellman calculations + certificate calculations). Even
with new connectiona attempt every second you only have few hundred
extra IKEv2 SAs. 

> In strongswan, when DPD is disabled, connections with a dead peer are removed at
> the time of rekeying, i.e. by default 3h. A DoS with a throughput 8000 times
> lower than classic DoS techniques is then possible.

Disabling DPD is bad idea, and if you do that, then you are asking
for it... 

> On the other hand, the requirements for the attack are quite strong. Firstly,
> the attacker needs to have some control over the connection between A and B.
> Secondly, all Initiator parties authenticate themselves using signature mode and
> are trusted by Victim. Thirdly, the attacker needs to find enough m1 messages to
> deviate. In addition, each m1 message must come from a different IKEv2 peer.
> Otherwise connections will simply replace current connections with the same
> peer. In fact I did not see this behavior in the RFC, but strongswan behaves
> this way by default.

Normally each new connection has INITIAL_CONTACT notification which
tells the responder that this replaces all previous connections with
same authenticated peers. I assumed you had assumed that
INITIAL_CONTACT notifications would have been disabled on A to get
this attack working at all, as I assume you forced A to do multiple
connections right after each other. 

On the other hand here you do assume that we simply take normal m1
messages going to B and redirect them to C also, which means that the
load of C can at most be same than B, thus you do not really do Denial
of Service attack, you just cause some extra load to C. I mean if B
and C are required to be part of same authentication infrastructure,
you can assume that they have about same amount of resources, thus
C has no problem of handling the same load B is having. 

> The contribution of the paper is to point out that when the above requirements
> are satisfied, then an attacker may perform a DoS attack with a significantly
> lower throughput than expected from classic flooding techniques.

I would say the throughput is much lower than normal operational case
on the Monday morning problem. I.e., when gateways are assumed to be
able to handle 100-1000 times of normal load every now and then and
still process operations without major delays. This throughput seems
to be lower than that.

> You also say that a proper implementation would put an idle limit on an
> childless SA that resulted from a TS_UNAVAILABLE (as opposed to a by design
> childless IKE SA). I think you mean TS_UNACCEPTABLE. I cannot find this behavior
> anywhere in the RFC. But even if it is true, the point I made above stays valid.

Most implementations uses something like 30 second timer for that. If
the initiator has some other Child SAs to be set up they will do it
immediately after the IKE_AUTH finishes, thus you just need to wait
for few tens of seconds for them. If they do not have any other Child
SAs to create there is no point of keep extra IKE SA up.

RFC do not provide all possible implementation detail, quite a lot of
corner cases are left for implementors to solve. RFC8019 does have
some discussion about these issues, even though it mostly concentrates
on the half open SA problem, as that is hardest to protect against. 

> > It makes more wrong assumptions like "(TS negotiation will fail in most cases)"
> > which I guess they think would fail because the different peer's have different
> > IPsec SA configurations, but really if they are that different, they will also
> > have different IDi/IDr payloads because a peer's configuration with many other
> > peers for specific subnets would be configured with local/remote IDs as to not
> > tie these to hardcoded IP addresses. Without explicit ID, the ID used is
> > normally the ID_IPvx, and if that is used, using an IP address X with ID_IPv4 Y
> > will also cause an IKE failure of the victim peer because for IP address X it
> > would then expect ID_IPv4 X.
> 
> You say that our assumption that TS negotiation will fail in most cases is
> wrong. Even if this assumption is wrong it only makes the attack stronger, as an
> even heavier connection is installed in C.
> 
> You say that authentication using ID_IPvx will fail because A would be using
> different IP addresses in its IDi and in the source address of the m3 packet.
> However, in the attack the deviation only changes the destination address, not
> the source address. Thus no such failure would occur.

ID_IPvx and outer IP does not have anything common. It is completely
possible and valid for node to send ID_IPv4 identity of 10.0.0.1 from
completely different IP-address. From the RFC7296 section 3.5:

	   	     		      	  	  When using the
   ID_IPV4_ADDR/ID_IPV6_ADDR identity types in IDi/IDr payloads, IKEv2
   does not require this address to match the address in the IP header
   of IKEv2 packets, or anything in the TSi/TSr payloads.

Also the most common road warrior setup would be any<->any traffic
selectors, and leave the narrowing for the responder, so C will
happily narrow that range down to something that is useful for him. 

> However, as I said above, the attack does assume that each m1 message comes
> from a different IKEv2 peer. It is obvious that if the attack works with Ndemo
> Initiators and the "uniqueids=never" option set in Victim, then it works in the
> same setup but with N Initiators and without the "uniqueids=never" option in
> Victim.

If you assume that each m1 comes for differnet legitimite IKEv2 peer,
you only provide work factor of 2, which is very very low...
Especially when you compare that to Monday morning problem where work
factor might suddenly be raised by 100 or 1000 times the normal load.

If you can cause C to provide any kind of visibile change of behavior
with work factor of 2 then C is completely misconfigured and
underresourced and will fail in normal use every now and then anyways.


> > It goes on to say the attack is not possible with PSKs, which I don't
> > understand.. They also then mumbled about asymmetric authentication, which I
> > don't understand, but regardless is basically only employed with EAPTLS and
> > Remote Access VPNs, so it does not apply to this attack.
> 
> When we said that the attack is not possible with PSKs, we assumed that the PSK
> between A and B is different than the PSK between A and C. In this case,
> authentication of A to C will fail and the attack does not work because no SA in
> installed in C.

Note, that it is also does not work for certificates, as CA trust
anchors configured for B and C are different, thus C will not accept
A's certificate as it is not signed by proper CA for C, it is signed
with trust anchor used for B. Only time where B and C uses same
certificate authorities is when they are part of same authentication
domain.

In IPsec we do not have similar common trusted CA list than we have in
web. Each adminstrative domain will configure the CAs they trust
separately, and quite often those CAs are separate CAs only for IPsec
use for that specific gateway. 
-- 
kivinen@iki.fi