Re: [IPsec] FYI: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria

Tristan Ninet <tristan.ninet@inria.fr> Fri, 11 October 2019 13:59 UTC

Return-Path: <tristan.ninet@inria.fr>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E522E12004D for <ipsec@ietfa.amsl.com>; Fri, 11 Oct 2019 06:59:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Level:
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vK24S_GZIh_W for <ipsec@ietfa.amsl.com>; Fri, 11 Oct 2019 06:59:37 -0700 (PDT)
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ADDFA12003F for <ipsec@ietf.org>; Fri, 11 Oct 2019 06:59:36 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="5.67,284,1566856800"; d="scan'208";a="322423050"
X-MGA-submission: MDHr0VVaYkKi6cSf6sXi7oyKdpRsAdSqTZ3u4T7DWrHKV7JLOC5roDv7BmFnFREFaKasbt6zSngTNJtYOCvMQLKANvOoFsqZSJml4CHp/CP7kPsz2SnAUxee+MGbG1r9hs5Z5gj7S0YQcNZ2ma7nF4iJccbfTZ33CYOmZ4/qSctqOQ==
Received: from zcs-store1.inria.fr ([128.93.142.28]) by mail3-relais-sop.national.inria.fr with ESMTP; 11 Oct 2019 15:59:32 +0200
Date: Fri, 11 Oct 2019 15:59:32 +0200
From: Tristan Ninet <tristan.ninet@inria.fr>
To: Tero Kivinen <kivinen@iki.fi>, Valery Smyslov <smyslov.ietf@gmail.com>
Cc: Paul Wouters <paul@nohats.ca>, "ipsec@ietf.org WG" <ipsec@ietf.org>, Olivier Zendra <olivier.zendra@inria.fr>, romaric maillard <romaric.maillard@thalesgroup.com>
Message-ID: <1531560408.4631274.1570802372648.JavaMail.zimbra@inria.fr>
In-Reply-To: <23948.33900.216096.521743@fireball.acr.fi>
References: <94576796.2706392.1569086539715.JavaMail.zimbra@inria.fr> <23948.33900.216096.521743@fireball.acr.fi>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [131.254.17.142]
X-Mailer: Zimbra 8.7.11_GA_3800 (ZimbraWebClient - FF69 (Linux)/8.7.11_GA_3800)
Thread-Topic: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria
Thread-Index: iRgoD5XXQCGEsFbx6MFDY8l9oUI2cw==
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/OjMkUOHqFXl3kb9mc4n_2WAyrio>
Subject: Re: [IPsec] FYI: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Oct 2019 13:59:40 -0000

Hi,

I do not have the time to answer right now, but I will do so later.

Best regards,
Tristan Ninet

----- Mail original -----
> De: "Tero Kivinen" <kivinen@iki.fi>
> À: "Tristan Ninet" <tristan.ninet@inria.fr>
> Cc: "Paul Wouters" <paul@nohats.ca>, "ipsec@ietf.org WG" <ipsec@ietf.org>, "Olivier Zendra" <olivier.zendra@inria.fr>,
> "romaric maillard" <romaric.maillard@thalesgroup.com>
> Envoyé: Jeudi 26 Septembre 2019 11:27:08
> Objet: Re: [IPsec] FYI: A Novel Denial-of-Service Attack Against IKEv2 - HAL-Inria

> Tristan Ninet writes:
>> Dear Mr. Wouters,
>> 
>> Thank you for your interest in our work.
>> 
>> > I've read through the paper, and I believe is very much misrepresents what it
>> > deems is a DoS attack against the IKEv2 protocol.
>> >
>> > The DoS attack described seems to think it can change the IP address and cause
>> > Initiator to be authenticated by a different peer than intended (ignoring all of
>> > IDi / IDr payloads it is relaying). Then the different peer is happy, but the
>> > last IKE_AUTH reply to the initiator would signify a failure. Then when the
>> > initiator sends an Informational message with a Delete payload and
>> > AUTHENTICATION_FAILED notify, the attacker drops the message. Now the different
>> > peer has "lost resources" since its IKE SA (and possibly IPsec SA) is up. A
>> > proper implementation would send a Liveness probe if its IPsec SA counters
>> > remain zero. It would also put an idle limit on an childless SA that resulted
>> > from a TS_UNAVAILABLE (as opposed to a by design childless IKE SA)
>> 
>> Let us denote Initiator by A, Responder by B, Victim by C, IKE_SA_INIT request
>> by m1, IKE_SA_INIT response by m2, IKE_AUTH request by m3, and IKE_AUTH response
>> by m4.
>> 
>> I understand you are saying that authentication of A to C would fail because of
>> IDi and IDr payloads.
> 
> Normally authentication from A to C will fail because of different
> authentication information. I.e., if they are using pre-shared keys
> the pre-shared key for A to B is different than A to C (unless B and C
> are part of same infrastructure, i.e., load-sharing or high
> availaibility pairs). If A is using certificates then usually it has
> two different CA infrastructures one for B and one for C, so C will
> not accept certificate meant for B and will fail authentication.
> 
> You can get this work if B and C are sharing configuration i.e., they
> are two different gateways in the same adminstrative domain.
> 
>> However, we put as a requirement to the attack that C trusts A, i.e. C has some
>> configuration entry with the ID of A. In this case, authentication will succeed.
> 
> You also put restriction that A shares exactly same authentication
> information for B and C both. This is not normal for cases where B and
> C are independent.
> 
>> You then say that a proper implementation would send a Liveness probe if its
>> IPsec SA sequence numbers remain zero.
>> 
>> The RFC does say that Liveness checks are needed. In this regard, strongswan and
>> libreswan do not follow the RFC since in both implementations, Dead Peer
>> Detection (DPD) is disabled by default.
> 
> RFC says:
> 
>								If no
>   cryptographically protected messages have been received on an IKE SA
>   or any of its Child SAs recently, the system needs to perform a
>   liveness check in order to prevent sending messages to a dead peer.
> 
> Usually DPD will be have timeout of 10-30 seconds, i.e., after initial
> silence, i.e., if there is nothing happening in the IKEv2 SA after
> final IKE_AUTH in few tens of seconds C will send DPD message to A and
> will not get reply to that, and C will delete the IKE SA after few
> minutes.
> 
>> However, DPD does not deter the attack. A classic flooding DoS attack can only
>> set up half-open SAs. An IKEv2 implementation should remove half-open SAs after
>> some short time. In strongswan by default half-open SAs are removed after 30s.
>> Therefore a high-rate of m1 messages is needed to achieve memory exhaustion
>> using classic flooding against IKEv2.
>> 
>> However, the Deviation attack sets up full IKE SAs (if not Child SAs as well) in
>> C. We measured that a full connection (one IKE SA + one Child SA) is 23kB in
>> strongswan, whereas a half-open SA is 1kB. This divides by 23 the minimum
>> throughput of m1 messages to deviate in order to exhaust the memory of C.
> 
> Note, that sending classic flooding attack packets requires no effort
> from the attacker. Settting up full IKEv2 SA do require
> Diffie-Hellmans, Certificate verifications etc, thus rate you can set
> them up is much more limited than classic case. The maximum number of
> IKEv2 SAs that can be set up each second is usually in order of
> hundreds or thousands, thus even if it uses 23kB for two minutes
> (until DPD kicks them out) that is only few gigabytes of memory.
> Older phones would run out of memory at that point, but newer ones
> have more memory than that...
> 
> Also as you are relying for A to set initiate the connection you can
> only set up one connection per few minutes, as A will not immediately
> retry connection to B/C when it receives IKE_AUTH message failing
> authentication. The RFC does not provide instructions for this case,
> but authentication failure in IKE_AUTH means there is something wrong
> that most likely will not be automatically fixed, the clients usually
> wait for some time before retrying. I.e., if you are waiting for
> adminstrator of either end to go and fix the authentication
> information in configuration there is no point retrying every second.
> You can try every 30 seconds without any issues, and still almost
> immediately detect when other end fixes their configuration.
> 
>> In strongswan, when DPD is enabled, by default, connections with a dead peer
>> (such as in the Deviation Attack) are removed after dpdtimeout + total
>> retransmission timeout = 30s + 165s = 195s. It does not make sense to go much
>> lower, and some implementations might want to set this timeout higher so that
>> bandwidth is not overwhelmed. This longer stay of undesirable connections in C's
>> memory divides by 6 the minimum throughput of m1 messages to deviate in order to
>> exhaust the memory of C.
> 
> So if A tries every 30 seconds and C keeps it in memory for 200
> seconds that still means we only have 6-7 of them active at one time.
> Not really proper attack.
> 
>> In total the Deviation Attack thus divides the required throughput by 140. In
>> consequence the DA is much harder to detect using intrusion detection systems
>> than classic DoS attacks, in particular when DPD timeout is high.
> 
> You seem to be assuming that A will retry immediately when it gets
> authentication failure? Even if it did that usually the setup time
> needed for one IKEv2 SA is in order of second or so, so getting more
> than one connection per second from the same host A is difficult (2 *
> RTT + Diffie-Hellman calculations + certificate calculations). Even
> with new connectiona attempt every second you only have few hundred
> extra IKEv2 SAs.
> 
>> In strongswan, when DPD is disabled, connections with a dead peer are removed at
>> the time of rekeying, i.e. by default 3h. A DoS with a throughput 8000 times
>> lower than classic DoS techniques is then possible.
> 
> Disabling DPD is bad idea, and if you do that, then you are asking
> for it...
> 
>> On the other hand, the requirements for the attack are quite strong. Firstly,
>> the attacker needs to have some control over the connection between A and B.
>> Secondly, all Initiator parties authenticate themselves using signature mode and
>> are trusted by Victim. Thirdly, the attacker needs to find enough m1 messages to
>> deviate. In addition, each m1 message must come from a different IKEv2 peer.
>> Otherwise connections will simply replace current connections with the same
>> peer. In fact I did not see this behavior in the RFC, but strongswan behaves
>> this way by default.
> 
> Normally each new connection has INITIAL_CONTACT notification which
> tells the responder that this replaces all previous connections with
> same authenticated peers. I assumed you had assumed that
> INITIAL_CONTACT notifications would have been disabled on A to get
> this attack working at all, as I assume you forced A to do multiple
> connections right after each other.
> 
> On the other hand here you do assume that we simply take normal m1
> messages going to B and redirect them to C also, which means that the
> load of C can at most be same than B, thus you do not really do Denial
> of Service attack, you just cause some extra load to C. I mean if B
> and C are required to be part of same authentication infrastructure,
> you can assume that they have about same amount of resources, thus
> C has no problem of handling the same load B is having.
> 
>> The contribution of the paper is to point out that when the above requirements
>> are satisfied, then an attacker may perform a DoS attack with a significantly
>> lower throughput than expected from classic flooding techniques.
> 
> I would say the throughput is much lower than normal operational case
> on the Monday morning problem. I.e., when gateways are assumed to be
> able to handle 100-1000 times of normal load every now and then and
> still process operations without major delays. This throughput seems
> to be lower than that.
> 
>> You also say that a proper implementation would put an idle limit on an
>> childless SA that resulted from a TS_UNAVAILABLE (as opposed to a by design
>> childless IKE SA). I think you mean TS_UNACCEPTABLE. I cannot find this behavior
>> anywhere in the RFC. But even if it is true, the point I made above stays valid.
> 
> Most implementations uses something like 30 second timer for that. If
> the initiator has some other Child SAs to be set up they will do it
> immediately after the IKE_AUTH finishes, thus you just need to wait
> for few tens of seconds for them. If they do not have any other Child
> SAs to create there is no point of keep extra IKE SA up.
> 
> RFC do not provide all possible implementation detail, quite a lot of
> corner cases are left for implementors to solve. RFC8019 does have
> some discussion about these issues, even though it mostly concentrates
> on the half open SA problem, as that is hardest to protect against.
> 
>> > It makes more wrong assumptions like "(TS negotiation will fail in most cases)"
>> > which I guess they think would fail because the different peer's have different
>> > IPsec SA configurations, but really if they are that different, they will also
>> > have different IDi/IDr payloads because a peer's configuration with many other
>> > peers for specific subnets would be configured with local/remote IDs as to not
>> > tie these to hardcoded IP addresses. Without explicit ID, the ID used is
>> > normally the ID_IPvx, and if that is used, using an IP address X with ID_IPv4 Y
>> > will also cause an IKE failure of the victim peer because for IP address X it
>> > would then expect ID_IPv4 X.
>> 
>> You say that our assumption that TS negotiation will fail in most cases is
>> wrong. Even if this assumption is wrong it only makes the attack stronger, as an
>> even heavier connection is installed in C.
>> 
>> You say that authentication using ID_IPvx will fail because A would be using
>> different IP addresses in its IDi and in the source address of the m3 packet.
>> However, in the attack the deviation only changes the destination address, not
>> the source address. Thus no such failure would occur.
> 
> ID_IPvx and outer IP does not have anything common. It is completely
> possible and valid for node to send ID_IPv4 identity of 10.0.0.1 from
> completely different IP-address. From the RFC7296 section 3.5:
> 
>	   	     		      	  	  When using the
>   ID_IPV4_ADDR/ID_IPV6_ADDR identity types in IDi/IDr payloads, IKEv2
>   does not require this address to match the address in the IP header
>   of IKEv2 packets, or anything in the TSi/TSr payloads.
> 
> Also the most common road warrior setup would be any<->any traffic
> selectors, and leave the narrowing for the responder, so C will
> happily narrow that range down to something that is useful for him.
> 
>> However, as I said above, the attack does assume that each m1 message comes
>> from a different IKEv2 peer. It is obvious that if the attack works with Ndemo
>> Initiators and the "uniqueids=never" option set in Victim, then it works in the
>> same setup but with N Initiators and without the "uniqueids=never" option in
>> Victim.
> 
> If you assume that each m1 comes for differnet legitimite IKEv2 peer,
> you only provide work factor of 2, which is very very low...
> Especially when you compare that to Monday morning problem where work
> factor might suddenly be raised by 100 or 1000 times the normal load.
> 
> If you can cause C to provide any kind of visibile change of behavior
> with work factor of 2 then C is completely misconfigured and
> underresourced and will fail in normal use every now and then anyways.
> 
> 
>> > It goes on to say the attack is not possible with PSKs, which I don't
>> > understand.. They also then mumbled about asymmetric authentication, which I
>> > don't understand, but regardless is basically only employed with EAPTLS and
>> > Remote Access VPNs, so it does not apply to this attack.
>> 
>> When we said that the attack is not possible with PSKs, we assumed that the PSK
>> between A and B is different than the PSK between A and C. In this case,
>> authentication of A to C will fail and the attack does not work because no SA in
>> installed in C.
> 
> Note, that it is also does not work for certificates, as CA trust
> anchors configured for B and C are different, thus C will not accept
> A's certificate as it is not signed by proper CA for C, it is signed
> with trust anchor used for B. Only time where B and C uses same
> certificate authorities is when they are part of same authentication
> domain.
> 
> In IPsec we do not have similar common trusted CA list than we have in
> web. Each adminstrative domain will configure the CAs they trust
> separately, and quite often those CAs are separate CAs only for IPsec
> use for that specific gateway.
> --
> kivinen@iki.fi