Re: [IPsec] Review of draft-ietf-ipsecme-ddos-protection-06

Paul Wouters <paul@nohats.ca> Fri, 03 June 2016 02:06 UTC

Return-Path: <paul@nohats.ca>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AB0312D144 for <ipsec@ietfa.amsl.com>; Thu, 2 Jun 2016 19:06:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.526
X-Spam-Level:
X-Spam-Status: No, score=-2.526 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_ADSP_ALL=0.8, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oEtUONV1kETU for <ipsec@ietfa.amsl.com>; Thu, 2 Jun 2016 19:06:11 -0700 (PDT)
Received: from mx.nohats.ca (mx.nohats.ca [IPv6:2a03:6000:1004:1::68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0612812D7E3 for <ipsec@ietf.org>; Thu, 2 Jun 2016 19:06:11 -0700 (PDT)
Received: from localhost (localhost [IPv6:::1]) by mx.nohats.ca (Postfix) with ESMTP id 3rLSB75kG4zw1; Fri, 3 Jun 2016 04:06:07 +0200 (CEST)
X-Virus-Scanned: amavisd-new at mx.nohats.ca
Received: from mx.nohats.ca ([IPv6:::1]) by localhost (mx.nohats.ca [IPv6:::1]) (amavisd-new, port 10024) with ESMTP id WzEx6hfXhLA2; Fri, 3 Jun 2016 04:06:05 +0200 (CEST)
Received: from bofh.nohats.ca (206-248-139-105.dsl.teksavvy.com [206.248.139.105]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx.nohats.ca (Postfix) with ESMTPS; Fri, 3 Jun 2016 04:06:05 +0200 (CEST)
Received: by bofh.nohats.ca (Postfix, from userid 1000) id 4EE81677A54; Thu, 2 Jun 2016 22:06:04 -0400 (EDT)
DKIM-Filter: OpenDKIM Filter v2.10.3 bofh.nohats.ca 4EE81677A54
Received: from localhost (localhost [127.0.0.1]) by bofh.nohats.ca (Postfix) with ESMTP id 36D9F406B809; Thu, 2 Jun 2016 22:06:04 -0400 (EDT)
Date: Thu, 02 Jun 2016 22:06:04 -0400
From: Paul Wouters <paul@nohats.ca>
To: Valery Smyslov <svanru@gmail.com>
In-Reply-To: <4200F5373D5542C985F3D4C51609213C@buildpc>
Message-ID: <alpine.LRH.2.20.1606022148040.23132@bofh.nohats.ca>
References: <alpine.LRH.2.20.1605311635540.16809@bofh.nohats.ca> <4200F5373D5542C985F3D4C51609213C@buildpc>
User-Agent: Alpine 2.20 (LRH 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"; format="flowed"
Archived-At: <http://mailarchive.ietf.org/arch/msg/ipsec/lMNB708wMMLTG2cx__2QE9xPLOo>
Cc: ipsec@ietf.org, Yoav Nir <ynir.ietf@gmail.com>
Subject: Re: [IPsec] Review of draft-ietf-ipsecme-ddos-protection-06
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Jun 2016 02:06:13 -0000

On Thu, 2 Jun 2016, Valery Smyslov wrote:

>>     An obvious defense, which is described in Section 4.2, is limiting
>>     the number of half-open SAs opened by a single peer.  However, since
>>     all that is required is a single packet, an attacker can use multiple
>>     spoofed source IP addresses.
>>
>>  I am not sure why this is mentioned here in this way, because the attack
>>  of spoofed source IP is already handled effectively with DOS cookies. I
>>  think it is better to state "bot-nets are large enough that they have
>>  enough unique IP addresses" and avoid talking about spoofing in this
>>  section altogether.
>
> Here are some general observations of IKEv2 vulnerabilities,
> regardless of the existing and proposed defense mechanisms, which are 
> described in subsequent sections.

But it is incomplete and out of place. Section is is about The
Vulnerability. It talks about vulnerabilities, then this one solution to
one thing, then goes into detail about the work that makes it
vulnerable. That is why I suggest to just remove the paragraph.

>>     Stage #3 includes public key operations, typically more than one.
>>
>>  It seems this sentence needs to say something that these operations are
>>  very expensive, similar to describing the "effort" in the previous
>>  sentences of stage #1 and stage #2.
>
> OK. How about:
>
>    Stage #3 may include public key operations if certificates are involved.
>    These operations are often more computationly expensive than those
>    performed at stage #2.

Looks good.

>>     It seems that the first thing cannot be dealt with at the IKE level.
>>     It's probably better left to Intrusion Prevention System (IPS)
>>     technology.
>>
>>  I would rewrite this more authoritively, and not use the word "seems"
>
> OK. How about:
>
>    If an attacker is so powerfull that it is able to overwhelm
>    the Responder's CPU that deals with generating cookies,
>    then the attack cannot be dealt with at the IKE level and
>    must be handled by means of the Intrusion Prevention System (IPS)
>    technology.

Looks good.

>>     Depending on the Responder implementation, this can be repeated with
>>     the same half-open SA.
>>
>>  I don't think this "depends on the implemention". Since any on-path
>>  attacker can spoof rubbish, a Responder MUST ignore the failed packet
>>  and remain ready to accept the real one for a certain about of time. 
>
> "Depending on the Responder implementation" means here that if along with 
> discarding the failed packet the Responder also discards the computed SK_* 
> keys, then it will need to re-calculate them again
> when the next IKE_AUTH packet is received, so the attack can be
> repeated. The SK_* keys don't depend on IKE_AUTH messages,
> so in general there is no need to discard them even if the received
> IKE_AUTH packet failed to decrypt properly, and the draft advises to keep 
> them in this case. However, implementations may have good reasons to do this 
> (e.g. to free hardware resources if crypto is performed in HW).

Oh, I didnt realise you talked about re-using DH components. Ok, in that
case it makes sense but you might want to say it only applies to those
who re-use DH calculations between different IKE peers. Our software
never does that (and I think FIPS also puts additional constraints on
this)

> Please, see above.
>
> Do you think more explanationa are needed here?

No I guess it is fine.

>>     Retransmission policies in practice wait at least one or two seconds
>>     before retransmitting for the first time.
>>
>>  I'm not sure if this is still true. Libreswan starts at 0.5s and doubles,
>>  and I know that iOS was faster too.
>
> Well, there are different implementations and each has its own
> retransmission policy. The Responder should take into account
> the slowest sensible retransmission policy, which seems to be the one 
> described in the draft.
>
> Will the following text make you happy?
>
>    Many retransmission policies in practice wait one or two seconds
>    before retransmitting for the first time.

It would be nicer to rewrite it without mentioning any absolute times.
That way the text will also remain more relevant in the future if/when
these timings change.

>>     When not under attack, the half-open SA timeout SHOULD be set high
>>     enough that the Initiator will have enough time to send multiple
>>     retransmissions, minimizing the chance of transient network
>>     congestion causing IKE failure.
>>
>>  I agree, but I'd like to note that this and the text just above mentioning
>>  "several minutes" is kind of archaic. We found a limit of 30 seconds on
>
> That's what RFC 7296 recommends (Section 2.4).

Okay, fair enough. I guess you mention shortening it while under attack,
so it's all okay.

>>  other implementations so common as a timeout, that we see no more value in
>>  keeping an IKE exchange around for more then 30 seconds. (we do re-start
>>  and try a new exchange from scratch for longer, in some configurations we
>>  try that forever)
>>
>>     For IPv6, ISPs assign between a /48 and a /64, so it makes sense to use
>>     a 64-bit prefix as the basis for rate limiting in IPv6.
>>
>>  Why does that make sense over using /48 ? Wouldn't you rather rate limit
>>  some innocent neighbours over not actually defending against the attack?
>>  If puzzles work as advertised, real clients on that /48 should still be
>>  able to connect.
>
> Well, I'm not an IPv6 expert. Probably Michael Richardson (who suggested this 
> change) or somebody else will comment on this.

This does not so much relate to IPv6 but to whether you rather
overestimate or underestimate the attacker's IP space. If you
underestimate, you will take longer to punish the attacking IPs. If you
overestimate you will needlessly slow down legitimate clients.

I don't know which of the two is better, hence my objection to "it makes
sense" because I don't see that.

>>     Regardless of the type of rate-limiting used, there is a huge
>>     advantage in blocking the DoS attack using rate-limiting for
>>     legitimate clients that are away from the attacking nodes.  In such
>>     cases, adverse impacts caused by the attack or by the measures used
>>     to counteract the attack can be avoided.
>>
>>  I don't understand this paragraph at all. I guess "rate-limiting for
>>  legitimate clients" just confuses me. I think it might attempt to be
>>  saying "not blocking ranges with no attackers helps real clients", but
>>  it is very unclear.
>
> Yoav?
>
>>     to calculate the PRF
>>
>>  One does not "calculate" a PRF. One uses a PRF to calculate something.
>
> OK.

You didn't provide text but I assume you changed it somehow.

>>  The section that starts with "Upon receiving this challenge," seems to
>>  be discussing the pros and conns of this method before it has explained
>>  the method. The reader is forced to skip this or forward to section 7
>>  and getting back to this part. I suggest to re-order some text to avoid
>>  this, or to give a better short summary of the puzzle nature just before
>>  this paragraph.
>
> It describes the puzzles mechanism in general, while Sections 7 & 8
> describe the particular instantiation of puzzles in IKEv2.
> I'd rather to keep some background about puzzles here,
> so that all possible defenses are described in one place.

Then I think it still requires a one-line introduction to puzzles.

>>     When the Responder is under attack, it MAY choose to prefer
>>     previously authenticated peers who present a Session Resumption
>>     ticket (see [RFC5723] for details).
>>
>>  Why is this only a MAY? Why is it not a SHOULD or MUST?
>
> A good question. I think the idea was not to force the Responder
> to serve only resumed clients and to let him(her) prioterize
> clients according to its own policy. In my opinion MUST is too strong, but 
> SHOULD is probably OK.

In the famous words of Steve Kent, if you say SHOULD instead of MUST,
explain when the Responder should not.

>>     The Responder MAY require such
>>     Initiators to pass a return routability check by including the COOKIE
>>     notification in the IKE_SESSION_RESUME response message, as allowed
>>     by Section 4.3.2. of [RFC5723].
>>
>>  Perhaps this should say the responder SHOULD require COOKIEs for resumed
>>  sessions if it also requires COOKIEs for IKE_INIT requests. That is, it
>>  should not give preference to resumed sessions as those could be equally
>>  forged as IKE_INIT requests.
>
> A good point. I tend to agree. Yoav?
>
>>     With a typical setup and typical Child SA lifetimes, there
>>     are typically no more than a few such exchanges, often less.
>>
>>  (ignoring the language) I do not believe this is true. This goes back to
>>  the discussion on how often people deploy liveness probes. Implementors
>>  seem to think 30s, while endusers want and do configure things like 1s.
>>  I don't think the text about the amount of IKE exchanges are typical
>>  are needed because the text below talks about specific abuse anyway,
>>  and not in terms of just number of exchanges.
>
> Are you suggesting to remove it?

Yes. You can just talk about something like "If an abusive amount of
(otherwise) valid IKE messages are received, ....." and let the
implemetor decide how many IKE messages counts as abusive? That also
avoids what to do when rekey's happen because that would likely reset
the counter because it is a new state?

>>        If the peer creates too many Child SA with the same or overlapping
>>        Traffic Selectors, implementations can respond with the
>>        NO_ADDITIONAL_SAS notification.
>>
>>  I think this requires normative language, eg: implementations MUST respond
>>  with a NO_ADDITIONAL_SAS notification. The same for the next bullet item
>>  where it says "implementations can introduce an artificial delay", which
>>  should be like: "MAY introduce an artificial delay" (or even SHOULD, or
>>  rewrite "too many" to "many" and use MAY)
>
> I'd use MAY and keep "too many". "Too many" means here that a peer is at 
> least misbehaved, while just "many" doesn't imply this
> (in my reading).

You cannot say "too many" and "MAY". If it is too many, it is abusive.
So you MUST take action. On the other hand if you say "many", then you
leave it open to interpretation whether it is abuse or not, and you can
use "MAY".

>>  Section 5 switchs from talking about "the Responder" to "the
>>  implementation".
>>  I think it should be "the Responder" throughout the document.
>
> OK.
>
>>      the retransmitted messages should be silently discarded.
>>
>>  That should be normative too, MUST be discarded.
>
> Agree.

Paul