Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Paul Wouters <paul@nohats.ca> Tue, 18 October 2022 16:06 UTC

Return-Path: <paul@nohats.ca>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D3A89C14F744 for <ipsec@ietfa.amsl.com>; Tue, 18 Oct 2022 09:06:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.105
X-Spam-Level:
X-Spam-Status: No, score=-7.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nohats.ca
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YzjA9iw3X7l6 for <ipsec@ietfa.amsl.com>; Tue, 18 Oct 2022 09:06:52 -0700 (PDT)
Received: from mx.nohats.ca (mx.nohats.ca [IPv6:2a03:6000:1004:1::85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 75C4EC14CE28 for <ipsec@ietf.org>; Tue, 18 Oct 2022 09:06:52 -0700 (PDT)
Received: from localhost (localhost [IPv6:::1]) by mx.nohats.ca (Postfix) with ESMTP id 4MsJdh6My1z5Bt; Tue, 18 Oct 2022 18:06:48 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nohats.ca; s=default; t=1666109208; bh=Yui1HDQQpg8+3PrDrtUPeuCUXQa5x2Loh27WwDqOeZE=; h=Date:From:To:cc:Subject:In-Reply-To:References; b=Jmb4YIqh6GG4zyF5dmy7NIekeU5QZMSIsufReJKKkFXlyLxCOjxAUoZBi8TC5Utae Rpix1jxl7SPw8LBmrqj+oTbxyuNEUjaWpFam9knEFS8ZdwP87i4NDB3Kzv5cyTmruT L6IETTm3wceW7/x6dBX713Pop86wV+oioo8ikduQ=
X-Virus-Scanned: amavisd-new at mx.nohats.ca
Received: from mx.nohats.ca ([IPv6:::1]) by localhost (mx.nohats.ca [IPv6:::1]) (amavisd-new, port 10024) with ESMTP id I8oge2A41I50; Tue, 18 Oct 2022 18:06:47 +0200 (CEST)
Received: from bofh.nohats.ca (bofh.nohats.ca [193.110.157.194]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx.nohats.ca (Postfix) with ESMTPS; Tue, 18 Oct 2022 18:06:47 +0200 (CEST)
Received: by bofh.nohats.ca (Postfix, from userid 1000) id 9E8603F6AC6; Tue, 18 Oct 2022 12:06:46 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1]) by bofh.nohats.ca (Postfix) with ESMTP id 9B4FB3F6AC5; Tue, 18 Oct 2022 12:06:46 -0400 (EDT)
Date: Tue, 18 Oct 2022 12:06:46 -0400
From: Paul Wouters <paul@nohats.ca>
To: Valery Smyslov <smyslov.ietf@gmail.com>
cc: 'Steffen Klassert' <steffen.klassert@secunet.com>, 'IPsecME WG' <ipsec@ietf.org>
In-Reply-To: <048a01d8e2f6$cedc83e0$6c958ba0$@gmail.com>
Message-ID: <50fbd76f-6aad-e422-9d95-afbcd6db87ba@nohats.ca>
References: <15eb01d8dd7e$fdf158e0$f9d40aa0$@gmail.com> <20221014111548.GJ2602992@gauss3.secunet.de> <03ca01d8e232$3bbace10$b3306a30$@gmail.com> <3c58c2e0-f022-15fb-2ebb-658fa51275a6@nohats.ca> <048a01d8e2f6$cedc83e0$6c958ba0$@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"; format="flowed"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/qbYP8o79xc2VHA4GMfI9AoxIEC8>
Subject: Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Oct 2022 16:06:56 -0000

On Tue, 18 Oct 2022, Valery Smyslov wrote:

>>> implementation with say 10 CPUs. Does it make any difference for this implementation
>>> If it receives CPU_QUEUES with 100 or with 1000? It seems to me that in both cases
>>> it will follow its own local policy for limiting the number of per-CPU SAs,
>>> most probably capping it to 10.
>>
>> That would be a mistake. You always want to allow a few more than the
>> CPUs you have. The maximum is mostly to protect against DoS attacks.
>
> How it protects against DoS attacks, can you elaborate?

Requesting to install 1 million Child SA's until the remote server falls over.
Perhaps less extremely, to contain the number of resources a sysadmin
allocates to a specific "multi CPU" tunnel.

>> If you only have 10 CPUs, but the other end has 50, there shouldn't
>> be much issue to install 50 SA's. Not sure if we said so in the draft,
>
> I'm not so sure. For example, if you use HSM, you are limited by its capabilities.

Sure. Maybe put the HSM on the fallback SA and not on the per-CPU SAs if
you don't have an option to use it for all.

>> but you could even omit installing 40 of the outgoing SA's since you
>> would never be expected to use them anyway. but you have to install all
>> 50 incoming ones because the peer might use them.
>
> And what to do in situations you are unable to install all 50 (for any reason)?
> And how it is expected to deal with situations when the number of CPUs
> is changed over the lifetime of IKE SA? As far as I understand some modern
> systems allows adding CPUs or removing them on the fly.

Right. All of these are reasons to not too tighyly but in limitations
and the notify conveys roughly what each end is willing to put into this
connection, but there might be slight temporary or not changes. We feel
it is still better to convey what both sides consider "ideal", over just 
sending CREATE_CHILD_SAs and getting them to fail.

> The use case is clear. And the idea to have per-CPU SAs is clear too.
> The problem (my problem) is the way how it is achieved.

I understand.

>>> I don't think this logic is credible in real life, but even in this case
>>> there is already a mechanism that allows to limit the number of
>>> per-CPU SAs - it is the TS_MAX_QUEUE notify.
>>
>>> So why we need CPU_QUEUES?
>>
>> TS_MAX_QUEUE is conveying an irrecoverable error condition. It should
>> never happen.
>
> That's not what the draft says:
>
>   The responder may at any time reject
>   additional Child SAs by returning TS_MAX_QUEUE.
>
> So, my reading is that this notify can be sent at any time if peer
> is not willing to create more per-CPU SAs. And sending this notify
> doesn't cause deletion of IKE SA and all its Child SAs (it is my guess).

Right. It is still something expected. Say there were 4 CPUs and we
commited to 4, but now one CPU was removed. So TS_MAX_QUEUE tells the
peer not to try for the 4th one. For whatever reason we cannot do it
anymore. This prevents that peer from keeping to try this every second.
Perhaps it will still try every hour. Or perhaps it will let the peer
run an additional one once it gets that 4th CPU back.

The common case will be both peers present what they want and (within
reason) will establish. No failed CREATE_CHILD_SAs happen unless there
was an unexpected change on one of the peers. Where in your proposal,
there will be a failed CREATE_CHILD_SAs as part of the normal process.

> If my reading is wrong and this is a fatal error (or what do you mean by " irrecoverable "?),
> then the protocol is worse than I thought for devices that for any reason cannot afford
> installing unlimited number of SAs (e.g. if they use HSM with
> limited memory). In this case they cannot even tell the peer
> that they have limited resources.

I might fatal as "do not attempt to do this again, I am out of
resources". Maybe you call that more a temporary error. What I was
trying to convey is that the error means "resources all in use, don't
keep trying this for now". If you feel that is a "temporary" error,
that's fine with me. As long as the peer wouldn't keep trying this
for other CPUs but is smart enough to realize this one failure means
not to keep pounding the peer with more CREATE_CHILD_SA requests.

>> Where as CPU_QUEUES tells you how many per-CPU child SAs
>> you can do. This is meant to reduce the number of in-flight CREATE_CHILD_SA's
>> that will never become successful.
>
> It seems to me that it's enough to have one CREATE_CHILD_SA with the proper
> error notify to indicate that the peer is unwilling to create more SAs.
> I'm not sure this is a big saving.

Note that you don't have to bring up CPUs on-demand via ACQUIREs. You
can also fire off all of them at once (after the initial child sa is
established). With CPU_QUEUES, you know whether to send 5 or 10 of
these. With your method, you have to bring them up one by one to see
if you can bring up more or not.

>>>>> I'm also not convinced that CPU_QUEUE_INFO is really needed, it mostly exists
>>>>> for debugging purposes (again if we get rid of Fallback SA). And I don't think we need
>>>>> a new error notify TS_MAX_QUEUE, I believe TS_UNACCEPTABLE can be used instead.
>>
>> We did it to distinquish between "too many of the same child sa" versus
>> other errors in cases of multiple subnets / child SAs under the same IKE
>> peer. Rethinking it, I am no longer able to reproduce why we think it
>> was required :)
>
> I believe TS_UNACCEPTABLE is well suited for this purpose. You know for sure that TS itself is OK,
> since you have already installed SA(s) with the same TS, and it's not fatal error notify and
> is standardized in RFC 7296 and it does not prevent creating SAs with other TS.

If the peers have two connections:

conn one
 	[stuff]
 	leftsunet=10.0.1.0/24
 	rightsunet=10.0.2.0/24

conn two
 	[same stuff so shared IKE SA with conn one]
 	leftsunet=192.0.1.0/24
 	rightsunet=192.0.2.0/24

If you put up 10 copies of conn one, and then start doing the first
(so fallback sa) of conn two and get TS_UNACCEPTABLE, what error
condition did you hit? Does the peer only accept 10 connections per
IKE peer, or did conn two have subnet parameters that didn't match the
peer ?


>> The idea of the fallback SA is that you always have at least one child
>> SA guaranteed to be up that can encrypt and send a packet. It can be
>> installed to not be per-CPU. It's a guarantee that you will never need
>> to wait (and cache?) 1 RTT's time worth of packets, which can be a lot
>> of packets. You don't want dynamic resteering. Just have the fallback
>> SA "be ready" in case there is no per-cpu SA.
>
> The drawback of the Fallback SA is that it needs a special processing.
> Normally we delete SAs when they are idle for a long time
> to conserve resources, but the draft says it must not be done with the Fallback SA.

Yes. But honestly that is a pretty tiny code change in IKEv2. Again, I
will let Steffen and Antony talk about performance, but I think
"re-steering" packets is hugely expensive and slow, especially if it
needs to be done dynamically, eg first you have to determine which CPU
to steer it to, and then steer the packet. Then maybe remember this
choice for a while because you cannot do this lookup for each packet.
Then if that SA dies and you need to find another one, that's a whole
other error path you need to traverse.

>>> I think it depends. I'd like to see optimization efforts to influence
>>> the protocol as less as possible. Ideally this should be local matter
>>> for implementations. This would allow them to interoperate
>>> with unsupporting implementations (and even to benefit from
>>> multi-SAs even in these situations).
>>
>> Those that don't support this don't see notifies ? Or do you mean to
>> somehow install multiple SA's for the same thing on "unsupported"
>> systems?
>
> Yes. The idea is that If one peer supports per-CPU SAs and the
> other doesn't, they still be able to communicate and have multiple SAs.

There is no way for you to know in advance if the peer will send you a
delete for the older child SA when establishing the new child SA, thus
defeating your purpose. I know RFC 7296 says you can have multiple
identical child SAs but in practise a bunch of software just assumes
these are the same client reconnecting and the previous chuld sa state
was lost. This proposed mechanism therefor wants to explicitely state
"we are going to do multiple identical SAs, can you support me".

> For example, if the supporting system has several weak CPUs,
> while the unsupporting one has much more powerful CPU,
> then multiple SAs will help to improve performance -
> the supporting system will distribute load on its weak CPUs,
> while for unsupporting the load will be small enough even for a single CPU.

But you simply don't know if the duplicate SA is going to lead to a
deletion of the older SA.

>> The problem currently is that when an identical child SA
>> is successfully negotiated, implementations differ on what they do.
>> Some allow this, some delete the older one. The goal of this draft
>> is to make the desire for multple idential child SAs very explicit.
>
> RFC 7296 explicitly allows multiple Child SAs with identical selectors,
> so if implementations immediately delete them, then they are either broken
> or have reasons to do it (e.g. have no resources).

Broken or not, they are not going to get "fixed". It was a design
choice.

Paul