Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Tero Kivinen <kivinen@iki.fi> Wed, 26 October 2022 22:01 UTC

Return-Path: <kivinen@iki.fi>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7BC84C152596 for <ipsec@ietfa.amsl.com>; Wed, 26 Oct 2022 15:01:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.81
X-Spam-Level:
X-Spam-Status: No, score=-2.81 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=iki.fi
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5yYQXLzR13FV for <ipsec@ietfa.amsl.com>; Wed, 26 Oct 2022 15:01:02 -0700 (PDT)
Received: from lahtoruutu.iki.fi (lahtoruutu.iki.fi [185.185.170.37]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 34A17C152569 for <ipsec@ietf.org>; Wed, 26 Oct 2022 15:01:00 -0700 (PDT)
Received: from fireball.acr.fi (fireball.kivinen.iki.fi [IPv6:2001:1bc8:100d::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kivinen@iki.fi) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 7CCA11B00153; Thu, 27 Oct 2022 01:00:57 +0300 (EEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1666821657; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AqZkvm3XLxCKbrI8rlitQDKIipyLO512aIk9/mjyfYs=; b=kvmSryHkXW5DgA/S3MFejniVnbvddaNxT775aQSoXc8pz+qDqcjQ1hsHcMwtr7p9zVX4jR /5sYQNg+6JS0zd5BY5DClBBZe7d5Erao8rQCDVuSbQ51ZHvbE9gaDhPtjzTGNfJH53PZ/Q a8WrJljSpgVS5GJKjn+0bytwgRpOoS9l/RdtE/BvIOTVrWmNsv2Fp+MyhVfKkODw50OuyD ZrTBzuEoZpXCQDUOS/NGq8+zaUKmClybNZOWGhUdMmpBI2HVpxyHX9bHSn7jpsOzGS1B9n Xtq7cMENh4uxT/VtlU28VxwGFZxIAlP3mHOUXcNSAizn7LZ7ypUPVfxKobCBgg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1666821657; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AqZkvm3XLxCKbrI8rlitQDKIipyLO512aIk9/mjyfYs=; b=mQ+EdEQYzYsLCUb6LmMyb5wrs/Esgf+Mt+7I/gsWDA4SlAHZ+FaTkpKnTJ8Q80WpRAnguQ gDOXqtRj53CIUj1lJOHCeeWEiYYyT9jhj+EOCWf8Yc0O5wePU8baRcXzFeWCLCEw6S41ZH pwByVheX1Bxc0v54alTuz9xSCFLItu6BnRknpKj3nXSiFJ9R9D/5S55vlV0S+uRmMA8PTv M4hcf0jSnY34pvX05JFMomaJBQQT0HAaYKvpgKleHUhc19u7+G0q8pL6qcHj1DGiCohaZj YGTY83u+1rslsRb4yP+kM34Ey6GgyyLCjQ4x9pEKXe0Q2rsYSt8E4Y178uOFqA==
ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=kivinen@iki.fi smtp.mailfrom=kivinen@iki.fi
ARC-Seal: i=1; s=lahtoruutu; d=iki.fi; t=1666821657; a=rsa-sha256; cv=none; b=C0we4/MGFfKJwRd35bzB5zi/cRbnWvR14JrR+g6TQnbLEqFcmgGviFmDHafkUUIy2unOjW lTps29+7+rqPKOO3eKcacgN4BfUm3H9CBx6W46KG1IFpwvIPV4fdbs6+OOr6j8pe5kUVwq wcB+DN4poWfOMl9TTezMjUtDfe2QEESPcb/RUHnoIRxvLOFbMhRWcnNcX8BJkwaxMiLhOS 4HmkRF8yCQihv1E5SNsfcHYGOKmyKwo1qaoW6OYyAOi47jWQMI1U3YSqx+r1rOOyVmFiyo eyU282fbD/EYAACe3WpMILneGVTrkgqdhlz65EzmasIvNqWv7wruuLPmaemxdQ==
Received: by fireball.acr.fi (Postfix, from userid 15204) id 17E7625C12C2; Thu, 27 Oct 2022 01:00:57 +0300 (EEST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Message-ID: <25433.44569.44812.537584@fireball.acr.fi>
Date: Thu, 27 Oct 2022 01:00:57 +0300
From: Tero Kivinen <kivinen@iki.fi>
To: Paul Wouters <paul@nohats.ca>
Cc: Steffen Klassert <steffen.klassert@secunet.com>, Valery Smyslov <smyslov.ietf@gmail.com>, Michael Richardson <mcr+ietf@sandelman.ca>, IPsecME WG <ipsec@ietf.org>
In-Reply-To: <F84D65B2-9A68-420D-BC55-2A6BD2542246@nohats.ca>
References: <20221021073714.GP3294086@gauss3.secunet.de> <F84D65B2-9A68-420D-BC55-2A6BD2542246@nohats.ca>
X-Mailer: VM 8.2.0b under 26.3 (x86_64--netbsd)
X-Edit-Time: 22 min
X-Total-Time: 21 min
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/N1YtWQziyhedSaxHxx85VI0rF_g>
Subject: Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Oct 2022 22:01:05 -0000

[Replying to this email, but commenting about the others also]

Paul Wouters writes:
> On Oct 21, 2022, at 03:37, Steffen Klassert <steffen.klassert@secunet.com> wrote:
> > Another possibility would be to use the same keymat on all
> > percpu SAs
> 
> You cannot do that. You need to ensure unique IVs for AEAD so you
> would need to subdivide the IV space. You would also still reach max
> operations on these SAs on different times AND things like FIPS puts
> an operational max count on the key usage which you can’t do if the
> key is used by multiple different states.
> 
> Using different real child SA’s was needed to ensure the
> cryptographic security properties.

This is something that is really a important. The keymat between the
CPUs can't be same, but we could in theory create a new key hierarchy
that generates keys for each sub Child SAs for each CPU, but I think
that will just complicate things more, and having real Child SAs for
each cpu is the correct solution.

In your discussion you were talking about cases where one device has
hundreds of cpus and other have few. Only case where such
configurations would be useful when other has lots of really low
powered cpus and other one has few very fast ones. My understanding is
that this is not really happening. Usually the one that has more cpus
has cpus which are about the same speed then the one having fewer
cpus.

There is no point of one having for example 10 fast cpus sending
traffic over 10 Child SA, when the receiving end only has two cpus
which are about same than the other ends cpus. The receiving end will
not be able to keep up with the traffic it is getting in, thus it will
drop packets as it can't decrypt them fast enough.

So I think we should try to concentrate in the cases where the number
of cpus for each end is in the same ballpark. We can have one host
having 2 cpus that is twice as fast as the other host having 4 cpus,
so creating 4 Child SAs is ok in that case, but I do not think there
is ever cases where we are generating more than 2-4 SAs per cpu, i.e.,
if one end has 2 cpus then practical limit is 8 Child SAs. Any more
than that will not help. Also host having hundreds of CPUs will most
likely talk to hundreds of other hosts too, so using 10 of cpus to
talk to one host, and 10 to talk to other host etc is also a way of
splitting up the work. I.e. that "gateway" would most likely advertise
having fewer CPUs it actually has. The other host having two cpus will
most likely be the "client" end and only talk to that one "gateway"
(or someone used way too much of money for device that does not need
to be that big)...

And I do agree on Valery that there is no point of trying to guess
what kind of broken implementations there are out there, we should
assume that implementations are following RFC7296, and if there are so
many broken implementations we need to take them in to account, then
we might want to update RFC7296....

Talking about locking and such thing is bit distracting, as you can do
lots of things without locking depending on the datastructures and who
writes them and so on. This goes so low level that I am not sure it is
that beneficial to talk about them here. For example there are ways of
updating the per cpu SAD without locks provided there is only one
entity that can update them...

We should make sure that all the stable state processing can be done
efficiently i.e., without locking etc, but IKE SA creation etc happens
every few hours etc, and trying to optimize locking behivior of them
is not that useful in the big picture.

Also I think it is just better to create all Child SAs at the
beginning, i.e., no point of doing that much per CPU aquiring etc. I
mean you have some way of distributing packets going out to CPUs
before that and if that is round robin then you will create all per
CPU SAs very quickly, and if that is something else (like this TCP
stream is locked to this CPU), then you mostly keep using only that
one CPU (in which case per cpu aquire will be useful), but all of
these depends so much on the implementation we are not talking about
here that I think that should be left to implementations to decide.

If we use per cpu aquiring things then other end might need to create
Child SAs too, just in case if the one inititing the connection only
sent out one packet and create one SA, and then the other end would
like to have 8 SAs for its 8 cpus, but only one was created, so would
it now create 7 missing one, or wait for the other end to create them
etc. 
-- 
kivinen@iki.fi