Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Valery Smyslov <smyslov.ietf@gmail.com> Thu, 20 October 2022 07:51 UTC

Return-Path: <smyslov.ietf@gmail.com>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EDC69C152563 for <ipsec@ietfa.amsl.com>; Thu, 20 Oct 2022 00:51:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.108
X-Spam-Level:
X-Spam-Status: No, score=-2.108 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GbPeJtq0oldK for <ipsec@ietfa.amsl.com>; Thu, 20 Oct 2022 00:51:51 -0700 (PDT)
Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 12A03C14CF1C for <ipsec@ietf.org>; Thu, 20 Oct 2022 00:51:51 -0700 (PDT)
Received: by mail-lj1-x229.google.com with SMTP id by36so25226159ljb.4 for <ipsec@ietf.org>; Thu, 20 Oct 2022 00:51:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=Nz2e0V41hjWCW5UDNUQYtT7dQKkz86680djIwR+MlJU=; b=NFHzWO8QSjNP4TD26D+I83IuPqygN47BiLqM2RM/QcrMB/LAf/Se4VJLx3D5YDVsvT qFzTplDJOt1S8OS7zFzTlJw4Rc1shJwgr8xu+p/orjgOWouyDcIbCRCjtSKFiliuLSYl 9kgmpb7e+ID65emA8M67KikVVUZtgD8V+PLaElqaWs3ifcFk0imCUSWVgVRjVgqJIUb2 UZ1DF1eHT0W4b4RJejUF6OT1t6z4Fi0zOG6cnrD2248g0TdoTD2weDHpFO1plYebYp3f 3Ps3GRmJvPWVSfVWYWt3PoeTZZOzyhUBp+9kJqawMpZeWGcINxKIUhT1sHdKhBm8znHc 29bQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-language:thread-index:content-transfer-encoding :mime-version:message-id:date:subject:in-reply-to:references:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Nz2e0V41hjWCW5UDNUQYtT7dQKkz86680djIwR+MlJU=; b=fUEJBJBxSkiCUaS2BM4jkn5WjZrnWFiItTJg/ExCvgQ8KSWd+tt+yLckq4z+C3/I7Q toVOTXIuIn8ypv2u8GvxDr1IwtYulvq9JELrh2R22K1g7ifKkD6iJK1ThfApZm+o75el G/jSezyJrM455IIDVgcKCB+O8oq7k7sL76pcWHDjZSvHSqJt3dRKikLdnqXik/lqEhcH tzipP/wyaxrxZyDb/QXegB+CbFou7izbo/B69VI/FsqqnXkS/g9Iy85kDmUxpdkwjATW ezbwz5rqDFFDINZrS0cw3bN9gDhYtTLaZAeUpKAvBoy0HooacYJf7HJ3PCmrYKDA2PK4 rVCg==
X-Gm-Message-State: ACrzQf0UVfu1fjeinVTOTg979br82rPokifiM+jPyY/+/wJ9JeZqY6Aq FAIQ5MjbAKEqwrdLPv6NCOsCXE7vPgg=
X-Google-Smtp-Source: AMsMyM5awfi2N7W0qss9CQVa1EXJrSTabYhkaMJpkTNKSl6adsRqf8B5P4fArmupAehxJlLVOzGKKQ==
X-Received: by 2002:a2e:a887:0:b0:25e:3174:fb67 with SMTP id m7-20020a2ea887000000b0025e3174fb67mr4076656ljq.337.1666252308181; Thu, 20 Oct 2022 00:51:48 -0700 (PDT)
Received: from buildpc ([93.188.44.204]) by smtp.gmail.com with ESMTPSA id v5-20020ac258e5000000b0049473593f2csm2622289lfo.182.2022.10.20.00.51.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Oct 2022 00:51:47 -0700 (PDT)
From: Valery Smyslov <smyslov.ietf@gmail.com>
To: 'Paul Wouters' <paul@nohats.ca>
Cc: 'Steffen Klassert' <steffen.klassert@secunet.com>, 'IPsecME WG' <ipsec@ietf.org>
References: <15eb01d8dd7e$fdf158e0$f9d40aa0$@gmail.com> <20221014111548.GJ2602992@gauss3.secunet.de> <03ca01d8e232$3bbace10$b3306a30$@gmail.com> <3c58c2e0-f022-15fb-2ebb-658fa51275a6@nohats.ca> <048a01d8e2f6$cedc83e0$6c958ba0$@gmail.com> <50fbd76f-6aad-e422-9d95-afbcd6db87ba@nohats.ca> <057901d8e395$2822ef40$7868cdc0$@gmail.com> <a1efaa10-71d9-a6df-4354-63d92e861b4e@nohats.ca>
In-Reply-To: <a1efaa10-71d9-a6df-4354-63d92e861b4e@nohats.ca>
Date: Thu, 20 Oct 2022 10:51:45 +0300
Message-ID: <06dd01d8e458$cd429f20$67c7dd60$@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQJS+NHZSxlPMvP0ZarIorjX6+t7ygK8HRKAAxozn4sAqsbVfwJs/0ScAi8UOHsCMy+GMQIjuQQGrKb6WsA=
Content-Language: ru
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/HNzqat87lY3KJ5Q7Xo7E-8SppDQ>
Subject: Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Oct 2022 07:51:52 -0000

Hi Paul,

> On Wed, 19 Oct 2022, Valery Smyslov wrote:
> 
> >> Requesting to install 1 million Child SA's until the remote server falls over.
> >> Perhaps less extremely, to contain the number of resources a sysadmin
> >> allocates to a specific "multi CPU" tunnel.
> >
> > I still fail to understand how this protection mechanism works.
> > One side suggests 10, the other suggests 1 million, according to your draft
> > the negotiated value is the largest one, i.e. 1 million. How the peer with
> > limited capabilities can protect itself from installing 1 million SAs?
> 
> By not doing more than its local policy says to do.

Then you don't need CPU_QUEUES, because both sides only
follow their local policy. They can just ignore the value the other side sent in this notify.

> > Fail negotiation of IKE SA? Thus it is very bad protection mechanism,
> > because it completely prevents this peers to communicate.
> > I would have understood if the smallest value is selected,
> > but this is not what the draft says.
> 
> Earlier drafts agreed on the smaller value, but that causes problems
> both in terms of acquires and for rekeys and simulanious new child sa
> establishment that happens at once in flight.
> 
> Note that before you get 1M child SA's, you need to have authenticated,
> so the abuse is easilly tracked to an entity and can simply be prevented
> with returning TS_MAX_QUEUE at the local policy max (eg 40 instead of 1M)

Exactly. And this makes the information from CPU_QUEUES useless and the 
notify itself redundant. This is my point.

> > I think that the real protection mechanism in your draft against installing too many
> > SAs is the TS_MAX_QUEUE notify and it makes CPU_QUEUES really redundant and useless.
> 
> I feel we keep repeating our arguments. Without CPU_QUEUES both peers
> keep guessing at what the best solution is.

Actually, each peer has its own opinion of what "the best" is.
Generally for a peer "the best solution" would be to have
exactly one per-CPU SA for each CPU it has. If peers have equal
number of CPU, then both will be happy, if unequal - one of the
peers will have to install more SAs, than the number of CPUs it has.
How many more - it depends on the number of CPUs the other side has
and on the local policy on this peer.

So, I see no difference whether CPU_QUEUES are used or not - 
in both cases peers will try to install such many SAs, that all CPUs
of the peer having more CPUs are busy. And in both cases
the resulted number of installed SAs will be limited by the local policies.
So, there is no extra guessing comparing to your draft.

> > This doesn't work this way. The presence of HSM in the system usually
> > means that there are some reasons (e.g. certification requirements) to use it for
> > all traffic and not for part of it. So, the system with HSM is usually
> > unable to install more SAs than the HSM can handle.
> 
> If you have an HSM, either on cpu or on nic, there will be constraints.
> If the number of ESP keys is limited, you have to limit the number of
> child SA's if you are forced for compliance to go through the HSM.
> Nothing that this draft can change on that.

My point with HSMs was that peers cannot indicate the maximum number
of SAs they can handle, so in many cases using TS_MAX_QUEUE is inevitable,
thus I see no reason to use CPU_QUEUES, it provides no useful information.

> > So, we disagree here :-) I think this information is really hard to use
> > (you admitted that it is unreliable) and the real mechanism to
> > limit the number of SAs is the TS_MAX_QUEUE notify.
> 
> That's the hard fail you should never reach. Without CPU_QUEUES, when my
> machine has 20 CPUs and the peer has 2, and I want to start all my child
> SAs at once (without using ACQUIREs) I would end up setting up the first
> child, then start sending out 19 more CREATE_CHILD_SA's (if window size > 1)
> and they will all come back failing. If using PFS, that wouldn't be
> great.

Paul, you contradicts yourself :-) A few paras above you confirmed
that this can happen with your draft as well (e.g. if the peer saying it has
2 CPU actually has 2 HSMs and cannot install more than 2 SAs). I see absolutely 
no difference here whether you use CPU_QUEUES or not.

> >> The common case will be both peers present what they want and (within
> >> reason) will establish. No failed CREATE_CHILD_SAs happen unless there
> >> was an unexpected change on one of the peers. Where in your proposal,
> >> there will be a failed CREATE_CHILD_SAs as part of the normal process.
> >
> > This is not true. In my proposal a failed CREATE_CHILD_SAs will happen only in
> > the situation when one side requests more SAs than the other can handle.
> 
> What happens when the other peer does not support this at all? Or do you
> still propose some notify to indicate support? One without a number ?

RFC 7296 requires this to be supported. I see no need for additional notification.
Note, that having multiple identical SAs installed is needed not only for distributing
them over CPUs, but also, for example, for handling traffic with different QoS 
marks (diffserv) - it just doesn't work if you put this traffic into a single SA.

> With no CPU_QUEUES at all, we have the same situation as we have now.
> There is no indication the peer won't delete our first child SA when it
> received an identical second child SA. If you send the notify but

True. But this behavior is broken. IETF generally should not
take broken implementations into account while developing standards (IMHO).

> without number indication, this issue goes away, but why not indicate
> (for free basically) what the ideal number is?
>
> > So, if one peer has 4 CPUs and the other has 10, and the first peer has no problems
> > installing 10 SAs, then they will end up with 10 SAs without any failed CREATE_CHILD_SAs:
> > they will just request creating new SAs until both have all their CPU in work.
> 
> Yes.
> 
> Note both sides will need to specify 12 (for window size 1) to keep
> 10 working ones, unless you are going to not count rekeying ones in your
> maximum.  Eg what if peer A starts a rekey for CPU 3, and peer B starts
> a rekey for CPU 1 at the same time. You end up briefly with 12 child SAs.

I didn't count transient SAs. There may be up to 30 SAs temporarily
(if both peers start simultaneously rekey all 10). But at the 
stable situation there will be 10.

> > Failed CREATE_CHILD_SAs only happen if one of the peers cannot handle
> > more SAs than it has already installed.
> 
> installed or committed to in flight, yes.
> 
> >> Note that you don't have to bring up CPUs on-demand via ACQUIREs. You
> >> can also fire off all of them at once (after the initial child sa is
> >> established). With CPU_QUEUES, you know whether to send 5 or 10 of
> >> these. With your method, you have to bring them up one by one to see
> >> if you can bring up more or not.
> >
> > Not exactly. In your proposal when peers exchanged CPU_QUEUES notifies,
> > they select the maximum of two values as the expected number of SAs.
> > With this logic the peer that proposed the smaller value (because
> > it cannot handle more) would still reject the excessive requests.
> 
> It is not about cannot handle. The 4 CPU node can (should!) still handle
> 8 if it wants to optimize talking to 8 CPU peers. I don't think in
> general, there will be cases for "cannot handle". 

See the HSM example above.

> Child SA's are not
> that expensive in kernel memory. And if you want to talk to 8 CPUS
> nodes, you better support 8 (or 10 see above) even if you have 4 CPUs
> only. You basically negotiate "how many do we need to cover all our
> CPUs" by picking the maximum value (within reason).

I see no negotiation in your draft. In the negotiation process peers 
should be able to come to a value that satisfy both, in your 
draft there is more an announcement, not a negotiation.
The peer indicating smaller value has no means to say
that this is its "hard limit" (apart from using TS_MAX_QUEUE).

> > So, actually with your proposal the peer that proposed the larger value
> > doesn't know for sure if all its requests will be successful.
> 
> It does. Basically, you will do the maximum of the two, based on the
> fact that if you send the notify to allow per-CPU child SAs, you surely
> can afford to install a few dozen of them if needed.

OK, do you want to say that your draft is not appropriate for the HSM use case?
This would be very unfortunate...

> >> conn one
> >>  	[stuff]
> >>  	leftsunet=10.0.1.0/24
> >>  	rightsunet=10.0.2.0/24
> >>
> >> conn two
> >>  	[same stuff so shared IKE SA with conn one]
> >>  	leftsunet=192.0.1.0/24
> >>  	rightsunet=192.0.2.0/24
> >>
> >> If you put up 10 copies of conn one, and then start doing the first
> >> (so fallback sa) of conn two and get TS_UNACCEPTABLE, what error
> >> condition did you hit? Does the peer only accept 10 connections per
> >> IKE peer, or did conn two have subnet parameters that didn't match the
> >> peer ?
> >
> > The latter. Otherwise NO_ADDITIONAL_SAS would be returned.
> 
> It would be good to know if the error is "no more of this one child"
> versus "no more child sa's at all". I don't think returning the same
> error for all these failure cases is good. the peer will not know the
> reason of the failure and might end up retrying things it should not,
> or not retrying things it should (eg a conn three)

If no more SAs at all (for this IKE SA), then NO_ADDITIONAL_SAS.
If TS is unacceptable and this is the first SA with this TS, then TS_UNACCEPTABLE.
If there are SAs with this TS, but no more such SAs can be installed, then TS_UNACCEPTABLE.

> > But if the peer starts creating 11th copy of the first connection
> > and get back TS_UNACCEPTABLE, this would mean that no more
> > per-CPU SAs are allowed.
> 
> Is this a temporary or permanent error? Should the peer retry after a
> while? Without knowing the desired numbers, it won't know?

The same question I asked you earlier about TS_MAX_QUEUE.
You answered, that this is implementation dependent. Same here.
For example, you may retry in, say, an hour.

> >> There is no way for you to know in advance if the peer will send you a
> >> delete for the older child SA when establishing the new child SA, thus
> >> defeating your purpose. I know RFC 7296 says you can have multiple
> >> identical child SAs but in practise a bunch of software just assumes
> >> these are the same client reconnecting and the previous chuld sa state
> >> was lost. This proposed mechanism therefor wants to explicitely state
> >> "we are going to do multiple identical SAs, can you support me".
> >
> > That behavior is really broken, but in the worst case
> > the peers will end up with a single Child SA, as well as
> > with your proposal. I assume not implementations
> > out there are broken :-)
> 
> So now the peer has to keep track of whether its duplicate SA's lead to
> removal of its initial SA. What if the remote peer supports this, but
> it idled out the initial one and send a delete.

In the latter case the SA will eventually be restored.
The logic for detecting this broken behavior could be simple - 
just count the number of identical SAs. If your preferred
value is >1 and most of the time only 1 such SA exists,
then this may be an indication that the peer is broken.

> The draft started with a concept of "we must clearly signal this
> support because of how people implemented 7296".

Oh, should we define a notification GENUINE_RFC7296_IMPLEMENTATION? :-)

> >> But you simply don't know if the duplicate SA is going to lead to a
> >> deletion of the older SA.
> >
> > Ok, but once I see this broken behavior I can stop creating more SAs.
> 
> See above. It might be a bit hairy to properly detect this.

A little bit.

> > Do you know for sure that they are not going to get "fixed"?
> > If so, then probably there is a way to tell their authors that
> > this behavior is broken, and probably they can change their mind?
> 
> uniqueids=yes started in freeswan a long long time ago before the Second
> Age of Openswan and strongSwan, and the Third Age of libreswan :-)

Isn't this archeological finding belong to pre-historic times? :-)
I'm not sure RFC7296 existed at that dark ages :-)

> For example apple clients pretty aggressively reconnect on network
> failures (eg wifi switching, person entering elevator) and if you don't
> start deleting duplicates you end up with a LOT of unused SAs. There
> are also VPN services that are pay per user and want to avoid the same
> user connecting multiple times. An explicit signal would be helpful.

I'm a bit confused here. In the case of Wi-Fi switching the IP is
changed and the whole IKE SA should be re-established
(unless you use MOBIKE). In this case all child SAs are deleted,
so what are we talking about? The same with the user
connecting multiple times - I assume it starts connecting
by creating a new IKE SA, so it's not concerned with multiple
child SAs too.

> We could rephrase the payload so it can be completely optional if you
> would be okay with that so we could both implement to our own desire?

The question is - will you install per-CPU SAs without this notify?
So, if my implementation tries to do it without explicit negotiation,
will you ignore such requests, install additional SAs on the same CPU
or create per-CPU SAs?

> And then we would need to write out a bit of your explanations in the
> draft for clarification?

No problem (actually I'm interested in working on this draft).

Regards,
Valery.

> Paul