Re: [IPsec] Additional charter items 1/4: Responder MOBIKE

Tero Kivinen <kivinen@iki.fi> Mon, 12 March 2018 15:10 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <23206.39029.998845.849191@fireball.acr.fi>
Date: Mon, 12 Mar 2018 17:10:45 +0200
From: Tero Kivinen <kivinen@iki.fi>
To: Valery Smyslov <smyslov.ietf@gmail.com>
Cc: ipsec@ietf.org
In-Reply-To: <0a2601d3b9db$58583af0$0908b0d0$@gmail.com>
References: <23175.7252.256625.885691@fireball.acr.fi> <02c501d3a95e$a5d73200$f1859600$@gmail.com> <23179.8656.330909.562547@fireball.acr.fi> <038e01d3aa5e$ec66bc80$c5343580$@gmail.com> <23180.40426.204224.108279@fireball.acr.fi> <043901d3ab29$8c980c20$a5c82460$@gmail.com> <23200.22423.300448.949259@fireball.acr.fi> <0a2601d3b9db$58583af0$0908b0d0$@gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/M0v_79kSVZvkEInLRgUoMrwsqJ8>
Subject: Re: [IPsec] Additional charter items 1/4: Responder MOBIKE
Precedence: list

Valery Smyslov writes:
> > No. The switch will be triggered immediately when the cluster / server
> > sends MOBIKE update using the 2nd ip address and does NOT include the
> > currently used IP address in the address list. If that message goes
> > through the client will start switch immediately, and probing the that
> > it works is just one round trip, so the delay should be less than
> > second.
> 
> I believe this interpretation is wrong. The IP address from the IP
> is always implicitly included into the list of host's IP addresses (Section 3.6):
> 
>    If the exchange initiator has only a single IP address, it is placed
>    in the IP header, and the message contains the
>    NO_ADDITIONAL_ADDRESSES notification.  If the exchange initiator has
>    several addresses, one of them is placed in the IP header, and the
>    rest in ADDITIONAL_IP4_ADDRESS/ADDITIONAL_IP6_ADDRESS notifications.

Yes, thats why the notify needs to be sent from address pair NOT in
use. In my example the header said IPr2, not IPr1. The IPr1 is the
currently used address, IPr2 was the additional address before.
Sending this notify will indicate that IPr1 is no longer in use and
because it is no longer in use it needs to be send from IPr2 as source
address.

> So, according to the RFC4555 the currently used IP address must not
> be included into the ADDITIONAL_IP*_ADDRESS notification. So, the
> client won't switch to a new address immediately, it's first test an
> old path and if it works it'll most probably do nothing.

Not true. 

> >     Initiator                              Responder
> >     ---------                              ---------
> >                  <-- HDR(IPr2, IPi), SK { N(NO_ADDITIONAL_ADDRESSES) }
> >     HDR(IPi, IPr2), SK { } -->
> > 
> > This will immediately trigger the initiator to switch to IPr2, as IPr1
> > is no longer available. Note, that IPsec traffic can be switched to
> > new address at this point already.
> 
> With NO_ADDITIONAL_ADDRESSES it will work, but only in case the mapping 
> for IPr2 exists. That's the issue.

Yes the mapping for IPr2 needs to work. Note, that if the mapping for
IPr2 does not work, then this message will get thrown away by NAT, and
as there is no response to the notify the responder will at some point
assume that this path it switched to is broken and switch to use
another IP, and is does not have any other IP, it needs to trigger
some special code there to add IPr1 back to the list of allowed
addresses, and then it will retransmit this address again using (IPr1,
IPi) as addresses. This will then reach the initiator and for his
point of view this was just removing the IPr2 from the address list.

On the other hand after that the responder will know that this method
will not work as initiator does not keep mappings up, and can give up
with the load balancing for this client, and rather move the clients
which support faster redirect.

Or if it thinks load balancing is more important than the a minute
breakage, it can simply keep retransmitting it on the IPr2, and stop
responding to any packets to IPr1. After a while the initiator will
notice this and it will try to probe on IPr2 too, and that will create
the NAT mapping, and then responder can retransmit its notify to that
port, and it will reach the initiator. 

> > and after that move is finished. So if client is keeping NAT mappings
> > alive address change can be done without any lost packets. And the
> > fact that it keeps mappings alive can either be tested, or you can use
> > the fact that you get keepalive packets to the 2nd address as
> > indication of that.
> 
> You are oversimplifying the problem. To make periodic tests by responder you 
> must have a copy of IKE SA on all the cluster nodes and continuously sync them.
> And the client won't send any NAT keepalives until it gets reply from the cluster, 
> so the IKE SA again needs to be present (and synced!) on all cluster nodes all the time.
> While this is possible, it is a headache and to some extent it makes the 
> whole idea of the load-sharing cluster meaningless.

Load balancing issues are really hard, and I assumed that those have
been taken care of by the cluster, as otherwise there is no point of
any of this. As client can at any point decide to use whatever address
the cluster gives it, the cluster MUST be able to process packets
arriving to any of its addreses always.

So load balancing issues inside the cluster is outside the scope of
this dicussion.

> Another problem is that with MOBIKE the initiator is free to switch to any path
> it wants in any moment. So, to encourage the client to send NAT KA to all cluster's
> addresses you must include all of them into ADDITIONAL_IP*_ADDRESS
> notification, so that the client knows all of them.  You must also reply
> to requests sent to any of these addresses, otherwise the client won't
> start sending NAT KA. But it means that now the client can switch 
> to any of these addresses on its own discretion - that's completely 
> kill the idea of load sharing cluster: it is the cluster that must control 
> when and where to move client.

Yes. If you want to have load balancing, you need to do it properly. 

> > I do not think it says that directly, but that is only way to get
> > MOBIKE working. It does that explictly for the initial IKE exchange:
> > 
> > 3.1. Initial IKE Exchange
> > ...
> >    If either or both of the peers have multiple addresses, some
> >    combinations may not work. Thus, the initiator SHOULD try various
> >    source and destination address combinations when retransmitting the
> >    IKE_SA_INIT request.
> 
> These must be different requests (with different SPIs), at least if the initiator 
> changes a destination IP, since the NAT_DETECTION_DESTINATION_IP would
> be different. So it is not a retransmission, it's a new request.
> Section 2.1 of RFC7296:
> 
>    A retransmission from the initiator MUST be
>    bitwise identical to the original request.  That is, everything
>    starting from the IKE header (the IKE SA initiator's SPI onwards)
>    must be bitwise identical; items before it (such as the IP and UDP
>    headers) do not have to be identical.

Only if the destination address changes. Usually the client using
MOBIKE has only one destination address, and multiple source
addresses, so it will include all of its source addresses in the
NAT_DETECTION_SOURCE_IP notifies, and the one destination address for
the NAT_DETECTION_DESTINATION_IP. In that case there is no need to
change packet after it is sent, and it will be same request
transmitted over different source IP addresses.

If client do have multiple destination addresses, then it must assume
each of those is different, and create separate IKE_SA_INIT messages
for each of them.

> > I.e., you start sending them out with new IP address. If you have
> > space in your window you might also send address update, but quite
> > often implementations only support window size of 1, so you send your
> > original packet for 2nd IP address pair for some time, and if it still
> > does not work, you go back to beginning, and notice this address pair
> > does not work, lets pick next one. And you continue doing that until
> > you time out the whole exchange after several minutes. Note, that you
> > might run out of ip-address pairs during the process so you might end
> > up going back to beginning again.
> 
> There are additional complications not mentioned in the RFC.
> If the exchange is a MOBIKE probe and NAT is supported, then 
> it will include NAT_DETECTION_DESTINATION_IP. When the exchange
> initiator tries  different destination IP addresses it must re-calculate
> this notification, so that NAT is detected properly. This leads to a direct 
> violation of Section 2.1 of RFC7296 (see above).

It will never re-calculate the packet. It will note this fact, and
mark this request as failed, thus it will run it through sending it to
the other end without modification, but after the process finishes, it
will throw away the result, and immediately start over with correct
destination address and correctly calculated
NAT_DETECTION_DESTINATION_IP address field.

I.e., following point in RFC4555:

   o  If a new address change occurs while waiting for the response,
      starts again from the first step (and ignores responses to this
      UPDATE_SA_ADDRESSES request).

will tell that if any change to addresses happened while it was
sending UPDATE_SA_ADDRESSES notify, then it will start over
immediately when reply is received. If while we were sending
UPDATE_SA_ADDRESSES we needed to switch to next destination address,
that means the "new address change" has occurred, thus we go back to
beginning.

> > I am not sure about that. Note, that we did work quite long to get the
> > rules in section 3.5 correct, and there are things there that will
> > make things work correctly if you follow the rules, even if not
> > everything is explained there (i.e., it does not explain why you need
> > to do things exactly like it says, it just assumes you do).
> > 
> > RFC4621 explains the design rational behind the MOBIKE, and it
> > explains why we did some things in RFC4555. For example the section
> > 6.2 of the RFC4621 explains that we need to use any existing IKE
> > exchange as path testing message and explains why we did it.
> 
> Sorry, but there are still some unclear places.
> And I don't think in its current form it suites well for building
> load-sharing cluster. You are trying to convince me otherwise, 
> and I agree that it would probably _somehow_ work in _some_ 
> situations, but the quality of such a solution leaves much to be desired, IMHO.

Load balancing was explictly mentioned as out of scope for the MOBIKE,
we did say that the method we are using in MOBIKE should try to
support load balancing, but we never meant it to be generic solution
for load balancing. There are lots of additional issues in the load
balancing which needs to be solved than just this.

I.e. RFC4621 says:

   Note that MOBIKE does not aim to support load balancing between
   multiple IP addresses. That is, each peer uses only one of the
   available address pairs at a given point in time.
...
   Load balancing is currently outside the scope of MOBIKE; however,
   future work might include support for it. The selected format needs
   to be flexible enough to include additional information in future
   versions of the protocol (e.g., to enable load balancing). This may
   be realized with an reserved field, which can later be used to
   store additional information. As other information may arise that
   may have to be tied to an address in the future, a reserved field
   seems like a prudent design in any case.

and RFC4555 says:

   MOBIKE allows both parties to be multihomed; however, only one pair
   of addresses is used for an SA at a time. In particular, load
   balancing is beyond the scope of this specification.

For example MOBIKE always assumes that both peers are same entity,
i.e., even when there is multiple IP-addresses, they all reach the
same entity in the other end, thus either end can process any packet
sent to any of its IP-addresses. 
-- 
kivinen@iki.fi

[IPsec] Additional charter items 1/4: Responder M… Tero Kivinen
Re: [IPsec] Additional charter items 1/4: Respond… Paul Wouters
Re: [IPsec] Additional charter items 1/4: Respond… Valery Smyslov
Re: [IPsec] Additional charter items 1/4: Respond… Hu, Jun (Nokia - US/Mountain View)
Re: [IPsec] Additional charter items 1/4: Respond… Tero Kivinen
Re: [IPsec] Additional charter items 1/4: Respond… Valery Smyslov
Re: [IPsec] Additional charter items 1/4: Respond… Tero Kivinen
Re: [IPsec] Additional charter items 1/4: Respond… Valery Smyslov
Re: [IPsec] Additional charter items 1/4: Respond… Paul Wouters
[IPsec] Additional charter items 1 thu 4 Michael Richardson
Re: [IPsec] Additional charter items 1/4: Respond… Tero Kivinen
Re: [IPsec] Additional charter items 1/4: Respond… Valery Smyslov
Re: [IPsec] Additional charter items 1/4: Respond… Tero Kivinen
Re: [IPsec] Additional charter items 1/4: Respond… Valery Smyslov