Re: [IPsec] Additional charter items 1/4: Responder MOBIKE

Tero Kivinen <kivinen@iki.fi> Wed, 07 March 2018 21:20 UTC

Return-Path: <kivinen@iki.fi>
X-Original-To: ipsec@ietfa.amsl.com
Delivered-To: ipsec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E4111277BB for <ipsec@ietfa.amsl.com>; Wed, 7 Mar 2018 13:20:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.12
X-Spam-Level:
X-Spam-Status: No, score=-1.12 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zfvAgsOcYOV4 for <ipsec@ietfa.amsl.com>; Wed, 7 Mar 2018 13:20:30 -0800 (PST)
Received: from mail.kivinen.iki.fi (fireball.acr.fi [212.16.101.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AA35412422F for <ipsec@ietf.org>; Wed, 7 Mar 2018 13:20:29 -0800 (PST)
Received: from fireball.acr.fi (localhost [127.0.0.1]) by mail.kivinen.iki.fi (8.15.2/8.15.2) with ESMTPS id w27LKNuv022256 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 7 Mar 2018 23:20:23 +0200 (EET)
Received: (from kivinen@localhost) by fireball.acr.fi (8.15.2/8.14.8/Submit) id w27LKNR0021329; Wed, 7 Mar 2018 23:20:23 +0200 (EET)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <23200.22423.300448.949259@fireball.acr.fi>
Date: Wed, 07 Mar 2018 23:20:23 +0200
From: Tero Kivinen <kivinen@iki.fi>
To: Valery Smyslov <smyslov.ietf@gmail.com>
Cc: ipsec@ietf.org
In-Reply-To: <043901d3ab29$8c980c20$a5c82460$@gmail.com>
References: <23175.7252.256625.885691@fireball.acr.fi> <02c501d3a95e$a5d73200$f1859600$@gmail.com> <23179.8656.330909.562547@fireball.acr.fi> <038e01d3aa5e$ec66bc80$c5343580$@gmail.com> <23180.40426.204224.108279@fireball.acr.fi> <043901d3ab29$8c980c20$a5c82460$@gmail.com>
X-Mailer: VM 8.2.0b under 25.1.1 (x86_64--netbsd)
X-Edit-Time: 44 min
X-Total-Time: 49 min
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipsec/YW9LyIuOOatlTTjBT4enx3fYM44>
Subject: Re: [IPsec] Additional charter items 1/4: Responder MOBIKE
X-BeenThere: ipsec@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Discussion of IPsec protocols <ipsec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipsec>, <mailto:ipsec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipsec/>
List-Post: <mailto:ipsec@ietf.org>
List-Help: <mailto:ipsec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipsec>, <mailto:ipsec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Mar 2018 21:20:34 -0000

Valery Smyslov writes:
> > There is no timers in the RFC specified for any of these operations,
> > all of them are implementation details. This is something that will
> > NOT affect interoperability, but will affect how well your
> > implementation works. If this is important matter for your
> > implementation you should research and study the problem and tune the
> > parameters suitable for your normal use case.
> 
> We are talking about different things. You are trying to convince 
> me that the cluster functionality _can_ be achieved with the current
> MOBIKE. I don't disagree, but I'm pointing out, that in this case
> it is relied on completely _optional_ features specified in RFC
> and there is no indication whether and how peer supports
> these features. These features don't affect MOBIKE interoperability, 
> but they do affect whether unmodified MOBIKE can be used
> for cluster scenario.

You are using lots of optional features too, things like NAT traversal
and Configuration payloads. Implementations implement features they
think are useful, and if you think features that allow MOBIKE movement
is useful then you will implement them. 

> Again, I'm not insisting that my proposal is the only and the best
> solution. Probably we can use the approach you suggested,
> but in this case some extension to the MOBIKE should be added
> that will make these features non optional and will add 
> a negotiation (or announcement) mechanism, so that 
> implementations can rely on peer's behavior.

Actually your cluster will see keepalive packets for both
IP-addresses, and can use this to indicate whether the client keeps
both addresses active through NAT. Also your cluster can actually test
it by sending empty informational exchange to the 2nd address first
few times, and then replacing it with 1st address if no reply (which
would indicate that 2nd ip address does not work).

So those features can detected if needed to.

> > It should detect it in few tens of seconds, and probes should find the
> > working path in tens of seconds more, or so. I.e., this should happen
> > in well under a minute.
> 
> That was my conclusion too. And I don't think that about a minute is
> a "short delay".

I have much longer delays in my network connections several times a
day. Especially when using mobile networks, or roaming between mobile
networks and wifi.

Yes, if you have streaming video or audio ongoing that will be
annoing, but even those usually buffer data for minute or so.

For conference calls or similar, minute is really noticeable break.

> > If you care about real-time traffic, then you should keep your NAT
> > mappings up for all peers, so you will get the address update message,
> > and can move to new address immediately when you receive it.
> 
> It won't help much. The MOBIKE client doesn't know that it must 
> switch to a just received additional address once he receives it. The 
> event that usually triggers this movement is an availability of current
> path. So you still will have a the delay while the client detects that
> the current path is not working and the new path works.
> You'll still have about 10-20 seconds delay at best.

No. The switch will be triggered immediately when the cluster / server
sends MOBIKE update using the 2nd ip address and does NOT include the
currently used IP address in the address list. If that message goes
through the client will start switch immediately, and probing the that
it works is just one round trip, so the delay should be less than
second.

I.e.


    Initiator                              Responder
    ---------                              ---------
                 <-- HDR(IPr2, IPi), SK { N(NO_ADDITIONAL_ADDRESSES) }
    HDR(IPi, IPr2), SK { } -->

This will immediately trigger the initiator to switch to IPr2, as IPr1
is no longer available. Note, that IPsec traffic can be switched to
new address at this point already.

    HDR(IPi, IPr2), SK { N(UPDATE_SA_ADDRESSES),
                      [N(NAT_DETECTION_SOURCE_IP),
		       N(NAT_DETECTION_DESTINATION_IP)],
		      [N(COOOKIE2)] } -->
						       

              <--  HDR(IPr2, IPi), SK { [N(NAT_DETECTION_SOURCE_IP),
	      	   	     	         N(NAT_DETECTION_DESTINATION_IP)],
					[N(COOKIE2)] }


and after that move is finished. So if client is keeping NAT mappings
alive address change can be done without any lost packets. And the
fact that it keeps mappings alive can either be tested, or you can use
the fact that you get keepalive packets to the 2nd address as
indication of that.

> > > Then the IKE SA state must be transferred to a new node and the
> > > current node must stop responding to this client.
> > 
> > You need to move the IKE SA anyways, and you need to move it before
> > you send any update which says there is another IP address to be used.
> > And both ends MUST be able to process IKE and IPsec SA packets at the
> > same time if we want to make sure no packets are lost during the
> > transition (if you really care about real-time traffic).
> 
> No, in your case the first node must stop responding to the client,
> so that client understands that it should switch to the other address.
> With my proposal the client is explicitly asked to do that,
> so the whole procedure should take less time and be more reliable.
> so that it is easier to make the event atomic from cluster point of view.

Not true.

   Changing addresses can also be triggered by events within IKEv2. At
   least the following events can cause the initiator to re-evaluate
   its local address selection policy, possibly leading to changing
   the addresses.
...
   o  An INFORMATIONAL request containing an ADDITIONAL_IP4_ADDRESS,
      ADDITIONAL_IP6_ADDRESS, or NO_ADDITIONAL_ADDRESSES notification
      is received. This means the peer's addresses may have changed.
      This is particularly important if the announced set of addresses
      no longer contains the currently used address.

> > > Then the cluster would wait until the client detects the failure and
> > > switches itself to a new node. And there is also a chance that there
> > > are some IKE exchanges in progress, so if the node stops responding
> > > the exchanges could time out the and the IKE SA would be deleted
> > > before the movement takes place (in my proposal the MOBIKE is
> > > combined with RFC6311 exchange to make this working)...
> > 
> > If you are using MOBIKE, then the IKE exchange should not time out
> > before it has tried all possible addresses, thus there is no issue in
> > there.
> 
> Can you please point me where RFC4555 requires that 
> _any_ exchange must try all possible addresses before 
> timing out? Path testing is described in Sections 3.10 and 3.12
> and these sections only tell about using INFORMATIONAL 
> DPD exchanges for this purpose.

I do not think it says that directly, but that is only way to get
MOBIKE working. It does that explictly for the initial IKE exchange:

3.1. Initial IKE Exchange
...
   If either or both of the peers have multiple addresses, some
   combinations may not work. Thus, the initiator SHOULD try various
   source and destination address combinations when retransmitting the
   IKE_SA_INIT request.

For the other exchanges this can be seen in the section 3.5.

I.e. one of the triggers for the address change is:

   o  An IKEv2 request has been re-transmitted several times, but no
      valid reply has been received. This suggests the current path is
      no longer working.

Note, that this is done BEFORE the exchange times out.

So in the next step you pick address you want to try next, and go
forward, and update IKE SA with new addresses, and then:

   o  If there are outstanding IKEv2 requests (requests for which the
      initiator has not yet received a reply), continues
      retransmitting them using the addresses in the IKE_SA (the new
      addresses).

I.e., you start sending them out with new IP address. If you have
space in your window you might also send address update, but quite
often implementations only support window size of 1, so you send your
original packet for 2nd IP address pair for some time, and if it still
does not work, you go back to beginning, and notice this address pair
does not work, lets pick next one. And you continue doing that until
you time out the whole exchange after several minutes. Note, that you
might run out of ip-address pairs during the process so you might end
up going back to beginning again.

After you do get reply to your IKEv2 request, then you now do have
space in your window and you do send UPDATE_SA_ADDRESSES packet to the
other end... 

> Broken implementations is not an issue here (although it is always big issue :-().
> The issue is that RFC4555 in its current form is too vague to be used
> for cluster use case. This use case requires some additions to RFC4555
> or some clarifications in any case (if the cluster use case is solved
> using MOBIKE, that is not the only possible way).

I am not sure about that. Note, that we did work quite long to get the
rules in section 3.5 correct, and there are things there that will
make things work correctly if you follow the rules, even if not
everything is explained there (i.e., it does not explain why you need
to do things exactly like it says, it just assumes you do).

RFC4621 explains the design rational behind the MOBIKE, and it
explains why we did some things in RFC4555. For example the section
6.2 of the RFC4621 explains that we need to use any existing IKE
exchange as path testing message and explains why we did it. 

> OK. How about the following?
> 
> MOBIKE protocol [RFC4555] is used to move existing IKE/IPsec SA from
> one IP address to another. However, in MOBIKE it is the initiator of
> the IKE SA (i.e. remote access client) that controls this process. 
> While the responder can try to instruct the initiator to switch to a different
> IP address, the whole process is not reliable enough, especially 
> in presence of NAT or firewalls. If there are several responders each having 
> own IP address and acting together as a load sharing cluster, then it is desirable 
> for them to have ability to request initiator to switch to a particular member.
> The working group will analyze the possibility to extend MOBIKE
> protocol or to develop new IKE extension that will allow to build load
> sharing clusters in an interoperable way.
> 
> Is it better?

Much better, but I still think that this can be done without
modification to the MOBIKE itself, and you already implement it in the
cluster end without any need to modify client end. It will work better
if the client will send keepalives for all IP-addresses instead of
just using the one, or if the client probes other paths than what is
used every now and then. 
-- 
kivinen@iki.fi