Re: [Ila] [DMM] Questions about SRv6 mobile user-plane

Tom Herbert <tom@quantonium.net> Mon, 29 January 2018 17:35 UTC

Return-Path: <tom@quantonium.net>
X-Original-To: ila@ietfa.amsl.com
Delivered-To: ila@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9D49012EC0F for <ila@ietfa.amsl.com>; Mon, 29 Jan 2018 09:35:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=quantonium-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C_T24Dg-gBRo for <ila@ietfa.amsl.com>; Mon, 29 Jan 2018 09:35:09 -0800 (PST)
Received: from mail-wr0-x235.google.com (mail-wr0-x235.google.com [IPv6:2a00:1450:400c:c0c::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ED4AC12EBEC for <ila@ietf.org>; Mon, 29 Jan 2018 09:35:08 -0800 (PST)
Received: by mail-wr0-x235.google.com with SMTP id 41so6730465wrc.9 for <ila@ietf.org>; Mon, 29 Jan 2018 09:35:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quantonium-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=G59lCKKqjkDIjQE/6xilhJY2pvaGluYXP7useMnyDUo=; b=FOd2AVZI0SH/yyGQpS56hWE39Wf3iMOTLbDVYtLG88s58wizIdih8DyXvYO8AxKUWe Mbx+RGjZrzCo1ZR7e5IQL/khAt1q2AGT3oP8svxInZVEOre0++ryH/vuCG0QYn2b/Ru+ ELNlp+oTLCYBaHjmUYhHRZvoF4Jy/Ap1J8GP/azu/9WTpUmHNwu1MbBxvnza4YmfACZJ VucpFMBEn9HZCabRyfhaKKdkEJjkLp8TOQXIgJQeMMS+ggEei/gVmQURhYCt6u3zB9gZ vmg7ZGzgeIpSgagUCN5aRojoHCGTgwml+cO5vTTeuBeZ4XOubXV4JAxDfgiKdPYZaty8 PUCg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=G59lCKKqjkDIjQE/6xilhJY2pvaGluYXP7useMnyDUo=; b=E/Trv1bv54h/7Nm91X2YthObuE5fXHil1IZ4wO/E2BP5SDlutXMBl7asKcj+zWDRQV wvmWiCE101VazVjeWbcKpEl7AKKCdQeqgIva+mDPTSlb3Wxpl5Plr8n+IZMJHEwY3pum g3UrxrxBGyjnO+A8fuCELvaq1LFTdi+QDCuW8Cc6dnaka3yxkzx3DEx3jZUqcuguqTgo 5X0FfNTlBENcelrWm+SARf4z5RFw1qX6zgPmyV8ssjqUHuTHo/ce68OJ04Sl0vblxujS TqmCQf+4NrU3ajLujn0aOQPjH1UZDalyPIhbBD5eT9RpxzGJgYGQDSxLaXndqFl2HJM4 dl4A==
X-Gm-Message-State: AKwxyteZYrMJFd39BxOShwvh7p7kcsHzpTV5aEwysa0CUI+itsGbIQM5 o4/KUMZSDDRa4PqNLY+Tqh3/6q2m5oUWVA2MwaSNaAVQ
X-Google-Smtp-Source: AH8x224061cozfLryVbMwgNv1/Eh1ksO+1LWpVI1D3qOqu4d5Q6NZthQjNue2pU/+IBpLQGzH7jhngQuqLgdhqrzWOM=
X-Received: by 10.223.171.194 with SMTP id s60mr20397916wrc.250.1517247307240; Mon, 29 Jan 2018 09:35:07 -0800 (PST)
MIME-Version: 1.0
Received: by 10.223.173.66 with HTTP; Mon, 29 Jan 2018 09:35:06 -0800 (PST)
In-Reply-To: <D69477C8.106CC%sgundave@cisco.com>
References: <CAPDqMerEUMEpKWSu3nC+rxcNpOj_LckvQwPga9bzkDdAYpSwwQ@mail.gmail.com> <D69114C4.2A206E%sgundave@cisco.com> <CAPDqMepFiUPBNbidHokJYPMovGYRaxbtqHbuo-d4qXrjsh=jXw@mail.gmail.com> <D6947782.2A29C4%sgundave@cisco.com> <D69477C8.106CC%sgundave@cisco.com>
From: Tom Herbert <tom@quantonium.net>
Date: Mon, 29 Jan 2018 09:35:06 -0800
Message-ID: <CAPDqMeoTStA8cHD5=2V1_bu-Ovp2Whsi4JBrkqfZ2X2VKAGvmA@mail.gmail.com>
To: "Sri Gundavelli (sgundave)" <sgundave@cisco.com>
Cc: "dmm@ietf.org" <dmm@ietf.org>, ila@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/ila/eWbBsVha9g45on8yIWhQfa202Lw>
Subject: Re: [Ila] [DMM] Questions about SRv6 mobile user-plane
X-BeenThere: ila@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Identifier Locator Addressing <ila.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ila>, <mailto:ila-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ila/>
List-Post: <mailto:ila@ietf.org>
List-Help: <mailto:ila-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ila>, <mailto:ila-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jan 2018 17:35:12 -0000

Hi Sri,

My comments are inline.

> https://tools.ietf.org/html/draft-herbert-ila-motivation-00 provides some comparisons between ILA and ILNP, encapsulations, SR, and transport layer mechanisms that can achieve some effects in mobility.
>
> The choice of mapping system is critical. The mapping of identifier, or equivalently virtual to physical address mapping, seems to be a common problem in mobility and networking virtualization. As you mentioned, LISP defines a query method to populate a mapping cache. I assume this problem needs to be tackled in SR for mobile user-plane but I'm not sure what solution is preferred after reading the draft.
>
> [Sri] There are multiple approaches on how we manage this mapping state. Obviously, ILA is one approach, but there are few other approaches as well that we need to review.

It's a good discussion to have.

> ILA partitions the problem into a two level hierarchy: ILA routers and IL forwarding nodes. This is somewhat analogous to core IP routers and nodes running neighbor discovery.  ILA routers contain all the (possibly sharded) mappings. They are authoritative. Forwarding nodes are located close to user devices and maintain a working set  cache of entries driven by user activity. If a packet doesn't hit the cache it's forwarded to a router that will do the ILA transformation. If the cache is hit, the packet can be transformed at the forwarding node to eliminate triangular routing. Caches can be populated by pull or push models. ILAMP (the ILA mapping protocol) supports both of these, but my current preference for scalability and mitigating DOS attacks on the cache is to use secure redirects sent by ILA routers  (analogous to ICMP redirects).
>
>
> [Sri] When I last reviewed the ILA I-D, I do not seem to remember reading about the cache state, ILMP. or about how the mapping gets to the ILA routers. Looks like the spec is evolving as we speak. With ILAMP type control protocol for cache management, I see more similarities to LISP.
>
>
We separate data plane from the control plane. The ILA draft describes
the data plane, other drafts (ILAMP, BGP/ILA, ILA in the datacenter)
describe control plane aspects. We'll post a draft shortly with
details specific to the mobile user plane. There are similarities to
LISP, but also differences.

>
>
>
>> On a different note, just curious if SID prefix can ever have topological relevance and can be used for routing. In other words, can you ever route a packet without translating  the SIR prefix of the destination address with the locator? Can SID prefix be used as a locator in some special cases?
>
>
> Yes, the SIR prefix is routable to forward to an ILA router. This is necessary for the redirect mechanism I describe above. I suppose this could be contorted to make the SIR address be a home address like in MobileIP and locators are COAs (if my use of MobileIP terminology is correct). There also might be nodes in the network, as well as external nodes that don't do go through a cache to their packets need to hit an ILA router to get forwarded to the location of mobile nodes. An upshot of that is that edge routers might need to perform transformations (SIR to ILA) at high rates so the mechanism needs to be very efficient and amenable to HW implementation.
>
>
> [Sri] This is precisely what I was thinking.
>
> I get that SIR prefix takes the packet awards the ILA domain and some ILA router in the path can apply the mapping. I was thinking there may not be a good reason to have more than one or two SIR prefixes for each ILA domain. As long as the SIR prefix can take the packet from a non-ILA domain (internet) to ILA domain, then the edge router can apply the mapping. But, that also implies the edge routers will have to have too much of mapping state. Now, if we have many SIR prefixes and associate a SIR prefix for each PGW/UPF, that state can be distributed and keep the edge routers stateless, but it also brings anchoring back into the picture. In one simplest mode, as you say, HNP (home network prefix) can be a SIR and the PGW/SGW or  (LMA/MAG) can do the translation of SIR - ILA, without the need for tunneling.
>
> So, in your mind how many SIR prefixes will be used in a typical T1 operator domain?

One SIR prefix is the simplest way. This allows 64 bit identifier
lookups instead of 128 bit. Also, there's no ambiguity in ILA to SIR
address translation since all locators may back to the same SIR.
However, there's nothing in the architecture that prevents multiple
SIR addresses as long as the mapping from ILA to SIR address is
unambiguous. Non local address identifiers do this.

> Also, how can we quantify the state that ILA introduces in different parts of the network?

Please look at topology of section 2 in
https://tools.ietf.org/html/draft-herbert-ila-ilamp-00. ILA routers
collectively contain the all the mappings for the domain. The mappings
can be sharded on the routers serving a shard can be replicated. There
are two cases where ILA transformation is needed: at netwrok ingress
(e.g. from Internet) and intra domain traffic. The first case is
served by edge routers which as I mentioned would have considerable
load. For intra domain communications routers would be used aslo, but
they can be augment by the use of mapping caches in forwarding nodes.
Forwarding nodes perform ILA to SIR transformation before delivery--
this does not need a lot of state.

Presumably every mobile node in the network has an identifier to
locator mapping. So the number of mappings in the domain equals the
number of  mobile nodes. This number is expected to reach into the the
billions, a scale a single device won't have the memory for the full
mapping table so it would be sharded. Each shard also would be
replicated N ways. So number of routers needed is num_shards *
num_replicas.

Another major consideration for scaling is changes to the mapping
system. It's a little harder to quantify since the load on the system
depends on the rate at which mobile nodes are moving. I'm not sure
what the rate of device moving in cellular network is (someone might
have good insights on that), but for scaling estimates I'm using 1%
(that is at an given time 1% of devices are moving to different eNb).

Tom