Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)

"FIGURELLE, TERRY F" <tf2934@att.com> Thu, 31 May 2018 00:38 UTC

Return-Path: <tf2934@att.com>
X-Original-To: ila@ietfa.amsl.com
Delivered-To: ila@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B05C61200FC; Wed, 30 May 2018 17:38:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.61
X-Spam-Level:
X-Spam-Status: No, score=-0.61 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JCKq6MvFqHqs; Wed, 30 May 2018 17:38:17 -0700 (PDT)
Received: from mx0a-00191d01.pphosted.com (mx0b-00191d01.pphosted.com [67.231.157.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 85ACA12DA40; Wed, 30 May 2018 17:38:17 -0700 (PDT)
Received: from pps.filterd (m0049459.ppops.net [127.0.0.1]) by m0049459.ppops.net-00191d01. (8.16.0.22/8.16.0.22) with SMTP id w4V0ZKah017742; Wed, 30 May 2018 20:38:09 -0400
Received: from flpd657.enaf.ffdc.sbc.com (sbcsmtp9.sbc.com [144.160.128.153]) by m0049459.ppops.net-00191d01. with ESMTP id 2ja4w71qvh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 30 May 2018 20:38:08 -0400
Received: from enaf.ffdc.sbc.com (localhost [127.0.0.1]) by flpd657.enaf.ffdc.sbc.com (8.14.5/8.14.5) with ESMTP id w4V0c7O7026193; Wed, 30 May 2018 17:38:07 -0700
Received: from zlp25946.vci.att.com (zlp25946.vci.att.com [135.213.92.138]) by flpd657.enaf.ffdc.sbc.com (8.14.5/8.14.5) with ESMTP id w4V0c6Y5026179; Wed, 30 May 2018 17:38:06 -0700
Received: from zlp25946.vci.att.com (zlp25946.vci.att.com [127.0.0.1]) by zlp25946.vci.att.com (Service) with ESMTP id 262D6412C920; Thu, 31 May 2018 00:38:06 +0000 (GMT)
Received: from CAFRFD1MSGHUBAG.ITServices.sbc.com (unknown [130.4.169.164]) by zlp25946.vci.att.com (Service) with ESMTPS id 0B1A7410F442; Thu, 31 May 2018 00:38:06 +0000 (GMT)
Received: from CAFRFD1MSGUSRJI.ITServices.sbc.com ([169.254.8.118]) by CAFRFD1MSGHUBAG.ITServices.sbc.com ([130.4.169.164]) with mapi id 14.03.0389.001; Wed, 30 May 2018 17:38:05 -0700
From: "FIGURELLE, TERRY F" <tf2934@att.com>
To: Tom Herbert <tom@quantonium.net>, Lorenzo Colitti <lorenzo@google.com>
CC: Tom Herbert <tom@herbertland.com>, "ila@ietf.org" <ila@ietf.org>, Alexandre Petrescu <alexandre.petrescu@gmail.com>, Behcet Sarikaya <sarikaya@ieee.org>, Dino Farinacci <farinacci@gmail.com>, 5GANGIP <5gangip@ietf.org>
Thread-Topic: [5gangip] [Ila] Scaling mapping systems (was Re: BOF Description)
Thread-Index: AQHTnc+dXgqKXRMVlEW8c4cGVUt1yqOVceOAgAEG+oCAAV1+WoCxwj7g
Date: Thu, 31 May 2018 00:38:05 +0000
Message-ID: <1AFE10CF28AE8B4C9663445736B8034D38255422@CAFRFD1MSGUSRJI.ITServices.sbc.com>
References: <CALx6S37+1PK3ET7g+XsCHt6-CJrABLko0YbWgE12xFX=vL5OPA@mail.gmail.com> <5067FA81-B416-4A19-9F11-A08B35BB8B6D@gmail.com> <CAPDqMeqNkiOWHOVU0AsUzFfPH60pTdS2x9CePhvZDhVLGJoGmw@mail.gmail.com> <9C425F56-738C-4600-9DFF-9D30FC3872DC@gmail.com> <CAPDqMeoLPSGbFg_H-_yOPBguXhOmx8fXjzYd_ax1Qds56KibZQ@mail.gmail.com> <CAKD1Yr2HQCt0DZ62_YqC1qOb_GJkqh35U8_mViMNYps8VOcv6A@mail.gmail.com> <CAPDqMerUgfcROGMjxpaaQD=-sohU4vd_-80n2os_v+_v__bYNQ@mail.gmail.com>
In-Reply-To: <CAPDqMerUgfcROGMjxpaaQD=-sohU4vd_-80n2os_v+_v__bYNQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [130.10.35.62]
Content-Type: multipart/alternative; boundary="_000_1AFE10CF28AE8B4C9663445736B8034D38255422CAFRFD1MSGUSRJI_"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-05-30_10:, , signatures=0
X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1805310005
Archived-At: <https://mailarchive.ietf.org/arch/msg/ila/JcTnLT3Z6i1PF-VFAU_qBi7KJ_g>
Subject: Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)
X-BeenThere: ila@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Identifier Locator Addressing <ila.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ila>, <mailto:ila-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ila/>
List-Post: <mailto:ila@ietf.org>
List-Help: <mailto:ila-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ila>, <mailto:ila-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 31 May 2018 00:38:21 -0000

You have a scaling problem there since you can handle that traffic example on much less than 530 UPF’s (e.g. 20+20 is enough from capacity point of view). To cover all of the layer1 fiber paths of importance in continental US you only need a max of 180 locations. I doubt anyone but AT&T would go to that extreme on fiber plant connectivity. Adding more locations actually increases latency and is not desired.

From: 5gangip <5gangip-bounces@ietf.org> On Behalf Of Tom Herbert
Sent: Tuesday, February 06, 2018 12:28 PM
To: Lorenzo Colitti <lorenzo@google.com>
Cc: Tom Herbert <tom@herbertland.com>; ila@ietf.org; Alexandre Petrescu <alexandre.petrescu@gmail.com>; Behcet Sarikaya <sarikaya@ieee.org>; Dino Farinacci <farinacci@gmail.com>; 5GANGIP <5gangip@ietf.org>
Subject: Re: [5gangip] [Ila] Scaling mapping systems (was Re: BOF Description)



On Mon, Feb 5, 2018 at 10:34 PM, Lorenzo Colitti <lorenzo@google.com<mailto:lorenzo@google.com>> wrote:
On Tue, Feb 6, 2018 at 12:37 AM, Tom Herbert <tom@quantonium.net<mailto:tom@quantonium.net>> wrote:
As time has gone on with experience in deployment, forwarding to another node that has a large shared cache, like an RTR, seems to deal with the issue in a compromised way.

In ILAMP (https://tools.ietf.org/html/draft-herbert-ila-ilamp-00<https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dherbert-2Dila-2Dilamp-2D00&d=DwMFaQ&c=LFYZ-o9_HUMeMTSQicvjIg&r=tO_sNxa2NTxwl2paJIf4zw&m=ZuWBLVzaamz8AnvsRNdHoMiV9gskJSpDdZj5s3HGtPA&s=QU0hDuO1SK98mkjSZpPpNiJtNlg02UEcDpJakusxv60&e=>) secure redirects are defined to be the primary means of populating a cache. On a cache miss, a packet is forwarded into the network and routing should forward the packet to an ILA router that contains the complete set of mappings for the associated shard. The router performs the transformation and can send a redirect back to the caching node. The redirect can be cached so that subsequent packets take a direct route and avoid the trianglular routing. Redirects must be secure so that they cannot be spoofed, so for that reason (and some others) the protocol is over TCP. The mapping cache is only an optimization and packets are never dropped or queued for pending cache resolution. If the cache weren't present, communications would still be viable but sub-optimal; that characteristic establishes a bound for the worst case DOS attack.

Any thoughs on this approach?

Hi Lorenzo,

Thanks for the feedback!

The obvious ones would seem:

  *   Will the working set of cache mappins fit in whatever memory is available in, say, a large peering router?
The cache needs to be sized to fit the working set. Please look at the reference topology of section N. The intent is that caches would not be used at edge or peer routers, ILA router are used there which shard the mapping database across some number of  nodes. Mapping caches are at ILA-Ns which are deployed deeper in the network towards the end hosts (like toward gNB). There are more of these than ILA-Rs, however they are serve a small set of users and would be implemented on lower end devices. The purpose of the cache is to optimize latency and throughput of performance critical intra-network communications (Ue to UE, UE to application server in network).


  *   Remember that memory that is capable of sustaining routing lookups at tens of gigabits per second is extremely expensive.
  *   How do you defend against state exhaustion attacks whereby a large number of spoofed sources send packets to nonexistent destinations?
No cache state is created for nonexistent destinations.


  *   How much bandwidth will be used for the redirects? How much CPU time will be used to process them?
  *   How many servers are needed to store
Suggest gaming this out for a large mobile operator with, say, 100M clients, 100 Internet connection points and 10T of traffic.

I did an analysis of ILA scalability for a couple of scenarios (see attached), For the 100M UEs, the number of ILA routers came out to 530.

Tom