Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)

Dino Farinacci <farinacci@gmail.com> Thu, 08 February 2018 10:40 UTC

Return-Path: <farinacci@gmail.com>
X-Original-To: ila@ietfa.amsl.com
Delivered-To: ila@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5D42512D940; Thu, 8 Feb 2018 02:40:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6NaDXigrgDJg; Thu, 8 Feb 2018 02:40:45 -0800 (PST)
Received: from mail-pf0-x22c.google.com (mail-pf0-x22c.google.com [IPv6:2607:f8b0:400e:c00::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1EFF812D893; Thu, 8 Feb 2018 02:40:45 -0800 (PST)
Received: by mail-pf0-x22c.google.com with SMTP id w83so1578932pfi.12; Thu, 08 Feb 2018 02:40:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GHsGutA+EaJ49Yi175hiiaFPhRVeG2FM3D/X/o5l/fU=; b=Y+qmL7YSCTOdIVo8tuWmcaEZW8HkBi5mF+/jrPAtm574SRzpMyqE9FK8u0M0/7DOv/ paJxsEHtBUJELy6BTGRVV77i6ZNRQNlBZvPtB0Epz6hOGouFG7lChF+tJAwSz0wtaa8i 6igMtXHNy0UpnZp4Bst9UP41jYLo7NCRPzUsBD1AFcpfcGG1LFQQVrGfT+st6RqAZxFJ pKZDfPDBtWweURTWmKUK6qCcq4vtdyQ7QpPIgJt6sjyT4wSLsDEoYJXpVlaYdMZ/2+hO 0GmSooz48Nph81Y+U/wmMCDPZ0BsJEvp19suDr4DhmRmeE9qp4/J2BbEAdhkZGAfltR2 yJQA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=GHsGutA+EaJ49Yi175hiiaFPhRVeG2FM3D/X/o5l/fU=; b=TCyfnRmfedy7QFXwEM37KA24XQMVjBORpskoE24FI0cU6ukQxFdtW+1oerGv07/iSH c8QLKz3HKAD9FACQksBKlZqNKGFT6apMayGBepkafkY4MwUSL2yikdJrQ3dABeSO0g1C 2A3YOsme1gmQR7ityaeiwbT8125t8fM72Hc/1Qbx4JBKVySacIcv6R1qXkzqkdWktua+ oVxM2y8PRIbRjnxHN3ir9L20iYKPILoEFLXuebWjy+vJYmIHK200YQ5fTLG4wgb2vsh6 Z7mhZAOA2oyhr6161tuTu7rz9MVMsbJMiPiu5MHTFWf6qWok+iRQAF+AuvndA5fS2kY/ Ljbg==
X-Gm-Message-State: APf1xPBCqxwRSjm8/JjWAM0Smy5ObAI4frbCk1UWDPIGXPVB2AfaXia3 xcS8F3PxvAFJNbks9Hsjo1A=
X-Google-Smtp-Source: AH8x226mdg6AVT1eIDFAsfi4MPeldkpkAY/C4zKJC1OowfnW/x6pRnOv63XSI318ZXJ9eeEaQVvZAg==
X-Received: by 10.98.68.129 with SMTP id m1mr197296pfi.171.1518086444622; Thu, 08 Feb 2018 02:40:44 -0800 (PST)
Received: from [10.174.67.169] ([211.185.161.174]) by smtp.gmail.com with ESMTPSA id p73sm10577242pfa.109.2018.02.08.02.40.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Feb 2018 02:40:44 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
From: Dino Farinacci <farinacci@gmail.com>
In-Reply-To: <CAPDqMeoHxRCB2mr5Oc4XVG+RjKF+3mBx7APVj6y8uCzpOfn0BA@mail.gmail.com>
Date: Thu, 08 Feb 2018 02:40:36 -0800
Cc: Tom Herbert <tom@herbertland.com>, 5GANGIP <5gangip@ietf.org>, Behcet Sarikaya <sarikaya@ieee.org>, ila@ietf.org, Alexandre Petrescu <alexandre.petrescu@gmail.com>, Lorenzo Colitti <lorenzo@google.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <57A8FA1C-7920-4F27-83E3-E673242FACE2@gmail.com>
References: <CALx6S37+1PK3ET7g+XsCHt6-CJrABLko0YbWgE12xFX=vL5OPA@mail.gmail.com> <5067FA81-B416-4A19-9F11-A08B35BB8B6D@gmail.com> <CAPDqMeqNkiOWHOVU0AsUzFfPH60pTdS2x9CePhvZDhVLGJoGmw@mail.gmail.com> <9C425F56-738C-4600-9DFF-9D30FC3872DC@gmail.com> <CAPDqMeoLPSGbFg_H-_yOPBguXhOmx8fXjzYd_ax1Qds56KibZQ@mail.gmail.com> <EF6D1220-510C-4A4A-A15E-CA7029B300F7@gmail.com> <CAPDqMeq9yWoEY7WvtX7v-p0UN01-BjERVUF0HNFTEqwD=P1X=g@mail.gmail.com> <D518818A-B2DF-40EE-8880-1D1B8B67FADC@gmail.com> <CAPDqMeoHxRCB2mr5Oc4XVG+RjKF+3mBx7APVj6y8uCzpOfn0BA@mail.gmail.com>
To: Tom Herbert <tom@quantonium.net>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ila/d5artq4ORmj4OVTP7rkxMlmEkio>
Subject: Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)
X-BeenThere: ila@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Identifier Locator Addressing <ila.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ila>, <mailto:ila-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ila/>
List-Post: <mailto:ila@ietf.org>
List-Help: <mailto:ila-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ila>, <mailto:ila-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Feb 2018 10:40:47 -0000

> Hi Dino,
> 
> I'm not sure I understand. OOO packets most often happen when packets for the same flow take two different paths through the network at the same time so that packets sent later beat earlier ones to the destination. W.r.t. to the cache, the window for OOO delivery is when cache is entry is

Right.

> instantiated, but there are already packets inflight that presumably are taking non-cached scenic route. Once the non-cached packets have been received, all the packets for the flow go through the cache taking the same path so there is no more OOO. It's a small window. Also, if multiple connectionr.s

For the same destination yes.

> are simultaneously suffer OOO from this I doubt this would have a material impact-- congestion window collapse is unlikely without packet loss. Of course, a stateful firewall in the path might not be so tolerant of OOO packets…

In ILA there is likely a one-to-one association between locator and ID. So the out of order situation can happen to each new specific flow. In LISP, the locator can be a prefix used for many EIDs so the out of order or drop situation on the ITR only happens for one destination and not to all the others that start a flow for the same prefix.

> > > trianglular routing. Redirects must be secure so that they cannot be spoofed, so for that reason (and some others) the protocol is over TCP. The mapping cache is only an optimization and packets are never dropped or queued for pending cache resolution. If the cache weren't present, communications would
> >
> > But the ILA router in the network needs to have the mapping. And if it does not, what happens then?
> >
> > The ILA router is considered authoritative for its shard. If it doesn't have a mapping then the packet is silently dropped as having an unreachable destination.
> 
> Does the ILA router for the shard only route to destinations inside of the shard? 
> 
> Yes.
>  
> And if so, it is kept small so you can push the mappings and make it scale at that level?
> 
> Small is relative. The ILA router first needs to being able to forward the load, so provisioning would need to consider the worse case where no entries hit the cache. Redirects would likely be rate limited (probably capped by throughput). In a network, design I would probably target a cache hit rate of at least 95%.

In any loc/ID design push can be used and scale. But is it practical (and necessary) for 1B entries?

> > > still be viable but sub-optimal; that characteristic establishes a bound for the worst case DOS attack.
> > >
> > > Any thoughs on this approach?
> >
> > You’ll really need to put signatures in the Redirect messages if you want robust authentication.
> >
> > The redirects are sent over an established TCP connection to an ILA-N so that deters message spoofing. For stronger security, TCP authentication option or TLS can be used.
> 
> Is it worth the state to hold millions of TCP connections only the occasion redirect message?
> 
> It might be millions of TCP connections spread across a large network, but for ILA-Ns it's at most a couple of hundred connections and for a mapping resolution ILA-R it's probably no more than 50K. Assuming 2K memory per TCP connection ctx (and TLS if in use) that's a 100M on the ILA-Rs. In addition to security, the other reasons to use TCP is it's reliable transport and provides congestion control and avoidance. There's also the option of using TFO if the amount of state were to become an issue.

Okay. Agree its not a memory-issue proble I was referring to. It was how to manage the connections. And how much is exposed to a network admin engineer to debug. Managing 100s of BGP connections have been a challenged with the customers I have worked with in the past.

Dino