Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)

Tom Herbert <tom@quantonium.net> Thu, 08 February 2018 17:05 UTC

Return-Path: <tom@quantonium.net>
X-Original-To: ila@ietfa.amsl.com
Delivered-To: ila@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 094BD126E64 for <ila@ietfa.amsl.com>; Thu, 8 Feb 2018 09:05:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=quantonium-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id twP_jf7SzyvP for <ila@ietfa.amsl.com>; Thu, 8 Feb 2018 09:05:37 -0800 (PST)
Received: from mail-wr0-x22d.google.com (mail-wr0-x22d.google.com [IPv6:2a00:1450:400c:c0c::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 680A412D7F4 for <ila@ietf.org>; Thu, 8 Feb 2018 09:05:37 -0800 (PST)
Received: by mail-wr0-x22d.google.com with SMTP id 41so5459732wrc.9 for <ila@ietf.org>; Thu, 08 Feb 2018 09:05:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quantonium-net.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=w8JGhxop1ov1ZXwY9GN1m2putxd7S0Xe/iK+C7ZaLt0=; b=z5C9ysvY0Oq1L9kPx0jP4FQoQ/pHQXOcX5LDB7ghMxJPCjlzhLv762quDbjWYmMf8e QHdD8EVW3TRqLtGHetvf7KnxePxQSL3JWSpkD04p2ASnlkZcRheLDcngxj1CXoKD+Ydd tDNWrxzx8RzIhiFo6CPnyQD3zQmbQBvg7m268dL7p1exkT/i45Qi/O4K0+1p6aa2sePl eSQdGrQtBhbZ/r5Jb+M0FFr5LIaMtsTYAMFGBBQqeSJjFKdLkwXt0vSgSLVELAiLDWmg EelFPVTJBM0sVgoqE6y3y2qC9MuBxFoT083epuGP9Sn47XSk+sgcdICrTYUQjva0MIXd UqqQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=w8JGhxop1ov1ZXwY9GN1m2putxd7S0Xe/iK+C7ZaLt0=; b=CwKCwir5MxJK2kUbgq+XXzvrEdwTUAkng/QSZGr/pMO7CbwnP18aI2duM9rIYKNaz4 WTIyp6FpehpQPx5BmwSPCwKu26tJTjviX1coI2xRSdLURP/W8PUZqXT9GlfSdRjqePzI 4pwcCt6yLHs6QbFK1vhZpFIydObBG2L0BbTOUP+g7IWpbWjKyQMJvk2l+4BD4SSjWk+C N3ELK4w33XoCyoXTmtdDOyBJwZbtsjFxSP6jqhksiE6XbhJYXOT9mLemUAOys2zrYNFy oujQ80qPq2ou3Cj52sHuVSHBQHl/p5VA61SqU+NDUsGnMuxeyq4WpQwqYOzGp+WJqumR sDxA==
X-Gm-Message-State: APf1xPAP7pOlUYjPrAkN1kHB2cin//hEszIVXJ4dBUcBdW3fj8T4T2NF /llhfG5FJuI5ejlOPnNN1rN7XIkhMWeENsMHb+fBuQ==
X-Google-Smtp-Source: AH8x227R6Tl9UHz5yNLANbGzUA1LVWTSYV5rXG+AQYTah0GS7wlugWFkd9ytx+GXzdlqhlOCNWqccdeu97bCRPke5yc=
X-Received: by 10.223.134.163 with SMTP id 32mr1458824wrx.250.1518109535723; Thu, 08 Feb 2018 09:05:35 -0800 (PST)
MIME-Version: 1.0
Received: by 10.223.173.66 with HTTP; Thu, 8 Feb 2018 09:05:35 -0800 (PST)
In-Reply-To: <57A8FA1C-7920-4F27-83E3-E673242FACE2@gmail.com>
References: <CALx6S37+1PK3ET7g+XsCHt6-CJrABLko0YbWgE12xFX=vL5OPA@mail.gmail.com> <5067FA81-B416-4A19-9F11-A08B35BB8B6D@gmail.com> <CAPDqMeqNkiOWHOVU0AsUzFfPH60pTdS2x9CePhvZDhVLGJoGmw@mail.gmail.com> <9C425F56-738C-4600-9DFF-9D30FC3872DC@gmail.com> <CAPDqMeoLPSGbFg_H-_yOPBguXhOmx8fXjzYd_ax1Qds56KibZQ@mail.gmail.com> <EF6D1220-510C-4A4A-A15E-CA7029B300F7@gmail.com> <CAPDqMeq9yWoEY7WvtX7v-p0UN01-BjERVUF0HNFTEqwD=P1X=g@mail.gmail.com> <D518818A-B2DF-40EE-8880-1D1B8B67FADC@gmail.com> <CAPDqMeoHxRCB2mr5Oc4XVG+RjKF+3mBx7APVj6y8uCzpOfn0BA@mail.gmail.com> <57A8FA1C-7920-4F27-83E3-E673242FACE2@gmail.com>
From: Tom Herbert <tom@quantonium.net>
Date: Thu, 08 Feb 2018 09:05:35 -0800
Message-ID: <CAPDqMeopH7hm9TaOe9QnFSxgd8fJAkybKxtL0qX0=Y20bKs5Vg@mail.gmail.com>
To: Dino Farinacci <farinacci@gmail.com>
Cc: Tom Herbert <tom@herbertland.com>, 5GANGIP <5gangip@ietf.org>, Behcet Sarikaya <sarikaya@ieee.org>, ila@ietf.org, Alexandre Petrescu <alexandre.petrescu@gmail.com>, Lorenzo Colitti <lorenzo@google.com>
Content-Type: multipart/alternative; boundary="001a1146a2209b7c7d0564b66bb8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ila/TlzUR5BlwQc2I9-lR63NHU8K_dg>
Subject: Re: [Ila] [5gangip] Scaling mapping systems (was Re: BOF Description)
X-BeenThere: ila@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Identifier Locator Addressing <ila.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ila>, <mailto:ila-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ila/>
List-Post: <mailto:ila@ietf.org>
List-Help: <mailto:ila-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ila>, <mailto:ila-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Feb 2018 17:05:40 -0000

>
>
> In ILA there is likely a one-to-one association between locator and ID. So
> the out of order situation can happen to each new specific flow. In LISP,
> the locator can be a prefix used for many EIDs so the out of order or drop
> situation on the ITR only happens for one destination and not to all the
> others that start a flow for the same prefix.
>

That would also be true of ILA, flows going to the same destination will
share the same cache entry.

At the start of a flow (or resumption from idle) a connection should be in
slow start so that there wouldn't be a lot of packets inflight to create
OOO packets, or it might even be only a single SYN in flight. It's likely
that the RTT to get the redirect is less than the application response time
so that the cache would be populated before a sender sends its second round
of packets.

As you pointed out the alternative to redirect is do a query and either
drop packets or queue packets until there is a response from the mapping
server. Dropping is problematic because it can introduce latency, consider
that if an initial SYN is dropped the sender takes an initRTO which is at
least 1 sec delay. Queuing is a problem of resources management and the
queue becomes a possible DOS target.

>
> > > > trianglular routing. Redirects must be secure so that they cannot be
> spoofed, so for that reason (and some others) the protocol is over TCP. The
> mapping cache is only an optimization and packets are never dropped or
> queued for pending cache resolution. If the cache weren't present,
> communications would
> > >
> > > But the ILA router in the network needs to have the mapping. And if it
> does not, what happens then?
> > >
> > > The ILA router is considered authoritative for its shard. If it
> doesn't have a mapping then the packet is silently dropped as having an
> unreachable destination.
> >
> > Does the ILA router for the shard only route to destinations inside of
> the shard?
> >
> > Yes.
> >
> > And if so, it is kept small so you can push the mappings and make it
> scale at that level?
> >
> > Small is relative. The ILA router first needs to being able to forward
> the load, so provisioning would need to consider the worse case where no
> entries hit the cache. Redirects would likely be rate limited (probably
> capped by throughput). In a network, design I would probably target a cache
> hit rate of at least 95%.
>
> In any loc/ID design push can be used and scale. But is it practical (and
> necessary) for 1B entries?
>
> This isn't doing push in the pub/sub sense which I agree won't scale to 1B
entries. Redirect are reactions to forwarded packets so the packets
effectively are the mapping request.


> > > > still be viable but sub-optimal; that characteristic establishes a
> bound for the worst case DOS attack.
> > > >
> > > > Any thoughs on this approach?
> > >
> > > You’ll really need to put signatures in the Redirect messages if you
> want robust authentication.
> > >
> > > The redirects are sent over an established TCP connection to an ILA-N
> so that deters message spoofing. For stronger security, TCP authentication
> option or TLS can be used.
> >
> > Is it worth the state to hold millions of TCP connections only the
> occasion redirect message?
> >
> > It might be millions of TCP connections spread across a large network,
> but for ILA-Ns it's at most a couple of hundred connections and for a
> mapping resolution ILA-R it's probably no more than 50K. Assuming 2K memory
> per TCP connection ctx (and TLS if in use) that's a 100M on the ILA-Rs. In
> addition to security, the other reasons to use TCP is it's reliable
> transport and provides congestion control and avoidance. There's also the
> option of using TFO if the amount of state were to become an issue.
>
> Okay. Agree its not a memory-issue proble I was referring to. It was how
> to manage the connections. And how much is exposed to a network admin
> engineer to debug. Managing 100s of BGP connections have been a challenged
> with the customers I have worked with in the past.
>
> My sense is that network control protocols are trending to use TCP with an
RPC layer. For instance in the datacenter use case, an operator like Google
or Facebook are going to be most comfortable with a protocol over TCP using
a their common RPC and authentication infrastructure (i.e. Google would
probably want to use gRPC, Facebook probably Thrift as they seem to be
using in Open/R). For 3GPP 5G the service-based interfaces are HTTP 2.0
(REST) over TCP.

Tom