Re: [Int-area] New Version Notification for draft-li-int-aggregation-00.txt

Toerless Eckert <tte@cs.fau.de> Fri, 25 February 2022 17:38 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B3A1F3A0B2F for <int-area@ietfa.amsl.com>; Fri, 25 Feb 2022 09:38:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.87
X-Spam-Level:
X-Spam-Status: No, score=-0.87 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id StBN5zUk6ZBe for <int-area@ietfa.amsl.com>; Fri, 25 Feb 2022 09:38:09 -0800 (PST)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 29BE63A09AF for <int-area@ietf.org>; Fri, 25 Feb 2022 09:38:08 -0800 (PST)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [131.188.34.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 550D4549ED9; Fri, 25 Feb 2022 18:38:02 +0100 (CET)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 43C9E4EA769; Fri, 25 Feb 2022 18:38:02 +0100 (CET)
Date: Fri, 25 Feb 2022 18:38:02 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: Tony Li <tony.li@tony.li>
Cc: int-area@ietf.org
Message-ID: <YhkT+iYy/VVVoZNp@faui48e.informatik.uni-erlangen.de>
References: <164367925561.21687.13323438769934745511@ietfa.amsl.com> <A5236BE8-2499-4E45-8B06-C131C4324611@tony.li> <YhiNEDhMoo2HRVPz@faui48e.informatik.uni-erlangen.de> <45325980-F4EC-483E-9D02-CBB208A3EDA4@tony.li>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <45325980-F4EC-483E-9D02-CBB208A3EDA4@tony.li>
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/6wbTCwSTAu00Ob24Uhjv6ID1luE>
Subject: Re: [Int-area] New Version Notification for draft-li-int-aggregation-00.txt
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area WG Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2022 17:38:12 -0000

On Fri, Feb 25, 2022 at 08:34:29AM -0800, Tony Li wrote:
> > Aka: In one possible universe (where less-stupid router vendors finally
> > start putting powerful enough control plane CPU/memory into routers),
> > i may not predominantly have a scalability issue of the routing subsystem,
> > but only of the forwarding subsystem (hardware forwarding entries).
> 
> I’ll try not to be offended. But I’m failing.

Sorry. totally not my intention. Only trying to be helpfull to
strengthen the document.

I just ran against control plane resource limitations in products way more
often during the decades than i felt necessary knowing what control plane
performane would be possible with appropriately scaled CPU/memory. 

> Router vendors already put in very high end CPUs and gobs of memory.
And it’s never enough. The routing table continues to grow open loop.
And carriers complain about the costs of the extra memory.

Maybe its not as bad now as it was in the recent past given how
Moore's law is changing, but my past experience was that dedicated
route processor boards could not compete in price and life-cycle agility
with general purpose data-center servers, but for Internet BGP, there
is IMHO no reason (other than desires for revenue), to NOT use general
purpose data center servers for Internet BGP routing tables. 

So at least we should  take the most price/performance optimized
opex/capex model into account as the reference against which to
vet cost/benefit of change. 

> > In this
> > universe i could solve the problem through a combination of two
> > mechanisms alone:
> > 
> > a) goegraphical aggregateable allocation of address space by registries
> 
> As pointed out, this is already happening.
> 
> > b) auto-aggregation within routers from routing-plane to forwarding
> >   plane. Aka: Just don't populate the poor HW tables with all those
> >   non-aggregated prefixes, but calculate the minimum number of
> >   sufficient shorter prefixes.
> 
> 
> And some of this is happening, but it is NOT a friendly and rational response to the problem. It has numerous setbacks. Oh, look, a non-aggregatable more-specific showed up. Now I need to expand some of my aggregates. And guess what: it only achieves so much. There are still customers who run out of resources even with this enabled.

If some of this is happening but in your opinion not a rational response
that would make it important IMHO to be dicussed in the document. Or
at least add references to wherever this may already be discussed. Without
quantified numbers published to show how this does or does not help,
we can just believe or not believe your assessment.

Also it sounds as if this mechanism is reducing the number of customers
who still have a problem. That too sounds relevant for the community
to make judgements about what you propose.

> Actually doing the aggregation in the control plane is far preferable: it reduces processing and memory requirements for all upstream systems.

The more aggregation you want to support, the more geographic structure
you introduce into the addressing space and the more you will also
run the risk of creating detriments against cross-geographic
shortcut links (oh no, a peering with you would raise the cost of
my routing table undesirably...). Aka we're trying to compare
capex cost for potentially overpriced CPU/RAM in routers with the
cost of operational processes in registries and operators. Thats
a difficult comparison.

> > 3. In general, i would suggest to split the asks between that
> > geographical aggregateable allocation and the technologies that
> > could then potentially benefit from it. Because: I think we want
> > the ask for better aggregateable addressing to be pushed into
> > the registries faster and independent from the individual technical
> > measures operators will employ to benefit from them. E.g.
> > 
> > - aggregateable addresses first
> > - auto-aggregation maybe as first optimization vendors could
> >  deliver, creating minimal operational overhead to operators
> > - actual aggregation configuration on edge-routers for 
> >  the control plane whenever/wherever operators can be persuaded
> >  to do this.
> 
> The first two are not asks. They are already complete.  The ask is thus the last point.

Hmm. Then i misread your draft. It did sound as if there is an ask
in your document against registries to improve how they assign address
block such that better geographic aggregation than today was enabled.

> > 4. It would be great to have a research group in IRTF around addressing,
> > for various reasons, but in this case of course specifically to have
> > a community of researchers where one might raise the ask to do more
> > numerical analysis of possible benefits. Aka: Any of the possible
> > points going forward (except with the forwarding-plane auto-aggregation
> > maybe) creates significant operational work for registries / operators,
> > so there should be sume hard number evidence of how much this cold give
> > in return.
> 
> That’s your agenda, not mine. I have no interest in ‘research’ in this document.
> This is purely pragmatic and operational.

My text Wasnt asking for research in this document. Just saying that research
might be one option to create insights in support of this document.

Aka: any numbers either from extrapolating actual deployment/product data
and/or simulation data good enough to stand in for real-world data.
How many routes do customers that have memory issues have today, how much
would they have with your proposal in place. That type of evidence.

> >>   Instead, we seek to define groups of hosts and treat them together as
> >>   a single abstraction, commonly known as a 'prefix'.  We call the
> >>   process of combining addresses together into a prefix 'aggregation'.
> >>   Under some circumstances, prefixes themselves may also be aggregated
> >>   to form another prefix, resulting in a recursive structure.  If
> >>   prefix A is a proper subset of prefix B, we say that A is 'more
> >>   specific' than B and that B is 'less specific' than A.
> >> 
> >>   We can then define the routing efficiency of a specific prefix as the
> >>   cost of carrying that prefix, plus all of its more specifics,
> >>   integrated across the entire network, and divided by the number of
> >>   host addresses subsumed by the prefix.
> > 
> > Nit: this looks like a too complex formula. Just say a prefix has an efficiency
> > of N if it covers N hosts or the like.
> 
> That would not cover the costs of people advertising more-specifics.

Maybe start putting in an example for calculating the routing efficiency.
I have a hard time trying to figure out what formula the text means

> >>   It is well known that abstraction obscures important detail and that
> >>   abstraction in routing can cause sub-optimal paths, resulting in
> >>   extra hops, wasted bandwidth, and managerial difficulties.  As a
> >>   result, there will always be a trade-off between scalability and
> >>   optimality when introducing abstractions.
> > 
> > Nit: i'd prefer to stick to the term aggregation instead of bringing
> > in abstraction as an equivalent. But the text also used prefixes already,
> > so why not just make the statement about routing for prefixes and maybe
> > add that the shorter the prefix the greater the risk of such sub-optimalities ?
> 
> That is simply not true. /8 is not somehow less optimal than /16. It depends on the topology, not the prefix length.

The shorter the prefix length of an aggregate, the more longer aggregates it
could have, right ?

> >>   When optimality is paramount and simple reachability is insufficient,
> >>   the routing subsystem has additional mechanisms that allow network
> >>   operators to make different path selection choices, sometimes
> >>   intentionally ignoring or explicitly working against abstraction.  We
> >>   call this broad set of mechanisms 'traffic engineering'.
> > 
> > Nit: WOuldn't the most simple example be a prefix route and then to get
> > a more optimum path for some important hosts in that prefix be another
> > longer-prefix route just for those host ? I don't think we would call that
> > setup traffic-engineering. Aka: Not sure why we need to bring in yet another
> > loaded term (traffic engineering). Also: Did i mention LISP (is that
> > traffic engineering ?) But it would equally provide tools for more optimality.
> 
> That would definitely be traffic engineering.

Ok, i wouldn't have guessed that you would consider a longer prefix
traffic engineering.

Let me see if i understand it correctly:

In general we would not want a longer prefix because we want the aggregation
(with your proposal). So now _if_ someone wants a longer perfix to go
on path where it would violate the aggregation, we call it traffic engineering.
But whats the operational mechanism by which one would decide this is
a permitted instance of traffic engineering or a bug / misconfiguration
in the aggregation scheme ?

Cheers
    Toerless