Re: [Int-area] Where to aggregate, where to drop

Dino Farinacci <farinacci@gmail.com> Sat, 02 April 2022 22:59 UTC

Return-Path: <farinacci@gmail.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9D06E3A1732 for <int-area@ietfa.amsl.com>; Sat, 2 Apr 2022 15:59:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.109
X-Spam-Level:
X-Spam-Status: No, score=-2.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9-Ss--3B75Pr for <int-area@ietfa.amsl.com>; Sat, 2 Apr 2022 15:59:38 -0700 (PDT)
Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 216383A1713 for <int-area@ietf.org>; Sat, 2 Apr 2022 15:59:38 -0700 (PDT)
Received: by mail-pj1-x102a.google.com with SMTP id h23-20020a17090a051700b001c9c1dd3acbso5656693pjh.3 for <int-area@ietf.org>; Sat, 02 Apr 2022 15:59:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yaLREZDWwTe5emGd0QXsX1oDjIgXBr4MVg5TtS447rw=; b=GQ3XAmjmBzx2BUCrn5CRCQoaUG6hmRKwC7iUA/mDV0Msb41ueQLtgR2AmMrLvRx9vV nXTnhm8ifAgPehv/qzEBahmO5Y4mezUVd6gZLszAtWDWJG5OHXuiMiVeywJ5tTAIOYZX bUClSsm34LCmh8QmiUInr+hb1CrSn1FSBny4gofL05wy/mTwlSxz7ALZTDYKHgN8+x3o yvCkedtgnZVJU1JwatdEQoKlrp1RQWPSUoOLycpYlk25tIjeBFpth3nyGNPy/EQL2bg8 RQOD7bY2ZnkdhfykHIQkuS+BO6QhueCVEStLyubR4OWuptG0Wf3K6lVLLVaPGAITNWHE EyhQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yaLREZDWwTe5emGd0QXsX1oDjIgXBr4MVg5TtS447rw=; b=q2KfrXrmWDvEtYk0DkI6PakzOLDLuUL7cPAVyZ0gS5F22SKLk/N1sC78KzEIbkArEj U9nKuTDWpYaTrIOnHJmlqEXxDk01NdwV4GLaR4XjES2C77WOa3Y9RYPGSK14rCV+sZ8V cIASVfSxNPCmj92m0JFKyIAIS87DEr2ORHJopq0U7W8DC4UHTuVkoUPIi+1Jw6dtPdYX ISIjbE8IAr1xVFQmqwwz8ZhAtV+UuXAPctsTM1YjCP0I9MeohqxA36Vtzdwy8JjHgaUY d0mGd6XIWiah62//yPHtOK1M/0BZYmXKossZTIoTKrB31O2AcgEvcR7+oy2BAYbVaJEO nr3g==
X-Gm-Message-State: AOAM5329tH0RtN53fkGJlEms2A34Vaoaz3RefUnYcNxc2R1+VAUBjsvh hPAfCwakqZE4svzWPs8oWSA=
X-Google-Smtp-Source: ABdhPJyn1Iww9mo9GX8v0iwEeWUHH2ndw5JwQZ6hMYbA5BiK2VKIPc2KdkfYldAdqck9wIj2eoo4+g==
X-Received: by 2002:a17:902:e743:b0:153:a902:8d8c with SMTP id p3-20020a170902e74300b00153a9028d8cmr16769040plf.150.1648940376743; Sat, 02 Apr 2022 15:59:36 -0700 (PDT)
Received: from smtpclient.apple ([2601:646:9600:fef0:eda0:f8c5:2fcc:6b61]) by smtp.gmail.com with ESMTPSA id 77-20020a621450000000b004fa923bb57asm6899486pfu.201.2022.04.02.15.59.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 02 Apr 2022 15:59:36 -0700 (PDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\))
From: Dino Farinacci <farinacci@gmail.com>
In-Reply-To: <316508E1-2AC9-4974-8C47-1351088445D2@comcast.net>
Date: Sat, 02 Apr 2022 15:59:34 -0700
Cc: int-area <int-area@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A0A71F59-1AB0-4905-8203-7A50918207FA@gmail.com>
References: <D9CC0098-7D52-4E91-B154-41BCA59DC82A@gmail.com> <316508E1-2AC9-4974-8C47-1351088445D2@comcast.net>
To: Tony Li <li.tony@comcast.net>
X-Mailer: Apple Mail (2.3693.60.0.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/uTPV-4TGZLt_t2JlwVrTJyb2wcE>
Subject: Re: [Int-area] Where to aggregate, where to drop
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area WG Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Apr 2022 22:59:41 -0000

> Dino,
> 
> Thanks for the question.
> 
>> When a provider proxy aggregates, it means they will summarized more specific routes they have stored in their routing table. Like ISP-A above has routes P.1, P.2, and P.3. When ISP-A advertises a P prefix, it is indicating it can reach all more specifics, even though it may not have a full-set of those more specifics that are covered by P.
> 
> 
> In the figure, ISP-A is the administrator for P, so when they aggregate it’s not proxy aggregation, it’s just simple aggregation.

Right, but doesn't matter what you call it, the point is where to aggregate.

> 
>> 
>> So here are the options:
>> 
>> (1) ISP-A advertises P to ISP-B (and may also advertise more specifics to other peers for policy reasons).
>> (2) ISP-A advertises P.1, P.2, and P.3 to ISP-B and ISP-B advertises P to its peers.
>> 
>> The questions is *where is the best place to aggregate*. Its a tradeoff on routing table savings and how far a packet can travel to either (1) get delivered to the destination or (2) get dropped by a router that doesn't have a more-specific for the destination (thereby wasting resources from the source to the drop point).
>> 
>> (1) If ISP-B aggregates P to the links you see above to stub peers, then the stubs can load-split on P and ISP-B can drop packets for say P.4 and deliver packets for P.3. The drop can happen at the PE router relatively soon.
> 
> 
> If they are stub peers, then sending them anything more than default is inefficient. Yes, ISP B will drop anything for P.4.  There’s no way to deliver that any way.

Well the stub AS could be multi-homed and want to load-spread traffic for different prefixes. The a set of advertised prefixes are exceptions for not following the default path.

> If the peers are not stubs and ISP B advertises P, then yes, it will attract P.4 traffic.  Presumably, the peers are not learning alternate paths for this traffic, so there’s also no way of delivering this traffic and ISP B would get to drop it.  
> 
> I’m not sure why we’re trying to optimize drop traffic. Hopefully, there isn’t a large volume of it. :)

If you can, you want to drop traffic early as possible so you optimize network resources.

> 
>> (2) If ISP-B gets P from ISP-A and advertises to the links you see above to stub peers, then P.3 and P.4 packets move through ISP-B where the edge router in ISP-A delivers to P.3 and drops for P.4.
> 
> 
> And in that case, ISP B has now paid for transit for P.4 drop traffic. That seems painful.

Right.

> As you point out, there’s a trade-off here: if you move the abstraction action boundary outward, you carry more prefixes but you drop unreachable traffic sooner.  If you move the abstraction action boundary inwards, then you carry fewer prefixes, possibly damage traffic engineering, and cause unreachable traffic to take a longer path before it’s dropped.

Yep, well said.

> 
>> All things not considered and looking at this specific topology, I would vote for (1).
> 
> 
> Of course, but what we’ve found is that prefix owners are not willing to just aggregate. They want traffic engineering so they also advertise more specifics.  Thus, the question is (2) or (3) let more specifics propagate throughout the network.

Right, but you are trying to fix this, right? You are trying to do better.

> 
>> Another issue, which I think Tony brought up is, if P gets sent to peers, hijackers have a better opportunity to hijack routes by injecting more-specifics to direct traffic to a bad acting honey pot. This has been known for a long time though so we are not bringing up a new issue.
> 
> 
> Actually, the point that I was bringing up was not one of security.  If more specifics are also present when ISP B performs remote aggregation, then the more specifics will tend to draw traffic away from ISP-B.  ISP B might like that.

Yes.

> 
>> Another question is, if all more-specfics are not stored (due to major link failure), is the aggregate withdrawn from the routing system. That is, if you want less route flapping, you may just want to keep P advertised. That optimizes FIB add/delete entropy everywhere that wants to store P. I would rather have hardware routers drop packets fast, then to have route oscillation.
> 
> 
> Yes, if ISP B isn’t getting all of the more specifics and continues to aggregate, it could attract traffic that it can’t deliver. Presumably this is a transient until it can get the more specifics.
> 
> Tony

So what is your conclusion? Is your draft saying to optimize traffic engineering or saving routing table space? Or simply documenting the tradeoffs?

Cheers,
Dino