Re: [Idr] I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt

Igor Malyushkin <gmalyushkin@gmail.com> Mon, 18 March 2024 12:02 UTC

Return-Path: <gmalyushkin@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0C0BFC180B4E for <idr@ietfa.amsl.com>; Mon, 18 Mar 2024 05:02:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.005
X-Spam-Level:
X-Spam-Status: No, score=-2.005 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aXYenTi_LnGU for <idr@ietfa.amsl.com>; Mon, 18 Mar 2024 05:02:16 -0700 (PDT)
Received: from mail-ua1-x933.google.com (mail-ua1-x933.google.com [IPv6:2607:f8b0:4864:20::933]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 09F15C180B46 for <idr@ietf.org>; Mon, 18 Mar 2024 05:02:05 -0700 (PDT)
Received: by mail-ua1-x933.google.com with SMTP id a1e0cc1a2514c-7e038c2c154so678657241.2 for <idr@ietf.org>; Mon, 18 Mar 2024 05:02:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710763324; x=1711368124; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=HkRmDU39iCizkQdWldSx7B4KbUv+JnbuxCdkU7EGMBA=; b=h1osj5IukT05qamfxrL4uYv195fh2JL5jp05c5briTYQ763lpXG1zMDIfXuaHDhHWe W94rf5rBnaoRXulw7V1NH5u677jNkyGTZ/+//KOG8s+gAWDEzRVL3zQI0dgtSH2bmuZW XLV9Byejc7+sd8CFRpIrqhgm33c3Mp1/r1WxAUEAMlvLJQJBt213a6DGX7gS7t2vNEtK HBufmHpjXH4DKgKsjYsvy7jtK+7R/NgfJc3sSmU6MSGkukJbh3GPMRaTHg5YlbMJviL1 +bzTc4669HBAgDwJnaLnnrFzQOJub/39+1Ugf7e/qZUzaM9kQERKnP3K9q3096dyq4+k H5Qw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710763324; x=1711368124; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HkRmDU39iCizkQdWldSx7B4KbUv+JnbuxCdkU7EGMBA=; b=tZ66wb+TZ9tdUeIAHbETAhNQz38kFpOIHPLsZpZKYcHpZO6cVF0uqStY79kwvmkv/L qFLIlC+bd4c83WqwrnExaY3flylBx//Z5zIPMaDAB/DcMOyi8O8ONLxVt1bMRbQ9EKfh OShuygVBT7ImlEZo6LCnOSbCKRL23W+oBbwxCWek5UpMNXmXrlBE4YxX28BKrhCNUoSw wqZhODn1IX7fJhsghrnxcINM3NinNCbgN9C2/k1KWiWiiku0NBhITCIdNanFgak3z/97 DXie0HgNJ9oTYhn4r1TwLSg8s2PCJk2PHze1t3uwXoPh+g+G80HpDJbxYBKAvirFteWU LxEw==
X-Forwarded-Encrypted: i=1; AJvYcCW2pyJxUrep90mZ46YR2NWrgqQmfq92630fFs3gAwJHi4LHMfIeR5ZQ0ETwsXOFUJTVKyBQS/BXMBbYYq8=
X-Gm-Message-State: AOJu0YxhIE6tULd6Bd/zSIFEUcHu7yI8cBgOhNYMBpJAg68GE408uqe7 dh9QtAusOvFQ0yiw8FoaMb/wGi0QQCSh6uJ4ILBZEbbN7QYkYr/heB9g5H2N0D4tm8BzaykSUVe YRJ7pGuGDiIgRpEMNt6UsUz5+yiM=
X-Google-Smtp-Source: AGHT+IHhsSKrFIkPlELJj3G3yqzxh6hvaABtpvYTl3jNdlZTuhajbBKHhPRrE0MxLsXFqNHK86t+A2CjFGQ+Cu/JigM=
X-Received: by 2002:a05:6122:1809:b0:4d4:4e66:ab66 with SMTP id ay9-20020a056122180900b004d44e66ab66mr2100229vkb.13.1710763324028; Mon, 18 Mar 2024 05:02:04 -0700 (PDT)
MIME-Version: 1.0
References: <171065415177.59997.7631576612994148063@ietfa.amsl.com> <CAOj+MMEsp_UfuiHdc4U_Bv5o7xsYYK_RryusUZ88u+SH9xifSA@mail.gmail.com> <SJ0PR05MB86322B34D635E7F221C04C0FA22D2@SJ0PR05MB8632.namprd05.prod.outlook.com> <CAEfhRrxDVi_Yw2wtTWiGzjw4pQ-8TF-48UCY5AUxMpKdrbexZw@mail.gmail.com> <CAOj+MMHNAz741WP9Pf2UCSOQRj6YepFh=Q4tzedmwBCm6e289g@mail.gmail.com> <CAEfhRryMW9nyWfnDQdi+R5g-nypg5ppwFy_Gdf71pRMFmZysHA@mail.gmail.com> <CAOj+MMF2-oqZ29hSBgaO+gzYXXyvCRgJ0m-zW2K7CWattCpgrQ@mail.gmail.com>
In-Reply-To: <CAOj+MMF2-oqZ29hSBgaO+gzYXXyvCRgJ0m-zW2K7CWattCpgrQ@mail.gmail.com>
From: Igor Malyushkin <gmalyushkin@gmail.com>
Date: Mon, 18 Mar 2024 16:01:52 +0400
Message-ID: <CAEfhRrw=acXDgVtzUEhqxZcOPYbJwT0Ha36k-ADgZaiy863erg@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
Cc: Kaliraj Vairavakkalai <kaliraj=40juniper.net@dmarc.ietf.org>, "idr@ietf. org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000003940fb0613ee2068"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/8TiGBxd7NOrpf7CLu2Oed-PjpKo>
Subject: Re: [Idr] I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2024 12:02:20 -0000

I see this differently. ABR1 has two sets of the same infrastructure paths.
One set is from original sources (outside his left area), and another is
from ABR2. Imagine, that the first set is the best one, then ABR1 allocates
a label for every prefix (and its label) from the set and distributes them
as transport prefixes toward his right area (and to ABR2 too). Effectively,
it makes ABR1 an LSR because it performs the SWAP for any incoming label.
With the per-prefix label allocation mode, it is possible to compile these
SWAPs with more than one outgoing label. Considering, we have the second
set of the same paths from ABR2, we can use his labels as a backup. So
there is a PIC egress for such labels.

Maybe I confused you because I didn't mention labels instead of routes. My
bad.

To the authors,

AS2 is further divided into two regions. There are three tunnel domains in
provider's network: The two regions in AS1 use RSVP intra-domain tunnel.
AS2 also uses RSVP-TE intra-domain tunnels. MPLS forwarding is used within
these domains and on inter-domain links. BGP LU (AFI/SAFI: 1/4) is the
transport family providing reachability between PE loopbacks PE25 and
PE11.

I see here a subtle mistake. There are no two regions in AS1 that can use
RSVP LSPs, probably it is AS2.

пн, 18 мар. 2024 г. в 15:21, Robert Raszuk <robert@raszuk.net>:

> Hi Igor,
>
> On Mon, Mar 18, 2024 at 12:13 PM Igor Malyushkin <gmalyushkin@gmail.com>
> wrote:
>
>> Well, maybe there is some gap in terminology. I always considered this
>> behavior as a PIC, because we can switch between the next hops without any
>> dependency on the number of prefixes above. An egress characteristic here
>> is that it happens on a failed next-hop node (an ingress is not aware
>> at the moment or is just starting to react). But we can find a better name
>> for this to avoid confusion.
>>
>
> I disagree.
>
> If you zoom into this specific scenario the described situation is that
> say ABR1 looses (all or some)  IBGP sessions outside his left area. Within
> those session(s) he may have gotten lots of infrastructure routes with lots
> of next hops.
>
> So here it needs to run best path and install all routes one by one into
> RIB and FIB now pointing towards a peer ABR2.
>
> There is no prefix independence here at all. There is no signalling in
> neither IGP nor BGP that one next hop is lost and we need to use the other
> one. That would be possible only on PEs not on ABRs.
>
> So while it is some sort of local protection it is not PIC.
>
> Regards,
> R.
>
>
>
>> Speaking about the propagation of withdraws. As I've previously
>> mentioned, traffic may be sent slightly before (a few milliseconds) or just
>> in time of a failure. Without "protection" at egress, it will be lost if
>> ABRs do not exchange their routes (e.g., because of the same CLUSTER ID).
>> Another moment to consider is that the fast propagation not only depends
>> on the diameter of the BGP network (the number of BGP hops from a source of
>> the event to all its potential receivers) but also on the situation on
>> every such hop (e.g., CPU spikes). In other words, it is not constant.
>>
>> пн, 18 мар. 2024 г. в 14:54, Robert Raszuk <robert@raszuk.net>:
>>
>>> > the egress PIC
>>>
>>> Except this is not real egress PIC.
>>>
>>> In egress PIC ASBRs or PEs receive EBGP paths and rarely act as RRs.
>>>
>>> Here we seem to have a case of option C and IBGP domain where ABRs are
>>> usually redundantly connected and they learn routes over IBGP from each
>>> site.
>>>
>>> I must admit that I have never seen a real practical analysis if in such
>>> cases we should be doing PIC between ABRs acting as RRs. Especially for
>>> infrastructure routes.
>>>
>>> And btw propagating withdraws via good RRs last time I measured was
>>> taking at most single milliseconds.
>>>
>>> Cheers,
>>> R.
>>>
>>>
>>> On Mon, Mar 18, 2024 at 11:24 AM Igor Malyushkin <gmalyushkin@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> AFAIK, the egress PIC is a widely deployed feature with labeled paths.
>>>> One of its characteristics is to preserve traffic in-flight, that was sent
>>>> just in time of a failure event or slightly after that. Traffic is almost
>>>> always faster than any control plane stuff. The significant problem with
>>>> PIC in this case is a possible temporal loop if a destination node fails,
>>>> but it is a separate topic.
>>>>
>>>> My 2 cents.
>>>>
>>>> пн, 18 мар. 2024 г. в 08:40, Kaliraj Vairavakkalai <kaliraj=
>>>> 40juniper.net@dmarc.ietf.org>:
>>>>
>>>>> Hi Robert, please see inline. KV>
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Kaliraj
>>>>>
>>>>>
>>>>>
>>>>> Juniper Business Use Only
>>>>>
>>>>> *From: *Robert Raszuk <robert@raszuk.net>
>>>>> *Date: *Sunday, March 17, 2024 at 11:28 PM
>>>>> *To: *Kaliraj Vairavakkalai <kaliraj@juniper.net>
>>>>> *Cc: *idr@ietf. org <idr@ietf.org>
>>>>> *Subject: *Fwd: I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>>
>>>>> *[External Email. Be cautious of content]*
>>>>>
>>>>>
>>>>>
>>>>> Hi Kaliraj,
>>>>>
>>>>>
>>>>>
>>>>> Thx for posting the new version.
>>>>>
>>>>>
>>>>>
>>>>> I have one observation or clarification to be made in respect to text
>>>>> you added in section 4.1:
>>>>>
>>>>>
>>>>>
>>>>> > However this approach does not allow the ABR-ABR tunnels to be
>>>>>
>>>>> > used as backup path, in the event where an ABR looses all tunnels
>>>>>
>>>>> > to upstream ASBR.
>>>>>
>>>>>
>>>>>
>>>>> So you are talking about the delta time it takes for ABR which
>>>>> looses all tunnels to upstream ASBRs to send BGP withdraws for those
>>>>> learned infrastructure routes - correct ?
>>>>>
>>>>>
>>>>>
>>>>> KV> Yes. Those withdrawals need to anyway happen, and reach both the
>>>>> ingress PEs and adjoining/redundant ABR.
>>>>>
>>>>> KV> So that they can do BGP PIC repair based on that event.
>>>>>
>>>>> KV> Here I am saying that such BGP PIC repair can happen only at
>>>>> ingress PE
>>>>>
>>>>> KV> (which may be multiple BGP hops away), and not at the adjoining
>>>>> ABR.
>>>>>
>>>>>
>>>>>
>>>>> So we are talking 10s of milliseconds here from the moment all such
>>>>> paths are invalidated (which  -the detection and invalidation is needed in
>>>>> any scenario).
>>>>>
>>>>>
>>>>>
>>>>> KV> The BGP update propagation can take longer, based on load on the
>>>>> BGP propagation path. But BGP PIC itself can’t always
>>>>>
>>>>> KV> guarantee 10s of ms restoration. It only guarantees restoring the
>>>>> traffic without depending on service-prefix scale
>>>>>
>>>>> KV> once the unreachability is detected (in this case: BGP withdrawal
>>>>> is received).
>>>>>
>>>>>
>>>>>
>>>>> As you have established each ABR will set next hop self and advertise
>>>>> routes to local PEs (directly or via yet one more pair of RRs (RR26 here)).
>>>>> So each PE will already have backup paths all what you are observing here
>>>>> is the time before PEs invalidate paths advertised by ASBR which
>>>>> looses upstream tunnels.
>>>>>
>>>>>
>>>>>
>>>>> KV> Agreed.
>>>>>
>>>>>
>>>>>
>>>>> So if such failure models are really likely to happen (in spite of
>>>>> redundant ABR connectivity in each area)  I would rather focus on fast
>>>>> removal of broken paths from the network with one next hop invalidation
>>>>> (single BGP or IGP message, single RIB to FIB switchover on PEs) etc ...
>>>>>
>>>>>
>>>>>
>>>>> KV> As explained above, that would also happen. But it may take longer
>>>>> than if the repair happened at the ABR, which is closer to the failure
>>>>> event.
>>>>>
>>>>> KV> Just a tradeoff to be aware of. Thx.
>>>>>
>>>>>
>>>>>
>>>>> Thx,
>>>>> Robert
>>>>>
>>>>>
>>>>>
>>>>> ---------- Forwarded message ---------
>>>>> From: <internet-drafts@ietf.org>
>>>>> Date: Sun, Mar 17, 2024 at 6:42 AM
>>>>> Subject: I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>> To: <i-d-announce@ietf.org>
>>>>> Cc: <idr@ietf.org>
>>>>>
>>>>>
>>>>>
>>>>> Internet-Draft draft-ietf-idr-bgp-fwd-rr-02.txt is now available. It
>>>>> is a work
>>>>> item of the Inter-Domain Routing (IDR) WG of the IETF.
>>>>>
>>>>>    Title:   BGP Route Reflector with Next Hop Self
>>>>>    Authors: Kaliraj Vairavakkalai
>>>>>             Natrajan Venkataraman
>>>>>    Name:    draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>>    Pages:   9
>>>>>    Dates:   2024-03-16
>>>>>
>>>>> Abstract:
>>>>>
>>>>>    The procedures in BGP Route Reflection (RR) spec RFC4456 primarily
>>>>>    deal with scenarios where the RR is reflecting BGP routes with next
>>>>>    hop unchanged.  In some deployments like Inter-AS Option C
>>>>>    (Section 10, RFC4364), the ABRs may perform RR functionality with
>>>>>    nexthop set to self.  If adequate precautions are not taken, the
>>>>>    RFC4456 procedures can result in traffic forwarding loop in such
>>>>>    deployments.
>>>>>
>>>>>    This document illustrates one such looping scenario, and specifies
>>>>>    approaches to minimize possiblity of traffic forwarding loop in such
>>>>>    deployments.  An example with Inter-AS Option C (Section 10,
>>>>> RFC4364)
>>>>>    deployment is used, where RR with next hop self is used at redundant
>>>>>    ABRs when they re-advertise BGP transport family routes between
>>>>>    multiple IGP domains.
>>>>>
>>>>> The IETF datatracker status page for this Internet-Draft is:
>>>>> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-fwd-rr/
>>>>> <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-ietf-idr-bgp-fwd-rr/__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYiMTgzkj$>
>>>>>
>>>>> There is also an HTMLized version available at:
>>>>> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-fwd-rr-02
>>>>> <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-fwd-rr-02__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmrSzwvH$>
>>>>>
>>>>> A diff from the previous version is available at:
>>>>> https://author-tools.ietf.org/iddiff?url2=draft-ietf-idr-bgp-fwd-rr-02
>>>>> <https://urldefense.com/v3/__https:/author-tools.ietf.org/iddiff?url2=draft-ietf-idr-bgp-fwd-rr-02__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmTSTQwU$>
>>>>>
>>>>> Internet-Drafts are also available by rsync at:
>>>>> rsync.ietf.org::internet-drafts
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> I-D-Announce mailing list
>>>>> I-D-Announce@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/i-d-announce
>>>>> <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/i-d-announce__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmzhCeS9$>
>>>>> _______________________________________________
>>>>> Idr mailing list
>>>>> Idr@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/idr
>>>>>
>>>>