Re: [Idr] I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt

Igor Malyushkin <gmalyushkin@gmail.com> Mon, 18 March 2024 12:10 UTC

Return-Path: <gmalyushkin@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0B9CCC180B52 for <idr@ietfa.amsl.com>; Mon, 18 Mar 2024 05:10:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.005
X-Spam-Level:
X-Spam-Status: No, score=-2.005 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yL_0W9AUfpck for <idr@ietfa.amsl.com>; Mon, 18 Mar 2024 05:10:12 -0700 (PDT)
Received: from mail-vs1-xe34.google.com (mail-vs1-xe34.google.com [IPv6:2607:f8b0:4864:20::e34]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 759C9C1519A3 for <idr@ietf.org>; Mon, 18 Mar 2024 05:10:02 -0700 (PDT)
Received: by mail-vs1-xe34.google.com with SMTP id ada2fe7eead31-4765792fc76so833895137.3 for <idr@ietf.org>; Mon, 18 Mar 2024 05:10:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710763801; x=1711368601; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=oMP9MGoDgjRY/NbxAMlzULI7EJOsGJfzPPieBUbP2kA=; b=Abistra0QqrxswEcGF/ghxPDJ8u09D+9IwoDrsP+V/nham2sJMGWt/6pFDgBfFRIf8 6nItgMWj4tjCBwsxSIJyP4UJgLoTDWL1gC8Fc3ux9yA+z0LEqdNLdvh0OYMH6Uw9nReA 60/06T1zjnj8JOu0rmD/dxCBMKOeBI3gYbmOU2bF2uhgyrzWTBmIPvm9XlOOvyikdIrO BNQJVspEcUJJAqmZGp5dxPPffSbx6Fpv1wOOiqi+BXrIJ9132rCfm3eV7JBZ09nTogK8 uJ0jGgoddSSGaPK02GDLrsR91eWRy4dxC6i/jHK6OtMGiZnJG25cLLYHzy+q2BVI4rYK lxJA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710763801; x=1711368601; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oMP9MGoDgjRY/NbxAMlzULI7EJOsGJfzPPieBUbP2kA=; b=OXl+9eoGpfOty1OP2v7BCaB48HH73FXVVmtdGw2XM5mrpXFnVDkxMNAwb9IIbLFN7M svUoM0iCRkDvvOfNRJ64mBj6Rn+qy7iwwUpsY3P3M9gSuWt/9W0sM/hBA3Zm37ZAR5dI PJXs7KnwBdgUSd9I6+jb6XM4x4EVJ48THlI1ZVAQBjdxP414JmWfF3rAcPmjr69gM4Z5 fJnmW+M9B2BJeAPa+vTLnBSpzXYg0DpdPMDkX+1QwkR7ToFyvCU88/rAC/OPuCJDRkLq VrQYYWnLcFqZVjA+EvrsO2Vr/RPQWKyI36YNdxwejAgIOvGgU8Ohdvw/s+gftfVNWHon hv4A==
X-Forwarded-Encrypted: i=1; AJvYcCURFb/7ryAzQXtH4/FdhaAR4uMM9noWQ6Pi+CiiZczx01C1LOcpPxFbtHB2mDnAYUUzeHQoL/qwL34WCNo=
X-Gm-Message-State: AOJu0YxFmPYv1GAXQd71D02pl/FMa3AYasnLIJ0cTZbPBnL5XwLHFZ75 s/vfQaqaLyoOE0a0k7+eJ/w2GH/TKvhY4c2L6rOl31r56ExWQaIJuew3j3rujtmmVoXAt+mOjrG 0f7SRgyrtnaE38jh6LRSGAO+JXxk=
X-Google-Smtp-Source: AGHT+IGHkwx8pc61I5CyOPZ1jrv9CJwsOSY7vMFH4FTF+hsmS/EZHgPGE/IYkLHYV6ZF77+urwZwe6iieONBwCFbILk=
X-Received: by 2002:a05:6122:2982:b0:4d3:4ac2:29f4 with SMTP id fn2-20020a056122298200b004d34ac229f4mr9401629vkb.2.1710763800986; Mon, 18 Mar 2024 05:10:00 -0700 (PDT)
MIME-Version: 1.0
References: <171065415177.59997.7631576612994148063@ietfa.amsl.com> <CAOj+MMEsp_UfuiHdc4U_Bv5o7xsYYK_RryusUZ88u+SH9xifSA@mail.gmail.com> <SJ0PR05MB86322B34D635E7F221C04C0FA22D2@SJ0PR05MB8632.namprd05.prod.outlook.com> <CAEfhRrxDVi_Yw2wtTWiGzjw4pQ-8TF-48UCY5AUxMpKdrbexZw@mail.gmail.com> <CAOj+MMHNAz741WP9Pf2UCSOQRj6YepFh=Q4tzedmwBCm6e289g@mail.gmail.com> <CAEfhRryMW9nyWfnDQdi+R5g-nypg5ppwFy_Gdf71pRMFmZysHA@mail.gmail.com> <CAOj+MMF2-oqZ29hSBgaO+gzYXXyvCRgJ0m-zW2K7CWattCpgrQ@mail.gmail.com> <CAEfhRrw=acXDgVtzUEhqxZcOPYbJwT0Ha36k-ADgZaiy863erg@mail.gmail.com>
In-Reply-To: <CAEfhRrw=acXDgVtzUEhqxZcOPYbJwT0Ha36k-ADgZaiy863erg@mail.gmail.com>
From: Igor Malyushkin <gmalyushkin@gmail.com>
Date: Mon, 18 Mar 2024 16:09:49 +0400
Message-ID: <CAEfhRrxQ_07=6OdgsdeyCc7djJcR3iWHUNf2LpS90kz6h3vfhQ@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
Cc: Kaliraj Vairavakkalai <kaliraj=40juniper.net@dmarc.ietf.org>, "idr@ietf. org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000a70f1e0613ee3c7f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/GGN7_-ARbyT0MCVKCPipfKsY83g>
Subject: Re: [Idr] I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2024 12:10:16 -0000

" Imagine, that the first set is the best one, then ABR1 allocates a label
for every prefix (and its label) from the set and distributes them as
transport prefixes toward his right area (and to ABR2 too). "

I meant the left area here.

пн, 18 мар. 2024 г. в 16:01, Igor Malyushkin <gmalyushkin@gmail.com>:

> I see this differently. ABR1 has two sets of the same
> infrastructure paths. One set is from original sources (outside his left
> area), and another is from ABR2. Imagine, that the first set is the best
> one, then ABR1 allocates a label for every prefix (and its label) from the
> set and distributes them as transport prefixes toward his right area (and
> to ABR2 too). Effectively, it makes ABR1 an LSR because it performs the
> SWAP for any incoming label. With the per-prefix label allocation mode, it
> is possible to compile these SWAPs with more than one outgoing label.
> Considering, we have the second set of the same paths from ABR2, we can use
> his labels as a backup. So there is a PIC egress for such labels.
>
> Maybe I confused you because I didn't mention labels instead of routes. My
> bad.
>
> To the authors,
>
> AS2 is further divided into two regions. There are three tunnel domains in
> provider's network: The two regions in AS1 use RSVP intra-domain tunnel.
> AS2 also uses RSVP-TE intra-domain tunnels. MPLS forwarding is used within
> these domains and on inter-domain links. BGP LU (AFI/SAFI: 1/4) is the
> transport family providing reachability between PE loopbacks PE25 and
> PE11.
>
> I see here a subtle mistake. There are no two regions in AS1 that can use
> RSVP LSPs, probably it is AS2.
>
> пн, 18 мар. 2024 г. в 15:21, Robert Raszuk <robert@raszuk.net>:
>
>> Hi Igor,
>>
>> On Mon, Mar 18, 2024 at 12:13 PM Igor Malyushkin <gmalyushkin@gmail.com>
>> wrote:
>>
>>> Well, maybe there is some gap in terminology. I always considered this
>>> behavior as a PIC, because we can switch between the next hops without any
>>> dependency on the number of prefixes above. An egress characteristic here
>>> is that it happens on a failed next-hop node (an ingress is not aware
>>> at the moment or is just starting to react). But we can find a better name
>>> for this to avoid confusion.
>>>
>>
>> I disagree.
>>
>> If you zoom into this specific scenario the described situation is that
>> say ABR1 looses (all or some)  IBGP sessions outside his left area. Within
>> those session(s) he may have gotten lots of infrastructure routes with lots
>> of next hops.
>>
>> So here it needs to run best path and install all routes one by one into
>> RIB and FIB now pointing towards a peer ABR2.
>>
>> There is no prefix independence here at all. There is no signalling in
>> neither IGP nor BGP that one next hop is lost and we need to use the other
>> one. That would be possible only on PEs not on ABRs.
>>
>> So while it is some sort of local protection it is not PIC.
>>
>> Regards,
>> R.
>>
>>
>>
>>> Speaking about the propagation of withdraws. As I've previously
>>> mentioned, traffic may be sent slightly before (a few milliseconds) or just
>>> in time of a failure. Without "protection" at egress, it will be lost if
>>> ABRs do not exchange their routes (e.g., because of the same CLUSTER ID).
>>> Another moment to consider is that the fast propagation not only depends
>>> on the diameter of the BGP network (the number of BGP hops from a source of
>>> the event to all its potential receivers) but also on the situation on
>>> every such hop (e.g., CPU spikes). In other words, it is not constant.
>>>
>>> пн, 18 мар. 2024 г. в 14:54, Robert Raszuk <robert@raszuk.net>:
>>>
>>>> > the egress PIC
>>>>
>>>> Except this is not real egress PIC.
>>>>
>>>> In egress PIC ASBRs or PEs receive EBGP paths and rarely act as RRs.
>>>>
>>>> Here we seem to have a case of option C and IBGP domain where ABRs are
>>>> usually redundantly connected and they learn routes over IBGP from each
>>>> site.
>>>>
>>>> I must admit that I have never seen a real practical analysis if in
>>>> such cases we should be doing PIC between ABRs acting as RRs. Especially
>>>> for infrastructure routes.
>>>>
>>>> And btw propagating withdraws via good RRs last time I measured was
>>>> taking at most single milliseconds.
>>>>
>>>> Cheers,
>>>> R.
>>>>
>>>>
>>>> On Mon, Mar 18, 2024 at 11:24 AM Igor Malyushkin <gmalyushkin@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> AFAIK, the egress PIC is a widely deployed feature with labeled paths.
>>>>> One of its characteristics is to preserve traffic in-flight, that was sent
>>>>> just in time of a failure event or slightly after that. Traffic is almost
>>>>> always faster than any control plane stuff. The significant problem with
>>>>> PIC in this case is a possible temporal loop if a destination node fails,
>>>>> but it is a separate topic.
>>>>>
>>>>> My 2 cents.
>>>>>
>>>>> пн, 18 мар. 2024 г. в 08:40, Kaliraj Vairavakkalai <kaliraj=
>>>>> 40juniper.net@dmarc.ietf.org>:
>>>>>
>>>>>> Hi Robert, please see inline. KV>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Kaliraj
>>>>>>
>>>>>>
>>>>>>
>>>>>> Juniper Business Use Only
>>>>>>
>>>>>> *From: *Robert Raszuk <robert@raszuk.net>
>>>>>> *Date: *Sunday, March 17, 2024 at 11:28 PM
>>>>>> *To: *Kaliraj Vairavakkalai <kaliraj@juniper.net>
>>>>>> *Cc: *idr@ietf. org <idr@ietf.org>
>>>>>> *Subject: *Fwd: I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>>>
>>>>>> *[External Email. Be cautious of content]*
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Kaliraj,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thx for posting the new version.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have one observation or clarification to be made in respect to text
>>>>>> you added in section 4.1:
>>>>>>
>>>>>>
>>>>>>
>>>>>> > However this approach does not allow the ABR-ABR tunnels to be
>>>>>>
>>>>>> > used as backup path, in the event where an ABR looses all tunnels
>>>>>>
>>>>>> > to upstream ASBR.
>>>>>>
>>>>>>
>>>>>>
>>>>>> So you are talking about the delta time it takes for ABR which
>>>>>> looses all tunnels to upstream ASBRs to send BGP withdraws for those
>>>>>> learned infrastructure routes - correct ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> KV> Yes. Those withdrawals need to anyway happen, and reach both the
>>>>>> ingress PEs and adjoining/redundant ABR.
>>>>>>
>>>>>> KV> So that they can do BGP PIC repair based on that event.
>>>>>>
>>>>>> KV> Here I am saying that such BGP PIC repair can happen only at
>>>>>> ingress PE
>>>>>>
>>>>>> KV> (which may be multiple BGP hops away), and not at the adjoining
>>>>>> ABR.
>>>>>>
>>>>>>
>>>>>>
>>>>>> So we are talking 10s of milliseconds here from the moment all such
>>>>>> paths are invalidated (which  -the detection and invalidation is needed in
>>>>>> any scenario).
>>>>>>
>>>>>>
>>>>>>
>>>>>> KV> The BGP update propagation can take longer, based on load on the
>>>>>> BGP propagation path. But BGP PIC itself can’t always
>>>>>>
>>>>>> KV> guarantee 10s of ms restoration. It only guarantees restoring the
>>>>>> traffic without depending on service-prefix scale
>>>>>>
>>>>>> KV> once the unreachability is detected (in this case: BGP withdrawal
>>>>>> is received).
>>>>>>
>>>>>>
>>>>>>
>>>>>> As you have established each ABR will set next hop self and advertise
>>>>>> routes to local PEs (directly or via yet one more pair of RRs (RR26 here)).
>>>>>> So each PE will already have backup paths all what you are observing here
>>>>>> is the time before PEs invalidate paths advertised by ASBR which
>>>>>> looses upstream tunnels.
>>>>>>
>>>>>>
>>>>>>
>>>>>> KV> Agreed.
>>>>>>
>>>>>>
>>>>>>
>>>>>> So if such failure models are really likely to happen (in spite of
>>>>>> redundant ABR connectivity in each area)  I would rather focus on fast
>>>>>> removal of broken paths from the network with one next hop invalidation
>>>>>> (single BGP or IGP message, single RIB to FIB switchover on PEs) etc ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> KV> As explained above, that would also happen. But it may take
>>>>>> longer than if the repair happened at the ABR, which is closer to the
>>>>>> failure event.
>>>>>>
>>>>>> KV> Just a tradeoff to be aware of. Thx.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thx,
>>>>>> Robert
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------- Forwarded message ---------
>>>>>> From: <internet-drafts@ietf.org>
>>>>>> Date: Sun, Mar 17, 2024 at 6:42 AM
>>>>>> Subject: I-D Action: draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>>> To: <i-d-announce@ietf.org>
>>>>>> Cc: <idr@ietf.org>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Internet-Draft draft-ietf-idr-bgp-fwd-rr-02.txt is now available. It
>>>>>> is a work
>>>>>> item of the Inter-Domain Routing (IDR) WG of the IETF.
>>>>>>
>>>>>>    Title:   BGP Route Reflector with Next Hop Self
>>>>>>    Authors: Kaliraj Vairavakkalai
>>>>>>             Natrajan Venkataraman
>>>>>>    Name:    draft-ietf-idr-bgp-fwd-rr-02.txt
>>>>>>    Pages:   9
>>>>>>    Dates:   2024-03-16
>>>>>>
>>>>>> Abstract:
>>>>>>
>>>>>>    The procedures in BGP Route Reflection (RR) spec RFC4456 primarily
>>>>>>    deal with scenarios where the RR is reflecting BGP routes with next
>>>>>>    hop unchanged.  In some deployments like Inter-AS Option C
>>>>>>    (Section 10, RFC4364), the ABRs may perform RR functionality with
>>>>>>    nexthop set to self.  If adequate precautions are not taken, the
>>>>>>    RFC4456 procedures can result in traffic forwarding loop in such
>>>>>>    deployments.
>>>>>>
>>>>>>    This document illustrates one such looping scenario, and specifies
>>>>>>    approaches to minimize possiblity of traffic forwarding loop in
>>>>>> such
>>>>>>    deployments.  An example with Inter-AS Option C (Section 10,
>>>>>> RFC4364)
>>>>>>    deployment is used, where RR with next hop self is used at
>>>>>> redundant
>>>>>>    ABRs when they re-advertise BGP transport family routes between
>>>>>>    multiple IGP domains.
>>>>>>
>>>>>> The IETF datatracker status page for this Internet-Draft is:
>>>>>> https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-fwd-rr/
>>>>>> <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-ietf-idr-bgp-fwd-rr/__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYiMTgzkj$>
>>>>>>
>>>>>> There is also an HTMLized version available at:
>>>>>> https://datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-fwd-rr-02
>>>>>> <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-fwd-rr-02__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmrSzwvH$>
>>>>>>
>>>>>> A diff from the previous version is available at:
>>>>>> https://author-tools.ietf.org/iddiff?url2=draft-ietf-idr-bgp-fwd-rr-02
>>>>>> <https://urldefense.com/v3/__https:/author-tools.ietf.org/iddiff?url2=draft-ietf-idr-bgp-fwd-rr-02__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmTSTQwU$>
>>>>>>
>>>>>> Internet-Drafts are also available by rsync at:
>>>>>> rsync.ietf.org::internet-drafts
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> I-D-Announce mailing list
>>>>>> I-D-Announce@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/i-d-announce
>>>>>> <https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/i-d-announce__;!!NEt6yMaO-gk!Hv7GNYr6n89i4QRD_aXV0QhV0N_J6YWRal9RjghMoB6DdmitfkQrjPi8YKCDbwPDc6YiEq2NYmzhCeS9$>
>>>>>> _______________________________________________
>>>>>> Idr mailing list
>>>>>> Idr@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/idr
>>>>>>
>>>>>