Re: [Idr] [spring] Error Handling for BGP-LS with Segment Routing

<bruno.decraene@orange.com> Tue, 08 January 2019 14:33 UTC

Return-Path: <bruno.decraene@orange.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C48C1200B3; Tue, 8 Jan 2019 06:33:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Qa_Qfc9kE4YS; Tue, 8 Jan 2019 06:33:51 -0800 (PST)
Received: from orange.com (mta136.mail.business.static.orange.com [80.12.70.36]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7AFD1124BF6; Tue, 8 Jan 2019 06:33:50 -0800 (PST)
Received: from opfednr06.francetelecom.fr (unknown [xx.xx.xx.70]) by opfednr25.francetelecom.fr (ESMTP service) with ESMTP id 43Yvrs10dmzCs14; Tue, 8 Jan 2019 15:33:49 +0100 (CET)
Received: from Exchangemail-eme2.itn.ftgroup (unknown [xx.xx.31.32]) by opfednr06.francetelecom.fr (ESMTP service) with ESMTP id 43Yvrr71ZszDq7C; Tue, 8 Jan 2019 15:33:48 +0100 (CET)
Received: from OPEXCAUBM33.corporate.adroot.infra.ftgroup (10.114.13.70) by OPEXCLILM32.corporate.adroot.infra.ftgroup (10.114.31.32) with Microsoft SMTP Server (TLS) id 14.3.408.0; Tue, 8 Jan 2019 15:33:48 +0100
Received: from OPEXCAUBM43.corporate.adroot.infra.ftgroup ([fe80::b846:2467:1591:5d9d]) by OPEXCAUBM33.corporate.adroot.infra.ftgroup ([::1]) with mapi id 14.03.0415.000; Tue, 8 Jan 2019 15:33:48 +0100
From: bruno.decraene@orange.com
To: Rob Shakir <robjs=40google.com@dmarc.ietf.org>, Alvaro Retana <aretana.ietf@gmail.com>
CC: "idr@ietf. org" <idr@ietf.org>, SPRING WG <spring@ietf.org>, Robert Raszuk <rraszuk@gmail.com>
Thread-Topic: [spring] Error Handling for BGP-LS with Segment Routing
Thread-Index: AQHUo6zyYdBKbJCKj0KbT0kumtYWPqWeEnMAgAdbqMA=
Date: Tue, 08 Jan 2019 14:33:48 +0000
Message-ID: <1830_1546958029_5C34B4CD_1830_110_1_53C29892C857584299CBF5D05346208A4898A07A@OPEXCAUBM43.corporate.adroot.infra.ftgroup>
References: <CAMMESsz8Z_B1aH-4wYL-V9cV=5Xse+tpKqXFish6+V+td7KKzw@mail.gmail.com> <CA+b+ERmic4UXsuWW08SKOH_hwhC5pA+o-J1pHOoT8n2LGJHUng@mail.gmail.com> <CAMMESszxvEFTdsdCS6yEM=Yi6iy=gnrOqWbD07wFTedY90hLkA@mail.gmail.com> <CAHd-QWu8RjwnwJ8LXWpjTmY=VHA4PwZt=uP+H5M4AnKQVBeG7w@mail.gmail.com> <CAMMESsxQhNtW4GEvucv6A2Sh2=_sxm9wigRax+9Gj3C7caBV5A@mail.gmail.com> <CAHd-QWskekEA1HrJbAGnwPrv8b2+jy12qg9iazmn4kXDgsN15Q@mail.gmail.com>
In-Reply-To: <CAHd-QWskekEA1HrJbAGnwPrv8b2+jy12qg9iazmn4kXDgsN15Q@mail.gmail.com>
Accept-Language: fr-FR, en-US
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.114.13.245]
Content-Type: multipart/alternative; boundary="_000_53C29892C857584299CBF5D05346208A4898A07AOPEXCAUBM43corp_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/GorSPuprpf5DiVOR5PjVzWBrshE>
Subject: Re: [Idr] [spring] Error Handling for BGP-LS with Segment Routing
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jan 2019 14:33:54 -0000

Hi Alvaro, all

Also speaking as a WG participant

For what it’s worth, +1 to Rob’s email.

> From: spring [mailto:spring-bounces@ietf.org] On Behalf Of Alvaro Retana
> Sent: Thursday, January 03, 2019 10:40 PM
[…]
> I fully realize that I may be the only one who thinks there’s an issue…

I don’t think so. But at least the specification does define the behavior in this error condition. And I don’t think that there is a perfect solution: RFC 7606 behavior seems good enough to me and IMHO, going further would require discussing alternative technical proposals (solutions) rather than just raising the problem.

But I’m wondering why error handling is that specific to BGP-LS. Why is that point not been raised on, let’s say, draft-ietf-ospf-ospfv3-segment-routing-extensions which is currently under IESG review? I can see that the specific are (a bit) different, but the big picture seems the same: the information is incomplete, how do we handle this?
Then, I’m not sure that the problem is specific/limited to SR/SID information.

Thanks,
Cheers,
--Bruno

From: spring [mailto:spring-bounces@ietf.org] On Behalf Of Rob Shakir
Sent: Thursday, January 03, 2019 11:40 PM
To: Alvaro Retana
Cc: idr@ietf. org; SPRING WG; Robert Raszuk
Subject: Re: [spring] Error Handling for BGP-LS with Segment Routing

Hi Alvaro,

Also speaking as a WG participant :-)
On Thu, Jan 3, 2019 at 1:40 PM Alvaro Retana <aretana.ietf@gmail.com<mailto:aretana.ietf@gmail.com>> wrote:
BGP-LS only defines a mechanism through which it may miss information, but not how to handle it — or maybe it does (?): by using attribute discard it just accepts that the information might be missing going forward…and doesn’t attempt to do anything.  Maybe this quote is true: "Doing Nothing Often Leads to the Very Best Something” — Winnie the Pooh

I think that it defines *something*, albeit not explicitly. Essentially, as I read it, we're saying "when an attribute encoded by the advertising BGP-LS source is incorrect, then BGP-LS as a system will prefer to use partial information" (partial information, since we assume that some information does get through, since the NLRI could be parsed).

That action may be ok in the general case…but I think that doing nothing may not be enough/appropriate for an application like SR, because it is explicitly calculating paths….

The point I’m trying to bring up is not necessarily treat-as-withdraw vs. attribute discard…. But, first, is attribute discard enough/appropriate/good for a BGP-LS application such as SR?  If it isn’t, second, is there a different approach that would be better?  Maybe we then come to a point where something can change…or accept the limitations of the system and be clear about them.  I fully realize that I may be the only one who thinks there’s an issue…

My point was really the same... The question I was trying to raise is "what is the alternative that you would suggest?". Other technologies that fulfill the same role as BGP-LS (those that I described) don't take a very different approach.

Clearly, it's bad to calculate paths with incomplete information about the topology of the network. It's also bad to calculate zero paths because you discarded the entire topology based on an error. We're in-between a rock and a hard place in terms of maintaining system functionality here -- all systems that do the same as BGP-LS are having to make some form of compromise about which constraint (correctness, or connectivity) they are violating.

This is why I was arguing for leaving things unchanged -- the correctness constraint seems OK to violate by default. If there are deployments where connectivity is the desirable constraint to violate, then reacting to the fact that attribute-discard did occur is possible (or not configuring 7606 error handling if the implementation supports this).

Describing these compromises is, of course, a good idea. However, it's not clear where this description would go -- we don't really have a document that describes this overall system and how it might be implemented today.

Cheers and HNY!
r.



Thanks!!

Alvaro.



On December 21, 2018 at 11:23:16 AM, Rob Shakir (robjs@google.com<mailto:robjs@google.com>) wrote:
Alvaro,

I think this is one of the difficulties of overloading a protocol like BGP with different datasets -- it's not simple to say how particular attributes are actually going to be used within a protocol deployment. This was one of the things that was noted in 7606 -- i.e., I can make *any* attribute really affect forwarding if I write a policy that accepts/rejects some UPDATE based on the presence of that attribute.

In general, any topology discovery mechanism (whether used in real-time or not) needs to define how it handles cases where it might end up with missing information. Let's consider what the different mechanisms for discovery we have are today:

  *   IGP listening -- in this case, if we have some malformed IS-IS TLV, then we might end up discarding this information (whether it be at the listening node, or a device that didn't flood it earlier in the chain) -- meaning that we know that we have some potential gap in the topology.
  *   Streaming telemetry -- speaking particularly to gNMI for LSDB streaming encoded using the OpenConfig model, here, we are tolerant to getting as much information as can be parsed, and have a way to carry unknown TLVs (which might include those that cannot be successfully parsed) as binary data to the external consumer. This means that the approach is "as complete data as possible", but has the same characteristic that we can also end up having the potential to lose data.
  *   BGP-LS with attribute discard -- this has some information loss, since we'll have some attributes that could be malformed in the input data, and we discard them at the receiver.
It doesn't seem to me that, given the source of the data is the IGP, and we might have information discarded there -- that we can really guarantee strong consistency of an off-box view of the network, since we can't guarantee strong consistency across the IGP domain itself.

Thus, I'm not sure that the issue that is being highlighted here actually makes a difference when we're considering the overall system design -- we always need to deal with the fact that the view of the network at the path computing node might not match exactly the network's current state in the presence of malformed protocol messages. One motivation for having the LSDB via streaming telemetry is the ability to provide such validation ("do all nodes within my IGP domain, including listeners, have a consistent view of the state of the network?").

If the discussion is "should we adopt treat-as-withdraw vs. attribute discard?" -- I don't think that from the system perspective there is really any difference between the two in this situation. We still have the same potentially inconsistent view of the network.

For these reasons, I'd err on leaving this unchanged in the current specification(s).

Cheers,
r.

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.