[Idr] Error Handling for BGP-LS with Segment Routing

Alvaro Retana <aretana.ietf@gmail.com> Tue, 18 December 2018 21:09 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2BA83130F63; Tue, 18 Dec 2018 13:09:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 54cZzqwmMUZ9; Tue, 18 Dec 2018 13:09:09 -0800 (PST)
Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D65411292F1; Tue, 18 Dec 2018 13:09:08 -0800 (PST)
Received: by mail-oi1-x233.google.com with SMTP id y23so3170681oia.4; Tue, 18 Dec 2018 13:09:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:date:message-id:subject:to; bh=P2k7Z+nUPMvTfIWT9o9Cdg3BAXPAiFMRXSjYO8y/Zxk=; b=vW8aIYJ+7D7w7uu9v5Lwzrn0MvLBWYr3yc7uBvyzfmtxF3s0edclYRerf0DNCFXU9N XZSb/tIcJMkBoiFTQtedIvRs2PJ0OHsvn3cpl24pg0RKilZeBkmWiYCieuor3eo+bs5t S94JhGd2okPqsOuPH4BBxDwotPj7wIvW6TefiZvZpcebg9YdKCT+cUBAI+VEkYTSnkLu 6FA5JZn5iWwIAhRu7MnY9dd9kU0VQpOd53iVxLAAzLANonSfpIGQNTONI2V0EBx217CF FH7nUFw9DVKQXuoXL9QMdShT9DrUZ+dXGG56ZMSSVksJA4/VCp68TSXYoufR+Lb51DGG Z9pw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:date:message-id:subject:to; bh=P2k7Z+nUPMvTfIWT9o9Cdg3BAXPAiFMRXSjYO8y/Zxk=; b=aQjvIpJZIVzkJeuFMqtJSPdNznJ1NuOOLAwITPPjJZP2HExqE/YqMRh6DSwiKjz+TZ MbZjccLqhsnwhkeBPSjHIkpGUsmXZIkvjmzaluPKW0FEZtCtmeSBNdLtySyAjXZEZM+M 7/9ZUAKdx+gbMKETbIiJO/iIEFkAe9Aa2MBg2RJ+0CwUDcNzPAT+a6HauC+cIiXuMkdS yC0VKIgxSX9DUzLXRPpkoHtAhGOzMBU1lchD0PPbSHvb+S9dKZ8OWjM4AMkg5XaiyZ+u JAR9zUmln6KO+u4FaAp2S6JJSD+WOx8ybCoTm4laLgGJ3waWTaqdIBqspW5Or0hrYi1I UoyQ==
X-Gm-Message-State: AA+aEWa9Ukjoqrrrr+8avQvEPWZdQHaOWqkpaquykxhJroxQI2KHpxyl +wV3x1HsH8DYtY97NmBLbh0PE+OC8LKMC3K4/yWpdw==
X-Google-Smtp-Source: AFSGD/Vimn6UYIYPkZX31W9pbSzFzhaQm0vkVlwyH/zzkyKL6lAtNoFM6lBpALKtYJZ1EBas0yQAiH/vDlXvRSeRVts=
X-Received: by 2002:aca:1b13:: with SMTP id b19mr8443803oib.215.1545167347894; Tue, 18 Dec 2018 13:09:07 -0800 (PST)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Tue, 18 Dec 2018 13:09:07 -0800
From: Alvaro Retana <aretana.ietf@gmail.com>
X-Mailer: Airmail (528)
MIME-Version: 1.0
Date: Tue, 18 Dec 2018 13:09:07 -0800
Message-ID: <CAMMESsz8Z_B1aH-4wYL-V9cV=5Xse+tpKqXFish6+V+td7KKzw@mail.gmail.com>
To: "idr@ietf. org" <idr@ietf.org>, SPRING WG <spring@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000e3cf58057d524e25"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/0qsjuVaEhroKInHZMohP6HnsWd8>
Subject: [Idr] Error Handling for BGP-LS with Segment Routing
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Dec 2018 21:09:12 -0000

Dear idr and spring WGs:

tl;dr  I don't think that BGP-LS, with error handling as specified
("attribute discard"), can provide the robustness that an application (like
SR), with direct impact on the forwarding in the network, needs.  [Jump to
the bottom for discussion.]


The BGP-LS extensions for SR (e.g.
draft-ietf-idr-bgp-ls-segment-routing-ext) are, as explained in that draft,
used so that "an external component (e.g., a controller) then can collect
SR information from across an SR domain and construct the end-to-end path
(with its associated SIDs) that need to be applied to an incoming packet to
achieve the desired end-to-end forwarding."

To me, that obviously implies that use of BGP-LS for SR has a direct effect
on how traffic is forwarded in the network.  Does any one see it
differently?


The error handling mechanism specified in rfc7752 is "attribute discard"
[rfc7606].  If an error is detected, then the information in the controller
may be, at best, incomplete, but it could also be out of date...resulting
in "segment routes" that don't follow the best available path or that may
even end in a black hole.

It seems clear to me that this is one of the cases that rfc7606 warned
about:

   o  Attribute discard: In this approach, the malformed attribute MUST
      be discarded and the UPDATE message continues to be processed.
      This approach MUST NOT be used except in the case of an attribute
      that has no effect on route selection or installation.

      ...
   For any malformed attribute that is handled by the "attribute
   discard" instead of the "treat-as-withdraw" approach, it is critical
   to consider the potential impact of doing so.  In particular, if the
   attribute in question has or may have an effect on route selection or
   installation, the presumption is that discarding it is unsafe unless
   careful analysis proves otherwise.  The analysis should take into
   account the tradeoff between preserving connectivity and potential
   side effects.


There was a related discussion as a result of my AD review of
draft-ietf-idr-ls-distribution (= rfc7606) [1][2].  At that time (2015),
the consensus on the list was (paraphrasing): if there's a malformed
attribute we won't be able to recover, but that's ok because BGP-LS is
"purely application-level data that has no immediate corresponding
forwarding state impact", and there won't be an impact on critical AFI/SAFI
for network operations.   No one else argued against that...so I ended up
in the rough...

I think the situation has now changed because BGP-LS is carrying SR
information that is used to define paths in the network -- even if
isolation exists, as described in rfc7752:

                 ...    Furthermore, it is anticipated that
   distribution of this NLRI will be handled by dedicated route
   reflectors providing a level of isolation and fault containment
   between different NLRI types.

...the BGP-LS information could still be incomplete, stale, etc..


After all that...  I don't think that BGP-LS, with error handling as
specified ("attribute discard"), can provide the robustness that an
application (like SR), with direct impact on the forwarding in the network,
needs.

What now?  I see several potential paths forward (there are probably more):

(1) "fix" BGP-LS to mandate (MUST) isolation and change the error handling
approach

(2) change the error handling approach...maybe just when used with SR

(3) the controller should only use the SR information received from routing
protocols (IGP/BGP, e.g. draft-ietf-idr-bgp-prefix-sid)

(4) ..??


I didn't find a specific discussion about this topic in the archive...but I
may have missed it in between other related ones.  If I did, please point
me to it.

Thoughts/ideas/comments?

Thanks!

Alvaro.

[1] https://mailarchive.ietf.org/arch/msg/idr/FomvQV2DqjaaRiAcLYLn3LcIdYM
[2] https://mailarchive.ietf.org/arch/msg/idr/wbPNQ-HM2NeR75gR2Or948J9o1I