Re: [Idr] [spring] Error Handling for BGP-LS with Segment Routing

Robert Raszuk <robert@raszuk.net> Thu, 03 January 2019 22:22 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 07CDD131313 for <idr@ietfa.amsl.com>; Thu, 3 Jan 2019 14:22:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RzGV6R1adPgE for <idr@ietfa.amsl.com>; Thu, 3 Jan 2019 14:22:53 -0800 (PST)
Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C7C7B130FAF for <idr@ietf.org>; Thu, 3 Jan 2019 14:22:52 -0800 (PST)
Received: by mail-qt1-x832.google.com with SMTP id t33so38580441qtt.4 for <idr@ietf.org>; Thu, 03 Jan 2019 14:22:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=E/OQNfYMwYyKk++On9CakqeLWpfLum9LusXaqgVgc9g=; b=FxqBGsam5yGNZ5JLtTOgvrIvF4G5ZL2er4e/Y0u4KFfgJVJgXA4TDhDlphXM1mud4A wpZSRlyKrXAUNnY8DS+K5nh1sXkfeMCW69lmQ8iRsB+Swz2748noEltqW5zCbr0l/Do2 k3SJfcLVeHzMYxDIRDCSOGhpkUhMS62Cs7vmkLE2nJ0vctZEsxBRoO4Gyt17eb2gilfA HfR3ktOZK2cfGcd4YvWC1eV1lZuRi3aTny8pIEz5FhjNBNJ0WmF8xKYMXoQziuWqswA/ rLMmIpE88zzuFe8QmvKd+ZzJT2dDY0qt31bKR94JXyY1+6nVIb8iu+Y644cugFU8Q1Mz hzTw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=E/OQNfYMwYyKk++On9CakqeLWpfLum9LusXaqgVgc9g=; b=ffOqcPqTpAx4Ksq41z0DLkkxXw1OiaHsYckXJYam7AmDjvgbc9PmUFNBc46FvJHG6B 5Tk4eTnZ545wJ0g2MOL7SRYACF/f6gOdXFsd1qhN4yTvgsNXKP10CEFjMxeCUlzGFH4Y VbteZvK8a+hZxt8JDpcXogsJMepdO5SzhDWdYAiph7feucq8Icz7jJzqYz5uni5OJnC8 HNtAJInUXSibT6St67LY+NHsAUw2F0J+ZdUOdk/+5uTx9i+W4Za3UrZgzXJIU4cB1KaG vN2HP/msHv/Ut7mZ73XLMTfyw92LfXq+56QfFIBxbk73r06NyTvH4G7UgOIwZhmFPzjA gLxw==
X-Gm-Message-State: AJcUukd1w8jRnjueNX8MVLOBHm1qdLlKdjeR51a6JMTxf3PeS0BVbC2m AG6WIKFYLd7KHb8Z6Zwkc9KGJUTGWSVLAO6YbKVZCw==
X-Google-Smtp-Source: AFSGD/X/2OkJCI+lSJCvqb3+AggRaITNSvkHDgHegM9RzCwMgZ4clcJSlakp5Bdtvbl6zo27Is8AsB5YlwZfI0HtM1s=
X-Received: by 2002:ac8:4359:: with SMTP id a25mr48665394qtn.361.1546554171909; Thu, 03 Jan 2019 14:22:51 -0800 (PST)
MIME-Version: 1.0
References: <CAMMESsz8Z_B1aH-4wYL-V9cV=5Xse+tpKqXFish6+V+td7KKzw@mail.gmail.com> <CA+b+ERmic4UXsuWW08SKOH_hwhC5pA+o-J1pHOoT8n2LGJHUng@mail.gmail.com> <CAMMESszxvEFTdsdCS6yEM=Yi6iy=gnrOqWbD07wFTedY90hLkA@mail.gmail.com> <CAHd-QWu8RjwnwJ8LXWpjTmY=VHA4PwZt=uP+H5M4AnKQVBeG7w@mail.gmail.com> <CAMMESsxQhNtW4GEvucv6A2Sh2=_sxm9wigRax+9Gj3C7caBV5A@mail.gmail.com>
In-Reply-To: <CAMMESsxQhNtW4GEvucv6A2Sh2=_sxm9wigRax+9Gj3C7caBV5A@mail.gmail.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Thu, 3 Jan 2019 23:22:41 +0100
Message-ID: <CAOj+MMHCSZEn2St-vx69SzwuiSZVR2wX_s3dgWNuPF+QpHtGCw@mail.gmail.com>
To: Alvaro Retana <aretana.ietf@gmail.com>
Cc: Rob Shakir <robjs@google.com>, "idr@ietf. org" <idr@ietf.org>, SPRING WG <spring@ietf.org>, Robert Raszuk <rraszuk@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000000affd7057e9534a8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/7JtzPXRuDaHQps8lRQtMSbU3JO8>
Subject: Re: [Idr] [spring] Error Handling for BGP-LS with Segment Routing
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Jan 2019 22:22:56 -0000

Hi Alvaro,

> BGP-LS only defines a mechanism through which it may miss information,
but not how to handle it

I think the point at least some of us are trying to indicate is that the
overall application is responsible for building into it proper redundancy.

BGP-LS reg error handling is not worse then BGP IPv4 or IPv6 routing - do
you agree ? If so unreachability should be handled end to end say by dual
homing, falling back from SR paths to "native IP path" etc ...

In my view SR controller is mainly used as optimization not as critical
element - well at least in the deployment models I would personally
recommend to use.

Regards,
R.


On Thu, Jan 3, 2019 at 10:40 PM Alvaro Retana <aretana.ietf@gmail.com>
wrote:

> Rob:
>
> Hi!
>
> I don’t think I said it before:  speaking as a WG participant...
>
> I want to pick up on a point you make below, which I agree with: "any
> topology discovery mechanism (whether used in real-time or not) needs to
> define how it handles cases where it might end up with missing
> information”….
>
> BGP-LS only defines a mechanism through which it may miss information, but
> not how to handle it — or maybe it does (?): by using attribute discard it
> just accepts that the information might be missing going forward…and
> doesn’t attempt to do anything..  Maybe this quote is true: "Doing Nothing
> Often Leads to the Very Best Something” — Winnie the Pooh
>
> That action may be ok in the general case…but I think that doing nothing
> may not be enough/appropriate for an application like SR, because it is
> explicitly calculating paths….
>
>
> The point I’m trying to bring up is not necessarily treat-as-withdraw vs.
> attribute discard…. But, first, is attribute discard
> enough/appropriate/good for a BGP-LS application such as SR?  If it isn’t,
> second, is there a different approach that would be better?  Maybe we then
> come to a point where something can change…or accept the limitations of the
> system and be clear about them.  I fully realize that I may be the only one
> who thinks there’s an issue…
>
> Thanks!!
>
> Alvaro.
>
>
> On December 21, 2018 at 11:23:16 AM, Rob Shakir (robjs@google.com) wrote:
>
> Alvaro,
>
> I think this is one of the difficulties of overloading a protocol like BGP
> with different datasets -- it's not simple to say how particular attributes
> are actually going to be used within a protocol deployment. This was one of
> the things that was noted in 7606 -- i.e., I can make *any* attribute
> really affect forwarding if I write a policy that accepts/rejects some
> UPDATE based on the presence of that attribute.
>
> In general, any topology discovery mechanism (whether used in real-time or
> not) needs to define how it handles cases where it might end up with
> missing information. Let's consider what the different mechanisms for
> discovery we have are today:
>
>    - IGP listening -- in this case, if we have some malformed IS-IS TLV,
>    then we might end up discarding this information (whether it be at the
>    listening node, or a device that didn't flood it earlier in the chain) --
>    meaning that we know that we have some potential gap in the topology.
>    - Streaming telemetry -- speaking particularly to gNMI for LSDB
>    streaming encoded using the OpenConfig model, here, we are tolerant to
>    getting as much information as can be parsed, and have a way to carry
>    unknown TLVs (which might include those that cannot be successfully parsed)
>    as binary data to the external consumer. This means that the approach is
>    "as complete data as possible", but has the same characteristic that we can
>    also end up having the potential to lose data.
>    - BGP-LS with attribute discard -- this has some information loss,
>    since we'll have some attributes that could be malformed in the input data,
>    and we discard them at the receiver.
>
> It doesn't seem to me that, given the source of the data is the IGP, and
> we might have information discarded there -- that we can really guarantee
> strong consistency of an off-box view of the network, since we can't
> guarantee strong consistency across the IGP domain itself.
>
> Thus, I'm not sure that the issue that is being highlighted here actually
> makes a difference when we're considering the overall system design -- we
> always need to deal with the fact that the view of the network at the path
> computing node might not match exactly the network's current state in the
> presence of malformed protocol messages. One motivation for having the LSDB
> via streaming telemetry is the ability to provide such validation ("do all
> nodes within my IGP domain, including listeners, have a consistent view of
> the state of the network?").
>
> If the discussion is "should we adopt treat-as-withdraw vs. attribute
> discard?" -- I don't think that from the system perspective there is really
> any difference between the two in this situation. We still have the same
> potentially inconsistent view of the network.
>
> For these reasons, I'd err on leaving this unchanged in the current
> specification(s).
>
> Cheers,
> r.
>
> _______________________________________________
> spring mailing list
> spring@ietf.org
> https://www.ietf.org/mailman/listinfo/spring
>