Re: [Idr] I-D Action: draft-ietf-idr-error-handling-03.txt

Jeff Wheeler <jsw@inconcepts.biz> Mon, 10 December 2012 16:13 UTC

MIME-Version: 1.0
In-Reply-To: <005001cdd6da$099f1e90$1cdd5bb0$@highwayman.com>
References: <20121121191321.6164.6887.idtracker@ietfa.amsl.com> <50AD2986.90705@cisco.com> <058b01cdd3b4$9f5193b0$ddf4bb10$@highwayman.com> <8ED5B0B0F5B4854A912480C1521F973A0F4940@xmb-rcd-x13.cisco.com> <94913EE5-2864-4EE2-B474-9631430B1E22@ericsson.com> <068701cdd478$2cf01cf0$86d056d0$@highwayman.com> <CAEGVVtBy-zdLz8hVajLnuAqgzfgQHrseK4r-N9=pOZGtqV7LbA@mail.gmail.com> <074d01cdd536$173f5830$45be0890$@highwayman.com> <9474D8DC-30FF-4C52-9504-15CBCC47E7D8@ericsson.com> <07df01cdd661$f28ef7c0$d7ace740$@highwayman.com> <36E98AE5-3EF8-4738-9982-42B9CA0BAAF5@rob.sh> <005001cdd6da$099f1e90$1cdd5bb0$@highwayman.com>
Date: Mon, 10 Dec 2012 11:13:26 -0500
Message-ID: <CAPWAtbJO7dopCv9mbRHTTNDsSAimumqXu1Xy+Rn2XoE+7Rpk8Q@mail.gmail.com>
From: Jeff Wheeler <jsw@inconcepts.biz>
To: Chris Hall <chris.hall@highwayman.com>
Content-Type: multipart/alternative; boundary="14dae9340f6d72840204d081d8c8"
Cc: idr@ietf.org
Subject: Re: [Idr] I-D Action: draft-ietf-idr-error-handling-03.txt
Precedence: list

On Mon, Dec 10, 2012 at 8:26 AM, Chris Hall <chris.hall@highwayman.com>
wrote:
> However, given some way of determining the (likely ?) impact of
> "unsafe" "treat-as-withdraw", then one could assess whether that is
> better or worse than session-reset (under some  circumstances ?) -- in

Why is the question not answerable by one of three options?
1# ignore
2# treat-as-withdraw
3# session-reset

1# IGNORE is hazardous but probably only to the prefixes in that update (or
withdraw) which means the scope of malfunction is relatively small.  If
someone announces a bad route to the DFZ, or a buggy switch announces wrong
L3VPN information, this will not "spill over" to any other prefixes, as
long as you honor any withdraw that happened to be in the same Message but
before the damaged UPDATE.

Yes, perhaps reachability to the affected prefixes will be gone.  There
could be a loop created and packets will have to expire due to TTL until
this loop is resolved.  But either way, I think the scope of the
malfunction is only prefixes that were in the bad UPDATE.  Even if they
weren't packed into the same UPDATE they will share the same Attributes and
so the sender is likely to make the same mistake.  The exception here is if
an MP_UN?REACH_NLRI Attribute is corrupt but the native NLRI in the outside
part of the Message are not damaged.  You would be ignoring them and
causing a little bit more RIB inconsistency.

2# TREAT-AS-WITHDRAW is hazardous.  Loops could be created in a native
forwarding path (IPv4/IPv6 with no labels) but in MPLS VPNs or when using
labeled-unicast, there will not be any loops.  Reachability may be lost but
if any undamaged path is available then it will be selected as best and
installed.

The great risk is, how do you guess what the prefixes are, if the framing
is wrong?  In my view, here is one way that is rather thorough for finding
MP NLRIs:

Beginning with or following the damaged Attribute (which one?), scan for
MP_REACH_NLRI:
*AttrFlags AttrType AttrLen* AFI SAFI NextHopLen NH *0x00* PfxLen Pfx
(PfxLen Pfx){0,}

You know what AFI, SAFI you are willing to support on this BGP session
because this was determined when the session established.  If you find 0x0e
AFI SAFI sequences you will hope the next octet is NextHopLen and then look
forward that many octets, hoping to find a 0x00 which is a handy reserved
field.*  If you found that 0x00 you can see if a PfxLen follows that is a
sane length for this AFI SAFI -- you know it won't be >= 33 for IPv4 or >=
128 for IPv6, for example.  After that you expect to skip the Pfx and keep
looking for more sequences like this until you run out of AttrLen.

* for information on this reserved field, see RFC4760 Pg4; for history,
RFC2283 Pg3 "Number of SNPAs"

So for updates to prefixes you might be able to find the MP NLRIs with a
good degree of confidence even if another Attribute is damaged.

With MP_UNREACH_NLRI you can similarly search for a known pattern and hope
to find the MP NLRIs.

If you DO get tricked by the damaged packet, you will cause yourself to
withdraw routes that had nothing to do with the malfunction.  This is
unfortunate just to avoid the chance of loops on what is hopefully a
limited number of prefixes.

If the MP_REACH_NLRI or MP_UNREACH_NLRI itself is damaged then you will
have a hard time finding the prefixes.  Maybe they won't even be there at
all, so no clever pattern match to look for them would be helpful.
 Whatever the sending side did to send this bad message is unknown.

For problems with native, non-MP Messages, you really do not have as much
context while you are looking for the NLRI.  It will be hard to find them
without false-positives.  Fortunately you still have the opportunity for a
fairly good sanity check if you simply look from the end of the Message
backwards, and do a bit of consistency checking.  This sounds
computationally-expensive but I think it actually isn't.  It would not be
very hard to write an implementation of this search and test its speed.

In any case, this code will not be executed except when damaged updates are
encountered, and at that point, you can compare the computational expense
of guessing around the problem, to the expense of resetting sessions over
and over and potentially creating work all through your network.

3# SESSION-RESET everyone understands what this behavior does.

If my analysis above is correct, you could easily make an argument for
un-safe ignore.  I think un-safe withdraw is a good OPTION because I
usually believe in giving vendors lots of flexibility to offer knobs to the
operator.  The vendor can certainly decide how robust his un-safe withdraw
checks will be.  Operators can turn the knob whatever way they want, but
probably IGNORE will be smart in almost all cases.  If I could only have
one of those two things I would rather have IGNORE.

-- 
Jeff S Wheeler <jsw@inconcepts.biz>
Sr Network Operator  /  Innovative Network Concepts

[Idr] I-D Action: draft-ietf-idr-error-handling-0… internet-drafts
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Saikat Ray (sairay)
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Shyam Sethuram
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Enke Chen
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Robert Raszuk
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Rob Shakir
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… bruno.decraene
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… bruno.decraene
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jakob Heitz
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Brian Dickson
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Chris Hall
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… Jeff Wheeler
Re: [Idr] I-D Action: draft-ietf-idr-error-handli… John Leslie