Adrian Farrel's Comments on draft-ietf-rtgwg-remote-lfa

Stewart Bryant <stbryant@cisco.com> Fri, 30 January 2015 18:10 UTC

Message-ID: <54CBC915.8010702@cisco.com>
Date: Fri, 30 Jan 2015 18:10:29 +0000
From: Stewart Bryant <stbryant@cisco.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: adrian Farrel <adrian@olddog.co.uk>
Subject: Adrian Farrel's Comments on draft-ietf-rtgwg-remote-lfa
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/rtgwg/FKZnv30y6ilYRMmtevvISQufkWE>
Cc: "rtgwg-chairs@tools.ietf.org" <rtgwg-chairs@tools.ietf.org>, "iesg@ietf.org" <iesg@ietf.org>, "rtgwg@ietf.org" <rtgwg@ietf.org>
Precedence: list
Reply-To: stbryant@cisco.com

Comment (2015-01-06 for -10)

SB> Adrian here is the commentary on the resolution
of your comments:

Although it causes some pain with abbreviations and a little more care
in explanation, you need to put the Introduction as the first section in
the document. The RFC editor will insist on this, so it is better if you
do it now and get the wording exactly how you would like it.

SB> I have put the following introduction before the definition of terms:

"RFC 5714 [RFC5714] describes a framework for IP Fast Re-route (IPFRR)
and provides a summary of various proposed IPFRR solutions. A basic
mechanism using loop-free alternates (LFAs) is described in  that
provides good repair coverage in many topologies, especially those
that are highly meshed. However, some topologies, notably ring based
topologies are not well protected by LFAs alone because there is no
neighbor of the point of local repair (PLR) that has a cost to the
destination without traversing the failure that is cheaper than the
cost to the destination via the failure."

"The method described in this document extends LFA approach
described in to cover many of these cases by tunneling the packets
that require IPFRR to a node that is both reachable from the
PLR and can reach the destination."

SB> Then the old introduction is renamed Overview of Solution and
starts with

"The problem of LFA IPFRR reachability in some networks is illustrated by
the network fragment shown in Figure 1."

SB> It then continues as before with the Figure

---

You are using RFC 5714 as a Normative Reference by making me go there
for the definition of terms. Please move it to the correct section.

SB> Done
---

IMHO your definition of FIB is rather loose.  Fortunately (?) "FIB" is
barely used in this document, so it might not be important, but if you
wanted to fix it:
- you are talking about IP packets in this document
- the actions are, I think, limited to forwarding actions

SB> As Alia said mostly MPLS, but the solution applies to both IP and MPLS.
SB> However I think I can get rid of the definition.

SB> I have modified the P-space definition slightly to

"The P-space of a router with respect to a protected link is
the set of routers reachable from that specific router using
the pre-convergence shortest paths, without any of those
paths (including equal cost path splits) transiting that
protected link."

SB> Then is section 10

"When the network re-converges, microloops [RFC5715]
can form due to transient inconsistencies in the forwarding
tables of different routers."

SB> Then I have deleted the definition of FIB.

---

SB> Throughout the text, the terms P-space, Q-space, and extended P-space
are used rather loosely. For example, when discussing Figure 1 you say
"S's extended P-space", but this is really "S's extended P-space with
respect to S-E". Someone familiar with the work might say that it is
obvious from the context that we are discussing the link S-E, and it is,
but the terminology needs to be tight.

SB> I have gone though the uses and cleaned it it.

SB> It seems a bit wordy in places, but it is more precise.

---

Section 2

    B
    has equal-cost paths via B-A-S-E and B-C-D-E and so may go through
    S-E.

I don't think B is going anywhere. Maybe...

    B
    has equal-cost paths to E via B-A-S-E and B-C-D-E and so may reach E
    through S-E.

SB> This now says:

"B has equal-cost paths to E via B-A-S-E and B-C-D-E and so the forwarder
at S might choose to send a packet to E via link S-E. Hence B is not in the
Q-space of E with respect to link S-E."

---

Section 2

    In MPLS networks the targeted LDP
    protocol needed to learn the label binding at the repair tunnel
    endpoint is a well understood and widely deployed technology.

But it would still benefit from a citation or a forward reference to
section 7.

SB> Done

---

I enjoyed 3.2

    relatively rare as is the incidence of failure in a well managed
    network.

So, managing my network well is protection against back-hoes. Nice.

SB> It's not a well managed network if it does not include a back-hoe
exclusion force field along the fiber tracks.

SB> I have left the text as it was, since I think the meaning is clear.

---

In 3.2

    Multiple
    repairs MAY share a tunnel end point.

1. s/repairs/repair tunnels/
2. s/MAY/may/ since this is not an implementation or operational choice,
    but a fact of life.

SB> Done

---

In 4.2 you have truncated...

    The repair tunnel endpoint needs to be a node in the network
    reachable from S without traversing S-E.

...and...

    o  The repair tunneled point MUST be reachable from the tunnel source
       without traversing the failed link; and

You mean "reachable using the normal FIB", I think.

SB> No, the first hop may not be in the normal FIB, that is a property of
LFAs. All of the other hops at in the normal FIB.

SB> I am concerned that fixing this would add confusion so have
left the text.

---

Section 4.3

    The preceding text has mostly described the computation of the remote
    LFA repair target (PQ) in terms of the intersection of two
    reachability graphs computed using SPFs.

"mostly"?

SB> mostly deleted

"reachability graphs"? Were they? Or were they reachability sets?

SB> I think graphs since you take into account how you get there
when you compute the SPT.

---

Your pseducode in 4.3 invokes an unresolved (and undescribed) function
Compute_Forward_SPF().

Actually, I think this is a bogus line that can be deleted.

SB> I would argue that the computation is well known, and it is
needed for the edge case that S-E has ECMPs that were
discarded by normal routing, for example if normal
routing ignored ECMP.

---

I think the introduction of "pseudonode" in 4.3 may be a little without
context.

SB> I have added a pointer to RFC1195. The reader can get from there
to ISO-10589, but that is less accessible as a first reference.

---

Section 7
    If for any reason the TLDP session cannot
    not be established

s/cannot not/cannot/

SB> Done

---

I think [RFC5424] and [RFC3411] are pretty poor references to give in
section 7. You appear to be saying that an implementation that cannot
establish a TLDP session should write a MIB module, standardise it, and
then report an error.

Can't you find an existing LDP MIB module that reports Session-up
failures?

Or maybe just delete "using any well known mechanism such as Syslog
[RFC5424] or SNMP [RFC3411]."

SB> It now says:

"... SHOULD advise the operator about the protection setup issue
through the network management system."

---

I think you can strengthen the security considerations. You have:

    To prevent their use as an attack vector IP repair tunnel endpoints
    (where used) SHOULD be assigned from a set of addresses that are not
    reachable from outside the routing domain.

1. "To prevent their use" is surely consistent with a "MUST".
    The fact that you want to say "SHOULD" means that you need to turn
    the text around...

    IP repair tunnel endpoints (where used) SHOULD be assigned from a set
    of addresses that are not reachable from outside the routing domain.
    This would prevent their use as an attack vector.

SB> Done.

2. You can add a note about what traffic can be placed into a repair
    tunnel. You already have this earlier in the document, and it is
    worth restating.

SB> See below

3. I think you should also make note of whether the repair tunnel is
    advertised by the routing protocol as an available link.

SB> I have added:

"Other than OAM traffic, used to verify the correct operation of a repair
tunnel, only traffic that is being protected as a result of a link failure
is placed a repair tunnel. The repair tunnel MUST NOT be advertised
by the routing protocol as a link that may be used to carry normal
user traffic, or routing protocol traffic."

Will be in version 11 shortly.

Stewart

Adrian Farrel's Comments on draft-ietf-rtgwg-remo… Stewart Bryant