Re: draft-palanivelan-bfd-v2-gr-08

Dave Katz <dkatz@juniper.net> Wed, 27 October 2010 17:18 UTC

Subject: Re: draft-palanivelan-bfd-v2-gr-08
MIME-Version: 1.0 (Apple Message framework v1081)
Content-Type: multipart/alternative; boundary="Apple-Mail-11-655084661"
From: Dave Katz <dkatz@juniper.net>
In-Reply-To: <FB649DA20153634794BEBBAB504DA1AD4506130D74@EMBX02-BNG.jnpr.net>
Date: Wed, 27 Oct 2010 11:18:49 -0600
Message-ID: <C2E157D9-DB69-43D8-BB86-E148A93BA9EE@juniper.net>
References: <FB649DA20153634794BEBBAB504DA1AD4506130D74@EMBX02-BNG.jnpr.net>
To: apvelan@cisco.com
Cc: Santosh P K <santoshpk@juniper.net>, rtg-bfd WG <rtg-bfd@ietf.org>
Precedence: list

I somehow seem to have missed the ongoing revision of this draft.

To underscore what Santosh says, I believe that GR interactions can be dealt with in an informational BCP, without any modification of the base spec.  RFC 5882 has a bunch of verbiage to try to address exactly this scenario.

But to the details:

I don't see how there is *any* mechanism that can deal with unplanned restart if the BFD session can't stay up long enough for the restarting system to start sending BFD packets again.  The non-restarting system fundamentally cannot differentiate between this and a crashed neighbor;  it must assume that the path has failed and do whatever it needs to (take down a routing protocol adjacency, for example.)  Unless I'm missing something, this is an unavoidable side effect of fast failure detection, and the only way to deal with it is to make the behavior of the dependent control protocol implementations less onerous when the topology changes (which is frankly what folks should have done instead of inventing GR, which in my opinion is an awful kludge).

If the BFD session can stay up long enough for the restarting system to send BFD packets again, and the control protocol has GR capabilities, the second paragraph of section 3.3 of RFC 5882 describes in general terms what to do.  In particular, the mythical adaptation layer described in the RFC can detect the fact that the session is actively being reestablished (by virtue of the receipt of BFD packets from the restarting system) and apply hysteresis to the BFD session flap.  This will give the restarted control protocol sufficient time to signal the GR and avoid perturbation in that layer.

The Diag field is intended to be *informational* only, that is, a write-only field as far as the BFD state machine is concerned.  There is currently *no* place in the spec where a receiving system uses this field as part of the BFD state machine.  Using it to signal within the protocol is outside of the architectural thinking (which, admittedly, is not explicitly documented).  But beyond that, the Diag field is "fragile" within the protocol;  it is easily overwritten under various conditions based on the existing spec, and attempting to preserve its value for signaling would require constraining the operation of the sender.  This is hinted at in section 6.8.17 in RFC 5880 when describing concatenated paths, for example.

I see no value in signaling the GR timers in the protocol;  if a restarting system can trigger this mechanism by sending Diag 9, it could also do so by simply cranking up the transmit and receive intervals to whatever values it desires.  For planned restart, this is all that is necessary (and this can be done cleanly at that time.)  For unplanned restart, at some level it doesn't matter;  by the time you can signal anything from the restarted system, you can run BFD at full speed, or whatever rate you would like to signal at that time.

All the stuff about broadband and large fan-out seems to be beside the point, an excuse for poor system design, and doesn't seem to add anything to the argument.  A system that is fielding DHCP requests but is ignoring fundamental connectivity is broken in any case (and as Santosh points out, no mechanism in the protocol is going to help you if the packets can't be sent and received anyhow.)  Doing successful unplanned GR has certain requirements, among them ensuring that the protocols involved can actually talk, and the particular example is not a particularly good one, IMHO.

What am I missing?

--Dave

On Oct 27, 2010, at 9:10 AM, Santosh P K wrote:

> Hello Palanivelan,
>      I have couple of doubts on this draft. Under section 6.2. Remote Neighbor Restart and Recovery, it’s mentioned that.
>  
>    “When the set of systems had their BFD sessions established, with GR
>    support as described in this document, when the remote neighbor
>    restarts it will set the BFD diagnostics field to a value of 9
>    (Neighbor Restarting) in the control packet to its neighbor (local
>    system).”
>  
>   This draft is trying to address the unplanned restart of protocols using BFD, as the planned outage is handled today anyway. 
>  
> 1.    How BFD would determine that protocol using BFD has gone for graceful restart (in case of unplanned outage), to send BFD packet with BFD diagnostics field set to 9?  
> 2.    If BFD can determine that protocol using BFD has restarted with GR enabled and that’s not planned outage then can’t we increase BFD session timers instead of having MyRestartInterval and yourRestartInterval fields?
> 3.    In case we miss out initial couple of BFD packets with diagnostics field set to 9 due to BFD not having enough CPU slice on restarting router, then at local router (helper) are we not bringing down the session? 
>  
> ....................................
> Thanks and regards
> Santosh P K
> 
>  
> 
>

draft-palanivelan-bfd-v2-gr-08 Santosh P K
Re: draft-palanivelan-bfd-v2-gr-08 Dave Katz
RE: draft-palanivelan-bfd-v2-gr-08 Palanivelan A (apvelan)
Re: draft-palanivelan-bfd-v2-gr-08 Dave Katz
Re: draft-palanivelan-bfd-v2-gr-08 Donald Eastlake
Re: draft-palanivelan-bfd-v2-gr-08 Dave Katz
Re: draft-palanivelan-bfd-v2-gr-08 Donald Eastlake
RE: draft-palanivelan-bfd-v2-gr-08 Palanivelan A (apvelan)
Re: draft-palanivelan-bfd-v2-gr-08 Dave Katz
RE: draft-palanivelan-bfd-v2-gr-08 Palanivelan A (apvelan)
RE: draft-palanivelan-bfd-v2-gr-08 Palanivelan A (apvelan)
Re: draft-palanivelan-bfd-v2-gr-08 Nitin Bahadur
RE: draft-palanivelan-bfd-v2-gr-08 Jeff Tantsura
RE: draft-palanivelan-bfd-v2-gr-08 Palanivelan A (apvelan)