Re: OSPF WG Charter Proposal

Rohit Dube <rohit@XEBEO.COM> Thu, 07 November 2002 15:38 UTC

Received: from cherry.ease.lsoft.com (cherry.ease.lsoft.com [209.119.0.109]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA28533 for <ospf-archive@LISTS.IETF.ORG>; Thu, 7 Nov 2002 10:38:36 -0500 (EST)
Received: from walnut (209.119.0.61) by cherry.ease.lsoft.com (LSMTP for Digital Unix v1.1b) with SMTP id <20.007B792F@cherry.ease.lsoft.com>; Thu, 7 Nov 2002 10:41:04 -0500
Received: from DISCUSS.MICROSOFT.COM by DISCUSS.MICROSOFT.COM (LISTSERV-TCP/IP release 1.8e) with spool id 329712 for OSPF@DISCUSS.MICROSOFT.COM; Thu, 7 Nov 2002 10:41:04 -0500
Received: from 204.192.44.242 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0f) with TCP; Thu, 7 Nov 2002 10:41:04 -0500
Received: (qmail 7102 invoked from network); 7 Nov 2002 15:41:03 -0000
Received: from bigbird.xebeo.com (192.168.0.21) by lxmail.xebeo.com with SMTP; 7 Nov 2002 15:41:03 -0000
Received: from bigbird.xebeo.com (localhost.localdomain [127.0.0.1]) by bigbird.xebeo.com (8.9.3/8.9.3) with ESMTP id KAA22539 for <OSPF@DISCUSS.MICROSOFT.COM>; Thu, 7 Nov 2002 10:41:03 -0500
X-Mailer: exmh version 2.1.1 10/15/1999
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Message-ID: <200211071541.KAA22539@bigbird.xebeo.com>
Date: Thu, 07 Nov 2002 10:41:03 -0500
Reply-To: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
Sender: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
From: Rohit Dube <rohit@XEBEO.COM>
Subject: Re: OSPF WG Charter Proposal
To: OSPF@DISCUSS.MICROSOFT.COM
In-Reply-To: Message from "Ash, Gerald R (Jerry), ALASO" <gash@ATT.COM> of "Thu, 07 Nov 2002 08:27:38 EST." <28F05913385EAC43AF019413F674A0170167B229@OCCLUST04EVS1.ugd.att.com>
Precedence: list

Jerry,

I looked at one of the sections you pointed out.

On Thu, 7 Nov 2002 08:27:38 -0500 "Ash, Gerald R (Jerry), ALASO" writes:
[snip]
=>I think you should review the ample evidence presented in http://www.ietf.org
>/internet-drafts/draft-ash-manral-ospf-congestion-control-00.txt that the prot
>ocols need to be enhanced to better respond to congestion collapse:
=>
=>- Section 2: documented failures and their root-cause analysis, across multip
>le service provider networks (also review the cited references)

The cited references [att, cholewka, jander, pappalardo*] are from trade
rags. Unless there is some other peer reviewed paper or standards document,
these incidents can hardly be used to point to the root cause. No, I am
not saying that something wasn't wrong - just that these references don't
carr much weight.

In one of these reported incidents, I was physically present and happen
to know the root cause. It was an "implementation mistake" which was
triggered by two different versions of the software being present in
the network simultaneously during a network wide s/e upgrade - a flooding
storm resulted. The problem here is _not_ in the flooding. While one
can certainly provide more knobs to limit flooding, one can also solve
the problem equally well or better by fixing the base implementation or
doing better tests before upgrading a network.

=>- Appendix B: vendor analysis of a realistic failure scenario similar to one
>experienced as discussed in Section 2 (perhaps you would like to provide your
>own analysis of this scenario based on your OSPF implementation)
=>- Appendix C: simulation analysis of protocol performance (other I-D's being
>discussed provide analysis of proposed protocol extensions)
=>
=>To say that network collapse in *every* case is due to *naive design choices*
> ignores the evidence/analysis presented.  Based on the evidence/analysis, the
>re is clearly room for the protocols to be improved to the point where network
>s *never* go down for hours or days at a time (drawing unwanted headlines & bu
>siness impact).

I don't think this is what Dave is saying - he is simply referring to all
the cases that _he_ has seen. This matches my experience with the caveat that
in some cases (a) there was faulty hardware (b) network/ospf parameters were
inconsistely or incorrectly applied.

--rohit.