Re: OSPF WG Charter Proposal

"Manral, Vishwas" <VishwasM@NETPLANE.COM> Thu, 07 November 2002 17:27 UTC

Received: from cherry.ease.lsoft.com (cherry.ease.lsoft.com [209.119.0.109]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA03107 for <ospf-archive@LISTS.IETF.ORG>; Thu, 7 Nov 2002 12:27:22 -0500 (EST)
Received: from walnut (209.119.0.61) by cherry.ease.lsoft.com (LSMTP for Digital Unix v1.1b) with SMTP id <23.007B7F49@cherry.ease.lsoft.com>; Thu, 7 Nov 2002 12:29:51 -0500
Received: from DISCUSS.MICROSOFT.COM by DISCUSS.MICROSOFT.COM (LISTSERV-TCP/IP release 1.8e) with spool id 330223 for OSPF@DISCUSS.MICROSOFT.COM; Thu, 7 Nov 2002 12:29:51 -0500
Received: from 12.27.183.253 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0f) with TCP; Thu, 7 Nov 2002 12:29:51 -0500
Received: by XOVER.dedham.mindspeed.com with Internet Mail Service (5.5.2653.19) id <4B0HP836>; Thu, 7 Nov 2002 12:29:50 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain; charset="iso-8859-1"
Message-ID: <E7E13AAF2F3ED41197C100508BD6A328791984@india_exch.hyderabad.mindspeed.com>
Date: Thu, 07 Nov 2002 12:31:24 -0500
Reply-To: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
Sender: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
From: "Manral, Vishwas" <VishwasM@NETPLANE.COM>
Subject: Re: OSPF WG Charter Proposal
To: OSPF@DISCUSS.MICROSOFT.COM
Precedence: list

Hi Dave/Tony/folks,

Thanks a lot for the comments.

I agree a large part of the draft covers implementation specific stuff like
the need for pacing LS updates/reducing the rate of flooding etc. An
implementation not doing that could cause problems as Jerry mentioned in his
earlier mail, and as Tony mentioned, is not learned the easy way. We did get
the comment from Dave(on the prepublished version of the draft) to reduce
implementation specific stuff, which we have to the minimum. I however do
think we can give implementation recommendations without affecting a good
implementation (eg I do not think saying pacing needs to be done hampers a
good implementation, though telling the mechanism to do it does, which i
think is not implementation specific anymore but a necessary requirement)

Besides in case of extreme congestion, we did figure out the only way to
bring adjacencies up is signal to the neighbor to slow the rate of flooding
further(for which we are using signalling)  besides the other way to
selectively brining up adjacencies at a time. This I think is the only
significant change which requires bits on the wire to change.

Our aim is to minimize changes to normal processing/code, while also getting
over the problem gracefully when it occurs and not having to go down for
days at a time. We also need to reduce the number of configureables, which
we do intend to minimize further.

Besides I do think from my readings that these problems are more common,
than I had initially guessed, though it does not always cause a network
collapse.

Though Tony I m not sure how ISIS 0-63 bits would add to recommendations of
implementation;-)

Thanks again,
Vishwas

-----Original Message-----
From: Tony Przygienda [mailto:prz@XEBEO.COM]
Sent: Thursday, November 07, 2002 9:47 PM
To: OSPF@DISCUSS.MICROSOFT.COM
Subject: Re: OSPF WG Charter Proposal


Ash, Gerald R (Jerry), ALASO wrote:

>Dave,
>
>>>Such failures are not the fault of the service provider
>>>operation or the vendor/equipment implementation.  They are
>>>due to shortcomings in the link-state protocols themselves --
>>>thus the need for the enhancements proposed in the draft.
>>>
>
>>I strongly disagree with this statement.  While the design of the
>>protocols can make it challenging, there is ample room in
>>implementation to provide stable and scalable networks.
>>
>>When a network collapses, the fault lies at the feet of the
>>implementers.  In every case I've seen (too many), the collapse was
>>inevitable sooner or later, due to naive design choices in software,
>>but at the same time was quite nonlinear in its onset (making any
>>predictive or self-monitoring approach pretty hopeless.)
>>
>>There are some things that would make the job easier, at the cost
>>of additional complexity, but pointing at network collapses
>>and blaming the protocols is disingenuous.
>>
>
>I think you should review the ample evidence presented in
http://www.ietf.org/internet-drafts/draft-ash-manral-ospf-congestion-control
-00.txt that the protocols need to be enhanced to better respond to
congestion collapse:
>
>- Section 2: documented failures and their root-cause analysis, across
multiple service provider networks (also review the cited references)
>- Appendix B: vendor analysis of a realistic failure scenario similar to
one experienced as discussed in Section 2 (perhaps you would like to provide
your own analysis of this scenario based on your OSPF implementation)
>- Appendix C: simulation analysis of protocol performance (other I-D's
being discussed provide analysis of proposed protocol extensions)
>
>To say that network collapse in *every* case is due to *naive design
choices* ignores the evidence/analysis presented.  Based on the
evidence/analysis, there is clearly room for the protocols to be improved to
the point where networks *never* go down for hours or days at a time
(drawing unwanted headlines & business impact).
>
>Jerry
>
Jerry, most of the things you say in your document (which is actually
pretty good) has been
known to people like Dave and other old-time implementors since years
and avoiding exactly
those things by smart implementation techniques was what was
differentiating the have from
the have-nots. I remember myself learning some of those things by hard
experience and some
by looking at old-hands code ;-) [Albeit I remember also picking up a
lot of smart control protocol
ideas from your RTNR work]. I do not think that Dave is putting down
what you say, rather
(and I commit the stupidity to interpret his words by my own beliefs)
that what your document
says are mostly _implementation_ issues, not _standardization_ and
therefore it is not a very wise
idea to add them to the charter of a _standards_ group.  Good protocol
specs are _not_
implementation cookbooks, they are documents governing bits on the wires
in such a way that
two people implementing things in vastly different ways can still talk
to each other. Recommendations
of implementation techniques prove long-term inherently dangerous (like
Joel pointed out, at a
certain point in time adding more code to an implementation introduces
more bugs than the
performance gain is worth) or utterly ridiculous (look at ISIS 0-63
metric to make SPF real fast,
it lead to quite bad contortions).

    thanks

    -- tony