Re: 7.3 MIB

dsiinc! Thu, 04 March 1993 22:12 UTC

Received: from by IETF.CNRI.Reston.VA.US id aa19802; 4 Mar 93 17:12 EST
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa19798; 4 Mar 93 17:12 EST
Received: from CS.UTK.EDU by CNRI.Reston.VA.US id aa27592; 4 Mar 93 17:12 EST
Received: from localhost by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA21512; Thu, 4 Mar 93 16:43:10 -0500
X-Resent-To: fddi-mib@CS.UTK.EDU ; Thu, 4 Mar 1993 16:43:09 EST
Errors-To: owner-fddi-mib@CS.UTK.EDU
Received: from relay1.UU.NET by CS.UTK.EDU with SMTP (5.61++/2.8s-UTK) id AA21493; Thu, 4 Mar 93 16:43:03 -0500
Received: from (via LOCALHOST.UU.NET) by relay1.UU.NET with SMTP (5.61/UUNET-internet-primary) id AA05808; Thu, 4 Mar 93 16:43:00 -0500
Date: Thu, 4 Mar 93 16:43:00 -0500
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: dsiinc!
Message-Id: <9303042143.AA05808@relay1.UU.NET>
Received: from dsiinc.UUCP by with UUCP/RMAIL (queueing-rmail) id 164157.26880; Thu, 4 Mar 1993 16:41:57 EST
To: uunet!CS.UTK.EDU!!
Subject: Re: 7.3 MIB
Content-Type: text
Content-Length: 8544

>    ... Something like "smt73" would work.  Any other
>    suggestions are welcome.

This suggestion looks just fine to me.

>    The only other item remaining to be discussed is traps.  I
>    believe that would probably be be the primary dicussion topic at Ohio.
>    Although it was me (I think) that had first (re-)initiated this
>    discussion a while back, it seems worthwhile at this point to
>    submit a trap-less MIB for publication as the 7.3 RFC, that some
>    progress can be made.  They could be added in the next incremental
>    standardization step if necessary.  On the other hand, if popular opinion
>    is that traps are needed immediately, we must spend additional
>    time on their definition before submitting this MIB.  Other WG members'
>    input on this would be appreciated.

I don't remember if you were around when we did the 6.2 MIB, but as
I recall, we intentionally left traps out of the main document because
every time we brought them up, we got locked in endless political
discussions about what was really appropriate for SNMP and what was 
SMT only, potential bandwidth problems, and SRF/trap retransmission
algorithms.  In light of this (and the fact that I would like to see
the FDDI MIB completed), I propose that the trap document be kept 
separate from the FDDI MIB document.  This way, if we get caught up
in the politics again, we won't delay the management ability for those
already waiting for it.  We've had SMT 7.2 customers since last August 
and they are getting tired of waiting for the SNMP management ability.

What I'd like to propose as a starting point for a traps document is 
that we take every event and condition reported by SMT SRFs and map 
them into corresponding traps.

As Jeff mentioned he is currently consumed with other activities, I'll 
volunteer to serve as editor for the document, if that's acceptable to 
everyone.  [All I require is the traps templete we used before and for
someone to tell me what the preferred document format is and I can get 
started.  I'm guessing that the raw document is probably not in ASCII

Once we've created the basic trap document from the SMT SRFs, we can 
systematically eliminate anything we classify as SMT specific (thus 
not relevant to SNMP managers) and then add anything else we conclude 
is missing.  After all of this is done, we can consider any changes 
to the frequency of trap generation.  

For now, I'd like to start by reporting a trap every time an SRF is 
generated.  If you look at the realistic frequency of trap generation 
mirroring SRFs, I don't believe there will be the bandwidth problem 
we though might occur when we went through this before.  Not only has
SRF tranmission gone through considerable updates, I think we have
a lot more experience in estimating SRF traffic patterns.

I believe with active participation from the mailing list and 
establishing a few basic goals (listed above), we should able to 
finish the traps document by the July IETF.

I'll get the ball rolling:

BACKGROUND INFORMATION (Most of this is directly from the SMT document):

    SMT uses the Status Report Protocol to periodically announce Station
    Status that is useful in managing an FDDI ring.  This status
    information is carried in Status Report Frames (SRFs).

    In reporting Status, the protocol considers two different types of
    Status:  Conditions and Events.

	1.  Conditions are reported periodically as long as the condition
	    exists.  The frequency of the period is never less than 2
	    seconds and increases to 32 seconds in the absence of any
	    new conditions or events.

	    When an existing condition is deasserted, it is reported once.

	    If a condition is asserted and deasserted before an SRF
	    frame can be sent, the reported behavior is implementation
	    specific.  Currently, at least one version of SMT on the
	    market reports nothing, while another reports the condition 
	    clear event without ever asserting the condition.

	2.  Events are reported once.  In no case can an SRF frame be
	    generated more than once every two seconds.

	    In the case where the same event occurs before an SRF frame
	    is generated, a flag is set to show that multiple
	    events occurred.


 All of the conditions have existing "Flags" in the current SNMP MIB.
 All of the events are active when their corresponding flag is "true"
 Look at the existing flags to find the definitions for the conditions.

Note that SMTHoldCondition and PORTEBErrorCondition can't exist in
the traps document since these were optional and not included in
our SNMP MIB.  I also want to estimate the frequency of the various 
conditions and events so everyone can judge whether the events and 
conditions are important enough to warrant the expected bandwidth they
will consume.

Thus, the only SMT conditions that need to be reported are:

    SMTPeerWrapCondition - There should never be more that 2 stations 
	reporting this condition at any one time on a given FDDI ring.

    MACFrameErrorCondition - This should rarely occur.  Something is
	really wrong with the hardware or cabling to get many of these.

    MACDuplicateAddressCondition - This may be reported by every station
	on the ring, but only if they all had the same address.  The
	typical case would be only 2 or 3 stations reporting this at 
	any given time.
    MACNotCopiedCondition - Although each station may generate these, 
	an SNMP manager should be able to throttle the amount of 
	information received from this condition by resetting each
	MACNotCopiedThreshold to acceptable limits (or turning it off
	altogether).  Therefore, while this theoretically could generate 
	up to 250 frames/second for a maximum size FDDI ring, the 
	manager has the ability to throttle the amount of information.
    PORTLerCondition - While this could be generated by every station
	simultaneously, this usually occurs when individual links
	experience abnormal problems.  In a normal FDDI ring, the 
	bandwidth generated by this condition should be very minimal.
	On any ring if there are more than a couple of stations
	reporting LERConditions at any time, the network installer 
	should be shot.


	For all these events, most occur only when a station enters 
	or leaves the FDDI ring.  
	For a station entering or leaving a ring, expect three 
	MACNeighborChangeEvents (1 for station, 1 for station's UNA, 
	and 1 for station's DNA), potentially one MACPathChangeEvent 
	(MAC going from isolated to primary/secondary), and one for 
	each port entering the ring.

	As a result of a ring-wrap, expect two (or three) 
	MACNeighborChangeEvents, and possibly PORTPathChangeEvents 
	for the ports at the wrap point.

	The only other case I recall seeing in real rings are the
	occasional MACNeighborChangeEvents when stations are being
	swamped with traffic.  In these cases, stations quite often
	can't copy incoming SMT frames, or don't have time to generate
	NIFs.  In either case, both the station being swamped and its
	neighbors will time out their UNA and DNA causing each station
	to report the loss of neighbor information.  It takes about
	4 minutes to time out neighbor information, so the event 
	usually doesn't occur with any greater frequency than this.
	In this event, you'll see 3 MACNeighborChangeEvents.

As I browse through the past discussions of traps on this mailing list,
I think these five conditions and four events are both necessary and
sufficient to handle the types of information needed by SNMP managers.

The PeerWrapCondition coupled with the MAC and PORT PATH change events 
and the Neighbor Change event should give any manager the ability to 
isolate and diagnose ring wraps, link up and link down events, and 
other information necessary for mapping both physical ring topology 
and logical path topology.

The rest of the events and conditions are helpful to discover not only
bad hardware and links, but it also helps diagnose "soft" problems
like people attempting to connect incompatible station types together,
stations with duplicate addresses and abnormal or overloaded traffic 
flows to individual stations.

Ron Mackey			Distributed Systems International, Inc.			531 W. Roosevelt Road, Suite 2
708-665-4639			Wheaton, IL 60187-5057