Re: Building an Agenda for the TEWG meeting at UBC

tom@nisca.ircc.ohio-state.edu Tue, 10 July 1990 15:41 UTC

Message-Id: <9007101532.AA29725@nisca.ircc.ohio-state.edu>
To: tewg@devvax.tn.cornell.edu
Cc: almes@rice.edu
Subject: Re: Building an Agenda for the TEWG meeting at UBC
Date: Tue, 10 Jul 1990 11:32:53 -0400
From: tom@nisca.ircc.ohio-state.edu

Thus spake Mr Almes:

>------- Forwarded Message
>Mike,
>  There seem to be two parts to your message:
><1> A general attitude that our Internet infrastructure should be regarded
>as part of a global computer-communications infrastructure that must be up
>all the time, just as the phone system stays up even during power outages.
>
><2> Some specific possible steps regarding UPS etc.
>
>I approve of <1>, though it could be viewed as a motherhood that everyone
>could agree to in principle, with no action in practice.
>
>One positive specific step being taken is a growing insistence that NSFnet
>backbone sites provide UPS.  In my view, the cost of doing proper UPS and
>24-by-7 operator support at *every* Internet site would be astronomical.
>There tend, however, to be `hub' sites within most networks so that a small
>subset of the sites really have to be up all the time and outages at other
>sites only effect those sites.
>
>One possible TEWG project would be to survey our networks and, for each net,
>count how many sites are `leaves' (their outages don't harm anyone else) and
>how many are `hubs' (their outages harm others).
>  A second element of the survey could be `diameter'.
>  A third could be number of link outages among hubs that can be tolerated.
>  A fourth could be number of router outages among hubs that can be >tolerated.
>
>The output of this survey could guide consideration of the cost-effectiveness
>of making the net more robust in the sense you suggest.
>
>We could also ask what steps people take to make their hub sites robust.  To
>go further into the fault-tolerance area would be to depart our `topology
>engineering' agenda.
>
>Would this be a constructive response to your points?  Will *you* be at UBC?
>	-- Guy
>
>------- End of Forwarded Message


  One should consider the amount of time and manpower which is being spent
  on problems (leaves and/or hubs) when power related problems strike.  We
  recently spents a man-day fixing (not to mention the several days of outage)
  a problem at a small "leaf" site that could have been prevented with at 
  least a surge protector, and better yet an inexpensive UPS.  As Mike had
  mentioned all of our hubs are on UPS and it has been more than worth the
  initial outlay of money to do that.  As for 24/7 operations support; I have
  to agree that it would be very costly and that not all sites need it.  That
  is not totally divorced from the issue of power protection, though, in that
  UPS will reduce the number of incidents that require operations staff to
  intervene.

  I think that sites acting as hubs have a responsibility to provide good
  connectivity.  I like the idea of doing a survey such as Guy suggests.
  Would it truely be departure from "topology engineering" to engineer a site
  or network connection to use a more fault-tolerant location over another?
  But then fault-tolerant needs to be defined (or recommended)...

             Tom