Re: Zone name changing via SNMP
John Norstad <j-norstad@nwu.edu> Wed, 31 March 1993 21:29 UTC
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa09807; 31 Mar 93 16:29 EST
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa09803; 31 Mar 93 16:29 EST
Received: from cayman.cayman.com by CNRI.Reston.VA.US id aa21887; 31 Mar 93 16:28 EST
Received: by cayman.Cayman.COM (4.1/SMI-4.0) id AA19098; Wed, 31 Mar 93 14:43:26 EST
Return-Path: <j-norstad@nwu.edu>
Received: from merle.acns.nwu.edu by cayman.Cayman.COM (4.1/SMI-4.0) id AA19094; Wed, 31 Mar 93 14:43:21 EST
Received: from jlntoy.acns.nwu.edu by merle.acns.nwu.edu with SMTP (16.6/16.2) id AA08994; Wed, 31 Mar 93 13:39:48 -0600
Message-Id: <9303311939.AA08994@merle.acns.nwu.edu>
Date: Wed, 31 Mar 1993 13:42:29 -0600
To: APPLE-IP@cayman.com
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Norstad <j-norstad@nwu.edu>
X-Sender: jln@merle.acns.nwu.edu (Unverified)
Subject: Re: Zone name changing via SNMP
I was happy to see Fidelia Kuang's note that some of the Apple folks are at least going to take a stab at zone changing via SNMP. I wish you the best of luck! I have some more comments. My main concern is to register my stongest possible vote for some kind of reasonably quick solution to the zone changing problem, even if it is imperfect. Today, even a simple operation like adding a new zone to the existing zone list for a network is often a herculean task. Thank God that at least our backbone Ciscos here at NU permit remote out-of-band management. We also have FastPath 4 and 5 boxes, Gatorboxes, Compat. Sys. boxes, Novell, AIR, Liason, and even Banyan Vines routers. It gets complicated. Anything which would make this easier would be much appreciated. Tom Evans did his usual excellent job of exploring all the horrible complexities of trying to do this "perfectly". Yes, I did read his long note from many months ago, and I had it in mind when I wrote my note. I also realize that Tom was not seriously proposing such an implementation, but was rather trying to debunk the whole idea of zone changing via SNMP. I'm also well aware that my "idea" is not original. (I call it an "idea" because I hesitate to dignify it with the status of "proposal".) Tom addresses a number of big problems in that old note. Permit me to briefly summarize them in light of my note. 1. Router discovery. This is a management software issue, not a MIB or router software issue. If I had a dumb console which couldn't even find all the routers on a net, I'd set the necessary MIB variables in each router by hand (or more likely, write a UNIX script to do it for me). As the internet manager, I do know where all the routers are. (If I don't, I have bigger problems than trying to change zone lists!) A smart console would of course be nice, but it's not required. 2. Non-SNMP routers on the net being reconfigured. Yes, of course these still must be done by hand. But again, this is not a MIB or router software issue. Sure, it would be nice if my console could discover these for me and alert me that they must be done by hand. But again, I'm willing to take care of this myself if my console is not smart. I'd much rather have a solution which did not address this problem than no solution at all. 3. Real-time reconfiguration. Much of the complexity in Tom's note is a direct consequence of the fact the he is trying to reconfigure the boxes in real time. He has to deal with all the ugly problems of retaining connectivity with the boxes in question during the reconfiguration. Timed suicide avoids all of these issues. That's the reason I used timed suicide in my note. The ability to schedule odd hour reconfigurations is just an extra benefit. 4. End-node recovery on the net being reconfigured. My idea does not accomplish this worthwhile goal. As Tom outlines, achieving this requires a significantly more sophisticated algorithm in the router code. Again, if we could get this, I'd be happy, but I'm more than willing to require restarts of all AppleTalk nodes on the net if that's all I can get. Also note that adding a new non-default zone name to a zone list does not require restarts. This is the kind of reconfiguration I most often need to do anyway. (Once a zone name has been assigned and starts being used by users, I find it impossible in practice to change it anyway, because so many user Macs have aliases and other kinds of "net object pointers" using the old zone name. I can't imagine any reasonable proposal which would address this issue. This is not a problem when adding a new zone name or changing network numbers.) 5. Minimizing the reconfiguration period. Tom's note outlines an algorithm where the console monitors the internet to discover when the network being reconfigured has aged out of all the routing tables in all the routers on the internet. In my idea, I have the person doing the reconfiguring supply the reconfiguration delay time. Again, in my idea I have sacrificed functionality for the sake of implementation simplicity. Taking all this together, we end up with something which I feel is imperfect but adequate, much better than what we have today (nothing), and above all something which could be implemented, tested, and formally specified as part of a MIB in a reasonable amount of time. More comments: Yes, I guess we would have to make use of a timer which counts down to the reconfiguration date/time, rather than an absolute date/time. Too bad, but still workable. Kind of ugly without the help of software in the management console to do the date/time subtraction, however. If I had to, I'd probably write a simple UNIX script to do this work for me. I think we should keep the pending network number range start and range end per-port variables. As Tom pointed out in his note, changing network numbers in real time presents some of the same problems as changing zone lists (although not as many), and that's why I included network number changing in my note. Again, using timed suicide to change network numbers avoids the problems. The seed vs. non-seed variable can be nuked. Indeed, according to Inside AT, configuring a port with a net range of 0-0 indicates non-seed anyway. (Do routers actually implement it this way?) To summarize the very simplest version of my idea once again, from the router's point of view: At the reconfiguration time (when the counter reaches zero), the router deletes the network from its routing table and marks the port "on-hold". In this simplest version, "on-hold" means "disabled". The only change in the routing algorithm is that all routing through the port is disabled. After the reconfiguration delay, the port is reconfigured and the port initialization process is restarted. This simplest version does nothing more than mimick what we have to do now by hand: First shut down all the routers on the net (or shut down ports, if the routers permit this), then wait for the network to age out across the entire internet, then reconfigure and restart the routers (or ports) one at a time. After rereading the relevent part of Inside AT, I see that there's no need to be concerned about seed vs. non-seed ports. If a non-seed port on the net happens to come back up before any seed port, it will just wait for a seed port to start broadcasting RTMP packets. This scheme is absolutely minimal. Its the very simplest possible solution to the problem. Tom's note outlines a very complex solution. There is a whole range of possible solutions between these two extremes. For example, using some of Tom's ideas we could perhaps avoid confusing end nodes on the net being reconfigured. I also suggested a possible modification (and increase in complexity) which would permit routing to continue "through" the network during the reconfiguration period. (This isn't as simple as I originally thought, though.) As an internet manager, I would much rather see the minimal solution or a weak solution implemented soon than have no solution at all, or have to wait a long time for a strong solution. As a customer of several router vendors, I'd be willing to see a delay of a few months in getting my hands on the new MIB if it meant I could get some sort of solution to the zone changing problem as part of the new MIB. In any case, it seems reasonable to attack a problem this complex by beginning with as many simplifying assumptions as possible, then adding refinenments based on experience. That's how I'd start, anyway. Tom Evans told me that he worries that this scheme is just too complicated and error prone for dumb customers to understand and use, too hard for the vendors to document, and too hard for the vendors to support. Perhaps. I do understand his concern. I'd just hate to be denied access to a powerful and very useful tool just because other people have trouble using it properly. Even with my approach, which avoids all the really tough problems, I think decent management software could go a long way to minimize user complexity. I can easily imagine a relatively simple Mac program which would do router discovery, warn the user about non-SNMP routers on the net, and present a very simple interface for making changes to the net's configuration. The net configuration window would display the current network number range and zone list and let the user edit them. It would also let the user specify the date and time at which the changes should occur and the reconfiguration delay time. That's about it in terms of the human interface. This could be a standalone program or part of a more sophisticated SNMP management console software package. Tom also mentioned the problem of a router going down and being restarted between the time the user configures the timed change and the time when the change is scheduled to take place. In this case, the router has lost all the reconfiguration information, and big problems result. The timed suicide will work properly only if all the routers on the net participate on schedule. One possibility Tom explored in his note would be to make one router a seed (the most reliable one) and the others non-seed just for the duration of the reconfiguration. Another possibility would be to have the management software monitor the confured routers and make certain they retain their settings right up to the scheduled reconfiguration time. This approach would also help keep the timers sychronized. Indeed, the entire collection of reconfiguration information could be kept in the management software, and not actually sent to the routers until 10 minutes or so before the scheduled reconfiguration time. If any of the routers are down at this point, the management software could easily back out of the whole operation. I like this solution to the problem the best. It doesn't eliminate the "catastrophe window", but it makes it much smaller. I attended a presentation by Gary Hornbuckle where he talked about the "law of the conservation of complexity" (he was quoting someone else). This is a good example. Changing the zone list for a network on an active internet is very complex. Different solutions distribute the complexity differently between the routing protocol, the MIB, the router software, the management software, and the user. Currently, the user is burdened with almost all of the complexity. Under my minimal approach, we accomplish a great deal (if not everything we'd like) by adding no complexity at all to the routing protocol and very little complexity to the MIB and the router software. Much of the remaining complexity can be handled by good management software. One way of looking at Tom's note is that it attempts to remove the greatest possible amount of complexity from the user, in the grand tradition of the Mac and AppleTalk. As Tom demonstrated quite convincingly, however, this is only possible at the expense of enormous increased complexity in the other components of the system. My much more modest scheme distributes the complexity differently, with a bit more complexity for the user, but I think it's a much more balanced approach. And remember that "user" in this context means "internet manager", not "end user" or even "LAN manager". With decent management software as outlined above, our "user" needs to be aware of the kind of connectivity loss which his internet will experience during the reconfiguration period, he needs guidance in specifying the reconfiguration delay time, he needs to know under what circumstances end nodes will need to be restarted after the reconfiguration, and he needs to be aware of what kinds of things can go wrong. Is this really too much to expect, or too difficult to document and support? Is it even all that much better under Tom's scenario? It's certainly much better than the current situation, where he has do all the reconfiguration by hand in addition to dealing with the problems mentioned above. I apologize for the length of this note. I think I've made the points I wanted to make, and I thank you for listening. I look forward to hearing about what happens with the Apple experiment. John Norstad Academic Computing and Network Services Northwestern University j-norstad@nwu.edu
- Re: Zone name changing via SNMP John Norstad