Lessons Learned
Echoes of Northridge & I.R.A. Bombings
UNPLEASANT surprises can be dangerous and expensive, especially when they reach the point where they are called "emergencies" or "disasters." We cannot eliminate these events, but we can take steps to guard against them, and we can make preparations that will help return daily activities and business to normal.
Experience is a great teacher, and the January 17, 1994 Northridge earthquake in the Los Angeles area offered extensive experience. The I.T. operations of many different companies were disrupted by the earthquake. Those that had prepared for some kind of disaster were the ones with the fastest and most effective recoveries.
In all cases, successful recovery plans had similar elements, including:
* A documented response plan in the hands of clearly-identified emergency response teams.
* Readily-accessible employee contact information.
* Current customer and client contact lists in the possession of accountable employees.
* Current contact information for vendors, suppliers and service providers.
* A documented disaster recovery plan that:
* Empowered the business's recovery teams to perform activities as quickly as possible.
* Described step-by-step procedures for recovering I.T. operations that were critical.
* Had been periodically exercised prior to the earthquake.
* Had been kept current with changes in the business.
* Made available alternate locations for conducting business, such as a computer hot site; and alternate office space.
For those organizations that used an alternate computing facility or service, such as a vendor-provided hot site, several critical design points emerged. These included:
* The availability of system hardware configurations that were current with those in use at the home locations.
* Skilled vendor staff who could fulfill the necessary I.T. operations tasks during the absence of the firm's own staff.
* Current backup media that could be delivered to the alternate site in a timely and complete fashion.
* Availability of a well-designed backup telecommunications network that could be activated on short notice to connect the organization's surviving business locations with the alternate computing facility.
* Prior arrangements with software vendors that would enable the software to run on alternate computing hardware.
* Documented computer operations scripts that could be made available at the alternate computing site.
* Operations scripts clear enough to be successfully-used by any skilled staff to restore the critical I.T. systems for use by the operating units.
Rehearsals
These elements were most successful for those whose plans had included prior opportunities for rehearsals, tests and exercises. Strong support from senior management made the difference: if it was there, then the funding and the impetus to accomplish realistic planning was there too.
Businesses that were affected by the earthquake and survived generally learned several things about the importance of I.T. to the on-going vitality of the organization:
1. I.T. provides services and information that are critical to the functioning of the business.
2. If these services and information are unavailable for an extended period of time, it becomes increasingly difficult to contact customers.
3. If you cannot contact your customers, they start to think that you are out of business.
4. If your customers think that you are out of business, then you probably are out of business.
5. I.T. operations can be disrupted not only by local events that might directly damage the equipment, but also by external events that prevent access to the I.T. location by critical support staff.
6. I.T. functions can be protected in an economical way that can help the business to survive a disaster.
7. Without preparation, scrambling around after a disaster to repair the effects of I.T. disruption is more expensive than making the investment in contingency preparations.
8. The tolerance for computing downtime is always decreasing.
There is an active I.T. industry that provides professional guidance for making disaster recovery and other contingency preparations.
In addition to providing subscription services to hot sites and other alternate computing arrangements, these firms can conduct a business impact analysis, develop recovery and continuous operations strategies, deliver on-site crisis management experts and provide guidance and project management in implementing all phases of contingency preparations.
Garry C. Herron is president and founder of Advanced Technology Solutions. As Western region manager for IBM Business Recovery Services, he led IBM's recovery efforts for clients suffering damage from the 1994 Northridge (Los Angeles) earthquake.
|