Gartner Blog Network

Attend Your Local ACP Tabletop Exercise! You Will Learn A Lot (and in the safety of a non-workplace environment)

by Roberta J. Witty  |  September 28, 2011  |  5 Comments

On Tuesday September 20, 2011, I attended my local CT ACP chapter’s annual tabletop exercise. Hosted at Northeast Utilities, the ACP management team in conjunction with NU’s BCM team, conducted a great two hour exercise that was filled with lots of changing conditions, misinformation, and even an Elvis impersonator! Also in attendance were an EMNS vendor providing real-time emergency notification services, two recovery services firms for the data center and workforce, two physical security services firms, and one BCM advisory firm that provided overall exercise planning support.

The Scenario: The Can-Do organization is a heavy construction equipment leasing headquartered in Madison, CT and with additional operating facilities in Virginia and Nevada. Their data centers are located in Madison, CT and Richmond, VA. To support the leasing business, they also have a credit company that has evolved into a full-service bank and insurance underwriter and broker located in the firm’s 600 retail centers. Banking applications were housed in the Richmond, VA data center.

The Plan: The plan we were given was called the “Emergency Response Plan” dated July 15, 1998 and prepared by the VP of Training, Cafeteria Services and Vehicle Maintenance. HA-HA.

Recovery teams included Operations, HR, Finance, IT and physical security. I was on the IT recovery team.

Although the plan stated that Can-Do ascribes to the U.S.’s National Incident Management System (NIMS), no one knew who was in charge of the incident nor how to communicate with the incident management team. It was even questioned as to why a command center was stood up on the first day of the hurricane warning; that foresight proved to be fortuitous as it turned out. The first thing we all needed to do was elect an incident commander (assigned to the Operations recovery team) and one for each recovery team. All recovery teams also assigned coordinating responsibilities to their sister recovery teams. The IT recovery team was smart (wink wink) and assigned a supply chain/vendor management coordinator and a scribe – guess who took that role (me).

We started the exercise by getting a hurricane warning for Madison, CT along with a warning that a mild flu pandemic was possible. The first step that the IT team performed was topping off the fuel tanks for the CT data center generators. We then sent out a message to the IT staff instructing them to test their work-at-home capability. We also checked with the VPN service provider our ability to add additional bandwidth if needed – 60% of the workforce was able to work from home, but looking into the future, we wanted to ensure that if we needed more bandwidth for additional staff we would have it available to us.

The scenario intensified throughout the exercise and we received additional information including:

  • flu shot availability in the cafeteria (a red herring);
  • a cat 4 Hurricane Mary was off the coast of Bermuda – Important!;
  • a fuel tanker accident at the Richmond, VA data center which closed down the facility – major crisis and also on the evening news in VA;
  • the CT data center lost all power because a security guard had an adverse reaction to some over-the-counter flu medication, passed out and hit his head on the emergency power shutoff button. Physical security was automatically notified and the police were called to secure the disabled data center,
  • Richmond, VA schools were closing at 1 pm due to the hurricane warning.

The IT recovery team notified the Richmond, VA data center recovery service provider that recovery was required; the turnkey arrangement ensured that the data center was up and running within a few hours. The Madison, CT data center had an active-active arrangement with a data center service provider (which the 1998 plan did not identify), so operations were automatically cut over to it once the power went out due to the untimely power shutoff. Prior to the power going out at the CT facility, the Incident commander asked IT to update the employee and franchisee crisis portal with information about how to communicate with Can-Do if the hurricanes hit either location. This activity was not completed because of the power outage in CT.

The power outage resulted in an IT recovery ETA of two days. That information was based on the very old emergency response plan which did not contain the current IT configuration at either data center nor recovery procedures for the current configurations. In reality, there was no loss of IT – even through the power outage due to the prior arrangements made with the DR service providers.

Finance issued a communication requesting that they be notified of any needed hotel and travel arrangements and that all expenses incurred needed to be justified correctly as storm-related or fuel spill-related. Due to the escalating conditions, they subsequently upped the credit limits on all corporate credit cards.

An interesting twist to the exercise was that somehow the FDIC got into the facility and was snooping around asking about Can-Do’s recovery ability. The FDIC person faked her identity to the IT recovery team as it turns out. Though they were smart enough to ask who she was, the team was a bit disjointed and she was given the information she asked for.

Multiple emergency messages were being sent to employees from the various recovery teams throughout the exercise but not everyone was getting the messages. As it turned out, some cell phone towers were down in Richmond, VA due to Hurricane Mary hitting land.

Finally, staff in CT were sent home at 7 pm that day in preparation for the hurricanes hitting.

Hopefully you can sense the problems that arose during this exercise. Some observations:

  • The plan needed to be updated ASAP! An old plan is worthless and you waste a lot of time trying to figure out what the current practices are for production processing and recovery of those practices.
  • Some exercise participants weren’t aware that they needed to DO something – they sat at the team table and just talked about what they would do in an actual recovery rather than going over to the table of the team with whom they needed to communicate.
  • The exercise showed that internal recovery team roles need to be defined in advance so that when an incident occurs, everyone knows their role.
  • Communications between the teams was strained at first – no one knew who the coordinators were. Although Can-Do ascribed to NIMS, no crisis command procedures nor crisis communication procedures were part of the plan.
  • It was not clear that there was an emergency/mass notification tool available. It took a good 30 minutes for the tool to start being used consistently. AND the first time recovery teams wanted to send a message, they were informed that they needed to have that message approved by the incident command center – also not in the recovery plan and obviously not tested. And rather much a thorn in the side of IT because we thought we should be able to send out our own IT messages – WAKE UP CALL that external communications must be reviewed by the right people internally AND that certain kinds of messages should be set up in advance so that when the interruption occurs, you have much of the messaging in place and approved.
  • Releasing recovery status information to someone who turned out to be from the FDIC presented a loophole in the crisis communications process.
  • Command center check-in calls with all team coordinators need to be scheduled on a regular basis.
  • There was no conference bridge capability for the incident command center to use so everyone had to physically be present at the command center.
  • An activity log of actions and tasks and their status needs to be created and updated throughout the incident.
  • Finally, a best practice for recovery exercising is to bring in outside observers to identify things during the exercise that you can’t see for yourself.

We had a great event and we all learned a lot about how important it is to have a current plan, clear recovery role assignment, how we personally respond in a crisis and how communications is KEY to recovery success.

Additional Resources

View Free, Relevant Gartner Research

Gartner's research helps you cut through the complexity and deliver the knowledge you need to make the right decisions quickly, and with confidence.

Read Free Gartner Research

Category: advisory  

Tags: availability-risk  backup-and-recovery  banking  bcm  bcp  bia  business-continuity-management  business-continuity-planning  business-impact-analysis  emergency-notification  emergency-preparedness  governance  incident-management  it-disaster-recovery  mass-notification  operational-risk-management  pandemic-planning  recovery-planning  resiliency  risk-assessment  roberta-witty  social-media  supply-chain-risk-management  workforce-continuity  

Roberta J. Witty
Research VP
11 years at Gartner
33 years IT industry

Roberta Witty is a research VP in Gartner Research, where she is part of the Compliance, Risk and Leadership group. Her primary area of focus is business continuity management and disaster recovery. Ms. Witty is the role specialty lead for… Read Full Bio

Thoughts on Attend Your Local ACP Tabletop Exercise! You Will Learn A Lot (and in the safety of a non-workplace environment)

  1. As stated, this was an excellent exercise – one that should be emulated by many. Ed Goldberg, our host at NUI, put together a program that had more chills and spills than one would normally find in a TTX program event. The silver lining was that while many of us were in different roles (I was an HR executive?), it was something to which we each gravitated and actually learned more about “what the other side was doing”.

    This type of exercise is necessary for individuals to bring into their OWN offices and engage with their own teams. As they say, the best way to get to Carnegie Hall is to “practice, practice, practice…” and the best way to ensure that your recovery chances are optimized is to follow the same advice…and to practice!

    I believe that this is the fourth such event at the Connecticut ACP Chapter and they keep getting better. Our thanks to the attendees, sponsors and the leadership of that fine Chapter. A job well done…now, go and practice some more 😉

  2. Roberta J. Witty says:

    Thanks Ralph. Yes indeed – practice is the key. I was just speaking with a client who went through Hurricane Irene unscathed (but that was not their expectation) and he said “Failure to prepare is preparation for failure”.

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.