Business Continuity

A member of the Gartner Blog Network

Roberta J. Witty
Research VP
11 years at Gartner
33 years IT industry

Roberta Witty is a research VP in Gartner Research, where she is part of the Compliance, Risk and Leadership group. Her primary area of focus is business continuity management and disaster recovery. Ms. Witty is the role specialty lead for… Read Full Bio

Are You Prepared for a Zombie Attack?

by Roberta J. Witty  |  November 2, 2011  |  1 Comment

Most of us think about computer-based zombies but this survey provides some needed humorous diversion from all of the weather-related power outages in the Northeast U.S. Take the Zombie Risk Management Assessment to see why. It amazing provides some core best practices from BCM.  Belated Happy Halloween!!!

1 Comment »

Category: Uncategorized     Tags: , , , , , , , , , , , , ,

As a woman in IT and security, we need to change this situation…

by Roberta J. Witty  |  October 12, 2011  |  Comments Off

I rarely just re-post articles, but this one needs to be taken with much seriousness: RT @bnkinfosecurity White men dominate IT security profession; IT security unemployment at 0 perce.. http://bit.ly/nQnwKL

Comments Off

Category: Advisory     Tags: , ,

Attend Your Local ACP Tabletop Exercise! You Will Learn A Lot (and in the safety of a non-workplace environment)

by Roberta J. Witty  |  September 28, 2011  |  5 Comments

On Tuesday September 20, 2011, I attended my local CT ACP chapter’s annual tabletop exercise. Hosted at Northeast Utilities, the ACP management team in conjunction with NU’s BCM team, conducted a great two hour exercise that was filled with lots of changing conditions, misinformation, and even an Elvis impersonator! Also in attendance were an EMNS vendor providing real-time emergency notification services, two recovery services firms for the data center and workforce, two physical security services firms, and one BCM advisory firm that provided overall exercise planning support.

The Scenario: The Can-Do organization is a heavy construction equipment leasing headquartered in Madison, CT and with additional operating facilities in Virginia and Nevada. Their data centers are located in Madison, CT and Richmond, VA. To support the leasing business, they also have a credit company that has evolved into a full-service bank and insurance underwriter and broker located in the firm’s 600 retail centers. Banking applications were housed in the Richmond, VA data center.

The Plan: The plan we were given was called the “Emergency Response Plan” dated July 15, 1998 and prepared by the VP of Training, Cafeteria Services and Vehicle Maintenance. HA-HA.

Recovery teams included Operations, HR, Finance, IT and physical security. I was on the IT recovery team.

Although the plan stated that Can-Do ascribes to the U.S.’s National Incident Management System (NIMS), no one knew who was in charge of the incident nor how to communicate with the incident management team. It was even questioned as to why a command center was stood up on the first day of the hurricane warning; that foresight proved to be fortuitous as it turned out. The first thing we all needed to do was elect an incident commander (assigned to the Operations recovery team) and one for each recovery team. All recovery teams also assigned coordinating responsibilities to their sister recovery teams. The IT recovery team was smart (wink wink) and assigned a supply chain/vendor management coordinator and a scribe – guess who took that role (me).

We started the exercise by getting a hurricane warning for Madison, CT along with a warning that a mild flu pandemic was possible. The first step that the IT team performed was topping off the fuel tanks for the CT data center generators. We then sent out a message to the IT staff instructing them to test their work-at-home capability. We also checked with the VPN service provider our ability to add additional bandwidth if needed – 60% of the workforce was able to work from home, but looking into the future, we wanted to ensure that if we needed more bandwidth for additional staff we would have it available to us.

The scenario intensified throughout the exercise and we received additional information including:

  • flu shot availability in the cafeteria (a red herring);
  • a cat 4 Hurricane Mary was off the coast of Bermuda – Important!;
  • a fuel tanker accident at the Richmond, VA data center which closed down the facility – major crisis and also on the evening news in VA;
  • the CT data center lost all power because a security guard had an adverse reaction to some over-the-counter flu medication, passed out and hit his head on the emergency power shutoff button. Physical security was automatically notified and the police were called to secure the disabled data center,
  • Richmond, VA schools were closing at 1 pm due to the hurricane warning.

The IT recovery team notified the Richmond, VA data center recovery service provider that recovery was required; the turnkey arrangement ensured that the data center was up and running within a few hours. The Madison, CT data center had an active-active arrangement with a data center service provider (which the 1998 plan did not identify), so operations were automatically cut over to it once the power went out due to the untimely power shutoff. Prior to the power going out at the CT facility, the Incident commander asked IT to update the employee and franchisee crisis portal with information about how to communicate with Can-Do if the hurricanes hit either location. This activity was not completed because of the power outage in CT.

The power outage resulted in an IT recovery ETA of two days. That information was based on the very old emergency response plan which did not contain the current IT configuration at either data center nor recovery procedures for the current configurations. In reality, there was no loss of IT – even through the power outage due to the prior arrangements made with the DR service providers.

Finance issued a communication requesting that they be notified of any needed hotel and travel arrangements and that all expenses incurred needed to be justified correctly as storm-related or fuel spill-related. Due to the escalating conditions, they subsequently upped the credit limits on all corporate credit cards.

An interesting twist to the exercise was that somehow the FDIC got into the facility and was snooping around asking about Can-Do’s recovery ability. The FDIC person faked her identity to the IT recovery team as it turns out. Though they were smart enough to ask who she was, the team was a bit disjointed and she was given the information she asked for.

Multiple emergency messages were being sent to employees from the various recovery teams throughout the exercise but not everyone was getting the messages. As it turned out, some cell phone towers were down in Richmond, VA due to Hurricane Mary hitting land.

Finally, staff in CT were sent home at 7 pm that day in preparation for the hurricanes hitting.

Hopefully you can sense the problems that arose during this exercise. Some observations:

  • The plan needed to be updated ASAP! An old plan is worthless and you waste a lot of time trying to figure out what the current practices are for production processing and recovery of those practices.
  • Some exercise participants weren’t aware that they needed to DO something – they sat at the team table and just talked about what they would do in an actual recovery rather than going over to the table of the team with whom they needed to communicate.
  • The exercise showed that internal recovery team roles need to be defined in advance so that when an incident occurs, everyone knows their role.
  • Communications between the teams was strained at first – no one knew who the coordinators were. Although Can-Do ascribed to NIMS, no crisis command procedures nor crisis communication procedures were part of the plan.
  • It was not clear that there was an emergency/mass notification tool available. It took a good 30 minutes for the tool to start being used consistently. AND the first time recovery teams wanted to send a message, they were informed that they needed to have that message approved by the incident command center – also not in the recovery plan and obviously not tested. And rather much a thorn in the side of IT because we thought we should be able to send out our own IT messages – WAKE UP CALL that external communications must be reviewed by the right people internally AND that certain kinds of messages should be set up in advance so that when the interruption occurs, you have much of the messaging in place and approved.
  • Releasing recovery status information to someone who turned out to be from the FDIC presented a loophole in the crisis communications process.
  • Command center check-in calls with all team coordinators need to be scheduled on a regular basis.
  • There was no conference bridge capability for the incident command center to use so everyone had to physically be present at the command center.
  • An activity log of actions and tasks and their status needs to be created and updated throughout the incident.
  • Finally, a best practice for recovery exercising is to bring in outside observers to identify things during the exercise that you can’t see for yourself.

We had a great event and we all learned a lot about how important it is to have a current plan, clear recovery role assignment, how we personally respond in a crisis and how communications is KEY to recovery success.

5 Comments »

Category: Advisory     Tags: , , , , , , , , , , , , , , , , , , , , , , ,

What’s Changed in BCM Since 9/11: A Ten Year Review

by Roberta J. Witty  |  September 13, 2011  |  1 Comment

Anniversaries of major events – personal and public – trigger much reflection on what has changed since the event, and 9/11 is no different. I went back to my experience in the months following 9/11 to find nuggets of information about which to write regarding how 9/11 changed the ability of organizations to respond and recover from major business disruptions. Colleagues and I conducted many advisory sessions across the U.S. regarding business continuity management (BCM), IT disaster recovery management (IT DRM) and crisis/incident management (CIM). That lasted for about nine months and then there was a profound “thud” as most private enterprises of all sizes – small, medium and large – moved on to more pressing issues. I think most of them were not ready for the commitment required to turn their IT DRM programs – which most recovery programs were at that time – to full-fledged BCM programs that encompassed IT, the workforce, customers, partners, the supply chain and so forth. The areas where we did see some focus in the first few years after 9/11 are workforce resilience and crisis management. Obviously there were the exceptions, but overall we did not see a huge rush to BCM program maturity as a result of 9/11 in the private sector.

However, we did see a major change directly related to 9/11 on the federal, state and local government side. The formation of the U.S. Department of Homeland Security in 2002 started the ball rolling. DHS/FEMA has done a very good job in maturing the readiness of federal, state, local and tribal nation emergency operations, but it has taken years for DHS/FEMA to have an impact on private sector BCM programs. The focus on improved public/private sector communications through multi-state and national-level exercises (especially for the healthcare, financial services and public utilities sectors), the introduction of Ready.gov, and the Voluntary Private Sector Preparedness Accreditation and Certification Program (PS-Prep) are three influential changes for private enterprises.

Even though 9/11 did not have an immediate impact on BCM maturity, it did set up the framework for preparedness, response and recovery improvements since for both the public and private sectors. The majority of these improvements have been a result of the confluence of three areas:

  1. Increasing natural and man-made disaster events such as SARS, Hurricane Katrina, the bird and swine flu threats, the London and Mumbai bombings, the Iceland volcanic ash event, earthquakes in Haiti, Chile, New Zealand and Japan, oil spills, the global financial crisis of 2008, major ice and snow storms and so forth;
  2. Technology innovations such as Internet broadband in the home, the real-time infrastructure, virtualization, hosting/outsourcing, smartphones and tablets, social media and cloud computing; and
  3. Business operating practices such as regulatory changes in response to financial fraud, telework initiatives and outsourcing non-core competencies.

Without these changes to business and IT practices, many of the improvements we see today in BCM maturity would not be possible.

We have come a long way in BCM since 9/11 and we have a longer way to go for organizations of all sizes and operating models to be prepared from even the smallest, localized threat. Gartner is committed to your success in preparedness, response and recovery activities and continues to offer clients foundational and timely research in BCM and IT-DRM through our BCM key initiative for business and IT leaders. Take our maturity self-assessment called ITScore for Business Continuity Management to jump start your journey.

1 Comment »

Category: Advisory     Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

In Memory of 9/11

by Roberta J. Witty  |  September 11, 2011  |  Comments Off

Love.

Hope..

Compassion…

Comments Off

Category: Uncategorized     Tags:

REPOST: U.S Again Under Threat: Published 9 September 2011 > Homeland Security Newswire

by Roberta J. Witty  |  September 9, 2011  |  Comments Off

Nice explanations provided for “Specific” and “credible” in the body:  http://www.homelandsecuritynewswire.com/us-again-under-threat

U.S again under threat

Published 9 September 2011

New York City and the District of Columbia respond to “specific, credible but unconfirmed” intelligence of an impending attack; information obtained indicates a vehicle-borne bomb; NYPD deploys boats, armored vehicles and a 1,000-member counter-terror force

As the nation prepares to commemorate the tenth anniversary of 9/11, New York City and Washington, D.C. again find themselves responding to an al Qaeda attack threat.

Federal authorities have advised local officials of a “specific, credible but unconfirmed threat” to the cities centered around the commemoration of the World Trade Center and Pentagon attacks.

The intelligence community has developed a “general description”of two or three individuals who may already be in the country. Making the effort more difficult is that the individuals in question have common names.

That intelligence came from the tribal region of Pakistan, from a source acknowledged as having a reliable record by U.S. intelligence officials.

It is believed that the attackers originated their journey in Afghanistan, with a possible third-country waypoint. That third country may have been Iran.

In the language of counter-terror officials, “specific” means that there is information of the type of attack that may occur. In this case, the information indicated that a vehicle-borne explosive device, a car- or truck-bomb, was the chosen method.

Last night, official focus was on two missing rental trucks, from different rental agencies in the Kansas City, Kansas area. They were later found and determined to be unconnected with the present threat.

“Credible” is used to indicate that the source of the information is believable, comes from a reliable, knowledgeable source. U.S. signals intelligence has been listening in on the communications of one particular al Qaeda source in Pakistan, from whom officials have gleaned confirmed information in the past.

Also supporting the credibility of the intelligence is an increase in “chatter” on the communication channels that are known to be used by al Qaeda operatives.

In the trove of documents gathered from Osama bin Laden’s compound in Abottabad, Pakistan, during the raid that killed him, bin Laden showed a predilection for attacking the United States on significant dates and anniversaries, such as the upcoming 9/11 commemorations.

What has not yet been uncovered is the type of corroboration that would provide confirmation indicating that the plot is active and in progress.

New York City wasted no time in responding to the threat notification.

All bridges and tunnels entering Manhattan have been staffed with additional police and national guard personnel. Cars and trucks entering the city are being searched. Additionally, there have been checkpoints set up at various locations in Manhattan, such as Times Square and Lower Manhattan, approaching the financial district.

Key rail and subway stations operated by the Metropolitan Transit Authority, Port Authority of NY and NJ and NJ Transit have been staffed with additional officers accompanied by national guard troops, watching traffic outside stations and randomly searching backpacks and baggage inside.

New York City’s response has been thorough. Besides police officers at the bridges, tunnels and rail stations, the city has deployed radiation-detecting boats, cameras have been placed throughout midtown and lower Manhattan. If required, the NYPD has a small, unmanned submersible craft available to search the hulls of ships and boats.

Also deployed or on standby, is an “army” of 1,000 anti-terror officers, armored vehicles and weapons and EOD (explosive ordinance disposal) specialists.

Comments Off

Category: Advisory     Tags: , , , , , , , , , , , , , , , , , , ,

Why Does It Take So Long to Restore Electricity?

by Roberta J. Witty  |  September 6, 2011  |  2 Comments

Lots of citizens in Connecticut are complaining about how long it took (or is taking)  to restore power to their residences after the impact of Hurricane Irene. I heard some interesting information today on my local NPR radio station: The Colin McEnroe Show spent the entire time on disaster preparedness. One of the guests – Arnold Chase – talked about why it can take so long today to restore power. What he presented was that today’s telephone pole, rather more accurately termed a utility pole, is not yesterday’s pole. Years ago when there were only telephone wires (and no longer telegraph wires which were the original use of the pole) to support, the pole was shorter. Today’s poles are higher – which automatically creates more danger in repair situations – and support not only telephone wires, but also electric power lines, cable TV and Internet wires and others. There are two types of electric power lines: distribution lines which carry the power which is dropped to the home and sub-transmission lines which carry higher voltage power from regional substations to local substations.

The highest set of wires are the electric power lines, so they are the first impacted by falling trees. Also, they are the most dangerous – you want them out of the way of telephone and cable company workers when they need access to the pole.  In addition, the telephone, cable and other wires are bundled together and strung along from pole to pole on messenger lines – more stable than the electric power lines. When a pole is damaged, it takes coordination of all utility services companies to get things back in working order.  In addition, the sub-transmission lines are high voltage wires that require specialized skills to install and maintain. And to get that job, you need at least five years of experience under your belt, resulting in a smaller pool of qualified electric utility personnel able to repair these lines.

I found this information to be useful in understanding why it takes so long to restore power. We in CT were warned that we could be out of power for days – not just one or two but five or more – so this explanation puts that warning into perspective. I am buying an automatic-cutover propane-powered generator. And I just bought camping LED light lanterns. I wanted to buy a floor sump pump in addition to the sump pump already in my basement floor, but the salesman at the local home repair store told me I was overdoing it – Thank You Sir.

2 Comments »

Category: Uncategorized     Tags:

New “Get Tech Ready” Web Resource from FEMA’s Ready Campaign

by Roberta J. Witty  |  August 31, 2011  |  2 Comments

Ensuring your staff is prepared and safe before, during and after a disaster goes a long way in ensuring workforce resilience – in other words, that your workforce will be ready and able to come to the aid of the organization during a crisis event. To that end,  a new web resource – Get Tech Ready – is being stood up by the U.S. Federal Emergency Management Agency (FEMA), the American Red Cross (ARC), the Ad Council and Google Crisis Response on behalf of the FEMA’s  Ready campaign. This new web resource is being released just ahead of schedule -  September – which in the U.S. is designated as the annual “National Preparedness Month”, and for 2011 it is the 10 year anniversary of 9/11.

According to FEMA  “this new resource educates individuals and families about how using modern-day technology can help them prepare, adapt and recover from disruptions brought on by emergencies or disasters. Get Tech Ready provides Americans with tips on how to use technological resources before, during and after a crisis to communicate with loved ones and manage your financial affairs. Preparedness tips on the website include:

  • Learn how to send updates via text and internet from your mobile phone to your contacts and social channels in case voice communications are not available;
  • Store your important documents such as personal and financial records in the cloud or on a secure and remote area or flash or jump drive that you can keep readily available so they can be accessed from anywhere; and
  • Create an Emergency Information Document using the Ready.gov Family Emergency Plan template in Google Docs or by downloading the Ready Family Emergency Plan to record your emergency plans.”

Check it out and let us know what you think.

2 Comments »

Category: Advisory     Tags: , , , , , , , , , , , , , , , , , , , , ,

Surviving Hurricane Irene: What Worked, What Didn’t, What Was New?

by Roberta J. Witty  |  August 30, 2011  |  2 Comments

As most of us are now on the other side of Hurricane Irene, we want to ask everyone what recovery controls worked, which didn’t and which were new for your organization or your town. For example, the local fire departments around my area (Northwest CT) are offering charging stations for citizens to use for devices such as cell phones, laptops and so forth. This service is a big boost to telework programs which depend on the workforce having power from their home or distributed location.

Also, it seems that emergency/mass notification services (EMNS) were used extensively to alert the population of storm status: NYC through NotifyNYC and NYC-OEM sent regular pre- and post- alerts regarding the event, I received voicemails or emails from my local CT town management, Connecticut Light & Power, and JPMorganChase alerting me about disaster preparedness status and steps to take if I needed assistance.

Another new feature was the use of texting: If one texted the name “Irene” to 501-01, National Grid would text regular updates on electrical power restoration status to your cell phone.    This feature definitely was not around back in the days of Hurricane Gloria (1985) or Bob (1991) and was quite useful since over 500K National Grid customers lost power.

Also, going to wifi hot spots at venues like Starbucks and McDonald’s is certainly a new capability. How many of you used one of these options?

And, on August 26, 2011 FEMA launched its first-ever smartphone application and text messaging updates. Available right now only on the Android smartphone, Blackberry and iPhone support will be coming in a few weeks.

What were your experiences if you were in an impacted area?

Roberta Witty and John Morency

2 Comments »

Category: Uncategorized     Tags: , , , , , , , , , , , , , , , , , , , , ,

Best Practices for IT Organizations in Response to the ‘Rolling Blackouts’

by Roberta J. Witty  |  May 18, 2011  |  Comments Off

The rolling blackouts designed to conserve electricity following the earthquake and tsunami in northern Japan continue to present serious challenges for enterprises. Gartner’s best practices can help IT organizations protect their infrastructures and support their workforces.

Key Findings

  • The earthquake and tsunami that struck the Tohoku district in March, and the power plant failures and other infrastructure problems that followed, continue to disrupt communications, transportation and other infrastructure.
  • The Japanese government and Tepco have implemented a plan for rolling electrical blackouts across Tepco’s coverage area, designed to reduce power usage and avoid total power failures.
  • These blackouts present serious challenges for Japanese enterprises, particularly in maintaining the operational integrity of their data centers and offering alternative system access to remote workers.

Tepco has said it will not carry out its planned rolling blackouts this summer, but electrical supply continues to present challenges for Japanese enterprises. Gartner has developed a set of best   practices for various scenarios and affected parties for IT organizations in Japan and worldwide. The appropriate response to the rolling blackout depends heavily on whether or not the enterprise’s data center has its own dedicated backup power generator.

Read more about the best practices – if you are an organization impacted by the earthquake/tsunami or not – in the full report by my colleagues  Masahiko Ishibashi, Eiichi Matsubara, Nagayoshi Nakano and Katsuo Hori.  Being a Gartner customer may be required.

Comments Off

Category: Uncategorized     Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,