I once worked as a consultant for a CIO at a midsized bank in the Northeast who had a novel way to test his disaster recovery plan. Lets call him Tom, not his real name, because his real name is the same as our CEO and this is not a story about our CEO.
Tom would periodically walk into the data center, unannounced and throw the main circuit breaker – effectively mimicking a data center disaster. This was done randomly and whenever he felt like it, which was not too often, but often enough that people were on their toes. I was in a meeting with him one summer Friday afternoon when he got up and said, “Take a walk with me.”
We crossed town to the data center, it was about a 15 minute walk and a beautiful day. As he walked in, the guys in operations said, “Oh, Tom not again.” He flipped the power off, the disaster started, and the recovery kicked in. They all had a smile on their faces and you guessed it, the systems rolled over, the bank stayed in operation and everything worked fine.
Some would call this irresponsible. I happened to think it was pretty brilliant because disasters happen when they happen not when you are planned and ready for them. If you are concerned that such a random outage would disrupt your business, then you may have a great plan, but no execution. When a disaster strikes something will be executed, its up to you to determine if it is the plan or your job.
We may need more of such disaster tests on a whim to truly evaluate our ability to stay operational when bad things happen to good IT departments.
So, how are you feeling today? Want to take a walk?