Yesterday, I had the firsthand experience of how a cloud crash can impact a business. My Outlook provider, Intermedia, went down hard for several hours. Not only did the world’s largest third-party provider of hosted Microsoft Exchange services’ network crash, but its VoIP-based phone system, which also runs in its data center crashed, too. While they were busy trying to rectify the problem, they were unable to make or receive calls, which further compounded the problem.
The outage affected my productivity in ways I didn’t even think possible. Because of the way my Outlook was configured, the outage actually caused the program to freeze up, which made me think initially my computer was having a hardware meltdown or I had a virus. After spending a few hours running scans and even doing a system rollback, I finally discovered what was really going on (thanks to a text message from a peer who also uses Intermedia), and tried to work around the problem the best I could. All in all, I probably lost a half day of productivity.
Reflecting back on the incident, I realize it was a good reminder that there’s no single bullet-proof computing platform — not even with the cloud. What this situation also taught me — and what it’s hopefully reminding you — is not to get too comfortable with your business continuity offering. Sure, in theory if your customer’s local server goes down you can get their data back and have them up in running in less than 30 minutes, right? But, when was the last time you really tested this claim? Is it possible there’s something you’re overlooking such as backups that aren’t being completed on a regular basis, corrupted backups, or a missing step in your business continuity plan that could turn your 30-minute restore into a full-day debacle?
The world’s largest third-party provider of hosted Microsoft Exchange services just experienced an outage that lasted from 7 a.m. to 3:30 p.m. the other day and affected millions of customers. I can only imagine the “We should have” conversations floating around the Intermedia board room today (e.g. We should have had our VoIP phones set up to failover to a traditional PBX system). If this kind of slip-up can happen to a company of this size, what steps do you have in place — or are you going to put in place — to ensure a similar debacle doesn’t happen with your company?