American Airlines - Quality Control & Disaster Recovery Lesson

Ask American Airlines right now about quality control and safety. They and their staff are working through a lot of difficulties in terms of the thousands of customers who are effected and their reputation. Many critics are suggesting that this could have been prevented with earlier testing and detection. In every business there are certain things that we rely on to just work. When they don’t it could result in huge setbacks in time, scheduling, and revenues. Having a strategic plan in place when something does go wrong is very important.

Planning takes two forms both having proper testing and maintenance in place for prevention and having a place to turn when something bad does happen. Both should be documented and available in part or whole to managers, staff, and vendors. Vendors should be required to write test plans against all deliverables and internal staff should have maintenance procedures to assure that their systems are functioning properly. These often simple tests can act as a sanity check to make sure that any problems due to a seemingly minor change doesn’t cause problems today or tomorrow. The maintenance plan should also include things such as backups, log reviews, and security audits.

The second part is having a plan for when things go wrong. In my experience trying to plan for every possible system failure is near impossible. Things often come from left field at the most inconvenient times. Here are a few things though you should have in place, which are common sense.

  • List of all vendors and their primary and secondary contacts. This should include both normal support and emergency after hours support.
  • List of all key customers and their contacts. Owning up to issues with customers although on the surface may be counter intuitive is actually good business. The guideline is if the issue is going to effect their business, it will effect yours. Having these messages crafted for the media and customers ahead of time can make this part of the plan much smoother and prevent mistakes that cause future publicity and legal headaches.
  • List of staff members’ emergency contact information. Mailing lists of SMS (pager) addresses can be helpful.
  • Access to a current list of system passwords and other key information for recovery, such as locations of backups.
  • If the budget allows have spare equipment ready.
  • Remote access and remote backup. These things can be key in terms of quickly recovering data especially after hours.
  • Legal and financial safeguards with respect to service level expectations of clients. Also how this might relate to safety, security of data, etc.

Nobody likes disaster. It makes for great headlines and stories, but the best stories are the ones that never get written. These are the stories of emergencies that are handled before they get out of hand by a dedicated team with the right information and tools.

Leave a Comment