Single Point of Failure

I keep revisiting old articles, not necessarily because I’ve run out of things to say, but because I often feel I did not state my case well enough. This mainly happened because I did not have enough practical experience outside of consulting to be able to be relevant. Here’s for a second go at my article on process redundancies and why they can make sense.

Defining a single point of failure

A single point of failure is a point in a system (such as a (manual) process) which, if it fails, will bring the whole system down. Not all redundancies were created because people felt they could afford to add an additional, non value added layer. Rather, redundancy in systems were created to avoid single point of failure.

Of course, during crisis times, process efficiency reviews such as Process Impact Analyses (PIA’s) are likely to identify redundant activities. The subsequent reengineering will lead to significant optimizations in process efficiency. However, while aiming to reduce the unnecessary redudancy in systems there are a lot of reengineering projects which overshoot.

The butcher’s apprentice

Large scale reengineering projects sometimes cut indiscriminately through processes and systems. In my experience, especially projects which originate out of a dire need to reduce expenditure will have overall expenditure reduction targets that are most easily applied across the board. That this approach often makes no sense at all is never considered. These projects result in the creation of single points of failure in processes where prior to the reengineering effort a necessary redundancy existed. Indiscriminate cutting into a process is a bit like being a butcher’s apprentice … mistakes by cutting the wrong piece of the meat can be quite costly.

And let’s face the hard truth we failed to face so often in the past: our people (who are responsible for doing the process work) are expensive to make mistakes with. Especially, but not only if these people are good in what they do.

It gets worse. Given these reengineering exercises are most often executed by external parties, those mistakes are mostly made by young, ambitious but necessarily inexperienced consultants with little to no real life process background.

Is there a way to avoid this? Can you detect and fix a single point of failure?

Detecting single points of failure

To detect a single point of failure, it’s important to take the time and talk to the people that execute the process every single day. They’re pretty much the only ones that are aware how the process actually functions. And talking, preceeded by gaining the trust of those people and earning that trust is essential. Not only because you may be surprised about the amount of process knowledge that was not captured in the process documentation. No, quite often these people are the only ones that are aware of the existence of the single point of failure and all possible implications that brings with it. Even better, they can quite likely help you avoid it. There are other ways, which I list below, but none as powerful as talking to people.

Alternative or additional actions

The following alternative but most often additional actions can help you in your search for single points of failure. Note that these require a significant investment in time and means to properly execute. In my opinion, only coupled with in depth discussion with process owners and operators will they yield the most relevant results. You can:

  1. Map the process: understanding the process and the interactions in the process is a powerful way to understand what is actually going on. Again, do not base yourself on procedural descriptions. Rather, interview people, monitor process execution and look at what’s really there. You may be able to identify your single points of failure in the process itself;
  2. Identify choke points during a walk-through or based on process indicators: quite often, a choke point where backlog piles up is a good indicator of a single point of failure. Process indicators which are often indicative of a problem in a process will be a relevant factor as well.

After identification, you need to solve the issue

As I stated above, a significant number of reengineering projects stripped operations to the bare bones, only to find out that because of the existence of single points of failure the process completeness, timeliness and accuracy were largely jeopardized.

That’s why I dare you to create or maintain the necessary redundancy in the process. This is often a cost benefit decision. In case the single point of failure risks risks collapsing or significantly disturbing the process, creating a redundancy in your process is not overspending. Rather, it makes good business sense if the cost of failure is significantly higher than the cost of the redundancy, over time.