By Robert J. Latino, CEO, Reliability Center, Inc.
The Swiss Cheese Model
Many are familiar with Dr. James Reason’s Swiss Cheese Model (see Figure 1) from his text ‘Managing the Risks of Organizational Accidents’ (Reasons, J. 1997. Managing the Risks of Organizational Accidents. Burlington, VT. Ashgate Publishing Company).
This is a very effective graphical representation that tends to stick in people’s mind, when it comes to safety systems and why bad things happen to good people (it also made the ‘cheese head’ fans of the Green Bay Packers very happy).
Breaking down the ‘Cheese’
In this graphical expression, each slice of Swiss cheese represents a defense mechanism or a safety system in our workplaces. Such systems include how we:
- communicate with each other
- train our employees
- purchase quality products to maintain our operations
- operate and maintain our processes via policies and procedures
- manage our workforce via our human resources department
- keep our people safe via our environmental, health & safety (EH&S) policies and procedures
We will refer to these collectively as our overall ‘management systems’.
As long as humans are involved in running our organizations, there will be holes (vulnerabilities) in our cheese. This is why I state that we will never have American cheese (no holes) representing our safety systems, because this would imply they would be failsafe and not impenetrable.
The Swiss cheese metaphor is more appropriate because we as human beings are NOT infallible. We are not perfect and we must acknowledge that in order to progress.
The holes in the Swiss cheese represent vulnerabilities that our safety systems are subject to. Many of these vulnerabilities are identified through our risk assessments such as Failure Modes and Effects Analysis (FMEA). This is where we seek to quantify risk by assessing our potential failure modes in a system via the following simple calculation:
Probability (P) x Severity (S) = Criticality (or Risk Prioritization Number [RPN])
The RPN allows us to identify the magnitude of a risk, therefore becomes a hole in our cheese. The diameter of the hole, will be proportional to the magnitude of the risk.
Given this knowledge, we essentially have three (3) ways in which we can manage the risks of our safety systems:
- Minimize the number of holes in our cheese – eliminate identified risks where possible, via human error reduction strategies such as error-proofing or mistake-proofing.
- Minimize the diameter of the remaining holes in our cheese – where risks are unacceptably high, conduct proactive Root Cause Analyses (RCA) on why the risks are so high. This may not eliminate a risk, but strategies can be employed to reduce the risks to an acceptable level.
- Ensure the remaining holes in each slice of cheese, do not line up on any given day – if we know where our vulnerabilities are in our systems, then we can better manage those systems. We can be proactive by ensuring those vulnerabilities (holes) are not permitted to synch up on any given day.
So let’s move away from our Swiss cheese example and apply it to a real world example.
Summary Case Background: A production process is unexpectedly shut down, stopping all operations. A bearing failure in a critical pump is determined to be the primary culprit.
An in-depth Root Cause Analysis (RCA) into the failure reveals the following:
- Physical Root Cause(s) – metallurgical evidence concludes the bearing failed due to mechanically induced fatigue. Drilling down further in our RCA, we ask “How could we have had mechanically induced fatigue?’
- Human Root Cause(s) – interviews, review of HR and training records and observation of alignment techniques reveal the mechanic was not properly conducting the alignments. Because we are at a human level (decision-maker level), we switch our questioning to ‘why?’ “Why would the mechanic not properly align the critical pump?”
- Latent Root Cause(s) – evidence described above concludes that:
- The outdated reverse-indicator alignment tools the mechanic was using, were worn and therefore not accurate.
- The mechanic was not properly trained on how to align properly. A more senior mechanic had recently retired and this mechanic was given his alignment responsibilities, with no additional training. So he relied on his past on-the-job (OJT) training which was not adequate.
- The alignment procedures in place were obsolete for the updated process. Therefore, even if the mechanic did follow them, the results would have been similar.
- There was less than adequate (LTA) management oversight. It was a supervisory responsibility to know the qualifications of their personnel. Supervisors should have observed the alignment techniques of the mechanic, especially after they had to assume the role of a retiree. Supervisors should have also been aware that applicable procedures were obsolete and the mechanic needed additional training in order to be proficient at their new tasks.
Note: These terms and RCA process are further described in a past article entitled as we did in our last article ‘Mistakes Were Made, But Not By Me….Facing the Mirror’.
So in Figure 1, we can now see how the holes in the cheese lined up on that day, from the planted seeds of the failure (management system flaws) all the way through to the undesirable outcome of impacting production.
While this is a manufacturing example, look at this from a system’s perspective. The same cause-and-effect processes happen every day, where every one of us works. Awareness of our surroundings if the greatest defense we can have to prevent the holes in the Swiss cheese from lining up!
For another example of this root system related to a customer complaint situation at a manufacturing plant, please view this short video case study – “Root Cause Analysis on Customer Complaints”.
Remember, “We NEVER seem to have the time and budget to do things right, but we ALWAYS seem to have the time and budget to do them again!”
Robert J. Latino is CEO of Reliability Center, Inc. Mr. Latino and been a practitioner, trainer, author and international speaker on the topics of Reliability and Root Cause Analysis for over 30 years. He can be contacted at 800/457-0645 or firstname.lastname@example.org. Visit our website at www.reliability.com to learn more.