For some time now, Inland Steel has been committed to improving the reliability of our plant operations as a way to reduce costs, downtime and increase productivity. Recognizing that substantial savings could be achieved by reducing or eliminating chronic system and equipment failures, the company provided extensive training in Root Cause Failure Analysis (RCFA) to 50 reliability engineers and more than 150 field employees.
The RCFA training, conducted by Reliability Center, Inc. (RCI), emphasizes the importance of approaching failure problems in a systematic, logical process and provides a proven methodology to identify, analyze and verify the underlying root causes of these recurring failures.
The classroom training provided a valuable base of knowledge and skills but, as the RCI consulting team pointed out many times, the proverbial “proof of the pudding” lies in applying these skills in actual workplace situations.
We recently had an opportunity to do just that and the results were immensely gratifying, not only in measurable savings for the company but also in the satisfaction that comes from tackling a problem and really solving it.
We had experienced nine catastrophic failures of our lance carriage assemblies in the Basic Oxygen Furnaces (BOF) at an approximate cost of $250,000 each. The Mean Time Between Failures (MTBF) of these occurrences was 2.5 months.
For those unfamiliar with the industry, the BOFs are the steelmaking operations in a steel mill. The lance carriage assembly is a crane weighing approximately 11 tons that raises and lowers an oxygen lance, a mechanism that blows pure oxygen at Mach 2 speed about 80 feet in and out of the steelmaking vessel. The lance provides the agitation to mix the “recipe” according to customer specifications.
The recurring failure of the lance carriage assembly presented an ideal opportunity to use RCFA in the field. In its classroom training, RCI teaches a five-step PROACT® methodology to address failures. The acronym stands for:
PReserving Failure Data
Ordering the Analysis
Analyzing the Data
Communicating Findings and Recommendations
Tracking to Ensure Results
As we had been told in our training classes, RCI’s PROACT® methodology compelled the failure analysis team to arrive at logical, verifiable, fact-based conclusions rather than solutions based on guesswork, hunches or “conventional wisdom.” Because of this, the team had great confidence in its analysis work and a high degree of certainty that its findings and recommendations would solve the problem.
Preserve Failure Data
We really took this part of the training to heart and decided to apply it to the lance carriage assembly failure. Our consultants from RCI provided assistance to ensure we were on track at all times and not acting on assumptions. Our first action had to deal with making sure we collected all the information related to the failure. All data and parts were preserved, labeled, logged and analyzed.
Order the Analysis
The next step was to make sure the team we assembled to analyze this failure was multi-disciplined. Operations, maintenance and technical representatives were included as members of the team.
The Principal Analyst had an electrical background although the failure appeared to have mechanical, physical roots. His background allowed him to be unbiased and to facilitate the analysis process rather than guide the team to a pre-conceived conclusion.
Analyze the Data
The team was faced next with taking the pieces of the puzzle (using the failure information we had gathered) and putting them all together. We used what RCI calls a “Logic Tree” to do this. The Logic Tree promotes a disciplined, logical deduction process that forces the team to work backwards from failure to physical and latent root causes.
Hypotheses are constantly developed as to how a preceding event could have occurred. When all possibilities were identified, strategies had to be developed to verify whether, in fact, these events did or did not happen. This is where the failure information we had so carefully gathered really proved its worth. Without this data, it would not have been possible to verify our hypotheses.
The only verification tool we could not use was “conventional wisdom.” Only hypotheses which could be verified as true based on the data were kept and followed.
The team kept driving or, as we called it “Deep Drilling,” until we drove the tree down to its basic roots. Two appropriate questions were asked repeatedly until all the roots were uncovered. First we asked “how can that happen?” The second question was “why did it happen?” By doing this, we were able to uncover not only the physical root causes of the failure, but also the human and organizational root causes as well. It became clear that certain management or organizational systems (improper hug nut specifications, improper torquing and alignment procedures, lack of parts inventory for new equipment) were influencing field employees in making decisions that led to failure.
Communicate Findings and Recommendations
When the RCFA process was completed, the solutions to the failure were apparent. The next step was to present our findings and recommendations in a way that would encourage actions to correct the problem.
Our team had developed recommendations based on consensus and what we felt management would accept. On the advice of our RCI consultants, we reviewed our recommendations a second time because they represented what we, as a team, felt would be a permanent fix. We were encouraged to reject our subjectivity and stick to objective solutions based on root causes.
With the assistance of RCI, our final presentation to management was designed to convey the information from the RCFA in the most effective manner possible. It was totally electronic with pertinent photographs digitized and viewed on a screen. Appropriate parts were passed around so management could see firsthand what the slides depicted.
The meeting was a total success. Management appreciated the level of detail and logical approach we provided. The recommendations were accepted, long-term plans were developed, responsibilities assigned and timetables set.
Tracking To Ensure Success
We all realized that, despite good intentions and consensus on a course of action, the battle was not won until we had a bottom line performance indicator that validated the correctness of our analysis. Our indicator was the MTBF of lance drops (previously 2.5 months).
The immediate actions were to realign the equipment properly to precision specifications and do some of the mechanical fixes that would address the physical roots. Because of perceived higher priority items, we did not address the organizational recommendations (providing alignment training, developing a torquing procedure, clearing the air between central and assigned maintenance and inspection responsibilities). These inexpensive human solutions are sometimes the most difficult to implement.
However, by making the changes we did regarding the physical roots, we did not experience another lance drop for 10 months, resulting in a savings of approximately $1.15 million compared to past performance.
Especially noteworthy, when we performed a “mini” RCFA analysis on the last lance drop, we found that 10 of the 11 contributors to this failure were the exact same roots identified in the initial RCFA. Because we delayed action on these recommendations, we incurred another failure. This event certainly reinforced to us that prompt follow-through on all the recommendations of the RCFA was vital. At the moment, our MTBF has improved from 2.5 months to 10 months and continues to increase month by month.
Management Support–Key to Success
Without the strong support of management, results such as we were able to achieve would be impossible. That support must be active and include, among other things, providing the principal analysts and their teams with the time to meet and analyze failure information. They must provide any necessary technical, field or administrative resources required to verify hypotheses. In today’s re-engineered environment, this is a difficult commitment for any manager to make.
In our particular case, #4BOF Manager Larry Coe and Maintenance Section Manager Dean Schramm exhibited enormous courage and foresight by authorizing the RCFA, supporting it and seeing it through to completion. Had they not taken the risk, our success could never have been realized.
The Larger Impact of RCFA
When we identified organizational roots (the systems or lack of systems by which the organization runs), we were able to leverage the benefits of analyses to other organizational units. For example, when we confirmed that we did not know how to properly align our equipment, we checked other areas and found that the problem was widespread. When we found that we did not have an effective torquing procedure, we checked other areas and found the same situation.
Many recommendations from the RCFA on the lance carriage assembly failure were implemented site-wide. Perhaps the greatest benefit of the entire process has been that other areas learned from our failure and were able to avoid failures of their own.
We also learned another important lesson. When we have a failure now, we do not “jump right in and try to fix it.” We take a step back and perform a mini-RCFA, including the collection of appropriate failure information. We use this information to develop countermeasures and implement them. The results are fewer delays, longer MTBFs and more reliable production.
Contact Us For an Expert Review Of Your RCA Program!
Set up a call with us today if you are interested in improving your Reliability Program. We are looking to partner with forward-thinking, innovative leaders and welcome the opportunity to help you thrive in this digital world.
Let’s partner to make this world a more reliable place.
What’s Wrong With The Term “Root Causes”?
The Stigma of RCA: What’s in a Name?
Is the 5-Ys a Valid RCA Tool for Significant Events?
RCA in Action: The Space Shuttle Columbia Investigation
Root Cause Analysis Software
Our RCA software mobilizes your team to complete standardized RCA’s while giving you the enterprise-wide data you need to increase asset performance and keep your team safe.