Root Cause Analysis: Putting It to Work for You
Investing the time and effort to clearly define the problem that needs to be solved is a critical step in getting to the true root cause
In writing about how to approach the critical challenges of his day, H.L. Menken said, "There is always a well-known solution to every human problem—neat, plausible, and wrong."
Menken's words from 1920 still ring true today. To develop truly effective solutions, the first step is to properly understand the problem one is trying to solve. Unfortunately, this is not always as straightforward as it might seem. In the context of food safety assurance in a production environment, a superficial analysis of a problem—a contamination issue, a compliance failure, or some other shortfall—may result in a work team mistakenly developing a solution that does not address the real source of the trouble. This "fix that doesn't fix it" represents lost time and effort, and worse, a problem that is still a problem.
At the 2023 Food Safety Summit, a dynamic workshop brought together leading experts to introduce the concepts, methods, and tools of root cause analysis (RCA). In that hands-on session, participants learned how to dig into systems, how to sort the meaningful data from the distractions, how to identify the root causes of performance issues, and how to develop impactful, efficient solutions that prevent problems from recurring. In this article, we will explore these core aspects of RCA, provide RCA tools for food producers and processors to understand and apply, and share links to RCA resources and further reading.
Why Root Cause Analysis is Important
Teams often approach a problem with a bias for what is believed to be the best solution. This is typically based on the most visible issues, which are likely to be only symptoms of deeper problems. Sometimes, the implemented solution can provide a temporary fix, especially if targeted toward the symptoms of a nonconformity, misleading the team about the true efficacy of the bandage they have applied. If the team addresses only the immediately visible symptoms, then the core of the problem, or true root cause, may remain and the symptoms will likely recur.
Root cause is the core of the issue or the underlying cause for a problem—the ultimate reason behind the symptoms. Ineffective solutions could happen when the problem statement does not accurately define what needs to be solved. In an effort to be comprehensive, problem statements sometimes circumvent the RCA process. These types of problem statements might be targeted at fixing the symptoms and not getting to the true root cause of why.
When a nonconformance is reported but the true root cause is not identified, the ongoing impact is seen in plant, process, and people efficiencies. For example, if a process change is not properly identified and training ensues, then un-training and retraining will be required. This will cost valuable time and resources and likely cause chaos and confusion. For these reasons, investing the time and effort to clearly define the problem that needs to be solved is a critical step in getting to the true root cause.
Regulatory authorities expect that firms evaluate failures and near-misses and take preventive or corrective actions to prevent recurrence. While the U.S. Food and Drug Administration's (FDA's) Food Safety Modernization Act (FSMA) Human Foods Rule does not explicitly require root cause analysis, it does require that a food operator take steps to understand the reason for a failure of a process, allergen, or sanitation control in order to prevent recurrence. For regulators, the execution and documentation of this process provides confidence that a food operator's food safety management system is working effectively. Evidence of recurring root causes or potential causal factors across a region, product type, or category can highlight the need for specific preventive actions, such as research, guidance development, training, or even the development of a multi-stakeholder program for prevention.
Root Cause Analysis Tools: Choosing and Using the Right Ones
A variety of RCA tools can be used to investigate a problem and narrow down that investigation to the root cause(s) and/or possible contributing factors. Each tool has its advantages and disadvantages, and each is used in very different ways. What the tools have in common is that they are all used to evaluate investigation information and data, and they all ensure that the RCA investigation was systematically performed and adequately documented.
When a problem occurs and RCA must be performed, the investigating team may have three areas of RCA to pursue:
- Why did the process fail?
- After the process failed, why did the food safety and quality (FSQ) systems fail, allowing the issue to move forward?
- After it moved forward, why did management fail to identify the issue, allowing it to continue unidentified?
No matter what tool is used to conduct the RCA, some important foundations need to be in place to lay the grounds for success:
- Corrective and preventive action (CAPA) process: A detailed, logic-driven process that incorporates RCA tools to investigate problems, identify causes, define corrective actions, and prevent recurrence. A robust CAPA may be sufficient to ensure proper operations.
- Team approach: An RCA by a team of problem-solvers, including those that have a practical understanding of the process being evaluated, provides better results than the work of any single person. Note: It is essential to make sure the entire team has had training on the RCA tool(s) to be used so that they may each contribute more fully during the RCA process.
- Frontline engagement: The frontline is where the process lives! Getting frontline input provides insight on the actual operation of the farm, facility, or process, including challenges. It also builds buy-in for effective RCA and helps establish the ownership of quality where it belongs—with those who operate the process.
- Focus on the process: Corrective actions must be focused on building the capability of the process and not placing blame on people. Citing "human error" as the root cause is simplistic and misleading. Modern RCA regards this as unacceptable.
- Escapes: When non-conformances escape to the customer or marketplace, it becomes necessary to perform RCA on controls, testing, and inspection to address and prevent the recurrence of escapes.
- System contribution: FSQ systems are designed to manage risks and prevent failures during everyday operations. When similar issues keep arising, or are seen in various areas of the operations, it becomes necessary to also find root cause within the system itself.
- Human factors: Beyond the easy cause of "human error," this approach branches out and drills down into the process conditions that contributed to that "error."
A brief overview of some common RCA tools follows. While each of these tools can seem easy and simple at first glance, they can be much harder to use properly in the absence of training or a facilitator that can help the team use the tool correctly and effectively. Outside facilitators help the RCA team follow the evidence and avoid harmful groupthink. At any given company, an RCA team may use several of these tools at different points in the investigation. To that end, facilitators also help provide a "sense-check" during the RCA to make sure the team is using the right tool(s) and going down the right path. Depending upon how questions are being asked and answered, different answers may be reached even with "correct" application of the tool.
"Five Whys" Analysis. This tool is a pure investigation tool. At each stage of this linear process, the team repeatedly asks, "Why?" and digs deeper and deeper into causes. (Example: "The product went bad." "Why?" "Because the freezer went out of specification." "Why?" "Because of a lack of preventive maintenance." "Why was the preventive maintenance not performed?" And so on.) The Five Whys approach was first developed at Toyota and spread around the world as a "laser-like" tool to root out the primary cause and contributing factors of a problem. Its name reflects that the tool involves drilling down from a problem statement, through various symptoms, until a root cause is uncovered that involves one or more areas of the industrial conditions, processes, and/or environment—i.e., the Five Ms & E (see Table 1).
Fishbone (Cause–Effect) Diagram. This tool (Figure 1) is named after its shape, which resembles a filleted fish. The problem statement is the "effect." It is placed at the head of the fish. The Five Ms & E are the main cause areas to explore, and each has its own main bone that comes off of the horizontal spine. Unlike the Five Whys tool, the fishbone diagram is a brainstorming tool that encourages lateral thinking. The outcome of this exercise is a quantity of possible causes that must be investigated to determine what, if anything, they contributed to the occurrence of the problem. In other words, the fishbone diagram throws a "wide net" over the problem, pulls in all of the details, then looks for the "keepers" and tosses the rest out.
Cause–Logic Map. This tool is a blend of Five Whys (linear) and formal fault tree analysis (branching). It is a systematic, deductive approach that begins with a general conclusion and works backward to potential causes. It allows the team to depart from the traditional Five Ms & E categories, and instead define the categories that make the most sense given the nature of the problem. One advantage of this tool is that it has a built-in means to document the investigation at every stage. A related tool, the cause–effect matrix, blends the Five Whys tool with the fishbone diagram, except in an easier (matrix) format than that of the traditional fishbone diagram.
Is/Is Not Diagram. Instead of linear or branching analysis, this tool uses dichotomous thinking—establishing insight through exploring opposites or opposing views. The "Is" column lists where the problem does exist, and the "Is Not" column lists where the problem does not exist. These are listed in paired sets, such as: "It Is happening on weekdays, but it Is Not happening on weekends." Once the diagram is built, the investigations focus on seeking why each dichotomy is occurring and what that insight might indicate regarding possible root cause(s) and contributing factors.
Validation and Verification
The real-world considerations of food production require structuring the root cause validation process in a streamlined, efficient manner. The U.S. Food and Drug Administration's (FDA's) Food Safety Modernization Act identifies preventive controls as the method to control food safety risks, only some of which are directly controllable by critical control points (CCPs). Risks managed by preventive controls must have those procedures validated and then verified on an ongoing basis. This model summarizes the process from validation of root cause through verification of the preventive controls.
In U.S. Department of Agriculture (USDA)-regulated establishments, preventive controls are sometimes known as partial quality control programs. Effective systems show validation of the controls as they are implemented, as well as ongoing verification of the application of those controls as they are used. Verification activities can provide information over time that the issue was addressed and the problem does not recur, essentially "validating" that the RCA identified the right root cause(s) or potential causal factors.
A flawed or partial RCA can terminate an issue without the identification of a failed Good Agricultural Practice (GAP), Good Manufacturing Practice (GMP), or process control factor; it reveals what went wrong without recognition of why it went wrong. The effects of the problem are mitigated, but the problem is not solved. In this situation, there is not only a high likelihood of the event being repeated, but the partial RCA also misses the true factor (or multiple factors) involved in the failure. Control of contributing factors without addressing the underlying reason why they were present can result in a repetitive cycle of short-term correction, followed by gradual loss of food safety controls and recurring problems. This cycle costs time, money, and (potentially) reputation and market share. Companies typically go through several stages of "maturity" in implementing RCA, with predictable impacts on outcomes (Table 2).
People- and/or training-related issues are frequently found to be root causes when a full RCA is conducted. The investigation team should be cautious, however, in biasing investigations toward staff nonconformities over process issues, as this may discourage staff from participating in the RCA. Nonetheless, investigations may be incomplete until people-related issues have been at least considered or identified. One purpose of validating the root cause is to cover all issues as to why the failure occurred.
Once a list of potential factors has been identified, the RCA team should test each factor to establish the potential impact it might have on the incident being investigated. This testing must be based on facts and data, relying on a rigorous cycle of hypothesis testing and further data analysis. Next, what is known about how the identified factors had been associated with previous incidents must be reviewed. This information can be used to create a short summary of the contributing factors and categories of potential root causes. The formal tools of RCA—Five Whys, fishbone diagram, cause–effect matrix, and others—should be used to systematically evaluate each possible root cause so as to provide some assurance that the relevant factors have indeed been found.
Validation is a data-driven process that measures the RCA team's success in identifying the true root cause. The validation process measures if the identified root cause is a controlling factor, or if it is only a contributor of process control. Often, root causes can be addressed by engineering the cause out of the system—e.g., by redesigning a facility or by upgrading, retooling, or eliminating a particular piece of equipment. However, the realities of commercial production can constrain response options to identified hazards. If process or facility redesign to eliminate the risk is not an option, then that risk must be managed.
Validating a preventive control (i.e., showing that it operates reliably, without deviations or failures) is different from validating an RCA process (i.e., the true root cause was identified) (Figure 2). Effective preventive controls must be developed and implemented as presented in FDA's Food Safety Modernization Act rule:1
"Risk-based preventive controls will not give you a 'zero-risk' system for manufacturing, processing, packing, and holding food; rather, risk-based preventive controls are designed to minimize the risk of known or reasonably foreseeable food safety hazards that may cause illness or injury if they are present in the products you produce."
After the preventive controls have been implemented, it is essential to verify and validate that they remain in place and are effective. Validation of the preventive controls proves that they will be effective in controlling the hazard. Verification is the monitoring system that holds the gain over time. The key to validating the effectiveness of the preventive controls is in the measurement system developed to assess process control. Traditional statistical process control (SPC) offers an excellent approach to analyze the data generated. Specifications need to be established, control limits must be calculated, and capability must be measured.
Verification monitoring confirms that the process is operating as it was when it was first validated. Root cause validation and verification must identify cost-effective control measures that can be feasibly implemented. A root cause culture creates expectations while stressing investigation over settling for process changes that are not the true cause of the failure. Such analyses can provide a data-driven prioritization of improvements and may serve as a powerful justification for needed but potentially difficult-to-implement changes. From the viewpoint of management, additional research into the cost-effectiveness of RCAs helps justify the investment of time and money to investigate food safety risks and mitigate economic exposures.
An RCA is often a retrospective analysis of an event or failure that has occurred in the past. For example, an environmental monitoring result is a lagging indicator of environmental contamination. Information from a consumer complaint, illness report, or customer report may be linked to a production lot(s) that was grown or manufactured at a significant time in the past, particularly if it is a product with a long shelf life. Unless the contamination is widespread or persistent, it may be difficult to identify the source of the contamination. Product or ingredient testing is limited by the availability of the implicated lot(s) and the statistical probability of finding a defective unit, even from a defective lot, if the contamination is unevenly distributed. In many cases, it is not possible for the RCA team to identify a specific root cause; in these cases, the identification of likely causal factors can lead to re-enforcement of relevant processes, heightened verification, and the conduct of needed research to better understand the nature or origin of the failure.
Successful implementation of RCA and the promotion of a food safety culture go hand in hand. In contrast to a reactive state of constant firefighting (the dreaded "whack-a-mole" approach to problem-solving), the effective identification, elimination, and/or management of root causes supports company values, productivity, and market share. A critical responsibility of food safety and quality management is to facilitate successful RCA processes and deploy preventive controls that allow organizations to live their values and establish a strong food safety culture.
Common Pitfalls and Considerations for Success
One question facing food safety professionals is: How can they ensure the RCA they facilitate will be effective? There are several considerations for success to keep in mind when implementing an RCA process within an organization. Just like food safety culture, a successful RCA process starts with tone at the top. The importance an organization places on a project or process starts with support and commitment from senior leadership. When implementing a formal process for RCA, participation and quality of the outcome is significantly impacted by having support from senior management. Those in senior management may not directly participate in the process, but their awareness and endorsement of the activities will ultimately drive success, particularly in the visible backing they provide when it comes time to implement the necessary preventive controls. What is important to the boss will be important to the organization.
Once senior leadership supports the efforts of the RCA process, it is important to ensure that the correct participants are assembled to kick off the activities. An effective RCA requires input from a variety of different stakeholders, as each will have their own areas of experience and perspective from which they view a problem. Consider the old adage, "if the only tool you have is a hammer, then every problem looks like a nail." If you have only food safety professionals trying to determine a root cause for a failure in operations, then you may miss critical data points or the subtle process nuances that are only meaningful to an operations person. When looking at the same operations failure, engineering or maintenance will approach the problem from a completely different perspective. Having these different perspectives will ensure that the RCA process can be more comprehensive by considering several different factors.
Another consideration for implementing a successful RCA program is to leverage existing processes with which the business is already familiar. RCA is not a process unique to food safety; it can be used to address a variety of issues. Partner with occupational safety and hazardous materials teams and create a single process of investigation to be used any time an incident or near miss occurs. Using the same process to investigate multiple types of failures will provide more opportunities to execute and refine the skill of completing the exercise, as RCA is most effective when used as a routine tool (and not only in a time of crisis). It also simplifies expectations for senior leadership and line management, as they only have to learn and execute a single process.
However, even if you have what you need for success—the support of leadership, a common process adopted across the business, and the right people in the room—there are still some common pitfalls that can derail the effectiveness of an RCA.
One of the most common mistakes made with RCA is not clearly defining the problem at the start. If the problem is not clearly defined from the beginning of the process, regardless of the analysis tools selected, then the RCA team may go down the wrong path and miss the root cause altogether. Once the RCA team is assembled, make sure the group takes the time to collectively discuss and align on a description of the problem that needs to be resolved. This will help ensure that the RCA team is focusing on the correct problem and not just a symptom of a larger issue. Focusing on the symptom may lead to a solution and a corrective action, but the same root cause failure is likely to continue, demanding more time to conduct the same RCA again in the future.
Unfortunately, despite the support of senior leadership and a strong team working on a well-defined problem, the RCA can still go sideways if the process allows personal opinions, turf issues, and preconceived ideas to hold weight instead of allowing the facts to guide the analysis. Everyone will come in with biases on what they think happened. Some will even think that because they have been doing the job for years, they know exactly what happened. It is critical to acknowledge these biases so as to better set them aside. The best RCA teams do not let opinions and biases act as red herrings that lead to addressing only a symptom that is covering up the real issue. It is essential to rely on the RCA process—follow the evidence to come up with a theory instead of starting with a theory and making the evidence fit that scenario.
Final Thoughts
Root cause analysis can no longer be an ad hoc process. Rushing in and "winging it" will waste valuable time and energy, likely leading to failure. More than an after-action discussion or a firefighting attempt, modern RCA requires effective use of tools developed for the purpose. Beginning with the right team, representing key areas of the process and frontline personnel, credible evidence must be identified and documented as the RCA investigation progresses. RCA tools can guide, but the burden falls on the team to think critically and ask the important questions. Is this tool getting us to the root cause(s)? Are we using the tool properly? Can a different tool also help our investigation? Are we missing some aspect of the problem that could help shed light on the root cause? Can we defend our logic to management, regulators, auditors, or our customer? The authors will discuss and share new RCA tools, such as the Go-See-Think-Do tool, and more at the RCA session at the 2024 Food Safety Summit in May.
Production of consistently safe and wholesome food is complex and challenging. Even in the best production environments, problems can arise. With the analytical tools of root cause analysis, workflow teams can systematically cut through the noise and get at the true causes of performance failures. With that clarity, effective and efficient solutions can be developed and deployed, thereby preventing problems from recurring.
Additional Reading
- U.S. Food and Drug Administration (FDA). Draft Guidance for Industry: Hazard Analysis and Risk-Based Preventive Controls for Human Hazard Analysis and Rock-Based Preventive Controls for Human Food. September 2023. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/draft-guidance-industry-hazard-analysis-and-risk-based-preventive-controls-human-food.
- FDA. "Corrective and Preventive Actions (CAPA)." Current as of March 28, 2023. https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/inspection-guides/corrective-and-preventive-actions-capa.
- Young, Danielle. "Five Whys Jefferson Memorial Example." YouTube. March 23, 2016. https://www.youtube.com/watch?v=BEQvq99PZwo.
- American Society for Quality (ASQ). "Fishbone Diagram." 2023. https://asq.org/quality-resources/fishbone.
- ASQ. "The 7 Basic Quality Tools for Process Improvement." 2023. https://asq.org/quality-resources/seven-basic-quality-tools.
- Produce Safety Science. "Root Cause Analysis (Part 1: What & Why)." YouTube. September 12, 2023. https://www.youtube.com/watch?v=nJsN8asmx9I.
- Produce Safety Science. "Root Cause Analysis (Part 2: How)." YouTube. October 22, 2023. https://www.youtube.com/watch?v=uwrTpv3sEMM.
Deb Kane, M.S., is the Vice President for Food Safety, Quality, EHSS, and Regulatory for J&J Snack Foods Corp. Ms. Kane is a Certified Instructor for HACCP and ServSafe, and is a subject matter expert in SQF, FSMA, and global standards/quality policies.
John Butts, Ph.D., is the Founder and Principal of FoodSafetyByDesign LLC, a company founded to help producers of high-risk products learn how to prevent and manage food safety risks. Dr. Butts received the NSF Lifetime Achievement Award in 2016 and was inducted into the Meat Industry Hall of Fame in 2020.
Natalie Dyenson, M.P.H., is the Chief Food Safety and Regulatory Officer for the International Fresh Produce Association (IFPA). Ms. Dyenson is an internationally recognized food safety leader in retail, foodservice, and food production in the food and beverage industry. She is a subject matter expert in quality food safety systems, HACCP, and sanitation.
Tim Jackson, Ph.D., is the Senior Science Advisor for Food Safety at the U.S. Food and Drug Administration's Center for Food Safety and Applied Nutrition (FDA CFSAN). Dr. Jackson is a leader in food safety research, regulation, food microbiology, and food safety for global markets, guiding the implementation of Quality Management Systems, FSMA, and the Safe Foods for Canadians Act. Dr. Jackson currently serves as the President of the International Association for Food Protection (IAFP).
Tim King, M.S., M.A., is the Founder of and a Senior Partner with Quality Matters LLC. He is a quality SME and instructor for the American Society for Quality in Milwaukee, Wisconsin. Mr. King specializes in QMS implementations, process improvement teams, and corrective action investigations.
Brendan A. Niemira, Ph.D., is a Lead Scientist for the U.S. Department of Agriculture's Agricultural Research Service (USDA ARS). His research deals with non-thermal food processing technologies such as cold plasma, pulsed light, and novel antimicrobials. Dr. Niemira is a Fellow of the Institute of Food Technologists (IFT) and a past member of the IFT Board of Directors.