Many executives believe that all failure is bad (although it usually provides lessons) and that learning from it is pretty straightforward. The author, a professor at Harvard Business School, thinks both beliefs are misguided. In organizational life, she says, some failures are inevitable and some are even good. And successful learning from failure is not simple: It requires context-specific strategies. But first leaders must understand how the blame game gets in the way and work to create an organizational culture in which employees feel safe admitting or reporting on failure.
Failures fall into three categories: preventable ones in predictable operations, which usually involve deviations from spec; unavoidable ones in complex systems, which may arise from unique combinations of needs, people, and problems; and intelligent ones at the frontier, where “good” failures occur quickly and on a small scale, providing the most valuable information.
Strong leadership can build a learning culture – one in which failures large and small are consistently reported and deeply analyzed, and opportunities to experiment are proactively sought. Executives commonly and understandably worry that taking a sympathetic stance toward failure will create an “anything goes” work environment. They should instead recognize that failure is inevitable in today’s complex work organizations.
The wisdom of learning from failure is incontrovertible. Yet organizations that do it well are extraordinarily rare. This gap is not due to a lack of commitment to learning. Managers in the vast majority of enterprises that I have studied over the past 20 years – pharmaceutical, financial services, product design, telecommunications, and construction companies; hospitals; and NASA’s space shuttle program, among others – genuinely wanted to help their organizations learn from failures to improve future performance. In some cases they and their teams had devoted many hours to after-action reviews, postmortems, and the like. But time after time I saw that these painstaking efforts led to no real change. The reason: Those managers were thinking about failure the wrong way.
Most executives I’ve talked to believe that failure is bad (of course!). They also believe that learning from it is pretty straightforward: Ask people to reflect on what they did wrong and exhort them to avoid similar mistakes in the future – or, better yet, assign a team to review and write a report on what happened and then distribute it throughout the organization.
These widely held beliefs are misguided. First, failure is not always bad. In organizational life it is sometimes bad, sometimes inevitable, and sometimes even good. Second, learning from organizational failures is anything but straightforward. The attitudes and activities required to effectively detect and analyze failures are in short supply in most companies, and the need for context-specific learning strategies is underappreciated. Organizations need new and better ways to go beyond lessons that are superficial (“Procedures weren’t followed”) or self-serving (“The market just wasn’t ready for our great new product”). That means jettisoning old cultural beliefs and stereotypical notions of success and embracing failure’s lessons. Leaders can begin by understanding how the blame game gets in the way.
The Blame Game
Failure and fault are virtually inseparable in most households, organizations, and cultures. Every child learns at some point that admitting failure means taking the blame. That is why so few organizations have shifted to a culture of psychological safety in which the rewards of learning from failure can be fully realized.
Executives I’ve interviewed in organizations as different as hospitals and investment banks admit to being torn: How can they respond constructively to failures without giving rise to an anything-goes attitude? If people aren’t blamed for failures, what will ensure that they try as hard as possible to do their best work?
This concern is based on a false dichotomy. In actuality, a culture that makes it safe to admit and report on failure can – and in some organizational contexts must – coexist with high standards for performance. To understand why, look at the exhibit “A Spectrum of Reasons for Failure,” which lists causes ranging from deliberate deviation to thoughtful experimentation.
Which of these causes involve blameworthy actions? Deliberate deviance, first on the list, obviously warrants blame. But inattention might not. If it results from a lack of effort, perhaps it’s blameworthy. But if it results from fatigue near the end of an overly long shift, the manager who assigned the shift is more at fault than the employee. As we go down the list, it gets more and more difficult to find blameworthy acts. In fact, a failure resulting from thoughtful experimentation that generates valuable information may actually be praiseworthy.
When I ask executives to consider this spectrum and then to estimate how many of the failures in their organizations are truly blameworthy, their answers are usually in single digits – perhaps 2% to 5%. But when I ask how many are treated as blameworthy, they say (after a pause or a laugh) 70% to 90%. The unfortunate consequence is that many failures go unreported and their lessons are lost.
Not All Failures Are Created Equal
A sophisticated understanding of failure’s causes and contexts will help to avoid the blame game and institute an effective strategy for learning from failure. Although an infinite number of things can go wrong in organizations, mistakes fall into three broad categories: preventable, complexity-related, and intelligent.
Preventable Failures In Predictable Operations
Most failures in this category can indeed be considered “bad.” They usually involve deviations from spec in the closely defined processes of high-volume or routine operations in manufacturing and services. With proper training and support, employees can follow those processes consistently. When they don’t, deviance, inattention, or lack of ability is usually the reason. But in such cases, the causes can be readily identified and solutions developed. Checklists (as in the Harvard surgeon Atul Gawande’s recent best seller The Checklist Manifesto) are one solution. Another is the vaunted Toyota Production System, which builds continual learning from tiny failures (small process deviations) into its approach to improvement. As most students of operations know well, a team member on a Toyota assembly line who spots a problem or even a potential problem is encouraged to pull a rope called the andon cord, which immediately initiates a diagnostic and problem-solving process. Production continues unimpeded if the problem can be remedied in less than a minute. Otherwise, production is halted – despite the loss of revenue entailed – until the failure is understood and resolved.
Unavoidable Failures In Complex Systems
A large number of organizational failures are due to the inherent uncertainty of work: A particular combination of needs, people, and problems may have never occurred before. Triaging patients in a hospital emergency room, responding to enemy actions on the battlefield, and running a fast-growing start-up all occur in unpredictable situations. And in complex organizations like aircraft carriers and nuclear power plants, system failure is a perpetual risk.
Although serious failures can be averted by following best practices for safety and risk management, including a thorough analysis of any such events that do occur, small process failures are inevitable. To consider them bad is not just a misunderstanding of how complex systems work; it is counterproductive. Avoiding consequential failures means rapidly identifying and correcting small failures. Most accidents in hospitals result from a series of small failures that went unnoticed and unfortunately lined up in just the wrong way.
Intelligent Failures At The Frontier
Failures in this category can rightly be considered “good,” because they provide valuable new knowledge that can help an organization leap ahead of the competition and ensure its future growth – which is why the Duke University professor of management Sim Sitkin calls them intelligent failures. They occur when experimentation is necessary: when answers are not knowable in advance because this exact situation hasn’t been encountered before and perhaps never will be again. Discovering new drugs, creating a radically new business, designing an innovative product, and testing customer reactions in a brand-new market are tasks that require intelligent failures. “Trial and error” is a common term for the kind of experimentation needed in these settings, but it is a misnomer, because “error” implies that there was a “right” outcome in the first place. At the frontier, the right kind of experimentation produces good failures quickly. Managers who practice it can avoid the unintelligent failure of conducting experiments at a larger scale than necessary.
Leaders of the product design firm IDEO understood this when they launched a new innovation-strategy service. Rather than help clients design new products within their existing lines – a process IDEO had all but perfected – the service would help them create new lines that would take them in novel strategic directions. Knowing that it hadn’t yet figured out how to deliver the service effectively, the company started a small project with a mattress company and didn’t publicly announce the launch of a new business.
Although the project failed – the client did not change its product strategy – IDEO learned from it and figured out what had to be done differently. For instance, it hired team members with MBAs who could better help clients create new businesses and made some of the clients’ managers part of the team. Today strategic innovation services account for more than a third of IDEO’s revenues.
Tolerating unavoidable process failures in complex systems and intelligent failures at the frontiers of knowledge won’t promote mediocrity. Indeed, tolerance is essential for any organization that wishes to extract the knowledge such failures provide. But failure is still inherently emotionally charged; getting an organization to accept it takes leadership.
Building A Learning Culture
Only leaders can create and reinforce a culture that counteracts the blame game and makes people feel both comfortable with and responsible for surfacing and learning from failures. (See the sidebar “How Leaders Can Build a Psychologically Safe Environment.”) They should insist that their organizations develop a clear understanding of what happened – not of “who did it” – when things go wrong. This requires consistently reporting failures, small and large; systematically analyzing them; and proactively searching for opportunities to experiment.
How Leaders Can Build A Psychologically Safe Environment
If an organization’s employees are to help spot existing and pending failures and to learn from them, their leaders must make it safe to speak up. Julie Morath, the chief operating officer of Children’s Hospital and Clinics of Minnesota from 1999 to 2009, did just that when she led a highly successful effort to reduce medical errors. Here are five practices I’ve identified in my research, with examples of how Morath employed them to build a psychologically safe environment.
Frame The Work Accurately
People need a shared understanding of the kinds of failures that can be expected to occur in a given work context (routine production, complex operations, or innovation) and why openness and collaboration are important for surfacing and learning from them. Accurate framing detoxifies failure.
In a complex operation like a hospital, many consequential failures are the result of a series of small events. To heighten awareness of this system complexity, Morath presented data on U.S. medical error rates, organized discussion groups, and built a team of key influencers from throughout the organization to help spread knowledge and understanding of the challenge.
Those who come forward with bad news, questions, concerns, or mistakes should be rewarded rather than shot. Celebrate the value of the news first and then figure out how to fix the failure and learn from it.
Morath implemented “blameless reporting” – an approach that encouraged employees to reveal medical errors and near misses anonymously. Her team created a new patient safety report, which expanded on the previous version by asking employees to describe incidents in their own words and to comment on the possible causes. Soon after the new system was implemented, the rate of reported failures shot up. Morath encouraged her people to view the data as good news, because the hospital could learn from failures – and made sure that teams were assigned to analyze every incident.
Being open about what you don’t know, mistakes you’ve made, and what you can’t get done alone will encourage others to do the same.
As soon as she joined the hospital, Morath explained her passion for patient safety and acknowledged that as a newcomer, she had only limited knowledge of how things worked at Children’s. In group presentations and one-on-one discussions, she made clear that she would need everyone’s help to reduce errors.
Ask for observations and ideas and create opportunities for people to detect and analyze failures and promote intelligent experiments. Inviting participation helps defuse resistance and defensiveness.
Morath set up cross-disciplinary teams to analyze failures and personally asked thoughtful questions of employees at all levels. Early on, she invited people to reflect on their recent experiences in caring for patients: Was everything as safe as they would have wanted it to be? This helped them recognize that the hospital had room for improvement. Suddenly, people were lining up to help.
Set Boundaries And Hold People Accountable
Paradoxically, people feel psychologically safer when leaders are clear about what acts are blameworthy. And there must be consequences. But if someone is punished or fired, tell those directly and indirectly affected what happened and why it warranted blame.
When she instituted blameless reporting, Morath explained to employees that although reporting would not be punished, specific behaviors (such as reckless conduct, conscious violation of standards, failing to ask for help when over one’s head) would. If someone makes the same mistake three times and is then laid off, coworkers usually express relief, along with sadness and concern – they understand that patients were at risk and that extra vigilance was required from others to counterbalance the person’s shortcomings.
Leaders should also send the right message about the nature of the work, such as reminding people in R&D, “We’re in the discovery business, and the faster we fail, the faster we’ll succeed.” I have found that managers often don’t understand or appreciate this subtle but crucial point. They also may approach failure in a way that is inappropriate for the context. For example, statistical process control, which uses data analysis to assess unwarranted variances, is not good for catching and correcting random invisible glitches such as software bugs. Nor does it help in the development of creative new products. Conversely, though great scientists intuitively adhere to IDEO’s slogan, “Fail often in order to succeed sooner,” it would hardly promote success in a manufacturing plant.
Often one context or one kind of work dominates the culture of an enterprise and shapes how it treats failure. For instance, automotive companies, with their predictable, high-volume operations, understandably tend to view failure as something that can and should be prevented. But most organizations engage in all three kinds of work discussed above – routine, complex, and frontier. Leaders must ensure that the right approach to learning from failure is applied in each. All organizations learn from failure through three essential activities: detection, analysis, and experimentation.
Spotting big, painful, expensive failures is easy. But in many organizations any failure that can be hidden is hidden as long as it’s unlikely to cause immediate or obvious harm. The goal should be to surface it early, before it has mushroomed into disaster.
Shortly after arriving from Boeing to take the reins at Ford, in September 2006, Alan Mulally instituted a new system for detecting failures. He asked managers to color code their reports green for good, yellow for caution, or red for problems – a common management technique. According to a 2009 story in Fortune, at his first few meetings all the managers coded their operations green, to Mulally’s frustration. Reminding them that the company had lost several billion dollars the previous year, he asked straight out, “Isn’t anything not going well?” After one tentative yellow report was made about a serious product defect that would probably delay a launch, Mulally responded to the deathly silence that ensued with applause. After that, the weekly staff meetings were full of color.
That story illustrates a pervasive and fundamental problem: Although many methods of surfacing current and pending failures exist, they are grossly underutilized. Total Quality Management and soliciting feedback from customers are well-known techniques for bringing to light failures in routine operations. High-reliability-organization (HRO) practices help prevent catastrophic failures in complex systems like nuclear power plants through early detection. Electricité de France, which operates 58 nuclear power plants, has been an exemplar in this area: It goes beyond regulatory requirements and religiously tracks each plant for anything even slightly out of the ordinary, immediately investigates whatever turns up, and informs all its other plants of any anomalies.
Such methods are not more widely employed because all too many messengers – even the most senior executives – remain reluctant to convey bad news to bosses and colleagues. One senior executive I know in a large consumer products company had grave reservations about a takeover that was already in the works when he joined the management team. But, overly conscious of his newcomer status, he was silent during discussions in which all the other executives seemed enthusiastic about the plan. Many months later, when the takeover had clearly failed, the team gathered to review what had happened. Aided by a consultant, each executive considered what he or she might have done to contribute to the failure. The newcomer, openly apologetic about his past silence, explained that others’ enthusiasm had made him unwilling to be “the skunk at the picnic.”
In researching errors and other failures in hospitals, I discovered substantial differences across patient-care units in nurses’ willingness to speak up about them. It turned out that the behavior of midlevel managers – how they responded to failures and whether they encouraged open discussion of them, welcomed questions, and displayed humility and curiosity – was the cause. I have seen the same pattern in a wide range of organizations.
A horrific case in point, which I studied for more than two years, is the 2003 explosion of the Columbia space shuttle, which killed seven astronauts. NASA managers spent some two weeks downplaying the seriousness of a piece of foam’s having broken off the left side of the shuttle at launch. They rejected engineers’ requests to resolve the ambiguity (which could have been done by having a satellite photograph the shuttle or asking the astronauts to conduct a space walk to inspect the area in question), and the major failure went largely undetected until its fatal consequences 16 days later. Ironically, a shared but unsubstantiated belief among program managers that there was little they could do contributed to their inability to detect the failure. Postevent analyses suggested that they might indeed have taken fruitful action. But clearly leaders hadn’t established the necessary culture, systems, and procedures.
One challenge is teaching people in an organization when to declare defeat in an experimental course of action. The human tendency to hope for the best and try to avoid failure at all costs gets in the way, and organizational hierarchies exacerbate it. As a result, failing R&D projects are often kept going much longer than is scientifically rational or economically prudent. We throw good money after bad, praying that we’ll pull a rabbit out of a hat. Intuition may tell engineers or scientists that a project has fatal flaws, but the formal decision to call it a failure may be delayed for months.
Again, the remedy – which does not necessarily involve much time and expense – is to reduce the stigma of failure. Eli Lilly has done this since the early 1990s by holding “failure parties” to honor intelligent, high-quality scientific experiments that fail to achieve the desired results. The parties don’t cost much, and redeploying valuable resources – particularly scientists – to new projects earlier rather than later can save hundreds of thousands of dollars, not to mention kickstart potential new discoveries.
Once a failure has been detected, it’s essential to go beyond the obvious and superficial reasons for it to understand the root causes. This requires the discipline – better yet, the enthusiasm – to use sophisticated analysis to ensure that the right lessons are learned and the right remedies are employed. The job of leaders is to see that their organizations don’t just move on after a failure but stop to dig in and discover the wisdom contained in it.
Why is failure analysis often shortchanged? Because examining our failures in depth is emotionally unpleasant and can chip away at our self-esteem. Left to our own devices, most of us will speed through or avoid failure analysis altogether. Another reason is that analyzing organizational failures requires inquiry and openness, patience, and a tolerance for causal ambiguity. Yet managers typically admire and are rewarded for decisiveness, efficiency, and action – not thoughtful reflection. That is why the right culture is so important.
The challenge is more than emotional; it’s cognitive, too. Even without meaning to, we all favor evidence that supports our existing beliefs rather than alternative explanations. We also tend to downplay our responsibility and place undue blame on external or situational factors when we fail, only to do the reverse when assessing the failures of others – a psychological trap known as fundamental attribution error.
My research has shown that failure analysis is often limited and ineffective – even in complex organizations like hospitals, where human lives are at stake. Few hospitals systematically analyze medical errors or process flaws in order to capture failure’s lessons. Research in North Carolina hospitals, published in November 2010 in the New England Journal of Medicine, found that despite a dozen years of heightened awareness that medical errors result in thousands of deaths each year, hospitals have not become safer.
Fortunately, there are shining exceptions to this pattern, which continue to provide hope that organizational learning is possible. At Intermountain Healthcare, a system of 23 hospitals that serves Utah and southeastern Idaho, physicians’ deviations from medical protocols are routinely analyzed for opportunities to improve the protocols. Allowing deviations and sharing the data on whether they actually produce a better outcome encourages physicians to buy into this program.
Motivating people to go beyond first-order reasons (procedures weren’t followed) to understanding the second- and third-order reasons can be a major challenge. One way to do this is to use interdisciplinary teams with diverse skills and perspectives. Complex failures in particular are the result of multiple events that occurred in different departments or disciplines or at different levels of the organization. Understanding what happened and how to prevent it from happening again requires detailed, team-based discussion and analysis.
A team of leading physicists, engineers, aviation experts, naval leaders, and even astronauts devoted months to an analysis of the Columbia disaster. They conclusively established not only the first-order cause – a piece of foam had hit the shuttle’s leading edge during launch – but also second-order causes: A rigid hierarchy and schedule-obsessed culture at NASA made it especially difficult for engineers to speak up about anything but the most rock-solid concerns.
The third critical activity for effective learning is strategically producing failures – in the right places, at the right times – through systematic experimentation. Researchers in basic science know that although the experiments they conduct will occasionally result in a spectacular success, a large percentage of them (70% or higher in some fields) will fail. How do these people get out of bed in the morning? First, they know that failure is not optional in their work; it’s part of being at the leading edge of scientific discovery. Second, far more than most of us, they understand that every failure conveys valuable information, and they’re eager to get it before the competition does.
In contrast, managers in charge of piloting a new product or service – a classic example of experimentation in business – typically do whatever they can to make sure that the pilot is perfect right out of the starting gate. Ironically, this hunger to succeed can later inhibit the success of the official launch. Too often, managers in charge of pilots design optimal conditions rather than representative ones. Thus the pilot doesn’t produce knowledge about what won’t work.
In the very early days of DSL, a major telecommunications company I’ll call Telco did a full-scale launch of that high-speed technology to consumer households in a major urban market. It was an unmitigated customer-service disaster. The company missed 75% of its commitments and found itself confronted with a staggering 12,000 late orders. Customers were frustrated and upset, and service reps couldn’t even begin to answer all their calls. Employee morale suffered. How could this happen to a leading company with high satisfaction ratings and a brand that had long stood for excellence?
A small and extremely successful suburban pilot had lulled Telco executives into a misguided confidence. The problem was that the pilot did not resemble real service conditions: It was staffed with unusually personable, expert service reps and took place in a community of educated, tech-savvy customers. But DSL was a brand-new technology and, unlike traditional telephony, had to interface with customers’ highly variable home computers and technical skills. This added complexity and unpredictability to the service-delivery challenge in ways that Telco had not fully appreciated before the launch.
A more useful pilot at Telco would have tested the technology with limited support, unsophisticated customers, and old computers. It would have been designed to discover everything that could go wrong – instead of proving that under the best of conditions everything would go right. (See the sidebar “Designing Successful Failures.”) Of course, the managers in charge would have to have understood that they were going to be rewarded not for success but, rather, for producing intelligent failures as quickly as possible.
Designing Successful Failures
Perhaps unsurprisingly, pilot projects are usually designed to succeed rather than to produce intelligent failures – those that generate valuable information. To know if you’ve designed a genuinely useful pilot, consider whether your managers can answer yes to the following questions:
- Is the pilot being tested under typical circumstances (rather than optimal conditions)?
- Do the employees, customers, and resources represent the firm’s real operating environment?
- Is the goal of the pilot to learn as much as possible (rather than to demonstrate the value of the proposed offering)?
- Is the goal of learning well understood by all employees and managers?
- Is it clear that compensation and performance reviews are not based on a successful outcome for the pilot?
- Were explicit changes made as a result of the pilot test?
In short, exceptional organizations are those that go beyond detecting and analyzing failures and try to generate intelligent ones for the express purpose of learning and innovating. It’s not that managers in these organizations enjoy failure. But they recognize it as a necessary by-product of experimentation. They also realize that they don’t have to do dramatic experiments with large budgets. Often a small pilot, a dry run of a new technique, or a simulation will suffice.
The courage to confront our own and others’ imperfections is crucial to solving the apparent contradiction of wanting neither to discourage the reporting of problems nor to create an environment in which anything goes. This means that managers must ask employees to be brave and speak up – and must not respond by expressing anger or strong disapproval of what may at first appear to be incompetence. More often than we realize, complex systems are at work behind organizational failures, and their lessons and improvement opportunities are lost when conversation is stifled.
Savvy managers understand the risks of unbridled toughness. They know that their ability to find out about and help resolve problems depends on their ability to learn about them. But most managers I’ve encountered in my research, teaching, and consulting work are far more sensitive to a different risk – that an understanding response to failures will simply create a lax work environment in which mistakes multiply.
This common worry should be replaced by a new paradigm – one that recognizes the inevitability of failure in today’s complex work organizations. Those that catch, correct, and learn from failure before others do will succeed. Those that wallow in the blame game will not.
originally posted on hbr.org by Amy C. Edmondson
About Author: Amy C. Edmondson is the Novartis Professor of Leadership and Management at Harvard Business School. Her most recent book is The Fearless Organization (Wiley, 2019).