Learning From Success: A Discussion Paper
"""''' "If the lessons from incidents were as simple as our current approach to incident investigations suggests, why haven’t we learn the lessons already?"
Michael Tooma, Partner Clyde & Co
Introduction.....................................................................................................1 The limitations of only looking at failure ..........................................................2 The limitations of our traditional approach to looking at what goes wrong for learning ......... 2 The benefits of a more positive approach .......................................................7 What can we learn from positive psychology? ...............................................................7 The value of near miss investigation as an examination of success ............................8 Why find out about what went right? .............................................................................10 A positive methodology for learning from success ........................................12 Near miss/near hit investigations …………………..………………………………………..13 Scenario 1 Case Study: Near miss/near hit involving Qantas Flight 32……..15
Serious Incidents………………………………………………………………..……………… 19
Scenario 2 Case Study: Australian Transport Safety Bureau Report – Safe working irregularity following rail track maintenance…………………………..25
Learning from success during projects………………………………………..…………... 29
Tips for exploring success …………………………………………….……………………...30
Scenario 3 Case Study: The London Summer Olympics……………………..34
The key to improving at anything is learning. But what are we learning from? So much of the time in safety, the conversation is about learning from what has gone wrong. An incident has occurred. What went wrong, we ask? How do we prevent that incident from re-occurring?
But if that is all we do, are we not missing out on many potentially fruitful lessons?
What happens if we turn this paradigm on its head? Instead of focusing all our attention on failures in safety performance, what if we started to learn from success? What would that look like? There is success in everything we do. Even in failure, there may be successes from which we can learn. While we have a tendency to cherry-pick what went wrong in the lead-up to an incident, the reality is not so simple. Undoubtedly, there will also be a number of things that proceeded according to plan and worked successfully. There is benefit in learning from that.
In this paper, we will explore:
the limitation of only exploring past failure for 'lessons learnt'
the benefits of a more positive approach
a methodology for learning from success, and
scenarios using the methodology.
We need to find the passion for learning from all that we do: our successes as well as our failures. We hope that this paper provides some inspiration for moving beyond simply looking at failure to derive our lessons in order to achieve better safety outcomes.
Contact: Alena Titterton Partner T: +61 439 077 702 E: firstname.lastname@example.org
Contact: Michael Tooma Partner T: +61 457 087 952 E: email@example.com
The limitations of only looking at failure
The limitations of our traditional approach to looking at what goes wrong for learning
The challenge for everyone involved in incident management and investigation is that we fall victim to the mindset of over-simplifying causes of disasters. Disaster reports are starting to sound the same. They all tell the tale of a company that preferred profit over safety. They make recommendations for greater safety leadership, the need for a safety culture and for a more effective regulator. We analyse things in the same way, make the same recommendations and somehow expect the result next time around to be different. We do this at a smaller scale also. The imperative in the aftermath of an incident is to minimise the impact of the incident. That means reducing shutdown time associated with damaged equipment, regulatory notices or industrial action. This often leads to reactive and narrow-focused decision making on corrective actions - a new safe working procedure and training course, for example, is the most popular corrective action. The assumption is that if we identify the cause of the incident, we can simply develop a procedure for addressing it, train workers in the procedure and require them to follow it. That thinking satisfies the regulators, they being more eager than most to move on to the next incident being investigated. It limits liability in that it is usually accompanied by a third feature -the implicit or explicit blame of the workers involved, either for needing a procedure or for failing to follow it. It also satisfies the conscience of managers who feel they have addressed the issue as soon as it came onto their radar. Crucially, it is cost effective. A procedure is relatively cheap compared to an engineering solution. Of course, if that procedure works and will be followed, then the problem is truly solved and all objectives have been met. In a perfect world, that would be the case since, for example, it is in the best interest of the company to ensure that the procedure is comprehensive and effective. It is in the best interest of the workers to understand the procedure and follow it; after all, it is there for their health and safety. Unfortunately, the world is not that simple. There are a number of biases that prevent us from understanding what goes wrong. For example, investigations suffer from hindsight bias – we know exactly what happened and how it happened so it seems obvious to us where the operator went wrong. But if it seems so obvious to us, why would they have done what they have done? Surely no one goes to work with the intention of hurting themselves
or others around them. If they did something wrong, it must have seemed like a good idea at the time, right? Further, we engage in cherry picking of events. Things take on more significance with the benefit of hindsight than they did in real time. We focus on those things because they fit a certain narrative, an explanation for the wrong path taken. But identifying them in this way has little preventative value because they clearly did not seem obvious to workers, managers and/or leaders in real time.
The problem with focusing on what went wrong is that we never get the full story
In addition, people get defensive when you ask them about what went wrong. They either deliberately or subconsciously distort the truth to sound more favourable or to make their mate sound more favourable. While you can improve that somewhat by creating just culture in the workplace, you can't eliminate it altogether. What you get, ultimately, is the version of the truth that people are willing to share with you so you never fully capture the learning from any incident. These things are further exacerbated by the fact that most of our investigation techniques are best designed for story telling rather than analysis. They are based on uncovering the linear timeline of events: "This happened and then that happened." But often the order in which things happen distorts the understanding of why they happened in that conditions may have existed within the system for many years that are important to the incident. This limits our analysis of incidents.
So how do we factor that into incident investigations? By accepting that, faced with the same facts, people will not necessarily behave in the same way. The world we live in is far more complex than that. Dekker (2011) observes: "Rational decision-making requires a massive amount of cognitive resources and plenty of time. It also requires a world that is, in principle, completely describable. Complexity denies the possibility of all of these. In complex systems (which our world increasingly consists of) humans could not or should not even behave like perfectly rational decision-makers. In a simple world, decision-makers can have perfect and exhaustive access to information for their decisions, as well as clearly defined preferences and goals about what they want to achieve. But in complex worlds, perfect rationality (that is, full knowledge of all relevant information, possible outcomes, and relevant goals) is out of reach ... In complex systems, decision-making calls for judgments under uncertainty, ambiguity and time pressure. In those settings, options that appear to work are better than perfect options that never get computed. Reasoning in complex systems is governed by people's local understanding, by their focus of attention, goals, and knowledge, rather than some (fundamentally unknowable) global ideal. People do not make decisions according to rational theory. What matters for them is that the decision (mostly) works in their situation." 0 F i Dekker (2011) goes on to explain that this perfectly normal reaction to the rules being imposed on us at a local level can accumulate at an organisational level with harmful consequences. He explains: "Local decisions that made sense at the time given the goals, knowledge and mindset of decision-makers, can cumulatively become a set of socially organized circumstances that make the system more likely to produce a harmful outcome. Locally sensible decisions about balancing safety and productivity -once made and successfully repeated -can eventually grow into unreflective, routine, taken-for-granted scripts that become part of the worldview that people all over the organization or system bring to their decision problems. Thus, the harmful outcome is not reducible to the acts or decisions by individuals in the system, but a routine by- product of the characteristics of the complex system itself." 1 F ii These inconvenient truths of the complex reality we face pose a challenge to the conventional wisdom around incident investigation which is typically concerned with uncovering "the truth" and, indeed, more so, "the root cause" of an incident. Consider the pioneering approach of James Reason in the "Swiss cheese" theory, a theory on which most modern incident investigation techniques are based.
The theory is that, like holes in Swiss cheese slices, all systems have deficiencies or inadequate defences. The causal trajectory of an incident leads to an incident when those deficiencies or failed defences in the system line up. It follows, then, that proactively increasing the defence layers reduces the likelihood of an incident. It also follows that in attempting to analyse an incident, a better understanding of that trajectory will uncover the absent or failed defences which enabled the system failure. Logically, then, in the aftermath of an incident, those system deficiencies are identified and addressed through corrective actions, thus reducing the holes on the Swiss cheese slices and therefore reducing the likelihood of a recurrence of the incident. But complex systems' incident trajectories are often unique. That is, addressing what went wrong in a particular incident will only help prevent that exact sequence from recurring. But the likelihood of the planets, or Swiss Cheese slices, aligning in exactly the same way is very remote. It is more likely that the next incident will involve a different trajectory and different holes on the Swiss cheese slices. Addressing the specific sequence that caused the incident will not address potential paths that the incident trajectory could have taken but for certain events. We often look at an incident sequence amazed, but relieved, that things were not much worse and that they could, in fact, have been, had it not been for some "lucky" event. But other than such a casual observation or a remark in an incident investigation report, little is done about those "other" non-causal events. That is, events that either prevented the incident from being of greater impact or the possible trajectory that did not occur -the road not travelled but which could have been travelled.
"most of the root causes of serious accidents in complex technologies are present within the system long before an obvious accident sequence can be identified". 2 F iii
That is, the holes are there, if only our investigation techniques could uncover all of them and not just those involved in the incident. Yet most investigation techniques are linear in their approach, seeking out the exact causal sequence - the truth of what happened - and then uncovering the root cause(s) which lead to that factual sequence.
But is it appropriate in a complex world to maintain a linear view of incident causation? Isn't the road that is not travelled just as instructive to further incident prevention as the road that is actually travelled? Indeed, in many respects, the "lucky" control is more instructive for incident prevention than the failed or absent control. If we adopt that approach, building system resilience is not simply achieved by adding Swiss cheese layers as the orthodox view of the theory may suggest, but also by uncovering and plugging holes within each layer that are reasonably related to the incident although not causally connected to it.
The benefits of a more positive approach
What can we learn from positive psychology?
Health and safety is not the only discipline that has had to grapple with a focus on the negative and move toward a more positive approach. The psychology discipline, which is often a source of inspiration and empirical data for health and safety specialists, has undergone a transformation. Throughout most of the 20th century, psychology focused on the pathology of disease (i.e. looking at and understanding an illness like depression and attempting to cure it). Proponents of the 'positive psychology' movement, leading psychologists Martin Seligman and Mihaly Csikszentmihalyi (2000) explain: "What psychologists have learned over 50 years is that the disease model does not move psychology closer to the prevention of these serious problems in the first place. Indeed, the major strides in prevention have come largely from a perspective focused on systematically building competency, not on correcting weakness. Prevention researchers have discovered that there are human strengths that act as buffers against mental illness: courage, future mindedness, optimism, interpersonal skill, faith, work ethic, hope, honesty, perseverance, and the capacity for flow and insight, to name several. Much of the task of prevention in this new century will be to create a science of human strength whose mission will be to understand and learn how to foster these virtues in young people." (emphasis added) 3 F iv The same could be said of all of us who have a passion for prevention of harm to people in the health and safety context. As Seligman and Csikszentmihalyi would say:
"treatment is not just fixing what is broken; it is nurturing what is best." 4 F v
We know that positive reinforcement is far more powerful than negative reinforcement. We say someone has received positive reinforcement if a desired behaviour they display has been followed with a reward or stimulus. Negative reinforcement is when someone receives punishment, an aversive stimulus or a withholding of a stimulus after displaying certain behaviour, usually undesirable behaviour.
People are more likely to adjust their behaviour to seek out praise and acceptance than out of fear of punishment.
We understand this well in our private lives. When people do things we like, we reward them with expressions of gratitude so that such behaviour is repeated. We avoid rewarding people for negative or undesired behaviour but, with few exceptions, punishment of people for such behaviour will rarely be effective. Harsh words directed at a fellow motorist who cuts you off on the road are unlikely to alter their driving behaviour. That is because they receive an immediate reward for their bad behaviour in reaching their destination sooner, and the added attention received from you as a fellow motorist, over time, is either ignored or becomes associated with the positive reward -beating traffic. At work, people do their jobs well every day. They follow safety procedures. They engage in safety programs required of them. For that, we seem to think they deserve no reward or recognition. Indeed, people strongly believe that it is wrong to reward them for "doing their job" as if to single out that conduct would undermine the integration of safety into operational requirements. By contrast, if they take a short cut, they are rewarded immediately by being able to do their job faster and, depending on the employment terms, either going home sooner, being recognised by their superiors or, in the case of contractors, making more money. Yet we are surprised when, over time, people gravitate towards short cuts. The same is true of managers. Beyond a certain threshold, improved productivity with no additional innovation or capital expenditure comes at the cost of the health and safety of workers. But no such distinction is made in relation to managerial recognition and reward. Managers receive instant positive reinforcement for day-to-day decisions they make which improve shareholder value, such as staff reductions and productivity improvements. Doing more with less has become a management mantra -a boast of success. That assumes that the status quo has inefficiencies. But where no such buffers exist, the value is derived at the long–term expense of current workers and future shareholders. That was the experience of the BP Texas Refinery.
The value of near miss investigation as an examination of success
Given what we know about human behaviour and psychology, it is remarkable that when a near miss occurs, we don't pause to recognise the positive behaviour exhibited by people that may have averted a disaster. That behaviour may well be expected because it is consistent with the system, but so are many everyday private behaviours for which we receive an acknowledgement or other positive reaction. As a society we expect that positive reinforcement in our private lives. We regard it as part of our culture. We drum it into our children. But in our working lives we seem to take safe behaviour for
granted. In incident investigations we regard it as irrelevant. How can that be? That is the moment when we are most vulnerable. When we all feel we are to blame. That we have somehow contributed to the incident. Pausing to reflect on success and recognise what we did right can ease much of that. If the focus of incident investigation remains solely on what went wrong, it is inevitable that it is about blame. Even in organisations where a just culture is in place, the singular focus on the negative behaviour can be detrimental to the overall functioning of the system. It is also a missed opportunity. Most effective innovations and discoveries were born out of unintended acts -observing the consequences of a deviation from normal procedure and learning from it. The same is also true of cooking recipes. Explanation is a key to improvement. While such behaviour should not necessarily be encouraged because its consequences are unpredictable, as any amateur cook well knows, where the experimentation works in that it avoided or tempered the effects of an incident we should explore it, understand it and celebrate it. As Dekker (2011) observes: "Complex systems can remain resilient if they retain diversity: the emergence of innovative strategies can be enhanced by ensuring diversity. Diversity also begets diversity: with more inputs into problem assessment, more responses get generated, and new approaches can even grow as the combination of those inputs." 5 F vi In one of our incident investigations, a worker survived a 20-metre fall in a mobile plant principally because, contrary to procedures, he had not buckled his seatbelt. On balance, more lives would be lost by failing to buckle a seatbelt than from doing so. But this life was saved for that reason. Shouldn't we learn from that and attempt to improve the design of our plant? Even if someone did something wrong that worked, we need to understand why it worked so we can capture its positive features. That, in essence, is what Reason (1997) 6 F vii was describing in the flexible culture component of his safety culture model. The empowerment of well-trained workers to make decisions that deviate from normal procedures, but that are consistent with the objectives of the procedures. Once those decisions are made, we need to then understand why they worked. That is where asking "what went right" comes in. The seriousness of, and the learning opportunities which can be garnered from, an event should not be downplayed just because the event does not itself result in injury or damage to plant. To the contrary, near misses present valuable opportunities to learn from mistakes and system deficiencies. In order to avoid disaster it is necessary to understand the risks that arise within an organisation. To allow this, a culture of reporting must be encouraged within the organisation. Without a reporting culture, an organisation will be unable to gather information of incidents that have occurred and will be unable to discover the cause of incidents. Underreporting of near misses will hide issues that can be remedied before the problem develops into a disaster. "we should investigate all accidents, including those that do not result in serious injury or damage, as valuable lessons can be learnt from them. 'Near misses', as they are often called are warnings of coming events. We ignore them at our peril, as next time the incidents occur the consequences may be more serious". 7 F viii Part of the difficulty of encouraging incident reporting and thereby creating a reporting culture is that incident investigation findings are invariably negative in outlook and approach. No matter the rhetoric As Kletz (2001) observes:
around "just culture", a finding of an error, even an expected and understandable one which is caused by system deficiencies, is a negative finding for the individual concerned. Surrounding that single error were a number of steps correctly followed and many controls which operated effectively, both prior to and following the incident. Those effective controls are taken for granted. It is the lack of attention given to those effective controls that undermines the resilience of a system. That vulnerability in the traditional approach is the motivation behind exploring a new technique in the next section.
Why find out about what went right?
What went right in an incident can be just as instructive as what went wrong. By identifying effective control features, they can be replicated across the system. Controls that work at a local level - that are accepted by operators and fit into other complex systems - are rare. Their effectiveness should be celebrated. That is particularly the case in near misses where, had it not been for those controls, an incident would have occurred. Indeed, even if what went right was not a control at all but a "lucky event'', an analysis of this may be instructive as to the type of controls that might work as a final barrier to the incident's causal trajectory. The reality is that we have been attempting to learn the negative lessons from disasters since the inception of safety science as a discipline. Major disaster report after major disaster report sets out the facts of the incident, the deficiencies in the system, expresses outrage as to how society can allow these conditions to exist, and makes recommendations in relation to safety leadership and safety culture with some specific design recommendations for industry consumption. This was the case in the Columbus, Piper Alpha, Exxon Valdez, BP Texas refinery, Upper Big Branch and Deepwater Horizon reports, to name a few. The problem with that approach is that it is entirely negative.
If it was that simple to learn the lessons from disasters, surely we would have learnt them by now.
The legal and commercial consequences of failing to do so are very significant globally. We have to assume that most leaders are, at worst, agnostic towards safety. Some may be passionate about safety but certainly none display the psychotic behaviour which would mean that lessons, if capable of being easily applied, would be ignored. I have never encountered any managing director who wakes up in the morning wanting to hurt their people, yet even in Australia, which prides itself on its safety standards, on average one person is still killed every working day. Globally the figure is much worse. The reality is that the lessons from disasters, instructive as they may be, are entirely superficial. Traditional linear incident investigations have limited ability to impact incident prevention because lightning does not strike twice. As Dekker (2011) observes: "Reconstructing events in a complex system, then, is nonsensical: the system's characteristics make it impossible. Investigations of past failures thus do not contain much predictive value for
a complex system. After all, things rarely happen twice in exactly the same way, since the complex system itself is always in evolution, in flux." 8 F ix
The utility of the lessons, therefore, is translated into motherhood statements about safety leadership and safety culture without any specific means of achieving that in the localised context. That is not to say that those lessons have no value or meaning. They do. The issue is, can we extract more from our incident investigations? Can we derive practical lessons of real meaning, value and application, and can we do this on a regular and systematic basis? It may be more useful to find out how a potential serious incident became a near miss. Or how consequences of an incident were tempered, rather than just finding out what caused the incident. A better understanding of "what went right" can assist in creating a more resilient system. The attraction with asking "what went right?" is its positive character. We know that positive reinforcement is the most important principle of behaviour and a key element of most behaviour change programs (as discussed above). 9 F x So what would a more positive methodology, based on learning from success look like? We provide some thoughts in the next section below.
A positive methodology for learning from success
"What goes right is as instructive as what goes wrong."
Michael Tooma, Due Diligence Series: Incident Notification, Management and Investigation , 2 nd Ed, CCH, 2017
There are three key moments when we can learn from success to support achieving better safety outcomes. We can do this by further exploring:
our near miss/near hit events , uncovering the success that prevented a more serious incident what goes right in our serious incidents (that is, investigate success in the lead up to an incident as well as test the potential success (resilience) of controls identified as part of the investigation and close-out of preventative and corrective actions. This ensures controls will be effective across a broader range of future potential incidents than the single trajectory of that past incident. We get to higher quality corrective/preventative actions for serious incidents through a 'what if' reliability analysis, and successful projects and teams with excellent safety performance (such as where there is no reported incident, injury or a lower level of such incidents than is average).
We explore how we might investigate success at these three key trigger points with the Positive Investigation Methodology outlined below.
The Positive Investigation Method can be used to:
Investigate catastrophic potential serious near miss/near hit events by focusing on what factors allowed that potential injury causing incident to become harmless (in the sense of causing no injury or harm) Increase the quality and resilience of controls recommended in serious incident investigation and support quality decision making by leaders when they subject recommendations for preventative and corrective actions to scrutiny.
Uncover success in your projects and teams
Near miss/near hit investigations
Near misses present valuable opportunities to learn from mistakes and system deficiencies. In order to avoid disaster it is necessary to understand the risks that arise within an organisation. Underreporting near misses will hide issues that can be remedied before the problem develops into a disaster.
As Kletz observed:
"we should investigate all accidents, including those that do not result in serious injury or damage, as valuable lessons can be learnt from them. 'Near misses', as they are often called are warnings of coming events. We ignore them at our peril, as next time the incidents occur the consequences may be more serious." 10 F xi Surrounding any one single error are layers of steps, procedures and controls which were adhered to and followed. These are often taken for granted in the investigation of incident, which in turn undermines the resilience of a system. That vulnerability in the traditional is the motivation to focusing on what went right. But we can go further. We can uncover the success stories that meant the near miss did not result in a more serious incident. In that regard, the approach that we recommend is to investigate serious near misses by focusing on what factors allowed that potential injury causing incident to become 'harmless' (in the sense of causing no injury or harm). That is, trying to uncover what when right to avoid the incident's potential. Some may have effective defences. Understanding those deliberate or accidental defences provides rich material for building and maintaining resilience of a system. At the very least you know those defences work or at least worked on one occasion. The benefit of that approach is that the conversation with witnesses is an entirely positive one. It's not about what could have happened. It is not about the doom and gloom narrowly averted. Rather, it is about their heroic act, well designed process or lucky event that allowed us to avoid the adverse outcome. People love talking about positive things particularly if they have had something to do with them. Of course, few accidental defences can be adopted without modification or systemisation. The idea is not to blindly record this event but rather attempt to learn from what works and find a way to make that work by design rather than by accident. Identify what happened. In questioning witnesses and persons involved, use open questions. Start with positive statements such as: 'It's incredible that you were able to avoid this situation becoming worse. How did you do it?' Look for the factors that meant that the incident did not resolve in further harm. Conversely, identify what could have happened. Ask what went right that prevent the incident from causing serious injury or damage. The question 'what went right' is asked multiple times (although there is no magic in the number 5, Method
we ask what went right 5 times in our case studies below) in order to build a comprehensive picture of what worked.
Would we replicate this success in avoiding a similar incident in future? Or were these lucky adaptations on the day?
o What if the consequences were different? o What if the process/system was different? o What if the environment was different? o What if the plant was different? o What if the operator was different?
How can we systematise the learnings so we repeat our success?
SCENARIO 1 CASE STUDY: NEAR MISS/NEAR HIT INVOLVING QANTAS FLIGHT 32
QF32: How the world's worst air disaster was averted
On 4 November 2010, 469 people on a flight from Singapore to Sydney were in the centre of dramatic events that had the potential to go down as one of the world's worst air disasters. Shortly after leaving Changi Airport, an explosion shattered Engine 2 of Qantas Flight QF32 – an Airbus A380. Hundreds of pieces of shrapnel ripped through the wing and fuselage, creating chaos as vital flight systems and back-ups were destroyed or degraded. The crisis showed every sign of ending tragically. Instead, the plane landed safely a few hours later at Changi Airport in Singapore with all people on board unharmed. At the height of the crisis, the A380's electronic centralised monitoring system ( ECAM ) was spitting out hundreds of alerts on what was wrong with the plane and instructions on what the pilots were required to do to address those defects. It was at this pivotal moment that Captain Richard de Crespigny made a profound decision. As he recalls in his book on the crisis: "I was growing tired of being reactive to the ECAM and I wanted something positive to focus on. There were too many alerts, too many things broken and not much to be achieved by dwelling on them."
Richard de Crespigny, QF32 (McMillan, 2012), p 218.
At that moment, Captain de Crespigny had what he called an epiphany:
"I inverted the logic. I remembered what Gene Kranz, NASA's Flight Director said during the Apollo 13 mission: 'Hold it, gentlemen, hold it! I don’t care what went wrong. I need to know what is still working on that space craft.' We went back to basics and it became easy."
Richard de Crespigny, QF32 (McMillan, 2012), p 198.
Captain de Crespigny's action in focusing on what was right with the plane on that day rather than what was wrong with it, allowed him and his crew to land the plane safely, saving the 469 lives on board.
Source: Case study extracted from Michael Tooma, Due Diligence Series: Incident Notification, Management and Investigation , 2nd Ed, CCH, 2017, see pp 83-84
STEP 1: ASK WHAT WENT RIGHT?
WHAT WENT RIGHT?
STEP 2: WHAT IF THE CIRCUMSTANCES WERE DIFFERENT?
QF32 CASE STUDY
While many of the controls identified in the analysis above may already exist in the system, the specific hobbies and interests of the Captain flying QF32, Captain Richard de Crespigny may need to be further systematised for pilots. Captain de Crespigny was an avid follower of aircraft and spacecraft disaster investigations. So much so, he was planning to write a book about it. When it came to responding to the emergency on the day of the engine loss, in 'inverting the logic' to focus on what was still working in the thinking process Captain de Crespigny describes, he was as much relying on what he learnt from aircraft disaster investigations during his reading outside work time as much as his Qantas training. In learning from what went right here, it would be useful to consider whether this aspect that was specific to this pilot's own interests can be further systematised at an organisational level. The question becomes: is there more the organisation should be doing for all pilots to be curious and read/learn about aircraft disasters on an ongoing basis as part of their continuous education?
What went right in an incident can be just as instructive as what went wrong. By identifying effective control features, they can be replicated across the system. Even if what went right was not a control at all but a "lucky event", an analysis of this may be instructive as to the type of controls that might work as a final barrier to the incident's causal trajectory. Lessons from disasters, instructive as they may be, are entirely superficial. Traditional linear incident investigations have limited ability to impact incident preventions because lightning does not strike the same place twice. Even in serious incidents, there is often something worse that could have occurred because of effective mitigation factors or an effective incident response. It is useful to recognise this in the context of an investigation because it provides recognition of the efforts of the individuals involved. The attraction with asking "what went right?" is its positive character. We know that reinforcement is the most important principle of behaviour and a key element of most behaviour change programs. We also know that positive reinforcement is far more powerful than negative reinforcements. We say someone has received positive reinforcement if a desired behaviour they display has been followed with a reward or stimulus. Negative reinforcement is when someone receives punishment, an aversive stimulus or a withholding of a stimulus after displaying certain behaviour, usually undesirable behaviour.
The focus in the PIM approach adapted to serious incident investigation is two-fold:
Firstly, understanding what could have happened if circumstances were different and recognising and learning from what made the incident less serious, and Secondly, verifying that whatever controls are being implemented would be effective in different circumstances.
If we were to add the "what went right?" dimension to the root cause analysis technique, we would need to adopt a multidimensional diagrammatical representation. Improving resilience and investigating success of corrective action controls If we are to appreciate the importance of the alternate trajectory of an incident, we need to adapt our approach to traditional incident investigation methods. In incident investigation, we say an incident event occurred because of a series of primary events leading up to it, each causally related to the incident event. These are the direct causes of the event. By asking why these events happened, we uncover the root causes of the event. Take, for example, the popular incident investigation technique adapted from the quality assurance techniques of Sakichi Toyoda for the Toyota Motor Company- the "5 Whys". The technique consists of asking "why?" five times to get to the root cause of an event. An example of doing so is provided below.
• Why did the incident happen? Because the worker didn’t follow procedure.
• Why did the worker not follow procedure? Because she was not trained in the procedures.
• Why was she not trained in the procedures? Because training needs of workers were not systematically identified.
• Why were training needs of workers not systematically identified? Because there is no system for training needs analysis to be undertaken.
• Why was there no system for training needs analysis to be undertaken? Because training is inadequately addressed in the safety management system.
As we have discussed, for every event sequence, there is a road not travelled because of some control that worked or some other barrier, whether intended or unintended. That alternative pathway can be uncovered by asking 'what if?'. This approach would disclose an alternate event sequence, together with its own root cause factors. If we now overlay the '5 why's' technique with a new set of questions for every factual scenario to uncover the alternate trajectory of the incident by asking five "what if ?" questions. Through that process, the investigator can uncover any holes not otherwise identified on each Swiss cheese slice and also which controls are in fact working. The latter category is useful to the analysis of "what went right?" as opposed to just "what went wrong?".
In this methodology, we conduct a reliability analysis of what went right by asking a series of 'what if' questions to reveal deeper learnings:
o What if the consequences were different? o What if the process was different? o What if the environment was different? o What if the plant was different? o What if the operator was different?
If we ask the "what if?" question four additional times, we get four alternate realities in addition to the actual sequence which occurred. That provides a much richer tapestry of system deficiencies and, therefore, recommendations for system improvements. To explain the process, each sequence in the linear incident trajectory and the four alternate trajectories can be linked back to the event and causal factor analysis diagram with each dot on the diagram representing an actual failed or absent defence; or, in the alternative trajectory, a point of vulnerability. The investigation report is then not just about linear causes but about vulnerability to future events which may adopt a different trajectory. The investigation is therefore not simply a root cause analysis, but a vulnerability analysis where the aim is to build resilience in the system to safeguard against an unpredicted and unexpected future event.
SCENARIO 2 CASE STUDY: Australian Transport Safety Bureau Report – Safe Working Irregularity following Rail Track Maintenance
ATSB Report: What went right in this safe working irregularity?
On 30 October 2015, the Australian Rail Track Corporation (ARTC) scheduled track maintenance work (placement of ballast) to occur between Braxton and Farley on the Up main north line, Hunter Valley in NSW. During this planned track maintenance work, a safe working irregularity occurred on the rail corridor. To facilitate the work, the network controller at the ARTC network control centre north (NCCN) situated at Broadmeadow issued the protection officer a work on track authority. The authority excluded rail traffic from the defined track section between Allandale and Farley, permitting safe access for the work to proceed. The track maintenance work took place without incident. At about 0856, the track work was completed and the protection officer contacted the network controller to fulfil the work on track authority and return the track for rail traffic. While travelling on an access road within the rail corridor during its exit, a road vehicle involved in the maintenance work became bogged in a drain. The vehicle was situated approximately five metres from the nearest rail of the Up Relief line. The vehicle was located outside the danger zone, and therefore there was no requirement to employ a work on track method to retrieve the vehicle. Nevertheless, the protection officer in charge of the worksite made a safety assessment to exclude rail traffic from the portion of track to ensure the presence of workers in the rail corridor did not alarm a driver of an approaching train. The protection officer contacted the network controller and requested controlled signal blocking. The network controller authorised the work to commence, but only outside of the danger zone. After initial attempts to recover the vehicle failed, they decided that a front-end loader was required to free the vehicle. During this conversation and a subsequent conversation, the parties did not confirm their common understanding about the location of the worksite. Consequently, when controlled signal blocking was put into effect, a train had already passed the signal and was travelling toward the worksite. Workers on the site assumed the track was protected and were preparing to remove the vehicle when they noticed an approaching train. There were no injuries to people or damage to property. The Australian Transport Safety Bureau (ATSB) found that communication between the protection officer and network controller resulted in the misunderstanding of information that contributed to a safe working irregularity where controlled signal blocking was issued with a train (HV130) located between the protecting signal and the worksite. The rule and procedure associated with the issuing of controlled signal blocking did not manage the sequential communication of sufficient information to identify the worksite location before controlled signal blocking was acted upon. There is also no requirement in the rule or procedure applicable to controlled signal blocking to keep a permanent record detailing the specific information relating to its implementation, therefore increasing the likelihood of error during the read-back process. The ATSB Investigation Report is almost exclusively focused on what went wrong even though no harm came to workers. Let's look at what went right and whether no harm would occur if the circumstances were different.
Source: Case study extracted from ATSB Transport Safety Report, Rail Occurrence Investigation RO-2015-021, Safe Working irregularity involving Controlled Signal Blocking between Allandale and Farley, Hunter Valley, NSW , Final Report, 4 May 201
STEP 1: ASK WHAT WENT RIGHT?
WHAT WENT RIGHT?
STEP 2: WHAT IF THE CIRCUMSTANCES WERE DIFFERENT?
RAIL TRACK MAINTENANCE CASE STUDY
Serious near miss/near hit events (such as a safe working irregularity on the rail corridor) are more often than not investigated the same way as when a serious incident involving harm to people occurs. It may be possible to gain richer information about the resilience of controls and necessary changes to systems if we start from the perspective of what went right in such a near miss/near hit event.
Learning from success during projects
While near miss/near hit events and serious incidents present valuable opportunities to learn from mistakes and system deficiencies, our usual approach begs the question. Why do we wait until something has gone wrong or nearly gone wrong to identify potential success? As we have outlined above, success is happening every day! We don't have an incident or near miss/hit every moment of every day. So what do we know about what is happening in between these moments? Beyond audits and inspections that typically focus on deviance from system requirements (again, focusing on failure), the answer is, not very much. We can investigate success when nothing remotely negative has been said to have happened. You have a successful project. The team of people on the project are managing to deliver high performance and have had no incidents or serious near misses. The client's time pressures and budgetary pressures exist in this successful and safe project as much as other projects on your books. What is sitting behind this success? How are the workers managing to achieve that success under those challenging circumstances where others have not? It is worth looking under the bonnet to check out what exactly is powering the engine to accumulate further knowledge and leverage from that to further systematise any of the excellence that you find which may have broader application. What if we started to explore what is happening when things are going right?
At what juncture should we look at success in a project/undertaking?
Obviously, this will depend upon the type of activity or area of operations. You can consider exploring, for example:
A particular project where safety performance was better than the typical levels of safety performance (also thinking about complexity levels and risk profile of the project)
A particular division or operation undertaken by the organisation which is celebrated as more efficient than others
A particular division or type of operation within the organisation that has less reported incidents than others
A particular area of operations that reports more near miss/hit events in general and also has less incidents
Now in broader management approaches, we investigate success from an operational perspective. And this often occurs when it comes to considering the end of a project. We will often conduct a project de-brief with our teams and with our clients, customers, suppliers or other stakeholders to find out what we can learn from how the project progressed and take that forward, feeding it through to the next projects. We should think about exploring success not just at the end of a project, but at key junctures during the project. Many of the valuable lessons will be lost if we wait until the end of a project when many of those moments (and what made them great) are long forgotten. Think about looking at success when a project commences. Why is the set up so great? Think about looking at success when the change of tempo and overlapping operations is high and watch why success is being achieved in practice in real time. Investigating success in this context does not require a retrospective approach and that means we can, to some extent, avoid many of the limitations of incident investigation discussed above.
Tips for Exploring Success
Unlike traditional incident investigations, exploring success (particularly in operations outside of a near miss or incident investigation) is not about checking particular boxes off your list. We don't have specific factors with boxes that we want to complete. That is quite purposeful. We want to remain open to hearing whatever we learn from listening to, or watching, the experience of how things are undertaken on that project or in that particular operation day to day. When exploring success, you want to understand how that success was achieved. In doing so, it is unhelpful to have a complete and exhaustive list of questions pre-determined. Because the reality is, you want to listen and watch and explore success in how operations are performed based upon what you learn in that exercise.
That requires you to be open.
In being open to learning from success in practice, you want to explore matters such as:
the context within which that success is occurring (understanding the nature of the
environment, the operational objectives, the milestones and KPIs, stakeholders involved, types of hazards and risks being faced, controls planned to be in place, controls actually in place, processes, equipment, training etc)
normalised deviance - the distinction between the 'work as planned' and the 'work as done'
adaptation - how workers may be required to adapt every day to make their operations work in practice to meet the demands placed upon them
what kind of autonomy the workers have to perform their roles and how it is that they have been (or simply feel) empowered to work effectively
whether workers are being provided with what they need to do their work ( resources including time, money and processes)
In asking those questions and identifying lessons for improvement of your approach to safety, you want to think about:
whether it is possible to narrow the gap between work as planned and work as done
whether work is being done safely using a different method to the approach identified in the documented system, and in light of that, whether there is a need to change the personnel's approach to the task or whether it is possible change the system/process/procedure to match practice (noting that we should not prefer the former), and
whether it is possible to streamline the systems more broadly in other contexts to adapt our systems to match how those workers are operating in practice?
You want to prompt that listening exercise in a way that means people are engaged in the process of opening up to you in a way that is honest, open and transparent. This obviously requires you to work quite differently to the skill set you bring to bear in incident investigation. Here, you are required to facilitate an open conversation about excellence.
Consider formulating your approach in learning from success in projects based on an open dialogue:
Set the passion and excitement for learning from personnel as the experts – Tell them how fantastic they are. Recognise the excellence in their achievements and tell them you want the whole business to be able to learn from their approach and experiences. Use language such as 'Talk me through how it all works?' When we make them the experts, people are willing to explain how things work and the challenges they really face in practice. This leverages from what we know about positive reinforcement. Resist the urge to be the expert: you are learning from them. Not the other way around.
31Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 Page 10 Page 11 Page 12 Page 13 Page 14 Page 15 Page 16 Page 17 Page 18 Page 19 Page 20 Page 21 Page 22 Page 23 Page 24 Page 25 Page 26 Page 27 Page 28 Page 29 Page 30 Page 31 Page 32 Page 33 Page 34 Page 35 Page 36 Page 37 Page 38 Page 39 Page 40 Page 41 Page 42 Page 43
Made with FlippingBook - Online catalogs