Teaching process safety in 2017

The last 4 years I’ve given guest lectures in process safety at the Norwegian University of Science and Technology for undergrad chemical engineering students – and I’ve promised to do this also this year – this is my annual pro bono event :).

I used to work as a consultant with Lloyd’s Register, and previously I’ve used slides based on their internal course in process safety, that I also used to teach. Now I have a new job at a different firm in a different sector (information security in a devops environment – in otherwords something completely different and not related to process safety or chemical engineering).

20151108_093707211_ios

Obviously, I need to create some new content for this year’s lectures. I’m looking forward to it, as this is a great opportunity to brush up also on the form of delivery. So, the plan so far is:

  • Basic principles (no single point of failure, risk-based design thinking, observable risks, usability)
  • Process accident examples (the fire from ice example from CSB is still great, but perhaps I can find something new to add)
  • Key safety standards, and some examples on how to use them
    • ISO 10418 / API RP 14C / NORSOK P-002 (process design and safety)
    • IEC 61511 (safety instrumented systems and safety integrity levels)
    • IEC 62443-3-3 (New! Cybersec in process systems, I think this one’s going to be increasingly relevant)
  • The mother of all accidents: overpressure
    • Blowdown systems
    • How to simulate blowdown in a simple process segment
    • Pressure equalization in compressor trains
  • New threats to process plants
    • Cyber attacks
    • Practices to make your plant less vulnerable

What more do you think undergrad chemical engineering students need to learn about safety in design?

Safety versus convenience 

Risk based asset management frameworks force us to be systematic in our approach. Multiple layers of defense are commonly applied to mitigate risks down to what we see as an acceptable level. In many cases it will feel like each layer of defense is a layer of inconvenience. 

 

Do we maximize the number of spikes (or layers of protection) to feel safe?

The ALARP principle is often used to evaluate if a certain defense layer is worth the investment. This type of analysis tends to be CAPEX focused. Very cumbersome operations tend to make people invent bypasses that are more convenient. In addition to investment cost, maybe we should also include the effect of the mitigation solution on convenience and how humans react to it, in addition to cost? If people bypass the intended operating procedure – the result of the new risk mitigation investment could be an increase in the overall risk.  

Are you aware of the effect of work life balance issues on the quality of your team’s work?

Make sure your people do not feel like a hot kettle with nowhere to let the steam out – that can lead to broken designs – and if your line of work is designing safety critical systems, broken designs usually means a greater chance of loss of life, polluting the environment and large financial losses.

Having a way to control the internal steam pressure of your team members may be utopia - but you should still look for ways to avoid disasters, together with your people.
Having a way to control the internal steam pressure of your team members may be utopia – but you should still look for ways to avoid disasters, together with your people.

We all know that the quality of our work varies – with a large number of factors. If we are overworked or really worried about something in our personal lives – quality of our work will most likely suffer. If you are responsible for the functional safety in a large project, human error can be disastrous, not only for the project, but for the people working in the plant when it has become operational. Whether it is yourself, or an entire team that you are responsible for, you need to be aware of key performance shaping factors. These factors are described in detail in human reliability analysis, such as developed by Idaho National Labs for the nuclear industry: http://www.nrc.gov/reading-rm/doc-collections/nuregs/contract/cr6883/cr6883.pdf. These techniques can lend some terminology and thinking that is useful in the project itself, to help manage the risk of significant human errors in the project phase. Remember – misunderstanding the risk factors and barrier elements themselves may lead to insufficient barriers against major accident hazards in a real plant! The factors in the SPAR-H methodology described in the linked document are:

  • Available time
  • Stress/stressors
  • Complexity
  • Experience/training
  • Procedures
  • Ergonomics/HMI
  • Fitness for duty.
  • Work processes

These factors have been defined for typical process operators’ actions in a nuclear power plant but they are also relevant for other types of tasks. Functional safety work typically has a high degree of complexity. The experience and training of people involved in the safety lifecycle tend to vary a lot, and procedures and work processes are not always clear to everyone involved. All of this falls under “management of functional safety” and project managers should think about what creates great quality when planning and managing the project. In many projects, time is quite limited, and the term “schedule impact” is a rather frightening concept to many project managers. This can lead to tasks being perceived as less important simply because the schedule is prioritized over quality. For safety critical tasks, this should not be allowed to happen.

Some factors from your project members’ personal lives may have severe impact on performance. People working on the project team can be stressed or not “fit for duty” due to a number of challenges that are not only work related. How can we deal with this? Project managers need to know their teams beyond their tasks and work backgrounds. You need to create an environment of trust, such that you have a greater chance of catching such performance limiting factors originating from outside the organization. For many people these factors may not be something that is seen as “bad” such as divorce, alcohol abuse or depressions, it may simply be challenges in making daily life work. People tend to want balance in life – with room for work, family, friends, hobbies, etc., etc. Working in a high-stakes project may itself be a threat to a balanced life. By knowing your people you can help them find the necessary balance that will also improve their performance at work. Flexible work-hours, part-time telecommuting and close follow-up with real feedback to every member on your team can help.

We consider human factors and the effect of the work environment as well as external performance shaping factors for operators. We should also strive for people to perform at their best when their work is to design the very systems used by the operators after commissioning.

Stages of process safety understanding

Defining process safety should be quite straightforward. However, what people mean with this term can vary quite a lot, and what to include in the term depends a lot on the understanding people have of the anatomy of severe accidents. Personally, I have met the following different understandings of the topic:

  • Process safety is what is governed by API 521 (basically steel strength and dimensioning of pressure relief valves)
  • Process safety is the technical measures taken to stop an accident from occurring
  • Process safety is the sum of organizational and technical systems involved in mitigating risk of major accidents

The first statement is obviously too narrow – especially as we know that more than half of accidents are down to human factors! Definition number 2 is a traditional view, and slightly more mature as it includes both the safety instrumented system and alarm management (to a certain extent). The last definition is maybe the most “modern”, and includes organizational culture, safety leadership as well as the technologies included in the first and second definitions.

How people understand the term “process safety” tends to mature over time – from a strictly technical view to a more holistic view including both individual and organizational factors, as well as the technologies and how they are used in a system. A walk up this staircase from the technology focused to a more holistic view can take a long time but conscious reflection can help speed the path to improved performance and risk management.

A complete understanding of barrier systems, which is really what risk management is about, requires an understanding of which factors are influencing accident risk, and what can be done to mitigate the risk. This requires that the asset owner thinks not only about “proof testing”, “compliance” or “asset management”, but also about:

  • Leadership
  • Barrier integrity
  • Maintenance
  • Monitoring
  • Design
  • Competence management
  • Permit to work system
  • Dynamics of plant and controls in normal and degraded modes
  • Etc, etc, etc.

In other words – to keep risk under control you need to take the full complexity of your operations into account. A purely technical view on process safety is thus simply not good enough.