What are the things that need to be considered when doing a risk assessment?

My answer to What are the things that need to be considered when doing a risk assessment?

Answer by Håkon Olsen:

The process can be summed up in three layers – the continuous flows of stakeholder communication, risk assessment itself, and the risk treatment.

risk_management

Here’s my answer from Quora:

Risk assessments can be performed at many levels of granularity but the same general structure of the process can be used for all such assessments. There is an ISO standard that describes this approach which is generally recognized as best practice (ISO 31000). This involves:

  • Defining the context
  • Identification or risk factors
  • Analysis of risk (likelihood and impact)
  • Evaluation of risk
  • Treatment planning
  • Monitoring of risk and treatment
  • Stakeholder communication and consulting

The context includes the scope of your assessment, who the stakeholders are, what is considered acceptable and not acceptable risk levels and how the value chain is affected by the risk exposure.

Identificaiton of risk factors can be done in many ways, but the use of “guidewords” is very common, hooks to get the ideas running. This is a sort of guided brainstorming, taking past experience into account but also avoiding disregarding events that have not yet happened. Typical guidewords for the risk to an office building could be; fire, bomb threat, hurricane, power outage, robbery. The list of guidewords must be tailored to the scope, and the context in general.

Analysis of risk means assessing how likely each scenario is, and what the potential impact can be. This can be done in a purely qualitative way, or it can be a sofisticated mathematial modeling excercise involving computer simulations and advanced statistics. The point is to arrive at an assessment of how likely something is to happen, and how bad it would be.

In evaluation of the risk you sort which risks must be reacted to, and which ones you can disregard. You typically prioritize risks that are both likely and with a potentially serious outcome. Thse risks are usually unacceptable to leave as they are. Then there is an intermediate ground with risks that are somewhat likely, or somewhat bad, or bouth, that you may want to do something with. In many areas these risks are treated if actions can be found that will reduce them without adding excessive cost – often referred to keeping risk ALARP (as low as reasonably practicable – a UK legal concept).

Treatment planning is all about what you do about your risks. You can build barriers to reduce the likelihood of the event happening (automatic pressure relief valves on pressure cookers), or that will reduce the impact (sprinklers to fight fires). This is called mitigation. You can also in many cases defer the risk to other partis through buying insurance – but this is not always possible. You can also avoid the risk if you cannot find a reasonable way to deal with it by stopping the risky activity, or redesigning whatever it is you do. Finally, you may also choose to accept the high rist because you think the rewards are great enough to justify it.

Over to practice; you need to monitor the risk level and the integrity or quality of the barriers you have built. If risk is building up you need to take action. This is a continuous activity, something banks, chemical factories and airlines do a lot of.

Finally, and perhaps one of the most overlooked parts of risk assessments, is communication. You have a lot of stakeholders that you should have identified in the concept description. Keeping them involved and engaged throughout your assets lifecycle is key to managing risk effectively. You can read more about the people management aspect of stakeholder engagement here: 4 steps to engaging people in risk conversations (my blog – lots of stuff about risk assessment there, have a look around!)

What are the things that need to be considered when doing a risk assessment?

Are we ready security -wise to ditch the office?

Here’s an article I shared on linkedIn some time ago – it spurred some interesting discussion about how digital transformation is changing the way we work and how we look at attendance. A key question not discussed in that piece is “are people able of protecting corporate data and intellectual property when the social fabric of the physical office is dismantled? I’d love to hear your thoughts about that!

IMG_0988
Technically we can work from anywhere – but are people able to maintain the necessary level of information security? 

Excerpt: Telecommuting has been a thing for some years. It works well for some, and not at all for others. Technology has come a long way, and it should now be possible to interact and work remotely for most types of “knowledge work”. In spite of this, we just can’t make it really work. More often than not, when trying to have a video conference at work, we spend 20-25 minutes to set the meeting up and make everything work. Usually because someone at the other end doesn’t know how to use his or her equipment. Clearly, technology is not enough by itself, it is necessary for people to learn how to use it. And, unfortunately, “professional” communication equipment has extremely bad UX design. Compare a top-of-the-line conferencing set up with Skype or Google Hangouts – there is a real difference in ease of use, and the feel of the whole thing.

Read the rest of the article on LinkedIn: Do we need physical attendance at work?

4 steps to engaging people in risk conversations

Risk management is about managing uncertainty; it is the planning, monitoring and handling of the unexpected. All of this happens in a specific context. You have something you want to protect from various risks, and you have the people who depend on that something. Communication with those people is key to all phases of risk management. If you cannot involve your colleagues, your suppliers and your customers in the way you deal with risk, you are going to fail. Let us first look at who the people you most likely need to deal with are.

The boss

The supplier

The workhorse

The consumer

The boss is responsible for the stuff you are trying to protect and must be involved in determining which risks are OK to take. The boss also needs to own the outcome and make sure everyone is on board pulling in the same direction. Getting the boss on your team should be a high priority.

The suppliers are all the people you depend on to do what you do, to make what you do. If the suppliers don’t want to play ball you are going to have a hard time understanding what can hit you, and you may not be able to deal with difficulties without their help. Communication can be difficult here, because the suppliers also have their own context and see the world from a different mountain top than you do.

The workhorse is the doer, the expert, your colleagues. These are the people you need to understand how things work, and the people you need to take action. If they don’t work with you on dealing with risk, you will definitely not succeed. Not all workhorses are going to want to help – this is where you need to engage through others; the boss, other workhorses that already are engaged, and perhaps even the suppliers and consumers (hopefully not).

The consumer is the customer, the client, the user. It is the people who depend on you to provide a service or product. Risks hitting you are hitting the entire supply chain, and the consumer may be the people who have the most to lose from bad risk management. The consumer may also be able to help with dealing with risks, and in resolving difficult situations. Involving the consumer in your risk management should always be a priority.

 

Your communication style must be tailored to the role of the person you are trying to involve, and to the ability of that person to contribute. If you do not think about this in advance, communication is not likely to be successful. This is why you need a plan.

Step 1: Make a communication plan

The different roles need different information to feel engaged. They may also have different interests in the asset you are trying to protect. The key to creating engagement through communication is to tailor your plan to the interests of your stakeholders. That being said, you also need this to be a two-way street; you need feedback, you need to gather information. Your communication plan shouldn’t be a long and formal document, a simple plan where you think through the key aspects of communication with each stakeholder is enough. The key steps are:

  • Identify the stakeholders and roles: who are they, what are their roles, what interest do they have in your asset, how much time do they have to support you and what do you need each of them to do?
  • Plan what each person needs to be involved in and how
  • Plan how you distribute information in various channels to the stakeholders – a matrix or table is a nice way of doing this in a condensed format. Think face-to-face meetings, town-halls, e-mails, intranet/web spaces, social media, phone calls, whatever channel you are planning to use. Keep in mind that effective communication works best in the channel the receiver prefers
  • Set up a schedule for how often you are going to communicate with each stakeholder – and make sure you don’t make it a “set and forget”

Step 2: Value relationships as much as results

Risk management is people management. Often risk managers are quite technocratic by nature and prefer to focus on results and technical matters. This is of course necessary, but you also need to value the relationships you have with the people you are communicating with. This means spending time with people, thanking them for their contributions and actively listening to what they have to say – even if it is not related to you risk management activities.

Step 3: Don’t give pole position to compliance

Compliance is important but far too often risk management is reduced to a checklist exercise of controls. This mindset is detrimental to good communication and can contribute to increased risk. The most important risk in risk assessments is overlooking the obvious – and the reason people do this is because they are not engaged in the process. Don’t forget compliance but use it as a driver for continuous improvement instead of being the focus in every activity.

Step 4: 30.000-foot view

With regular intervals, you should take a step back and reflect. Ask yourself open ended questions and try to find answers based on your experience with the various stakeholders in the project.

  • What did Mr. X contribute with?
  • What did Ms. Y not tell me and why not?
  • Do I have what I need?
  • Who is satisfied with their involvement and who is dissatisfied? Why?
  • What do I need to change to get what I need?
  • What do I need to change to make sure every stakeholder feels valued?

If you follow these steps, things may still go wrong. The chances are, however, that you will get much more useful involvement, much more engagement from the people you need to deal with, than if you go about communication in an unplanned ad-hoc way.

Automation and identity loss

Everybody automates. And everything can be automated. We are giving up human contact to achieve higher efficiency. Companies will need fewer workers, and most senior managers who are trained to view this through the “shareholder value” lense see this as a great development that reduces the cost base of their companies. As an example, Rune Bjerke, CEO of the biggest Norwegian bank DNB, recently said he is convinced that in 5 years the staff of the bank will be halved. 

For the consumer this means that dealing with the bank is a personalized experience based on collected data and machine learning instead of human interaction. This may be efficient but it leaves less room for flexibility and for a more meaningful and real customer relationship. 

If we push hard on automating everything, the jobs humans do today will be mostly unnecessary. Interactions with firms will be by proxy through computers. We need to start thinking about the path we are taking. Today there is a vacuum in the area of regulations, and generally in the thinking around how people can find purpose in life when jobs are few and our identities can no longer be interchangeable with our professional titles. 

Teaching process safety in 2017

The last 4 years I’ve given guest lectures in process safety at the Norwegian University of Science and Technology for undergrad chemical engineering students – and I’ve promised to do this also this year – this is my annual pro bono event :).

I used to work as a consultant with Lloyd’s Register, and previously I’ve used slides based on their internal course in process safety, that I also used to teach. Now I have a new job at a different firm in a different sector (information security in a devops environment – in otherwords something completely different and not related to process safety or chemical engineering).

20151108_093707211_ios

Obviously, I need to create some new content for this year’s lectures. I’m looking forward to it, as this is a great opportunity to brush up also on the form of delivery. So, the plan so far is:

  • Basic principles (no single point of failure, risk-based design thinking, observable risks, usability)
  • Process accident examples (the fire from ice example from CSB is still great, but perhaps I can find something new to add)
  • Key safety standards, and some examples on how to use them
    • ISO 10418 / API RP 14C / NORSOK P-002 (process design and safety)
    • IEC 61511 (safety instrumented systems and safety integrity levels)
    • IEC 62443-3-3 (New! Cybersec in process systems, I think this one’s going to be increasingly relevant)
  • The mother of all accidents: overpressure
    • Blowdown systems
    • How to simulate blowdown in a simple process segment
    • Pressure equalization in compressor trains
  • New threats to process plants
    • Cyber attacks
    • Practices to make your plant less vulnerable

What more do you think undergrad chemical engineering students need to learn about safety in design?

IEC 61511 Security Requirements

IEC 61511-1 Ed. 2 is now out, and as I’ve mentioned previously on this blog, with new requirements for cybersecurity analysis for your safety instrumented systems. The new requirement makes it mandatory to perform a security risk and vulnerability assessment for your safety instrumented systems. It specifically requires you to identify threats, to assess impact and likelihood (or credibility), and to plan your mitigation strategy and response to identified threats. The standard allows you to use an overall cybersecurity assessment for your entire control system, provided you cover all relevant threats for the SIS.

cropped-20150512_122333851_ios1.jpg
You do not want your SIS to show you the blue screen of death – have you looked into the security integrity of your system?

It is important to tailor the approach to the setting and the need in the network environment where the SIS is operated. It is possible to go into a vulnerability study at great detail, developing attack trees for low-level attack scenarios. From a SIS design point of view this is not very useful – and a more conceptual level assessment based on network topologies, security policies and the risk context of the plant is more appropriate.

At LR I’ve had the pleasure of adapting a more in-depth cybersecurity assessment method to the SIS environment together with some of my great colleagues, and we are looking forward to serving our customers with this as a part of functional safety management.

If you want to be contacted about IEC 61511 security requirements and how to integrate security into your functional safety mangement, please fill out the contact form below.

Telecommuting and the environment

Modern cities are trying to become greener. One of the major contributors to pollution from cities is transport.

traffikk
Toll roads – the only way to reduced congestion in cities?

 

 

In Norway, there are three main strategies to change people’s behaviors in this respect:

  1. Toll roads into and out of all cities to make it more expensive to drive your own car, in addition to high environmental taxes on buying new cars and on fossil fuel (gasoline costs about NOK 15 per liter, which is equivalent to about US $7/gal.
  2. Subsidized public transport (buses, trains, etc.). Buses in urban areas are modern, run on biofuels, run about every 10 minutes in dense areas, and a bus card valid for one month costs about NOK 700.
  3. Electric cars are not subject to heavy taxation, and they get a free pass on the toll roads. A Model S from Tesla thus costs about the same as a much smaller fossil fuel car from non-premium brands.

In spite of these efforts, traffic is increasing. Primarily, due to a lot of people buying electric cars. The problem is, it still creates congestion. And parking is hard to find. So what do politicians suggest to fix the problem? Primarily two things in Trondheim where I live: make parking even more scarce, and increase toll road prices by at least 50%. I suppose that may work, but is it the only option?

Norway has a skilled workforce. A lot of people work in offices with computers, and there is really limited need to actually be in that office to get your stuff done. I think telecommuting could help reduce congestion, and help the environment in one easy whiff. This, however, requires a lot of companies to change the way they approach collaboration, mangement, and use of technology. A social network can not fully replace coffe machine chit-chat, but it helps (use Yammer, closed Facebook groups,  etc.). Use productivity software that allows online collaboration – most office programs allow this today (Microsoft Office and Google Apps perhaps being the biggest players). Managers should also empower their employees to take more decisions than they do in practice today – in many companies the hierarchy is still king – and that does not play well with a semi-virtual workforce. So – what would happen if 50% of commuters would work from home 2 days per week? If we take 1000 commuters as basis for our argument, we know that about 50% drive a car, the rest will use a bicycle, walk or use public transport. Further, we can assume that people will chose random days to work from home, so about the same number of people will work from home every day. Then we are down to about 250 cars per day (on average). This may of course be a bit optimistic, but making people telecommute part of their work weeks can have huge impact on pollution and congestion. Therefore politicians should try to give incentives to both companies and indivduals if they actually choose to implement such a policy:

  • Reduce tax burdon on employer (e.g. by introducing a pro forma deductible per telecommuter)
  • Make broadband connections tax deductible for people who choose to telecommute

So, today I choose to work from the home office.

Make sure security does not stop your people from getting stuff done!

Cybersecurity is on the list of many organizations’ top priorities nowadays. Obviously, protecting the confidentiality, integrity and availability of business data is a crucial part of any modern enterprise’s risk management activities. However, in many cases, security measures are making simple things difficult, and hard things even harder. When this happens, users tend to find workarounds, often involving using private cloud services, private devices, or connecting via sneakernets to do their business. If this is the case at your company – you should rethink your approach to security.

lockouthorse
Feel locked out by your security policy? Are you prevented from doing your job by the IT department?

What can organizations do to maintain security and allowing people to get their work done?

Security measures need to have a sound basis in the threats you are trying to avoid. This means, you should have at least a basic grasp on what kind of threats you are dealing with, and which measures will be effective in dealing with them. Here’s a 6-step list to how you can achieve that.

  1. Perform a cyber security threat identification to list all threats and sort them as “unacceptable” and “acceptable” based on both impact and credibility
  2. Deal with the threats by designing counter-measures; this can be technology, awareness training and response capabilities
  3. Educate your users on the threats and why it is important to avoid letting adversaries in.
  4. The principle of least privilege is sound – but it should not be interpreted as “no access given unless proven beyond doubt that access is needed”. It means – access shall only be given if it is meaningful for that user to have access, and in cases where this increases the attack surface, ensure the user is educated to understand what that means.
  5. Do not overuse filtering techniques for content. That is the same as inviting sneaker nets where you have no control.
  6. Never forget that technology is there to help people get stuff done, not in order to prevent them from doing anything. If a user needs to do something (e.g. to download and test software from the internet), work with the user to find safe ways to do this instead of being an obstacle.

The daily reboot

Awareness is important when it comes to cyber security, and this awarenes is often lacking in the control system domain because we are so used to looking for all sorts of other causes of upsets in production or accidents for that matter. I’m going to give a talk on industrial cyber security at a workshop offered by my employer (LR) on Tuesday. I figured I wanted to tell a short story to set the mood – here’s the outline.

cropped-20150512_122333851_ios1.jpg

The reboot story

Aldo Tomation is responsible for the control systems at the specialist material manufacturing firm Composite Reinforcement Inc. Aldo is passionate about both the finances of the company as well as the health and safety of his coworkers. Because of this, Aldo has shown great focus on production regularity and that all requirements of the European Machinery Directive and the machinery safety standard ISO 13849-1 have been met. Lately he has noticed a certain reduction in the production regularity, and the downtime is always occuring at the same time of the day. Every day, just after 4pm CET, the plant goes down, and then comes back up again shortly after. Aldo thinks that this is very strange, so he studies the machine logs. There he can see that the control system is reobooted just after 4pm every day, and that there are no control system log entries or data entries in the historian before the night shift comes on at 10pm. During the night shift everything works as it should.

Curious about this strange behavior Mr. Tomation talks to the operators. They tell him that every day the HMI screens and controls are locked just after 4pm every day but that they have found a workaraound; they just reboot the system and unplug the network cable just after the reboot – then everything works flawlessly! Aldo is impressed with the creativity shown by the operators to regain control but still puzzled by this strange behavior. He considers calling customer support to complain about the quality of the control system.

Question: If you had been Mr. A. Tomation – would you have considered the possibility of a cyber attack as the reason behind this strange need fo rebooting the control system?

3 Weeks before this strange behavior appeared, the firm had signed an agreement with the Japanese navy to deliver reinforcement fibers for a modernization project they were running on their  submarine fleet. This deal had been kept a secret from both parties. Could it still have something to do with the attacks?

What kind of awareness do I hope to give birth to with this story?

  • If your system is behaving in a strange way – it is worth checking out
  • Businesses can be targets for attacks that are motivated by geo-politics
  • Running a whole shift without logs shouldn’t be considered normal – why did the operators not report this as a possible security incident?

I’d love to hear your comments on whether you think this sort of story can be effective in awareness work. I’m going to test it with some clients, and then decide if I think it works.

 

 

LEAN thinking in functional safety

Whenever something works beautifully together, like pieces in a complex piece of machinery, it creates satisfaction for everyone involved. When the system consists of people working together on a complex project, we experience this type of satisfaction when there is a situation of flow in the project. Information is treated when it is received and flows without barriers to the next person who needs to perform a task or make decision based on this information. Unfortunately, this is far from reality in most functional safety projects. These projects are typically complex, with a variety of stakeholders and various agendas and levels of competence.

071415_0731_Fourgoldenp1.png
Good project planning and execution across interfaces is necessary to achieve a lean functional safety organization throughout the lifecycle of a safety instrumented system. The opportunities for quality improvements and the banashing of waste are plentiful and many of these opportunities are low-hanging fruits.

 

This field could benefit greatly from lean thinking and a culture geared towards achieving flow. This, however, requires better functional safety planning, more openness between stakeholders, and a clear picture of how functional safety activities fit with the bigger picture. Lean is all about banishing waste, and there is a lot of waste in functional safety projects. Typical types of waste encountered on most projects include:

  • People waiting for input to perform the next activity. A lot of this waiting is unnecessary and due to bad planning, follow-up or lack of understanding of follow-on effects of missing deadlines
  • Unnecessary work performed – a lot of documentation is created and never used. This is related to competence levels with stakeholders and “wrong” or “ultraconservative” interpretations of standards and regulations.
  • Re-work: work done several times due to lack of information, wrong people involved, inefficient review processes, bad quality, etc.

Optimizing the whole work process requires the whole value chain to be involved, and that inefficiencies can be rooted out across organizational interfaces. By systematically removing “waste” in the functional safety value chain, I would expect that better quality and lower costs could be obtained at once. And better quality in this respect means fewer fatalities, less pollution and better uptime.