OT and Cloud @Sikkerhetsfestivalen 2025

This week we celebrated the “Sikkerhetsfestivalen” or the “Security Festival” in Norway. This is a big conference trying to combine the feel of a music festival with cybersecurity talks and demos. I gave a talk on how the attack surface expands when we connect OT systems to the cloud, and how we should collaborate between OT engineers, cloud developers and other stakeholders to do it anyway, if we want to get the benefits of cloud computing and better data access for our plants.

OT and Cloud was a popular topic by the way, several talks centered around this:

  • One of the trends mentioned by Samuel Linares (Accenture) in his talk “OT: From the Field to the Boardroom: What a journey!” was the integration of cloud services in OT, and the increased demand for plant data for use in other systems (for example AI based analytics and decision support).
  • A presentation by Maria Bartnes (SINTEF) about a project that SINTEF did for NVE on assessing the security and regulatory aspects of using cloud services in Class 1 and 2 control systems in the power and utility sector. The SINTEF report is open an can be read in its entirety here: https://publikasjoner.nve.no/eksternrapport/2025/eksternrapport2025_06.pdf. The key take-away form the report is that there are more challenging regulatory barriers, than chalelnges implementing secure enough solutions.

My take on this at the festival was from a more practical point of view.

(English tekst follows the Norwegian)

Sikkerhetsfestivalen 2025

Denne uka var jeg på Sikkerhetsfestivalen på Lillehammer sammen med 1400 andre cybersikkerhetsfolk. Det morsomme med denne konferansen, som tar mål av seg å være et faglig treffsted med konferansestemning, er at kvaliteten er høy, humøret er godt, og man får muligheten til å få faglig påfyll, treffe gamle kjente, og få noen nye faglige bekjentskaper på noen korte intense dager. 

I år deltok jeg med et foredrag om OT og skytjenester. Stadig flere som levererer kontrollsystemer, tilbyr nå skyløsninger som integrerer med mer tradisjonelle kontrollsystemer. Det har lenge vært et tema med kulturforskjellen mellom IT-avdelingen og automasjon, men når man skal få OT-miljøer til å snakke med de som jobber med applikasjonsutvikling i skyen, får vi virkelig strekk i laget. På OT-siden ønsker vi full kontroll over endringer, og må sikre systemer som ofte har svake sikkerhetsegenskaper, men hvor angrep kan gi veldig alvorlige konsekvenser, inkludert alvorlige ulykker som kan føre til skader og dødsfall, eller store miljøutslipp. På den andre siden finner vi en kultur hvor endring er det normale, smidig utvikling står i høysetet, og man har et helt annet utvalg i verktøyskrinet for å sikre tjenestene. Skal vi få til å samarbeide her, må begge parter virkelig ville det og gjøre en innsats! 

For å illustrere denne utfordringen tok jeg med meg et lite demoprosjekt, som illustrerer utviklingen fra en verden der OT-systemer opprinnelig var skjermet fra omgivelsene, til nåtiden hvor vi integrerer direkte mot skyløsninger og kan styre fysisk utstyr via smarttelefonen. Selve utformingen av denne demoen ble til som et slags hobbyprosjekt sammen med de yngste i familien i sommer: Building a boom barrier for a security conference – safecontrols

La oss se på hvordan vi legger til stadig flere tilkoblingsmuligheter her, og hva det har å si for angrepsflaten. 

1992: Angrepsflaten er kun lokal. Det er kun en veibom med en enkel knapp for å åpne og lukke. Denne betjenes av en vakt på stedet. 

2001: Vakt-PC i vaktbua, med serielltilkobling til styringsenheten på bommen. Angrepsflata er fortsatt lokal, men inkluderer nå vakt-PC. Denne er ikke koblet til eksterne nett, men det er for eksempel mulig å spleise seg inn på kabelen og sende kommandoer fra sin egen enhet til bommen om man har tilgang til den. 

2009: Systemet er tilkoblet bedriftens nettverk via OT-brannmur. Dette muliggjør fjerntilgang for å hente ut logger fra kontoret, uten behov til å reise ut til hver lokasjon. Angrepsflaten er nå utvidet, med tilgang i bedriftsnettet er det nå mulig å komme seg inn i OT-nettet og fjernstyre bommen.

2023: Aktiviteten på natt er redusert, og analyser av åpningtidspunkter viser at det ikke lenger er behov for nattevakt. 

2025: Skybasert styringssystem “Skybom” implementeres, og man trenger ikke lenger nattevakt. De få lastebilene som kommer på natta får tilgang, sjåførene kan selv åpne bommen via en kode på smarttelefonen. Nå er angrepsflaten videre utvidet, en angriper kan gå på programvaren som styrer systemet, sende phishing til lastebilsjåfører, bruke mobil skadevare på sjåførenes mobiler, eller angripe selve skyinfrastrukturen. Det er også en ny hardwaregateway i OT-nettet som kan ha utnyttbare sårbarheter.

I det opprinnelige systemet var de digitale sikkerhetsbehovene moderate, på grunn av veldig lav eksponering. Etter hvert som man har økt antallet tilkoblinger og integrasjoner, øker angrepsflaten, og det gjør også at sikkerhetskravene bør bli strengere. I moderne OT-nett vil man typisk bruke risikovurderinger for å sette et sikkerhetsnivå, med tilhørende krav. Den mest vanlige standarden er IEC 62443. Her skal man bryte ned OT-nettet i soner og kommunikasjonskanaler, utføre en risikovurdering av disse og sette et sikkerhetsnivå fra 1-4, hvor 1 er grunnleggende, og 4 er strenge sikkerhetskrav. Her er det kanskje naturlig å dele nettverket inn i 3 sikkerhetssoner: skysonen, nettverkssonen, og kontrollsonen. 

Det finnes mange måter å vurdere risiko for et nettverk på. En svært enkel tilnærming vi kan bruke er å spørre oss 3 spørsmål om hver sone: 

  1. Er målet attraktivt for angriperen (juicy)?
  2. Er målet lett å kompromittere (lav sikkerhetsmessig modenhet)?
  3. Er målet eksponert (feks tilgjengelig på internett)?

Jo flere “ja”, jo høyere sikkerhetskrav. Her ender vi kanskje med SL-3 for skysonen, SL-2 for lokalt nettverk, og SL-1 for kontrollsonen. Da vil vi få strenge sikkerhetskrav for skysonen, mens vi har mer moderate krav for kontrollsonen, som i større grad også lar seg oppfylle med enklere sikkerhetsmekanismer. 

I foredraget viste jeg et eksempel vi dessverre har sett i mange reelle slike systemer: de nye skybaserte systemene er laget uten særlig tanke for sikkerhet. Det hender seg også (kanskje oftere) at systemene har gode sikkerhetsfunksjoner som må konfigureres av brukeren, men hvor dette ikke skjer. I vårt eksempel har vi en delt pinkode til bommen som alle lastebilsjåførene bruker, et system direkte eksponert på internett og ingen reell herding av systemene.Det er heller ingen overvåkning og respons. Dette gjør for eksempel at enkle brute-force-angrep er lette å gjennomføre, noe vi demonstrerte  som en demo. 

Til slutt så vi på hvordan vi kunne sikret systemet bedre med tettere samarbeid. Ved å inkludere skysystemet i en “skysone” og bruke SL 3 som grunnlag, ville vi for eksempel definert krav til individuelle brukerkontoer, ratebegrensning på innlogginsforsøk, bruk av tofaktorautentisering og ha overvåkning og responsevne på plass. Dette vil i stor grad redusert utfordringene med økt eksponering, og gjort det vanskeligere for en ekstern trusselaktør å lykkes med og åpne bommen via et enkelt angrep. 

Vi diskuterte også hvordan vi kan bruke sikkerhetsfunksjonalitet i skyen til å bedre den totale sikkerhetstilstanden i systemet. Her kan vi for eksempel sende logger fra OT-miljøet til sky for å bruke bedre analyseplattformer, vi kan automatisere en del responser på indikatorer om økt trusselnivå før noen faktisk klarer å bryte seg inn og stjele farlige stoffer fra et lager eller liknende. Skal vi få på plass disse gode sikkerhetsgevinstene fra et skyprosjekt i OT, må vi ha tett samarbeid mellom eieren av OT-systemet, leverandørene det er snakk om, og utviklingsmiljøet. Vi må bygge tillit mellom miljøene gjennom åpenhet, og sørge for at OT-systemets behov for forutsigbarhet ikke overses, men samtidig ikke avvise gevinstene vi kan få fra bedre bruk av data og integrasjoner.

OT and Cloud Talk – in English!

This week I was at the Security Festival in Lillehammer along with 1400 other cybersecurity professionals. The great thing about this conference, which aims to be a professional meeting place with a festival atmosphere, is that the quality is high, the mood is good, and you get the opportunity to gain new technical knowledge, meet old friends, and make new professional acquaintances over a few short, intense days.

This year I participated with a presentation on OT and cloud services. More and more companies that deliver control systems are now offering cloud solutions that integrate with more traditional control systems. The cultural difference between the IT department and the automation side has long been a topic of discussion, but when you have to get OT environments to talk to those who work with application development in the cloud, you really get a stretch in the team. On the OT side, we want full control over changes and have to secure systems that often have weak security features, but where attacks can have very serious consequences, including severe accidents that can lead to injury and death, or major environmental spills. On the other hand, we find a culture where change is the norm, agile development is paramount, and there is a completely different set of tools available for securing services. If we are to collaborate here, both parties must truly want to and make an effort!

To illustrate this challenge, I brought a small demo project with me, which illustrates the development from a world where OT systems were originally isolated from their surroundings, to the present day where we integrate directly with cloud solutions and can control physical equipment via a smartphone. The design of this demo came about as a kind of hobby project with the youngest members of my family this summer: Building a boom barrier for a security conference – safecontrols.

Let’s look at how we are constantly adding more connectivity options here, and what that means for the attack surface.

1992: The attack surface is local only. It is just a boom barrier with a simple button to open and close. This is operated by an on-site guard.

2001: The guard has a PC in the guardhouse, with a serial connection to the control unit on the barrier. The attack surface is still local, but now includes the guard’s PC. This is not connected to external networks, but it is, for example, possible to splice into the cable and send commands from your own device to the barrier if you have access to it.

2009: The system is connected to the corporate network via an OT firewall. This enables remote access to retrieve logs from the office, without the need to travel to each location. The attack surface has now expanded; with access to the corporate network, it is now possible to get into the OT network and remotely control the barrier.

2023: Activity at night is reduced, and analyses of opening times show that there is no longer a need for a night watchman.

2025: A cloud-based control system “Skybom” is implemented, and a night watchman is no longer needed. The few trucks that arrive at night are granted access, and the drivers can open the barrier themselves via a code on their smartphone. Now the attack surface is further expanded; an attacker can go after the software that controls the system, send phishing emails to truck drivers, use mobile malware on the drivers’ phones, or attack the cloud infrastructure itself. There is also a new hardware gateway in the OT network that may have exploitable vulnerabilities.

In the original system, the digital security needs were moderate due to very low exposure. As the number of connections and integrations has increased, the attack surface also grows, which means that security requirements should become stricter. In modern OT networks, risk assessments are typically used to set a security level, with associated requirements. The most common standard is IEC 62443. Here, you should break down the OT network into zones and conduits, perform a risk assessment of these, and set a security level from 1-4, where 1 is basic and 4 is strict security requirements. Here, it is perhaps natural to divide the network into 3 security zones: the cloud zone, the network zone, and the control zone.

There are many ways to assess network risk. A very simple approach we can use is to ask ourselves 3 questions about each zone:

  • Is the target attractive to the attacker (juicy)?
  • Is the target easy to compromise (low security maturity)?
  • Is the target exposed (e.g., accessible on the internet)?

The more “yeses,” the higher the security requirements. Here we might end up with SL-3 for the cloud zone, SL-2 for the local network, and SL-1 for the control zone. This would give us strict security requirements for the cloud zone, while we have more moderate requirements for the control zone, which can also be fulfilled to a greater extent with simpler security mechanisms.

In the presentation, I showed an example that we have unfortunately seen in many real systems like this: the new cloud-based systems are created with little thought for security. It also happens (perhaps more often) that the systems have good security features that must be configured by the user, but this does not happen. In our example, we have a shared PIN code for the barrier that all truck drivers use, a system directly exposed to the internet, and no real hardening of the systems. There is also no monitoring and response. This makes simple brute-force attacks easy to carry out, something we demonstrated as a demo.

Finally, we looked at how we could better secure the system with closer collaboration. By including the cloud system in a “cloud zone” and using SL 3 as a basis, we would, for example, define requirements for individual user accounts, rate limiting on login attempts, the use of two-factor authentication, and have monitoring and response in place. This would largely reduce the challenges with increased exposure and make it more difficult for an external threat actor to succeed in opening the barrier via a simple attack.

We also discussed how we can use security functionality in the cloud to improve the overall security posture of the system. For example, we can send logs from the OT environment to the cloud to use better analysis platforms, we can automate some responses to indicators of increased threat levels before someone actually manages to break in and steal dangerous substances from a warehouse or similar. To implement these good security benefits from a cloud project in OT, we must have close collaboration between the owner of the OT system, the relevant suppliers, and the development environment. We must build trust between the teams through openness and ensure that the OT system’s need for predictability is not overlooked, while at the same time not rejecting the benefits we can get from better use of data and integrations.

Connecting OT to Cloud: Key Questions for Practitioners

When we first started connecting OT systems to the cloud, it was typically to get access to data for analytics. That is still the primary use case, with most vendors offering some SaaS integration to help with analytics and planning. The cloud side of this is now more flexible than before, with more integrations, more capabilities, more AI, even starting to push commands back into the OT world from the cloud – something we will only see more of in the future. The downside of that as seen from the asset owner’s point of view is that the critical OT system with its legacy security model and old systems are now connected to a hyperfluid black box making decisions for the physical world on the factory floor. There are a lot of benefits to be had, but also a lot of things that could go wrong.

How can OT practicioners learn to love the cloud? Let’s consider 3 key questions to ask in our process to assess the SaaS world from an OT perspective!

The first thing we have to do is accept that we’re not going to know everything. The second thing we have to do is ask ourselves, ‘What is it we need to know to make a decision?’… Let’s figure out what that is, and go get it.

Leo McGarry – character in “The West Wing”

The reason we connect our industrial control systems to the cloud, is that we want to optimize. We want to stream data into flexible compute resources, to be used by skilled analysts to make better decisions. We are slowly moving towards allowing the cloud to make decisions that are feeding back into the OT system, making changes in the real world. From the C-Suite, doing this is a no-brainer. How these decisions challenge the technology and the people working on the factory floors, can be hard to see from the birds-eye view where the discussion is about competitive advantage and efficiency gain instead of lube oil pressure or supporting a control panel still running on Windows XP.

The OT world is stable, robust, traditional , whereas the cloud world is responsive, in a constant flux, adaptable. When people managing stable meet people managing flux meet, discussions can be difficult, like the disciples of Heraclitus debating the followers of Parmenides in ancient Greek phillosophy.

Question 1: How can I keep track of changes in the cloud service?

Several OT practitioners have mentioned an unfamiliar challenge: the SaaS in the cloud changes without the knowledge of the OT engineers. They are used to strict management of change procedures, the cloud is managed as a modern IT project with changes happening continuously. This is like putting Parmenides up against Heraclitus; we will need dialog to make this work.

Trying to convince the vendor to move away from modern software development practices with CI/CD pipelines and frequent changes to a more formal process with requirements, risk assessment spreadsheets and change acceptance boards is not likely to be a successful approach, although it may seem to be the most natural response to a new “black box” in the OT network for many engineers. At the same time, expecting OT practitioners to embrace a “move fast and break things, then fix them” is also, fortunately, not going to work.

  • SaaS vendors should be transparent with OT customers what services are used and how they are secured, as well as how it can affect the OT network. This overview should preferably be available to the asset owner dynamically, and not as a static report.
  • Asset owners should remain in control which features will be used
  • Sufficient level of observability should be provided across the OT/cloud interface, to allow a joint situational understanding when it comes to the attack surface, cyber risk and incident management.

Question 2: Is the security posture of the cloud environment aligned with my OT security needs?

A key worry among asset owners is the security of the cloud solution, which is understandable given the number of data breaches we can read about in the news. Some newer OT/cloud integrations also challenge the traditional network based security model with a push/pull DMZ for all data exchange. Newer systems sometimes includes direct streaming to the cloud over the Internet, point-to-point VPN and other alternative data flows. Say you have a crane operating in a factory, and this crane has been given a certain security level (SL2) with corresponding security requirements. The basis for this assessment has been that the crane is well protected by a DMZ and double firewalls. Now an upgrade of the crane wants to install a new remote access feature and direct cloud integration via a 5G gateway delivered by the vendor. This has many benefits, but is challenging the traditional security model. The gateway itself is certified and is well hardened, but the new system allows traffic from the cloud into the crane network, including remote management of the crane controllers. On the surface, the security of the SaaS seems fine, but the OT engineer feels it is hard to trust the vendor here.

One way the vendor can help create the necessary trust here, is to allow the asset owner to see the overall security posture generated by automated tools, for example a CSPM solution. This information can be hard to interpret for the customer, so a selection of data and context explanations will be needed. An AI agent can assist with this, for example mapping the infrastructure and security posture metrics to the services in use by the customer.

Question 3: How can we change the OT security model to adapt to new cloud capabilities?

The OT security model has for a long time been built on network segmentation, but with very static resources and security needs. When we connect these assets into a cloud environment that is undergoing more rapid changes, it can challenge the local security needs in the OT network. Consider the following fictitious crane control system.

Crane with cloud integrations via 5G

In the situation of the crane example, the items in the blue box are likely to be quite static. The applications in the cloud are likely to see more rapid change, such as more integrations, AI assistants, and so on. A question that will have a large impact on the attack surface exposure of the on-prem crane system here, is the separation between components in the cloud. Imagine if the web application “Liftalytics” is running on a VM with a service account with too much privileges? Then, a vulnerability allowing an attacker to get a shell on this web application VM may move laterally to other cloud resources, even with network segregation in place. These type of security issues are generally invisible to the asset owner and OT practitioners.

If we start the cloud integration without any lateral movement path between a remote access system used by support engineers, and the exposed web application, we may have an acceptable situation. But imagine now that a need appears that makes the vendor connect the web app and the remote access console, creating a lateral movement path in the cloud. This must be made visible, and then the OT owner should:

  1. Have to explicitly accept this change for it to take action
  2. If the change is happening, the change in security posture and attack surface must be communicated, so that compensating measures can be taken in the on-prem environment

For example, if a new lateral movement path is created and this exposes the system to unacceptable risk, local changes can be done such as disabling protocols on the server level, adding extra monitoring, etc.

The tool we have at our disposal to make better security architectures is threat modeling. By using not only insights into the attack surface from automated cloud posture management tools, but also cloud security automation capabilities, together with required changes in protection, detection and isolation capabilities on-prem, we can build a living holistic security architecture that allows for change when needed.

Key points

Connecting OT systems to the cloud creates complexity, and sometimes it is hidden. We set up 3 questions to ask to start the dialog between the OT engineers managing the typically static OT environment and the cloud engineers managing the more fluid cloud environments.

  1. How can I keep track of changes in the cloud environment? – The vendor must expose service inventory and security posture dynamically to the consumer.
  2. Is the security posture of the cloud environment aligned with my security level requirements? – The vendor must expose security posture dynamically, including providing the required context to see what the on-prem OT impact can be. AI can help.
  3. How can we change the OT security model to adapt to new cloud capabilities? We can leverage data across on-prem and cloud combined with threat modeling to find holistic security architectures.

Do you prefer a podcast instead? Here’s an AI generated one (with NotebookLM):


Doing cloud experiments and hosting this blog costs money – if you like it, a small contribution would be much appreciated: coff.ee/cyberdonkey

Secure Multihomed Devices in Control Networks: Managing Risks and Enhancing Resilience

In control networks, where ensuring constant communication and reliable operation is critical, devices are frequently configured to be multihomed. This means they possess connections to multiple separate networks. This approach is favored over traditional routing methods where traffic is passed between networks. The advantage lies in the redundancy and potential performance boost multihoming offers. If one connection malfunctions, the device can seamlessly switch to another, maintaining vital communication within the control network. Additionally, multihoming allows for the possibility of utilizing different networks for specific traffic types, potentially optimizing overall control network performance.

While multihoming offers redundancy and performance benefits in control networks, it introduces security risks if the connected networks are meant to be entirely separate. Here’s why:

  1. Bridging Separate Networks: A multihomed device acts like a bridge between the networks it’s connected to. If these networks should be isolated for security reasons (e.g., a control signal network and a configuration network), the multihomed device can unintentionally create a pathway for unauthorized access. A malicious actor on one network could potentially exploit vulnerabilities on the device to gain access to the otherwise isolated network.
  2. Policy Bypass: Firewalls and other security measures are typically implemented at network borders to control traffic flow. With a multihomed device, traffic can potentially bypass these security controls altogether. This is because the device itself can become a point of entry, allowing unauthorized traffic or data to flow between the networks, even if the network firewalls have proper rules in place.
  3. Increased Attack Surface: Each additional connection point represents a potential vulnerability. With a multihomed device, attackers have more opportunities to exploit weaknesses in the device’s security or configuration to infiltrate one or both networks.

Bypassing firewalls: an example

Consider a system with two networks, where traffic is routed through a firewall. Network B is considered critical for real-time operations and has primarily control system protocols such as Modbus. This network is not encrypted. Network A is primarily used for configuring systems and reprogramming controllers. Most of the traffic is encrypted. Remote access is accepted into Network A, but not Network B.

On the firewall, all traffic between A and B is blocked during normal operation. When a controller in network B needs to be updated, a temporary firewall rule to allow the traffic is added.

Computer 2 i multi-homed and can be used to bypass the firewall

Along comes the adversary, and managed to use remote access to compromise Computer 1, and take over a local administrator account. Then the attacker moves laterally to Computer 2 using the Network A interface, managing to secure an SSH shell to Computer 2. From this shell, the attacker now has access to the control network over the second network interface, and executes a network scan from Computer 2 to identify the devices in Network B. Moving from there, the attacker is able to manipulated devices and network traffic to cause physical disruption, and the plant shuts down.

What are the options?

Your options to reduce the risk from multihomed devices may be limited, but keeping it like the example above is definitely risky.

  • The ideal solution: Remove any multi-homed setups, and route all traffic through the firewall. This way you have full control of what traffic is allowed. This may not be possible if the latency added is too much but this is a rare constraint.
  • The micro-segmented solution: Keep the network interfaces but add stateless firewalls on each network card to limit the traffic. Then the multi-homed device becomes its own network segment. Using this to implement a default deny policy will greatly improve the security of the solution.
  • Device hardening: This should be done for all the solutions, but can also be a solution in its own right. Keep the multi-homed behavior in place, but harden the device so that taking over it becomes really difficult. Disable all unused services, run all applications with minimal privileges, and used the host-based firewall to limit the traffic allowed (both ingress and egress).