People tend to rely more on numbers than on other types of «proof» of goodness. Where those numbers come from seems to play less of a role. Of course, a number picked out of thin air is just as worthless as a Greek government bond – but why do we then seem to trust a promise, as long as someone has put a number on it? Several people have discussed this previously and in many settings before, but one of my favorites is this blog post at the American Mathematical Society from 2012 by Jean Joseph.
This tendency of “everything is fine because the numbers say so” thinking is very much present in functional safety; over focus on probability calculations is common. I believe there are several reasons for this. First, engineers like quantitative measures – and there are good and sound methodologies for performing reliability calculations. We tend to trust what numbers say more than qualitative information that we perceive to be less accurate.
A SIL requirement consists of four types of requirements – the practical implications of which depending on the integrity level sought. The four types of requirements are illustrated below.
The quantitative requirements are probability calculations. We tend to overfocus on these at the expense of the others. The quality of these calculations depends on the quality of the input data (failure rates) – and the quality of such data can be very hard to verify.
Semi-quantitative requirements are in most cases expressed as the required redundancy (hardware fault tolerance) and the safe failure fraction. To build in the necessary robustness in a safety function, redundancy is required to ensure a single failure does not lead to a dangerous failure of the safety function. The required redundancy depends on the SIL of the function, as well as the fraction of failures that will lead to a safe state directly (the so-called safe failure fraction, SFF). In practice, we see somewhat less focus on this than on the probability calculations themselves (PFD).
Software requirements depend on the required SIL and the type of software development involved. Software competence among system users and system integrators is typically lower than their hardware competence. This causes the software requirement setting and compliance assessment to be delegated to the software vendor without much oversight from the integrator or user. This is a competence-based weakness in the lifecycle in such cases that we cannot capture in the numbers we calculate.
Qualitative requirements include how we work with the SIS development process itself, including managing changes, and ensuring systematic errors are not introduced. An important part of this work and the requirements we need to meet is to ensure that personnel competent for their roles perform all activities.
If we are going to trust the probabilities calculated, we need to trust that the right level of redundancy exists. We need to trust that software developers create their code in a way that makes the existence of bugs with potential dangerous outcomes very unlikely. We need to trust that everybody involved in the SIS development has the right level of competence and experience, and that the organizations involved have systems in place to properly manage the development process and all its requirements. A simple probability estimate does not tell us much, unless it is born in the context of a properly managed SIS development process.