Fault-Space Approximation using Call-Region Fault Injection

Due to shrinking transistor sizes and operating voltages, transient hardware faults causes by single event upsets (SEU), also called soft errors, become an emerging challenge for safety-critical systems. SEUs could appear by radiation e.g. and can be mitigated by fault tolerance mechanisms. Testing such mechanisms is nearly impossible under realistic influences (like radiation canons) because radiation is non-deterministic in general.

Testing fault tolerance mechanisms is commonly done by performing extensive fault injection (FI) experiments on a system that try to mimic either the physical causes for SEUs or their effects and then observing the system’s behaviour. The biggest advantage doing such FIs are repeatable experiment results while testing fault tolerance mechanisms.

There are two dimensions of possible fault injections: Every bit in every cycle. When and where a FI is useful for testing safety and robustness of a system is one of the main questions of FI. Evaluating all possible injections in this huge fault space is effectively impossible.

During an execution of a function, registers could be injected in its function space of the fault space. There are many opportunities for FIs again and the number of possible FIs rises the more instructions would be executed. A possible approach for an approximation is to generalise register injection and separate the code into self-described "Regions". One idea for a region (based on basic blocks) was evaluated in a past thesis Fault-Space Approximation using Basic Blocks, which seems to be a good approximation. The next idea is to set new, wider borders for such a region.

In this case: "Call Regions", which begin at the start of a function or a call, and end with a call or the end of a function. It should be compared if the different techniques (register, basic block and call-region injection) lead to the same behaviour and if this concept of approximation is useful or not while reducing the number of needed FIs.

FIs could be done by using the C++ application FAIL*. This fault injection tool is able to simulate fault injections in x86 processors and should be used/extended by the idea above.

Further Reading