CLASSY-FI: Cross-Layer Application-Specific Synthesis and Analysis of Fault Injection
Since the identification of physical causes for soft errors in the 1970s, the sensitivity of circuits for soft errors has been increasing due to voltage and structure shrinking. Functional safety standards, such as ISO 26262-9, demand explicit measures to assess (and, if necessary, mitigate) the effect of soft errors on safety and robustness. This is commonly done by performing extensive fault injection (FI) experiments on the target system that try to mimic the effects of transient faults (by changing logic signals) and then observing the system’s behavior with respect to its functional specification.
Logic faults can be injected on the pin, flip-flop, ISA, or even program level – and it is an open question, which level is “best” to assess a system’s robustness: Higher levels provide for a higher fault-injection efficiency, but lower levels are closer to the physical reality and various researchers have shown that the injection level of the commonly assumed single event upsets (SEUs) can have quite an impact on the results. In general, the lower the injection level (e.g., flip-flop level vs. ISA level), the more precisely we mimic the effects of real SEUs in the hardware.
The goal of the CLASSY-FI project is to derive constructive methods and techniques for scalable, yet precise and complete FI to experimentally assess the robustness of safety-critical embedded control systems against soft errors. The key idea behind the CLASSY-FI method is an application-specific cross-layer data-flow analysis, by which we consider the program–hardware-specific fault-propagation structure systematically on different levels. This fosters a hybrid approach for FI on multiple levels: Virtually cover all faults in the lower-level fault space (→ precision and completeness), but do the actual injections in the higher-level fault space (→ scalability) whenever possible and semantically equivalent. Otherwise inject on the lower level, supported by problem-specific hardware-assisted fault injection (HAFI) techniques.
Technically, CLASSY-FI aims at a methodology for the automatic generation of application–hardware-specific fault space (FS) construction and FI implementations that are highly tailored towards the actual system under test. Scientifically, we thereby provide new insights into the questions: (1) How well (quantitatively) does precise ISA-level FI cover precise lower-level FI, also with respect to single flip-flop (FF) faults that evolve to ISA-level multi-bit errors? (2) Is it feasible to reach full fault-space coverage on FF level (for reasonably sized systems) by an application-specific multi-layer fault-space analysis and automatic derivation of campaign-tailored hybrid FI platforms? (3) What is the influence of the μ-architecture on ISA-level FI coverage and to what degree can they be reused in case of incremental evolution of the μ-architecture?
People
Latest News
Tim-Marek Thomas presents Checkpoint Placement for Systematic Fault-Injection Campaigns at the 42nd International Conference on Computer-Aided Design (ICCAD '23) in San Francisco, CA, USA. In the paper we present a new approach to reduce the forwarding phase in fault-injection campaigns by the clever placement of checkpoints. Compared to the classical static placement of checkpoints, this reduces the forwarding time by 88–99 percent. The paper is related to our CLASSY-FI project.
Oskar Pusz presents Data-Flow–Sensitive Fault-Space Pruning for the Injection of Transient Hardware Faults at the Conference on Languages, Compilers and Tools for Embedded Systems (LCTES '21).
In the paper, we describe Data-Flow–Sensitive Fault-Space Pruning (DFP), a new precise and fault-space–complete data-flow sensitive fault-space pruning method that extends on def/use-pruning by also considering the instructions’ semantics when deriving fault-equivalence sets. In our experimental evaluation, this already reduces the number of necessary injections by up to 18 percent compared to def/use pruning.
The DFP is the core element in the ISA level of our research project CLASSY-FI.
The source code and evaluation artifacts are available here: Source Code and Evaluation Data for the Paper: Data-Flow–Sensitive Fault-Space Pruning for the Injection of Transient Hardware Faults.
Publications
-
Thesis
Program-Structure–Guided Reduction of the Execution Time of Fault-Injection Campaigns on the ISA Layer -
PHD thesisLeibniz Universität Hannover2024.
PDF 10.15488/17924 [BibTex]
-
SAFECOMP
Conference
B
ACTOR: Accelerating Fault Injection Campaigns using Timeout Detection based on Autocorrelation -
41st International Conference on Computer Safety, Reliability and Security (SAFECOMP 2022)Springer-Verlag2022.
PDF Slides 10.1007/978-3-031-14835-4_17 [BibTex]
-
SAFECOMP
Conference
B
SailFAIL: Model-Derived Simulation-Assisted ISA-Level Fault-Injection Platforms -
41st International Conference on Computer Safety, Reliability and Security (SAFECOMP 2022)Springer-Verlag2022.
PDF Slides 10.1007/978-3-031-14835-4_14 [BibTex]
-
LCTES
Conference
A
Data-Flow–Sensitive Fault-Space Pruning for the Injection of Transient Hardware Faults -
Proceedings of the 2021 ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES '21)ACM Press2021.
PDF Slides 10.1145/3461648.3463851 [BibTex]
-
LCTES
Artifact
A
Source Code and Evaluation Data for the Paper: Data-Flow–Sensitive Fault-Space Pruning for the Injection of Transient Hardware Faults -
Proceedings of the 2021 ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES '21)ACM Press2021.
10.5281/zenodo.4698901 [BibTex]
-
PRDC
Conference
B
Program-Structure–Guided Approximation of Large Fault Spaces -
2019 24th Pacific Rim International Symposium on Dependable Computing (PRDC'19)IEEE Computer Society Press2019.
PDF Slides 10.1109/PRDC47002.2019.00044 [BibTex]
-
DAC
Conference
A
Cross-Layer Fault-Space Pruning for Hardware-Assisted Fault Injection -
Proceedings of the 55th Annual Design Automation Conference 2018 (DAC '18)ACM Press2018.
PDF Slides Raw Data 10.1145/3195970.3196019 [BibTex]
Theses
Finished Theses
Data-Flow Analysis for Fault-Equivalence Set Forming on the ISA Layer
- Typ
- Bachelorarbeit
- Status
- abgeschlossen
- Supervisors
- Oskar Pusz
Christian Dietrich
Daniel Lohmann - Bearbeiter
- Zena Obeidi (abgegeben: 01. Mar 2019)
Schotbruch: Automatisierte Ableitung von Injektionsplattformen für transiente Hardwarefehler aus formalen Prozessormodellen
- Typ
- Masterarbeit
- Status
- abgeschlossen
- Supervisors
- Christian Dietrich
Daniel Lohmann - Bearbeiter
- Marcel Budoj (abgegeben: 08. May 2019)
Acceleration of Fault-Injection Campaigns through Early Timeout Detection
- Typ
- Masterarbeit
- Status
- abgeschlossen
- Supervisors
- Oskar Pusz
Daniel Lohmann - Bearbeiter
- Felix Siegel (abgegeben: 22. May 2020)
Formalizing the Execution Semantics of the AVR Instruction Set with the Description Language SAIL
- Typ
- Bachelorarbeit
- Status
- abgeschlossen
- Supervisors
- Christian Dietrich
Oskar Pusz
Daniel Lohmann - Bearbeiter
- Luca Nedaskovskij (abgegeben: 16. Oct 2020)
Transient-Fault Resilience of a Capability-enabled Processor Plattform
- Typ
- Masterarbeit
- Status
- abgeschlossen
- Supervisors
- Christian Dietrich
Daniel Lohmann - Bearbeiter
- Malte Bargholz (abgegeben: 01. Nov 2020)
Design and Implementation of Benchmarks for Systematic Fault Injection
- Typ
- Bachelorarbeit
- Status
- abgeschlossen
- Supervisors
- Oskar Pusz
Daniel Lohmann - Bearbeiter
- Jannis Bujak (abgegeben: 02. Mar 2021)
Pruning of Soft-Error Fault Spaces by Dynamic Register-Usage Tracing in a Formal Instruction-Set Model
- Typ
- Masterarbeit
- Status
- abgeschlossen
- Supervisors
- Christian Dietrich
Daniel Lohmann - Bearbeiter
- Yannick Loeck (abgegeben: 26. May 2021)
Design and Implementation of an Early Timeout-Detection Mechanism for Systematic Fault-Injection Campaigns
- Typ
- Masterarbeit
- Status
- abgeschlossen
- Supervisors
- Oskar Pusz
Daniel Lohmann - Bearbeiter
- Tim-Marek Thomas (abgegeben: 22. Oct 2021)
Leveraging Application-Specific Knowledge to Guide Statistical Fault Injection
- Typ
- Bachelorarbeit
- Status
- abgeschlossen
- Supervisors
- Tim-Marek Thomas
Daniel Lohmann