W03 Reconciling Implementation Performance and Confidence in Machine Learning

Start
End
Organiser
Eric Jenn, IRT Saint Exupéry, France
Organiser
Pierre Gaillard, CEA, France
Organiser
Filipo Perotto, ONERA, France

This workshop addresses the challenge of reconciling performance and confidence in the implementation of machine learning algorithms for safety-critical domains. While optimized hardware platforms (GPU, FPGA, accelerators) and advanced graph transformations enable high performance, they raise concerns regarding traceability, verification, and certification. The focus will be on bridging the gap between training and implementation models, specifying design models, handling numerical accuracy, and mastering optimization techniques. Presentations will highlight industrial and academic perspectives, with emphasis on certification constraints (e.g., DO-178) and assurance arguments. The format combines a keynote with a series of short technical talks (15 min) to stimulate discussions (15 min).

The ultimate goal is to foster a common understanding of how to implement ML efficiently while ensuring reliability and safety.

W03.1 Context and Challenges

Session Start
Session End
Session chair
Eric Jenn, IRT Saint Exupéry, France
Presentations

W03.1.1 Workshop welcome

Start
End
Speaker
Eric Jenn, IRT Saint Exupéry, France

W03.1.2 Assurance of Machine Learning in Aviation: Challenges, Solutions, and Emerging Guidance

Start
End
Keynote Speaker
Dmitrii Kirov, Collins Aerospace, Italy

Suppose we want to use Machine Learning (ML) technologies onboard aircraft. The reasons for doing so range from increasing autonomy and easing pilot workload to simply reducing the computing resources required by avionics. Can these benefits be realized while maintaining or even improving aircraft safety? What are the barriers for ML assurance and certification, and how might they be overcome? Researchers, regulators, and the aviation industry have been working to answer these questions, developing new guidance for certification of ML-based airborne systems, such as the ED-324 / ARP6983 standard that is currently being developed by EUROCAE WG-114 / SAE G-34 joint working group. This keynote will discuss several key challenges in the assurance of safety-critical ML-enabled components that are addressed by the working group. We will then highlight new technologies and processes that are being created to meet corresponding certification objectives. Specifically, formal methods will play a large role in the safety assurance and deployment of ML onboard aircraft.

W03.2 Advanced vs. embedded ML - bridging the gap

Session Start
Session End
Speaker
Dumitru Potop-Butucaru, INRIA, France
Session chair
Eric Jenn, IRT Saint Exupéry, France

Many industrial actors have today not one, but two distinct ML departments - one mostly dedicated to R&D into advanced ML, and the other to the actual embedded implementation of ML algorithms. The embedded ML department usually considers rather simple (e.g. feedforward) networks in inference mode and focuses on providing real-time and numerical precision guarantees using compilers and run-times with limited control over optimization and resource allocation. But advanced ML algorithms require complex control involving back-propagation training (in on-device training and reinforcement learning contexts), or more generally stateful (e.g. recurrent) and conditional (e.g. gated mixture of experts) behaviors. Such advanced control is not readily covered by existing embedded back-ends, and is also often hidden in layers of Python code. We propose an approach to address and 

W03.3 SONNX - Towards a standardized ML format model for safe systems

Session Start
Session End
Session chair
Pierre Gaillard, CEA, France
Speaker
Eric Jenn, IRT Saint Exupéry, France

Transforming a trained model into an executable implementation must preserve the model’s intended semantics. Focusing on models specified using the ONNX standard, this presentation examines the requirements for precise and unambiguous syntax and semantics when models are interpreted or translated into lower-level representations. It highlights the limitations of ONNX in certification-oriented contexts and discusses the need for a safety-related ONNX profile. The presentation also introduces the objectives, methodology, and initial results of the SONNX Working Group, which aims to align the use of ONNX with industrial and safety-critical requirements without diverging from the standard.

W03.4 From models to Implementations

Session Start
Session End
Session chair
Eric Jenn, IRT Saint Exupéry, France
Presentations

W03.4.1 Optimizing tensor operations for performance

Start
End
Speaker
Guillaume IOOSS, INRIA, France

This presentation focuses on compiling tensor operations described, for instance, in an ONNX model, through transformation passes such as loop fusion, tiling, interchange, and related optimizations.

In the context of critical embedded systems, several objectives must be met:

  • Minimizing latency and memory usage to satisfy performance requirements and platform constraints, by exploiting the capabilities of the target architecture and exploring the available optimization space;
  • Ensuring compliance with timing requirements by performing WCET analyses;
  • Ensuring conformance to the original model, by relying on mathematical formalization and proofs.

The compilation techniques needed to achieve these objectives differ significantly, and bridging this gap is essential to deliver both safety and performance.

In this talk, we focus on the “compiling for performance” side and review the usual program transformations and compilation passes used to extract the best performance from tensor operators. The goal is to provide enough background to support discussions on how to connect these two worlds.

W03.4.2 Determinism Is Optional, Predictability Is Not: Numerical Approximation as a First-Class Citizen in Modern Machine Learning

Start
End
Speaker
David Defour, Université de Perpignan via Domitia, France

Modern deep learning systems rely heavily on numerical approximations, including reduced precision, quantization, non-deterministic execution orderings, and parallel computation. These choices do not merely introduce “negligible noise”; they can fundamentally alter optimization dynamics, learned representations, training behavior, stability, and, in some cases, the functional behavior of neural networks. This presentation sheds light on the tensions between determinism, reproducibility, and predictability, and questions the actual role of numerical precision in the design, analysis, and certification of modern machine learning models.

W03.4.3 Impact of Optimizations on the Reliability of DNNs on GPUs and FPGAs

Start
End
Speaker
Fernando Fernandes Dos Santos, INRIA, France

W03.5 Performance first, trust later? Rethinking edge AI deployment with Eclipse Aidge

Session Start
Session End
Session chair
Eric Jenn, IRT Saint-Exupery, France
Speaker
Pierre Gaillard, CEA, France
Speaker
Filipo Perotto, ONERA, France

The deployment of AI at the edge is often driven by performance constraints, with safety  considerations addressed only at later stages. Aidge challenges this paradigm by providing an open-source framework in which confidence, and traceability are first-class design objectives alongside performance optimization. 
It offers a transparent and auditable toolchain for deploying inference on edge platforms, relying on explicit intermediate representations and controlled graph transformations to ensure traceability from training models to implementation artifacts. Through the integration of the ACETONE approach, enabling traceability and worst-case execution time analysis, Aidge is well suited to aeronautical certification standards such as DO-178C and the forthcoming ML-specific standard ARP6983. In addition, Aidge supports inference testing under hardware fault conditions and pioneers the adoption of the Safety ONNX standard, contributing to robustness assessment, standardized model exchange, and strengthened assurance arguments for safety-critical AI systems.
 

W03.6 Closure

Session Start
Session End
Session chair
Eric Jenn, IRT Saint-Exupery, France
Chair
Pierre Gaillard, CEA, France
Chair
Filipo Perotto, ONERA, France