41st Workshop on Sustained Simulation Performance
April 23 - April 24
Agenda
All times are given in Central European Summer Time (CEST).
Thursday, April 23rd, 2026 |
|
| 09:00 | Registration Desk Opens |
| 09:45 – 10:00 | Welcome & Introduction Michael Resch, High-Performance Computing Center Stuttgart, University of Stuttgart |
| 10:00 – 10:30 | A brief case study of “AI for Science” at the Cyberscience Center Hiroyuki Takizawa,Cyberscience Center, Tohoku University |
| 10:30 – 11:00 | HLRS – Status and Outlook Michael Resch, High-Performance Computing Center Stuttgart, University of Stuttgart |
| 11:00 – 11:30 | Coffee Break |
| 11:30 – 12:00 | One Year of HUNTER from a User Perspective: Progress, Performance, and Patience with NS3D Christoph Wenzel, Institute of Aerodynamics and Gas Dynamics, University of Stuttgart |
| 12:00 – 12:30 | D3 Center’s strategy and project for supporting academic research in the “AI for Science” era Susumu Date, The University of Osaka |
| 12:30 – 13:30 | Lunch Break |
| 13:30 – 14:00 | Impact of GPU Virtualization on LLM Inference Performance: A Comparative Study of Bare Metal, vGPU Qifeng Pan, High-Performance Computing Center Stuttgart, University of Stuttgart |
| 14:00 – 14:30 | Thoughts on current and future HPC&AI installations Sabine Roller, Institute of Software Methods for Product Virtualization, German Aerospace Center (DLR e.V.) |
| 14:30 – 15:00 | Aeroacoustic Optimization of a Chevron Nozzle on the Hunter HPC System Matthias Meinke, Institute of Aerodynamics, RWTH Aachen University |
| 15:00 – 15:30 | Coffee Break |
| 15:30 – 16:00 | Evaluating a Real-Time Lossy Array Compression Algorithm for a Lattice Boltzmann Solver Darjan Krijan, High-Performance Computing Center Stuttgart, University of Stuttgart Computer simulations that were previously regarded as CPU-bound become gradually memory-bound as the growth in memory bandwidth cannot keep up with the much higher advancements in raw computing power. This imbalance is quantified with a relative factor of approximately 5.1 per decade since the 1990s, where a rise in memory bandwidth is met with a 5.1-times increase in relative computing power. In practical terms, comparing a NEC SX-4 from 1994 that operated at a balanced arithmetic intensity of 0.125 FLOP/Byte with an Intel Ponte Vecchio accelerator from 2023 that operates at 15.9 FLOP/Byte shows a factor of 127 in the described imbalance. Mixed-precision approaches that were traditionally used to speed up the throughput of calculations on a CPU core level now provide speedup due to less demanded memory bandwidth. Approaches to compress arrays in a lossless or lossy manner to reduce memory bandwidth were implemented in LLNL’s zfp library, although it is not able to process the data in real-time. In this work, a similar approach targeting a real-time lossy array compression (RTLAC) algorithm utilizing known value ranges of variables was developed and applied to a Lattice Boltzmann method for CFD simulations. Here, the main simulation variables are within certain bounds, making them ideal candidates for the RTLAC approach. Accuracy and performance results of the algorithm implemented in the m-AIA solver framework with multiple compression sizes will be presented. |
| 16:00 – 16:30 | FlowSimulator: A framework for multidisciplinary simulations for virtual aircraft Julian Braun, Institute of Software Methods for Product Virtualization, German Aerospace Center (DLR e.V.) Multidisciplinary simulations are essential for modern aerospace engineering. Applications include static aeroelastic coupling, flutter analysis, time-resolved aeroelastic simulations, and gradient-based shape optimization. Highly specialized codes are able to compute high-fidelity solutions in their respective domain, but external frameworks are needed to establish the coupling between them. This talk presents FlowSimulator and its approach for flexible and performant in-memory coupling. FlowSimulator is an environment which contains a variety of independent codes such as CFD for ONERA, DLR and Airbus (CODA) and the structure solver b2000++pro. It follows a layered approach: users orchestrate their own workflows in Python, while data exchange between simulation codes is realized on a C++ layer through MPI-parallel data structures for grids and simulation data. This allows for rapid prototyping as well as for performant, parallel data exchange. Following a discussion of the design, an outlook on fluid–structure interaction between high-order codes will be provided. |
| 16:30 – 17:00 | MPPI – Type safe C++ Datatypes for MPI Mike Söhner, High-Performance Computing Center Stuttgart, University of Stuttgart MPI provides a flexible C-API to communicate data of various types between a set of distributed processes over high-speed interconnects in HPC systems. Data buffers are described using MPI-Datatypes, which specify the type and layout of the data to be transmitted. To construct these datatypes, users must manually describe the memory layout of buffer elements via the MPI-API. However, modern applications are typically written in object-oriented C++, which offers significant advantages over C, including type safety and metaprogramming capabilities. In this work, we introduce a new C++-API and datatype engine that leverage C++ language features such as concepts, ranges, and the upcoming reflection to extract the necessary datatype information for the user at compile-time. This approach simplifies the user’s work, enhances code safety by eliminating manual datatype construction and offers previously unavailable possibilities. Our measurements demonstrate that this interface introduces no performance overhead and, in some cases, even improves performance. |
Friday, April 24th, 2026 |
|
| 09:00 | Registration Desk Opens |
| 09:30 – 10:00 | Toward Anomaly Prediction in HPC Systems Ryusuke Egawa, School of Engineering, Tokyo Denki University |
| 10:00 – 10:30 | RISC-V Vector Architecture & Programming Model Fredrik Unger, Openchip & Software Technologies Introduction to RISC-V Vector architecture and programming model. A view on how the vector architecture differentiates from traditional scalar processors and GPU-based accelerators, what the RISC-V open ISA with vector extensions brings to the HPC ecosystem, and basic differences with respect to the programming model. |
| 10:30 – 11:00 | Coffee Break |
| 11:00 – 11:30 | Future Computing: Researching and Working with the Cerebras Wafer Scale Engine Jonathan Schäfer, High-Performance Computing Center Stuttgart, University of Stuttgart We present general remarks on researching and working with the Cerebras Wafer-Scale Engine (WSE). We discuss the programming model, current efforts worldwide and in our group to implement non-AI workloads on the WSE, and some preliminary and submitted results. |
| 11:30 – 12:00 | Patient-Specific Hemodynamic Simulations with SPH: Challenges and Implementation Pipeline Niklas Neher, High-Performance Computing Center Stuttgart, University of Stuttgart |
| 12:00 – 12:30 | Towards Energy‑Aware HPC: Measuring Efficiency Across Heterogeneous Hardware with HWS Dirk Pflüger, Scientific Computing – Institute of Parallel and Distributed Systems, University of Stuttgart Energy efficiency has become a critical aspect of sustained high‑performance computing. Yet for developers, obtaining reliable energy metrics remains challenging due to mixed hardware environments, varying vendor interfaces, and limited portability of measurement tools. In this talk, we present HWS, our HardWare Sampling library that offers uniform, low‑overhead access to performance and power data acrossCPUs and GPUs. HWS enables detailed energy efficiency analysis in heterogeneous HPC systems. We further show comparative results from benchmark applications implemented in SYCL, illustrating how the combination of HWS and SYCL supports energy‑aware optimizations and fair cross‑architecture performance evaluations. |
| 12:30 – 13:30 | Lunch Break |
| 13:30 – 14:00 | Tools for Rapid and Efficient I/O for Exascale Patrick Vogler, High-Performance Computing Center Stuttgart, University of Stuttgart |
| 14:00 – 14:30 | Enhancing Research with Provenance Management for HPC and AI Systems Yosuke Taira, NEC Corporation |
| 14:30 | Farewell Michael Resch, High-Performance Computing Center Stuttgart, University of Stuttgart |
Committee
Program Committee
- Prof. Michael Resch, Stuttgart University, HLRS
- Prof. Hiroaki Kobayashi, Tohoku University
- Dr. Wolfgang Bez, NEC Deutschland GmbH, Division HPCE
- Prof. Sabine Roller, German Research School for Simulation Sciences GmbH
Organizing Committee
- Prof. Michael Resch, Stuttgart University, HLRS
- Johannes Gebert, Stuttgart University, HLRS
- Prof. Hiroaki Kobayashi, Tohoku University
- Prof. Hiroyuki Takizawa, CyberScience Center, Tohoku University