Code optimisation is the practice of improving the efficiency of a computer code, a program or a software library. These improvements generally allow the resulting program to execute more quickly, take up less memory space, limit the consumption of resources (for example the input / output files), or consume less energy.
As the power of supercomputers continues to grow, the next step is exascale: the ability to perform billions of billions of calculations every second. We are no more limited to aligning raw power by multiplying the number compute servers. Close collaboration is needed between application developers and high-performance computing (HPC) solution providers. Hardware / software synchronisation is the key to success. Without it, HPC will not be able to reach its fullest potential.
The need to adapt numerical simulation codes used in the energy industry for HPC has led ENERXICO to take on the task of Exascale Enabling for the areas of renewable energies, oil and gas energies and biofuels for transportation. ENERXICO partners will analyse the codes´ current issues such (i.e. bottlenecks that impede parallel programming, non-mandatory data access, particular library use, etc.) that showed scalability weaknesses for multi-nodes computing.
ENERXICO partners have identified the issues and bottlenecks of each of the seven codes used in the project, and are working to improve them. These codes are presented below:
WRF (Weather Research and Forecasting Model) is a mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometres.
In order to make a comparison with high resolution climatology obtained with WRF, the code is needed to identify the most relevant coupling parameters between meso and micro scale models. Also, by developing a WRF and LES dynamic coupled model with a novel approach based on forcing RANS models with 3D tendencies coming from WRF, the comparison with field data available in Mexico and EU will be possible. For this purpose, it is necessary to ensure that the WRF code will be optimised.
ENERXICO´s efficiency and scalability analysis of WRF identified that the communications were the main bottleneck for the instrumented runs. The loss of efficiency is due to the serialization and dependencies between processes that need to be reconsidered.
BSIT (Barcelona Subsurface Imaging Tools) is a production geophysical imaging application with the capacity to run on extremely large systems implementing Full Waveform Inversion. FWI is a cutting-edge technique that aims to acquire the physical properties of the subsoil from a set of seismic measurements. Starting from a guess (initial model) of the variables being inverted (e.g., sound transmission velocity), the stimulus introduced and the recorded signals, Full Waveform Inversion performs several phases of iterative computations to reach the real value of the set of variables being inverted with an acceptable error threshold.
From a purely computational point of view, the reason why this application is very suited for an exascale computer is that the initial field is split among a huge number of disjoint datasets, which can be processed independently. Once a set of signal sources and receivers has been selected, the field can be decomposed, and later on computed using traditional means (e.g. OpenMP, CUDA, MPI, etc.). This makes this problem an embarrassingly parallel one.
The performance analysis of FWI identified that the communications between adjacent domains and host/device data transfers are the main bottleneck for the instrumented runs. The analysis also reports the main loose of efficiency is due to the communications serialization. Furthermore, GPUs executions profiles show a poor device utilization.
Alya , a code developed by the Barcelona Supercomputing Center (BSC) is a parallel multi-physics CFD code of the PRACE Benchmark Suite for HPC. The numerical discretization is based on a second-order spatial low-dissipation finite element scheme with an explicit temporal third-order Runge-Kutta method for momentum and scalar transport. Alya has been designed to run on leading HPC systems and includes both the MPI and OpenMP models to take advantage of the distributed and the shared memory paradigms. Accelerators like GPUs are also exploited to further enhance the performance of the code.
The combustion problem is solved using a flamelet approach for which transport equations for user-defined controlling variables (i.e. mixture fraction and progress variable) and their corresponding variances are solved using a presumed-shape probability density function that accounts for turbulence-chemistry interactions in the subgrid scale. This model is coupled to a Lagrangian solver to account for the liquid phase of the spray with a two-way coupling approach. A Newmark/Newton–Raphson scheme is used to solve the kinematic equations which is coupled to heat and mass transfer models describing droplet heating and evaporation. Various combinations of heat and mass transfer models are available, with increasing complexity considering the interaction of the two phenomena and non-equilibrium effects.
Alya is also extensively used by the BSC wind-energy team. Alya RANS k-epsilon model is being used for wind resource assessment of wind farms within the framework of SEDAR (High Resolution Wind Energy Software) project between BSC and Iberdrola Renovables. Alya has also been used for wind resource assessment accounting for thermal effects, solving the ABL over very complex terrain along entire diurnal cycles, showing that the modelling of thermal stratification significantly enhances the prediction of the annual wind energy production.
The LES implementation in Alya has been validated for the simulation of wind over complex terrains, obtaining accurate results over very steep terrain when comparing against other LES codes. Besides, Alya has shown a parallel large efficiency of around 90% up to 100 000 cores and 3.2 x 10^9 elements, which permits to run LES of ABL flows over complex and heterogeneous terrains achieving enough mesh discretization and domain size.
However, one of the major complexities when running high-fidelity reacting sprays simulations is to achieve high parallel performance with Eulerian-Lagrangian frameworks. This aspect is one of those that will be tested in ENERXICO project. Optimisation strategies will be pursued for both node- and system-level performance increase.
ExaHyPE (An Exascale Hyperbolic PDE Engine) is an engine for solving systems of first-order hyperbolic partial differential equations. Due to the robustness and shock capturing abilities of ExaHyPE’s numerical methods, both linear and non-linear hyperbolic PDEs can be simulated with very high accuracy.
Applications powered by ExaHyPE can be run on a simple laptop but are also able to exploit thousands of CPU cores on state-of-the-art supercomputers.
The ENERXICO project aims to further improve the distributed memory scaling and performance of ExaHyPE by moving to a more performant AMR framework. Also, the performance issues caused by ensemble runs are planned to be investigated. Using ExaHyPE's multisolver capabilities the efficiency of many uncertainty quantification algorithms can be improved too.
DualSPHysics code is used to develop an extension that will be the first SPH based code for the numerical simulation of oil reservoirs and which has also important benefits versus commercial codes based on other numerical techniques.
This BH code (for Black Hole) is a large-scale massively parallel reservoir simulator capable of performing simulations with billions of “particles” or fluid elements that represents the system under study. It contains improved multi-physics modules that automatically combine the effects of interrelated physical and chemical phenomena to accurately simulate in-situ recovery processes. This leads to the development of a graphical user interface multiple-platform application for code execution and visualization, and for carrying out simulations with data provided by industrial partners and performing comparisons with available commercial packages.
Furthermore, a large effort is being made to simplify the process of setting up the input for reservoir simulations from exploration data by means of a workflow fully integrated in our industrial partners’ software environment.
Using geophysical forward and inverse numerical techniques, the ENERXICO project will evaluate novel, high-performance simulation packages for challenging seismic exploration cases that are characterized by extreme geometric complexity. In particular, high-order methods based upon fully unstructured tetrahedral meshes and also tree-structured Cartesian meshes with adaptive mesh refinement (AMR) for better spatial resolution need to be explored. The ENERXICO team is waiting performance analysis to decide where to focus on, in order to perform in priority the optimisations.
SEM46 is a seismic full waveform modelling and inversion code developed at Universite Grenoble Alpes. It aims to provide high resolution 3D models of the subsurface physical properties, from active seismic recordings. Such models can then be interpreted as geological formations, rock types, reservoirs and/or fluid content, in order to evaluate resources quantities and monitor production.
SEM46 is formulated on a finite-(spectral)element method, which allows to naturally handle complex topography and bathymetry, in order to tackle the difficulties of subsurface characterization for complex onshore, shallow and deep-offshore targets. In the frame of the ENERXICO project, SEM46 will be used in geophysical forward modelling to simulate seismic wave in complex setting, in comparison with ExaHype, SeisSol and BSIT. SEM46 will also be compared to BSIT for seismic inversion.
The audit of SEM46 shows very good efficiencies for both input cases (small and big). The analysis identified that the weakest factor is a relatively low IPC. That should be the main target for optimizations of the code. Then at large scale executions, an improvement of the synchronisation of MPI should have to be managed (reducing unnecessary MPI-barrier calls). ENERXICO has started to optimize the code.
SeisSol is a software package for simulating wave propagation and dynamic rupture based on the arbitrary high-order accurate derivative discontinuous Galerkin method (ADER-DG). Characteristics of the SeisSol simulation software are:
• use of tetrahedral meshes to approximate complex 3D model geometries (faults & topography) and rapid model generation
• use of elastic, viscoelastic and viscoplastic material to approximate realistic geological subsurface properties.
SeisSol is already well optimised for current Intel architectures. However, it is important to know its behaviour on top of AMD and ARM components. Optimisations may have to be forecasted to better suit to these new HPC architectures now on the market, so that this code may still be used with optimal performance. Optimisations are already planned as SeisSol's features will be extended in order to include more sophisticated and realistic material models: the aim consists of keeping high performance and scaling properties.
Initial results of ENERXICO´s optimisation tasks will be presented in a report on intranode and multi-node optimisations for HPC codes. This intermediate milestone will allow to share the partners´ methodology, improvements (speedup) reached.
A final report on enabling computational and energy efficient codes for the exascale is also planned. It will gather all the results, review performance measurements and improvements related to computational and energy efficiency bottlenecks in HPC codes to exploit exascale capabilities.