Hybrid Parallel Computing in CMPS

Hybrid Parallel Computing in CMPS

Hybrid Parallel Computing and HPC Methods in the CMPS Solver: Distributed Memory, Shared Memory, Hybrid, and GPU Computing 

Computational Fluid Dynamics (CFD) solvers help us understand and simulate complex flow phenomena in engineering and scientific research. The CMPS CFD solver offers an effective solution in this field with its physics-based modeling and parallel computing approaches.

Scalability Challenges in CFD 

CFD simulations require precise resolution of spatial and temporal scales, often leading to an increased need for computational resources. The ability of a solver to efficiently utilize an increasing number of processors—high scalability—is crucial for solving complex problems. CMPS addresses this need with its optimization for parallelization strategies and HPC environments.

Domain Decomposition in CMPS

Parallel Computing in CMPS 

CMPS is designed to be compatible with various platforms, including current supercomputer clusters, desktop systems, shared memory systems, and multi-core architectures. It also integrates effectively with high-speed networks like InfiniBand and Myrinet. During its development, both cluster systems and shared memory architectures were supported. 

Distributed Architecture 

Distributed Memory Methods in CMPS 

Combining multiple CPU cores in a shared memory system can be challenging. As a result, modern supercomputers adopt a distributed memory strategy: multiple computer nodes are linked together and connected via high-speed networks. In today's supercomputers: 

Launching a distributed solver in CMPS is straightforward. Users can easily input the names of nodes through the GUI. CMPS balances these nodes and automatically transmits the necessary data for each node to perform computations. This process is known as domain decomposition. 

CMPS employs algorithms for domain decomposition and parallel solution strategies to ensure efficient computation. 

Hybrid Parallel and Shared Memory Methods in CMPS 

Shared memory systems allow multiple processors to use a common memory pool, which accelerates data access and inter-processor communication. However, mechanisms like locks or barriers may be required for processors attempting to access or modify the same data. 

Solution Algorithm

Shared Memory Method

MPI-based models require less synchronization compared to shared memory systems and reduce errors during data transfers, offering a more reliable solution in complex systems. 

CMPS adopts a hybrid approach. This approach facilitates effective communication within a shared memory system while connecting physical machines through high-speed networks. This ensures both scalability and performance improvements. 

MPI Buffer Optimization in CMPS 

When communicating non-basic data types with MPI, the data must be packed in buffer memory, which creates an additional workload on the receiving node. CMPS optimizes this process by making buffer memory allocations persistent and minimizing repetitive operations. This accelerates data transfers and enhances overall performance. 

GPU Computing 

GPU (Graphics Processing Unit) computing is experimentally used in the CMPS solver for specific matrix solver processes and computation-intensive operations, but it will not be available in the initial release. Thanks to their numerous parallel cores, GPUs can perform computation-heavy tasks faster than CPUs. By utilizing GPUs, the CMPS solver achieves: 

With GPU optimization, CMPS reduces computation times for large-scale problems while improving energy efficiency.