L Arrindell
Toine van Wonderen & Leco Arrindell

MPI Wave Simulation

ARTIKEL

(Skip to plot)

Introduction

MPI is a way to execute programs parallelized in C. There are multiple processes that all execute a part of the program simultaniously. The nodes do not share the same memory so they need to send messages to communicate and share data with eachother.

In this article we will look into running a wave equation simulation defined by:
Ai,t + 1 = 2 * Ai,tAi,t − 1 + c * (Ai − 1,t−(2*Ai,tAi + 1,t))
Each process will execute a single operation for each k iteration until the full wave equation is calculated. The processes will exchange messages to give the current state of the wave equation to the neighbors, so that each k iteration the wave equation is updated with the latest calculations in each process.

Code

This article is part of an university assignment, therefore we can not share any code or other specifics. Because it may lead to plagiarism, as it exposes our specific implementation. We encourage independent problem-solving and coding skills which are crucial for the educational process, and sharing code publicly might undermine that objective. (For that same reason this page will also not be indexed by bonafide search engines.)

Testresults

These results are based on tests performed on the DAS-5(Distributed ASCI Supercomputer) located at the University of Amsterdam. The tests are run automatically through a custom Python script. All times are calculated in seconds where only the time during the simulation was measured. After all simulations the speedup was calculated based on running the test sequentially with one process running on one node.

All combined tests, 2x4 means 2 nodes with 4 processes per node. Shown as speedup compared to 1 node with 1 process.

In the above plot we can see all tests performed for the different node and processes per node combinations, where every point represents the average of 5 individual tests. Here we see that the speedup increases greatly by adding more nodes and threads but eventually when the task gets to around 10M datapoints they all are a little to much slower then the benchmark. This could be the result of factors such as overhead of the parallelization, communication, or the load balancing which is impacting the efficiency.

These results show that using MPI for parallelized code execution brings major performance improvements compared to running the simulation sequentially, with more nodes and processes per node getting higher gains.