![]() |
Results April 21, 2006 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dense Matrix Multiplication Using MPIThe code used in this test solves the system A*x = b when matrix A is full with values. The parallel standard used was MPI. The code was developed by Paul Sexton (see people section). The obtained results are shown in Table 2. The most interesting numbers are shown in red color. You can find the analysis on the bottom of this page.
AnalysisThe performance of the system scales up to 32 processes. This fact shows us that the system is behaving like a parallel machine that supports multiple processes. We can see that we obtain a maximum performance of 1236.56 MIPOS when the number of processes is 32 and the NEQ is equal to 1024. We obtained an speedup of 20. We can see that the code runs 20 times faster that its serial version when we use 32 processes and NEQ is 1024. Another remarkable aspect is when NEQ is fixed in 1024 and we increase the number of processes, the speedup grows almost linear with the number of parallel tasks. Maximum SWaP was obtained at 32 processes and NEQ equal to 1024 (New !). If we consider a performance of 1236.56 MIPOS, a space of 2 RU (rack units), and a power consumption of 300 Watts.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||