Dense Matrix Multiplication Using OpenMP
The code used in this test solves the system A*x
= b when matrix A is full with values. The parallel standard used
was OpenMP. The code used was the
Teuchos package from
Sandia National Laboratories. The obtained results are shown in Table 3.
The most interesting numbers are shown in red color. You can find the
analysis on the bottom of this page.
|
|
MIOPS |
Number of Threads |
|
|
|
|
|
|
|
|
NEQ |
1 |
2 |
4 |
8 |
16 |
32 |
64 |
128 |
256 |
512 |
1024 |
2048 |
|
|
|
32 |
70.7039 |
129.444 |
201.283 |
244.909 |
187.967 |
27.8215 |
25.5386 |
25.2984 |
25.4565 |
28.265 |
27.1987 |
26.3786 |
|
|
|
64 |
64.4648 |
126.876 |
242.889 |
444.411 |
553.656 |
175.209 |
175.339 |
170.509 |
165.263 |
178.959 |
192.797 |
163.113 |
|
|
|
128 |
63.8348 |
127.19 |
252.011 |
477.465 |
741.955 |
545.474 |
702.764 |
822.704 |
749.806 |
595.638 |
682.043 |
964.883 |
|
|
|
256 |
63.7332 |
127.153 |
247.775 |
493.825 |
867.528 |
1020.25 |
1013.72 |
1025.6 |
1152.31 |
779.991 |
849.307 |
1023.03 |
|
|
|
512 |
62.8613 |
125.557 |
250.251 |
494.066 |
857.547 |
1222.63 |
1222.51 |
1203.83 |
1221.83 |
1152.75 |
1220.77 |
1221.63 |
|
|
|
1024 |
60.2081 |
120.488 |
240.937 |
480.051 |
832.174 |
1242.68 |
1252.92 |
1251.34 |
1253.52 |
1233.78 |
1212.34 |
1250.67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Speedup |
Number of Threads |
|
|
|
|
|
|
NEQ |
|
2 |
4 |
8 |
16 |
32 |
64 |
128 |
256 |
512 |
1024 |
2048 |
|
|
|
32 |
|
1.83 |
2.85 |
3.46 |
2.66 |
0.39 |
0.36 |
0.36 |
0.36 |
0.40 |
0.38 |
0.37 |
|
|
|
64 |
|
1.97 |
3.77 |
6.89 |
8.59 |
2.72 |
2.72 |
2.64 |
2.56 |
2.78 |
2.99 |
2.53 |
|
|
|
128 |
|
1.99 |
3.95 |
7.48 |
11.62 |
8.55 |
11.01 |
12.89 |
11.75 |
9.33 |
10.68 |
15.12 |
|
|
|
256 |
|
2.00 |
3.89 |
7.75 |
13.61 |
16.01 |
15.91 |
16.09 |
18.08 |
12.24 |
13.33 |
16.05 |
|
|
|
512 |
|
2.00 |
3.98 |
7.86 |
13.64 |
19.45 |
19.45 |
19.15 |
19.44 |
18.34 |
19.42 |
19.43 |
|
|
|
1024 |
|
2.00 |
4.00 |
7.97 |
13.82 |
20.64 |
20.81 |
20.78 |
20.82 |
20.49 |
20.14 |
20.77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Efficiency |
Number of Threads |
|
|
|