Difference between revisions of "Infiniband performance/CentOS and openMPI"
From Teknologisk videncenter
m (Created page with "A total increase from 1Gbps ethernet from approx. 114MBps (23 us latency) to approx 630MBps (15 us latency). not quite good enough - Working with Infiniband PSM =Over 1GB Eth...") |
m (→Over 4xQDR) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | A total increase from 1Gbps ethernet from approx. 114MBps (23 us latency) to approx | + | A total increase from 1Gbps ethernet from approx. 114MBps (23 us latency) to approx 676MBps (16 us latency). not quite good enough - Working with [[Infiniband PSM]] |
=Over 1GB Ethernet= | =Over 1GB Ethernet= | ||
<source lang=cli> | <source lang=cli> | ||
Line 72: | Line 72: | ||
=Over 4xQDR= | =Over 4xQDR= | ||
<source lang=cli> | <source lang=cli> | ||
− | [root@centos1 bin]# mpiexec --mca btl ^openib --mca btl_tcp_if_include ib0 -H 10.0.1.102,10.0.1.101 /home/pong | + | |
+ | [root@centos1 bin]# mpiexec --mca btl ^openib\ | ||
+ | --mca btl_tcp_if_include ib0 -H 10.0.1.102,10.0.1.101 /home/pong | ||
Hello from 1 of 2 | Hello from 1 of 2 | ||
Hello from 0 of 2 | Hello from 0 of 2 | ||
− | Timer accuracy of ~ | + | Timer accuracy of ~1.192093 usecs |
− | 8 bytes took | + | 8 bytes took 129 usec ( 0.124 MB/sec) |
− | 16 bytes took | + | 16 bytes took 79 usec ( 0.405 MB/sec) |
− | 32 bytes took | + | 32 bytes took 97 usec ( 0.660 MB/sec) |
− | 64 bytes took | + | 64 bytes took 104 usec ( 1.231 MB/sec) |
− | 128 bytes took | + | 128 bytes took 80 usec ( 3.196 MB/sec) |
− | 256 bytes took | + | 256 bytes took 72 usec ( 7.111 MB/sec) |
− | 512 bytes took | + | 512 bytes took 53 usec ( 19.347 MB/sec) |
− | 1024 bytes took | + | 1024 bytes took 91 usec ( 22.487 MB/sec) |
− | 2048 bytes took | + | 2048 bytes took 128 usec ( 31.992 MB/sec) |
− | 4096 bytes took | + | 4096 bytes took 134 usec ( 61.138 MB/sec) |
− | 8192 bytes took | + | 8192 bytes took 94 usec ( 174.415 MB/sec) |
− | 16384 bytes took | + | 16384 bytes took 152 usec ( 215.422 MB/sec) |
− | 32768 bytes took | + | 32768 bytes took 365 usec ( 179.541 MB/sec) |
− | 65536 bytes took | + | 65536 bytes took 581 usec ( 225.587 MB/sec) |
− | 131072 bytes took | + | 131072 bytes took 1296 usec ( 202.265 MB/sec) |
− | 262144 bytes took | + | 262144 bytes took 2255 usec ( 232.504 MB/sec) |
− | 524288 bytes took | + | 524288 bytes took 3513 usec ( 298.476 MB/sec) |
− | 1048576 bytes took | + | 1048576 bytes took 4194 usec ( 500.034 MB/sec) |
Asynchronous ping-pong | Asynchronous ping-pong | ||
− | 8 bytes took | + | 8 bytes took 44 usec ( 0.363 MB/sec) |
16 bytes took 34 usec ( 0.945 MB/sec) | 16 bytes took 34 usec ( 0.945 MB/sec) | ||
− | 32 bytes took | + | 32 bytes took 38 usec ( 1.688 MB/sec) |
− | 64 bytes took | + | 64 bytes took 34 usec ( 3.754 MB/sec) |
− | 128 bytes took | + | 128 bytes took 37 usec ( 6.927 MB/sec) |
− | 256 bytes took | + | 256 bytes took 42 usec ( 12.202 MB/sec) |
− | 512 bytes took | + | 512 bytes took 33 usec ( 31.123 MB/sec) |
− | 1024 bytes took | + | 1024 bytes took 42 usec ( 48.806 MB/sec) |
− | 2048 bytes took | + | 2048 bytes took 47 usec ( 86.767 MB/sec) |
− | 4096 bytes took | + | 4096 bytes took 54 usec ( 152.034 MB/sec) |
− | 8192 bytes took | + | 8192 bytes took 82 usec ( 199.766 MB/sec) |
− | 16384 bytes took | + | 16384 bytes took 110 usec ( 298.132 MB/sec) |
− | 32768 bytes took | + | 32768 bytes took 200 usec ( 327.626 MB/sec) |
− | 65536 bytes took | + | 65536 bytes took 322 usec ( 406.925 MB/sec) |
− | 131072 bytes took | + | 131072 bytes took 515 usec ( 509.033 MB/sec) |
− | 262144 bytes took | + | 262144 bytes took 894 usec ( 586.563 MB/sec) |
− | 524288 bytes took | + | 524288 bytes took 1666 usec ( 629.371 MB/sec) |
− | 1048576 bytes took | + | 1048576 bytes took 3102 usec ( 676.050 MB/sec) |
Bi-directional asynchronous ping-pong | Bi-directional asynchronous ping-pong | ||
− | 8 bytes took 39 usec ( 0. | + | 8 bytes took 39 usec ( 0.409 MB/sec) |
− | 16 bytes took | + | 16 bytes took 41 usec ( 0.780 MB/sec) |
− | 32 bytes took | + | 32 bytes took 35 usec ( 1.839 MB/sec) |
− | 64 bytes took | + | 64 bytes took 34 usec ( 3.781 MB/sec) |
− | 128 bytes took | + | 128 bytes took 38 usec ( 6.711 MB/sec) |
− | 256 bytes took | + | 256 bytes took 32 usec ( 16.026 MB/sec) |
− | 512 bytes took | + | 512 bytes took 36 usec ( 28.443 MB/sec) |
− | 1024 bytes took | + | 1024 bytes took 36 usec ( 56.887 MB/sec) |
− | 2048 bytes took | + | 2048 bytes took 55 usec ( 74.695 MB/sec) |
− | 4096 bytes took | + | 4096 bytes took 98 usec ( 83.600 MB/sec) |
− | 8192 bytes took | + | 8192 bytes took 135 usec ( 121.198 MB/sec) |
− | 16384 bytes took | + | 16384 bytes took 145 usec ( 226.051 MB/sec) |
− | 32768 bytes took | + | 32768 bytes took 241 usec ( 271.887 MB/sec) |
− | 65536 bytes took | + | 65536 bytes took 563 usec ( 232.849 MB/sec) |
− | 131072 bytes took | + | 131072 bytes took 1331 usec ( 196.974 MB/sec) |
− | 262144 bytes took | + | 262144 bytes took 2535 usec ( 206.831 MB/sec) |
− | 524288 bytes took | + | 524288 bytes took 4167 usec ( 251.648 MB/sec) |
− | 1048576 bytes took | + | 1048576 bytes took 6992 usec ( 299.932 MB/sec) |
− | Max rate = | + | Max rate = 676.050497 MB/sec Min latency = 15.974045 usec |
</source> | </source> | ||
+ | {{Source cli}} | ||
[[Category:Infiniband]][[Category:CentOS]] | [[Category:Infiniband]][[Category:CentOS]] |
Latest revision as of 05:24, 28 August 2012
A total increase from 1Gbps ethernet from approx. 114MBps (23 us latency) to approx 676MBps (16 us latency). not quite good enough - Working with Infiniband PSM
Over 1GB Ethernet
[root@centos1 bin]# mpiexec --mca btl ^openib --mca btl_tcp_if_include eth0 -H 10.0.1.102,10.0.1.101 /home/pong
Hello from 1 of 2
Hello from 0 of 2
Timer accuracy of ~0.953674 usecs
8 bytes took 121 usec ( 0.132 MB/sec)
16 bytes took 58 usec ( 0.552 MB/sec)
32 bytes took 60 usec ( 1.065 MB/sec)
64 bytes took 239 usec ( 0.536 MB/sec)
128 bytes took 256 usec ( 1.000 MB/sec)
256 bytes took 94 usec ( 5.450 MB/sec)
512 bytes took 268 usec ( 3.818 MB/sec)
1024 bytes took 151 usec ( 13.570 MB/sec)
2048 bytes took 311 usec ( 13.165 MB/sec)
4096 bytes took 444 usec ( 18.443 MB/sec)
8192 bytes took 272 usec ( 60.227 MB/sec)
16384 bytes took 424 usec ( 77.256 MB/sec)
32768 bytes took 687 usec ( 95.377 MB/sec)
65536 bytes took 1693 usec ( 77.419 MB/sec)
131072 bytes took 2788 usec ( 94.024 MB/sec)
262144 bytes took 5031 usec ( 104.209 MB/sec)
524288 bytes took 9545 usec ( 109.855 MB/sec)
1048576 bytes took 18478 usec ( 113.495 MB/sec)
Asynchronous ping-pong
8 bytes took 56 usec ( 0.286 MB/sec)
16 bytes took 49 usec ( 0.652 MB/sec)
32 bytes took 54 usec ( 1.183 MB/sec)
64 bytes took 258 usec ( 0.496 MB/sec)
128 bytes took 243 usec ( 1.054 MB/sec)
256 bytes took 246 usec ( 2.081 MB/sec)
512 bytes took 293 usec ( 3.495 MB/sec)
1024 bytes took 292 usec ( 7.012 MB/sec)
2048 bytes took 227 usec ( 18.046 MB/sec)
4096 bytes took 381 usec ( 21.502 MB/sec)
8192 bytes took 252 usec ( 65.014 MB/sec)
16384 bytes took 405 usec ( 80.894 MB/sec)
32768 bytes took 682 usec ( 96.111 MB/sec)
65536 bytes took 1555 usec ( 84.293 MB/sec)
131072 bytes took 2936 usec ( 89.282 MB/sec)
262144 bytes took 4778 usec ( 109.732 MB/sec)
524288 bytes took 9541 usec ( 109.902 MB/sec)
1048576 bytes took 18320 usec ( 114.473 MB/sec)
Bi-directional asynchronous ping-pong
8 bytes took 47 usec ( 0.339 MB/sec)
16 bytes took 51 usec ( 0.627 MB/sec)
32 bytes took 48 usec ( 1.335 MB/sec)
64 bytes took 219 usec ( 0.584 MB/sec)
128 bytes took 201 usec ( 1.274 MB/sec)
256 bytes took 255 usec ( 2.007 MB/sec)
512 bytes took 265 usec ( 3.866 MB/sec)
1024 bytes took 318 usec ( 6.439 MB/sec)
2048 bytes took 381 usec ( 10.751 MB/sec)
4096 bytes took 200 usec ( 40.953 MB/sec)
8192 bytes took 254 usec ( 64.525 MB/sec)
16384 bytes took 421 usec ( 77.825 MB/sec)
32768 bytes took 735 usec ( 89.159 MB/sec)
65536 bytes took 1538 usec ( 85.220 MB/sec)
131072 bytes took 3177 usec ( 82.509 MB/sec)
262144 bytes took 5665 usec ( 92.551 MB/sec)
524288 bytes took 11319 usec ( 92.639 MB/sec)
1048576 bytes took 28445 usec ( 73.727 MB/sec)
Max rate = 114.472840 MB/sec Min latency = 23.603439 usec
Over 4xQDR
[root@centos1 bin]# mpiexec --mca btl ^openib\
--mca btl_tcp_if_include ib0 -H 10.0.1.102,10.0.1.101 /home/pong
Hello from 1 of 2
Hello from 0 of 2
Timer accuracy of ~1.192093 usecs
8 bytes took 129 usec ( 0.124 MB/sec)
16 bytes took 79 usec ( 0.405 MB/sec)
32 bytes took 97 usec ( 0.660 MB/sec)
64 bytes took 104 usec ( 1.231 MB/sec)
128 bytes took 80 usec ( 3.196 MB/sec)
256 bytes took 72 usec ( 7.111 MB/sec)
512 bytes took 53 usec ( 19.347 MB/sec)
1024 bytes took 91 usec ( 22.487 MB/sec)
2048 bytes took 128 usec ( 31.992 MB/sec)
4096 bytes took 134 usec ( 61.138 MB/sec)
8192 bytes took 94 usec ( 174.415 MB/sec)
16384 bytes took 152 usec ( 215.422 MB/sec)
32768 bytes took 365 usec ( 179.541 MB/sec)
65536 bytes took 581 usec ( 225.587 MB/sec)
131072 bytes took 1296 usec ( 202.265 MB/sec)
262144 bytes took 2255 usec ( 232.504 MB/sec)
524288 bytes took 3513 usec ( 298.476 MB/sec)
1048576 bytes took 4194 usec ( 500.034 MB/sec)
Asynchronous ping-pong
8 bytes took 44 usec ( 0.363 MB/sec)
16 bytes took 34 usec ( 0.945 MB/sec)
32 bytes took 38 usec ( 1.688 MB/sec)
64 bytes took 34 usec ( 3.754 MB/sec)
128 bytes took 37 usec ( 6.927 MB/sec)
256 bytes took 42 usec ( 12.202 MB/sec)
512 bytes took 33 usec ( 31.123 MB/sec)
1024 bytes took 42 usec ( 48.806 MB/sec)
2048 bytes took 47 usec ( 86.767 MB/sec)
4096 bytes took 54 usec ( 152.034 MB/sec)
8192 bytes took 82 usec ( 199.766 MB/sec)
16384 bytes took 110 usec ( 298.132 MB/sec)
32768 bytes took 200 usec ( 327.626 MB/sec)
65536 bytes took 322 usec ( 406.925 MB/sec)
131072 bytes took 515 usec ( 509.033 MB/sec)
262144 bytes took 894 usec ( 586.563 MB/sec)
524288 bytes took 1666 usec ( 629.371 MB/sec)
1048576 bytes took 3102 usec ( 676.050 MB/sec)
Bi-directional asynchronous ping-pong
8 bytes took 39 usec ( 0.409 MB/sec)
16 bytes took 41 usec ( 0.780 MB/sec)
32 bytes took 35 usec ( 1.839 MB/sec)
64 bytes took 34 usec ( 3.781 MB/sec)
128 bytes took 38 usec ( 6.711 MB/sec)
256 bytes took 32 usec ( 16.026 MB/sec)
512 bytes took 36 usec ( 28.443 MB/sec)
1024 bytes took 36 usec ( 56.887 MB/sec)
2048 bytes took 55 usec ( 74.695 MB/sec)
4096 bytes took 98 usec ( 83.600 MB/sec)
8192 bytes took 135 usec ( 121.198 MB/sec)
16384 bytes took 145 usec ( 226.051 MB/sec)
32768 bytes took 241 usec ( 271.887 MB/sec)
65536 bytes took 563 usec ( 232.849 MB/sec)
131072 bytes took 1331 usec ( 196.974 MB/sec)
262144 bytes took 2535 usec ( 206.831 MB/sec)
524288 bytes took 4167 usec ( 251.648 MB/sec)
1048576 bytes took 6992 usec ( 299.932 MB/sec)
Max rate = 676.050497 MB/sec Min latency = 15.974045 usec