Difference between revisions of "Beowulf Aug2010/Installation/Test Applicationer"

From Teknologisk videncenter
Jump to: navigation, search
m (New page: ==Brug 100% CPU== Skal måske startes flere gange for at bruge begge kerne. <source lang=bash> #!/bin/bash while : ; do true done </source>)
 
m (Memory test)
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<accesscontrol>teacher</accesscontrol>
 
==Brug 100% CPU==
 
==Brug 100% CPU==
 
Skal måske startes flere gange for at bruge begge kerne.
 
Skal måske startes flere gange for at bruge begge kerne.
Line 8: Line 9:
 
done
 
done
 
</source>
 
</source>
 +
==Teste MPI båndbrede==
 +
Testen er lavet på vores Dell PowerEdge 1750 med onboard netkort i en 3550 100mbit/s switch
 +
<pre>
 +
Hello from 0 of 2
 +
Timer accuracy of ~0.953674 usecs
 +
 +
Hello from 1 of 2
 +
      8 bytes took      331 usec (  0.048 MB/sec)
 +
      16 bytes took      206 usec (  0.155 MB/sec)
 +
      32 bytes took      213 usec (  0.301 MB/sec)
 +
      64 bytes took      229 usec (  0.559 MB/sec)
 +
    128 bytes took      266 usec (  0.962 MB/sec)
 +
    256 bytes took      334 usec (  1.533 MB/sec)
 +
    512 bytes took      472 usec (  2.170 MB/sec)
 +
    1024 bytes took      758 usec (  2.701 MB/sec)
 +
    2048 bytes took      1112 usec (  3.683 MB/sec)
 +
    4096 bytes took      1402 usec (  5.843 MB/sec)
 +
    8192 bytes took      2138 usec (  7.664 MB/sec)
 +
  16384 bytes took      3555 usec (  9.218 MB/sec)
 +
  32768 bytes took      6403 usec (  10.235 MB/sec)
 +
  65536 bytes took    12508 usec (  10.479 MB/sec)
 +
  131072 bytes took    23635 usec (  11.091 MB/sec)
 +
  262144 bytes took    46029 usec (  11.390 MB/sec)
 +
  524288 bytes took    90590 usec (  11.575 MB/sec)
 +
1048576 bytes took    179719 usec (  11.669 MB/sec)
 +
 +
  Asynchronous ping-pong
 +
 +
      8 bytes took      212 usec (  0.075 MB/sec)
 +
      16 bytes took      205 usec (  0.156 MB/sec)
 +
      32 bytes took      211 usec (  0.303 MB/sec)
 +
      64 bytes took      237 usec (  0.540 MB/sec)
 +
    128 bytes took      264 usec (  0.969 MB/sec)
 +
    256 bytes took      336 usec (  1.524 MB/sec)
 +
    512 bytes took      470 usec (  2.179 MB/sec)
 +
    1024 bytes took      753 usec (  2.720 MB/sec)
 +
    2048 bytes took      1076 usec (  3.807 MB/sec)
 +
    4096 bytes took      1407 usec (  5.822 MB/sec)
 +
    8192 bytes took      2141 usec (  7.653 MB/sec)
 +
  16384 bytes took      3661 usec (  8.950 MB/sec)
 +
  32768 bytes took      6428 usec (  10.195 MB/sec)
 +
  65536 bytes took    12507 usec (  10.480 MB/sec)
 +
  131072 bytes took    23708 usec (  11.057 MB/sec)
 +
  262144 bytes took    46073 usec (  11.380 MB/sec)
 +
  524288 bytes took    90696 usec (  11.561 MB/sec)
 +
1048576 bytes took    179756 usec (  11.667 MB/sec)
 +
 +
  Bi-directional asynchronous ping-pong
 +
 +
      8 bytes took      197 usec (  0.081 MB/sec)
 +
      16 bytes took      189 usec (  0.169 MB/sec)
 +
      32 bytes took      196 usec (  0.327 MB/sec)
 +
      64 bytes took      200 usec (  0.640 MB/sec)
 +
    128 bytes took      230 usec (  1.114 MB/sec)
 +
    256 bytes took      324 usec (  1.580 MB/sec)
 +
    512 bytes took      454 usec (  2.256 MB/sec)
 +
    1024 bytes took      735 usec (  2.786 MB/sec)
 +
    2048 bytes took      1075 usec (  3.810 MB/sec)
 +
    4096 bytes took      1421 usec (  5.765 MB/sec)
 +
    8192 bytes took      2187 usec (  7.491 MB/sec)
 +
  16384 bytes took      3630 usec (  9.027 MB/sec)
 +
  32768 bytes took      6534 usec (  10.030 MB/sec)
 +
  65536 bytes took    13344 usec (  9.823 MB/sec)
 +
  131072 bytes took    25400 usec (  10.321 MB/sec)
 +
  262144 bytes took    49689 usec (  10.551 MB/sec)
 +
  524288 bytes took    124407 usec (  8.429 MB/sec)
 +
1048576 bytes took    261184 usec (  8.029 MB/sec)
 +
 +
Max rate = 11.669048 MB/sec  Min latency = 94.413757 usec
 +
 +
</pre>
 +
Og de adskiller sig jo ikke det store fra dem i [[Weekend_Projekt_-_Test_Cluster|Test Clusteret]]
 +
<br/><br/>
 +
Her er de så hvis man bruger Intel netkortet igennem en 100mbit/s switch:
 +
<pre>
 +
Hello from 0 of 2
 +
Timer accuracy of ~0.953674 usecs
 +
 +
      8 bytes took      504 usec (  0.032 MB/sec)
 +
      16 bytes took      266 usec (  0.120 MB/sec)
 +
      32 bytes took      199 usec (  0.322 MB/sec)
 +
Hello from 1 of 2
 +
      64 bytes took      262 usec (  0.489 MB/sec)
 +
    128 bytes took      262 usec (  0.977 MB/sec)
 +
    256 bytes took      491 usec (  1.042 MB/sec)
 +
    512 bytes took      460 usec (  2.227 MB/sec)
 +
    1024 bytes took      1012 usec (  2.024 MB/sec)
 +
    2048 bytes took      1084 usec (  3.778 MB/sec)
 +
    4096 bytes took      1780 usec (  4.602 MB/sec)
 +
    8192 bytes took      2179 usec (  7.519 MB/sec)
 +
  16384 bytes took      3735 usec (  8.774 MB/sec)
 +
  32768 bytes took      6395 usec (  10.248 MB/sec)
 +
  65536 bytes took    12623 usec (  10.384 MB/sec)
 +
  131072 bytes took    13000 usec (  20.165 MB/sec)
 +
  262144 bytes took    24516 usec (  21.386 MB/sec)
 +
  524288 bytes took    47452 usec (  22.098 MB/sec)
 +
1048576 bytes took    91873 usec (  22.827 MB/sec)
 +
 +
  Asynchronous ping-pong
 +
 +
      8 bytes took      434 usec (  0.037 MB/sec)
 +
      16 bytes took      189 usec (  0.169 MB/sec)
 +
      32 bytes took      234 usec (  0.273 MB/sec)
 +
      64 bytes took      221 usec (  0.579 MB/sec)
 +
    128 bytes took      482 usec (  0.531 MB/sec)
 +
    256 bytes took      325 usec (  1.576 MB/sec)
 +
    512 bytes took      714 usec (  1.435 MB/sec)
 +
    1024 bytes took      743 usec (  2.756 MB/sec)
 +
    2048 bytes took      1255 usec (  3.264 MB/sec)
 +
    4096 bytes took      1419 usec (  5.773 MB/sec)
 +
    8192 bytes took      2247 usec (  7.292 MB/sec)
 +
  16384 bytes took      3574 usec (  9.168 MB/sec)
 +
  32768 bytes took      6691 usec (  9.794 MB/sec)
 +
  65536 bytes took    13006 usec (  10.078 MB/sec)
 +
  131072 bytes took    13254 usec (  19.779 MB/sec)
 +
  262144 bytes took    24478 usec (  21.419 MB/sec)
 +
  524288 bytes took    47442 usec (  22.102 MB/sec)
 +
1048576 bytes took    91929 usec (  22.813 MB/sec)
 +
 +
  Bi-directional asynchronous ping-pong
 +
 +
      8 bytes took      490 usec (  0.033 MB/sec)
 +
      16 bytes took      247 usec (  0.130 MB/sec)
 +
      32 bytes took      248 usec (  0.258 MB/sec)
 +
      64 bytes took      247 usec (  0.518 MB/sec)
 +
    128 bytes took      248 usec (  1.032 MB/sec)
 +
    256 bytes took      498 usec (  1.028 MB/sec)
 +
    512 bytes took      528 usec (  1.939 MB/sec)
 +
    1024 bytes took      748 usec (  2.738 MB/sec)
 +
    2048 bytes took      1261 usec (  3.248 MB/sec)
 +
    4096 bytes took      1521 usec (  5.386 MB/sec)
 +
    8192 bytes took      2270 usec (  7.218 MB/sec)
 +
  16384 bytes took      3785 usec (  8.658 MB/sec)
 +
  32768 bytes took      6564 usec (  9.984 MB/sec)
 +
  65536 bytes took    13499 usec (  9.710 MB/sec)
 +
  131072 bytes took    19491 usec (  13.450 MB/sec)
 +
  262144 bytes took    36962 usec (  14.185 MB/sec)
 +
  524288 bytes took    71936 usec (  14.577 MB/sec)
 +
1048576 bytes took    160475 usec (  13.068 MB/sec)
 +
 +
Max rate = 22.826658 MB/sec  Min latency = 94.413757 usec
 +
</pre>
 +
==HDD test==
 +
http://www.linux-mag.com/id/7906/2/
 +
<pre>
 +
iostat -x -m /dev/sda1 1
 +
</pre>
 +
<pre>
 +
*************************************
 +
root@NewClusterH:~# hdparm -Tt /dev/sdb
 +
 +
/dev/sdb:
 +
Timing cached reads:  19890 MB in  2.00 seconds = 9969.08 MB/sec
 +
Timing buffered disk reads:  2318 MB in  3.00 seconds = 772.65 MB/sec
 +
*************************************
 +
 +
test af write på SSD
 +
 +
********************************
 +
root@NewClusterH:/ssd# dd count=500 bs=100M if=/dev/zero of=/ssd/test.img
 +
500+0 records in
 +
500+0 records out
 +
52428800000 bytes (52 GB) copied, 68,0062 s, 771 MB/s
 +
********************************
 +
 +
test af read på SSD
 +
 +
********************************
 +
root@NewClusterH:/ssd# dd count=500 bs=100M if=/ssd/test.img of=/dev/null
 +
500+0 records in
 +
500+0 records out
 +
52428800000 bytes (52 GB) copied, 63,5971 s, 824 MB/s
 +
********************************
 +
 +
test af write på WD
 +
 +
********************************
 +
root@NewClusterH:/raid5# dd count=500 bs=100M if=/dev/zero of=/raid5/test.img
 +
500+0 records in
 +
500+0 records out
 +
52428800000 bytes (52 GB) copied, 150,079 s, 349 MB/s
 +
********************************
 +
 +
test af read på WD
 +
 +
********************************
 +
root@NewClusterH:/raid5# dd count=500 bs=100M of=/dev/null if=/raid5/test.img
 +
500+0 records in
 +
500+0 records out
 +
52428800000 bytes (52 GB) copied, 156,591 s, 335 MB/s
 +
********************************
 +
</pre>
 +
==Memory test==
 +
http://www.cs.virginia.edu/stream/ref.html#start
 +
<pre>
 +
rael@newclusterh:~/stream$ gcc -O stream.c -o stream
 +
rael@newclusterh:~/stream$ ./stream
 +
-------------------------------------------------------------
 +
STREAM version $Revision: 5.9 $
 +
-------------------------------------------------------------
 +
This system uses 8 bytes per DOUBLE PRECISION word.
 +
-------------------------------------------------------------
 +
Array size = 2000000, Offset = 0
 +
Total memory required = 45.8 MB.
 +
Each test is run 10 times, but only
 +
the *best* time for each is used.
 +
-------------------------------------------------------------
 +
Printing one line per active thread....
 +
-------------------------------------------------------------
 +
Your clock granularity/precision appears to be 1 microseconds.
 +
Each test below will take on the order of 2170 microseconds.
 +
  (= 2170 clock ticks)
 +
Increase the size of the arrays if this shows that
 +
you are not getting at least 20 clock ticks per test.
 +
-------------------------------------------------------------
 +
WARNING -- The above is only a rough guideline.
 +
For best results, please be sure you know the
 +
precision of your system timer.
 +
-------------------------------------------------------------
 +
Function      Rate (MB/s)  Avg time    Min time    Max time
 +
Copy:      12071.0251      0.0027      0.0027      0.0027
 +
Scale:      11632.6684      0.0028      0.0028      0.0028
 +
Add:        13065.5196      0.0037      0.0037      0.0037
 +
Triad:      13072.3065      0.0037      0.0037      0.0037
 +
-------------------------------------------------------------
 +
Solution Validates
 +
-------------------------------------------------------------
 +
rael@newclusterh:~/stream$
 +
 +
</pre>
 +
===Links===
 +
http://home.comcast.net/~fbui/bandwidth.html
 +
[[Category:CoE]]

Latest revision as of 14:10, 5 April 2011

<accesscontrol>teacher</accesscontrol>

Brug 100% CPU

Skal måske startes flere gange for at bruge begge kerne.

#!/bin/bash

while : ; do
true
done

Teste MPI båndbrede

Testen er lavet på vores Dell PowerEdge 1750 med onboard netkort i en 3550 100mbit/s switch

Hello from 0 of 2
Timer accuracy of ~0.953674 usecs

Hello from 1 of 2
       8 bytes took       331 usec (   0.048 MB/sec)
      16 bytes took       206 usec (   0.155 MB/sec)
      32 bytes took       213 usec (   0.301 MB/sec)
      64 bytes took       229 usec (   0.559 MB/sec)
     128 bytes took       266 usec (   0.962 MB/sec)
     256 bytes took       334 usec (   1.533 MB/sec)
     512 bytes took       472 usec (   2.170 MB/sec)
    1024 bytes took       758 usec (   2.701 MB/sec)
    2048 bytes took      1112 usec (   3.683 MB/sec)
    4096 bytes took      1402 usec (   5.843 MB/sec)
    8192 bytes took      2138 usec (   7.664 MB/sec)
   16384 bytes took      3555 usec (   9.218 MB/sec)
   32768 bytes took      6403 usec (  10.235 MB/sec)
   65536 bytes took     12508 usec (  10.479 MB/sec)
  131072 bytes took     23635 usec (  11.091 MB/sec)
  262144 bytes took     46029 usec (  11.390 MB/sec)
  524288 bytes took     90590 usec (  11.575 MB/sec)
 1048576 bytes took    179719 usec (  11.669 MB/sec)

  Asynchronous ping-pong

       8 bytes took       212 usec (   0.075 MB/sec)
      16 bytes took       205 usec (   0.156 MB/sec)
      32 bytes took       211 usec (   0.303 MB/sec)
      64 bytes took       237 usec (   0.540 MB/sec)
     128 bytes took       264 usec (   0.969 MB/sec)
     256 bytes took       336 usec (   1.524 MB/sec)
     512 bytes took       470 usec (   2.179 MB/sec)
    1024 bytes took       753 usec (   2.720 MB/sec)
    2048 bytes took      1076 usec (   3.807 MB/sec)
    4096 bytes took      1407 usec (   5.822 MB/sec)
    8192 bytes took      2141 usec (   7.653 MB/sec)
   16384 bytes took      3661 usec (   8.950 MB/sec)
   32768 bytes took      6428 usec (  10.195 MB/sec)
   65536 bytes took     12507 usec (  10.480 MB/sec)
  131072 bytes took     23708 usec (  11.057 MB/sec)
  262144 bytes took     46073 usec (  11.380 MB/sec)
  524288 bytes took     90696 usec (  11.561 MB/sec)
 1048576 bytes took    179756 usec (  11.667 MB/sec)

  Bi-directional asynchronous ping-pong

       8 bytes took       197 usec (   0.081 MB/sec)
      16 bytes took       189 usec (   0.169 MB/sec)
      32 bytes took       196 usec (   0.327 MB/sec)
      64 bytes took       200 usec (   0.640 MB/sec)
     128 bytes took       230 usec (   1.114 MB/sec)
     256 bytes took       324 usec (   1.580 MB/sec)
     512 bytes took       454 usec (   2.256 MB/sec)
    1024 bytes took       735 usec (   2.786 MB/sec)
    2048 bytes took      1075 usec (   3.810 MB/sec)
    4096 bytes took      1421 usec (   5.765 MB/sec)
    8192 bytes took      2187 usec (   7.491 MB/sec)
   16384 bytes took      3630 usec (   9.027 MB/sec)
   32768 bytes took      6534 usec (  10.030 MB/sec)
   65536 bytes took     13344 usec (   9.823 MB/sec)
  131072 bytes took     25400 usec (  10.321 MB/sec)
  262144 bytes took     49689 usec (  10.551 MB/sec)
  524288 bytes took    124407 usec (   8.429 MB/sec)
 1048576 bytes took    261184 usec (   8.029 MB/sec)

 Max rate = 11.669048 MB/sec  Min latency = 94.413757 usec

Og de adskiller sig jo ikke det store fra dem i Test Clusteret

Her er de så hvis man bruger Intel netkortet igennem en 100mbit/s switch:

Hello from 0 of 2
Timer accuracy of ~0.953674 usecs

       8 bytes took       504 usec (   0.032 MB/sec)
      16 bytes took       266 usec (   0.120 MB/sec)
      32 bytes took       199 usec (   0.322 MB/sec)
Hello from 1 of 2
      64 bytes took       262 usec (   0.489 MB/sec)
     128 bytes took       262 usec (   0.977 MB/sec)
     256 bytes took       491 usec (   1.042 MB/sec)
     512 bytes took       460 usec (   2.227 MB/sec)
    1024 bytes took      1012 usec (   2.024 MB/sec)
    2048 bytes took      1084 usec (   3.778 MB/sec)
    4096 bytes took      1780 usec (   4.602 MB/sec)
    8192 bytes took      2179 usec (   7.519 MB/sec)
   16384 bytes took      3735 usec (   8.774 MB/sec)
   32768 bytes took      6395 usec (  10.248 MB/sec)
   65536 bytes took     12623 usec (  10.384 MB/sec)
  131072 bytes took     13000 usec (  20.165 MB/sec)
  262144 bytes took     24516 usec (  21.386 MB/sec)
  524288 bytes took     47452 usec (  22.098 MB/sec)
 1048576 bytes took     91873 usec (  22.827 MB/sec)

  Asynchronous ping-pong

       8 bytes took       434 usec (   0.037 MB/sec)
      16 bytes took       189 usec (   0.169 MB/sec)
      32 bytes took       234 usec (   0.273 MB/sec)
      64 bytes took       221 usec (   0.579 MB/sec)
     128 bytes took       482 usec (   0.531 MB/sec)
     256 bytes took       325 usec (   1.576 MB/sec)
     512 bytes took       714 usec (   1.435 MB/sec)
    1024 bytes took       743 usec (   2.756 MB/sec)
    2048 bytes took      1255 usec (   3.264 MB/sec)
    4096 bytes took      1419 usec (   5.773 MB/sec)
    8192 bytes took      2247 usec (   7.292 MB/sec)
   16384 bytes took      3574 usec (   9.168 MB/sec)
   32768 bytes took      6691 usec (   9.794 MB/sec)
   65536 bytes took     13006 usec (  10.078 MB/sec)
  131072 bytes took     13254 usec (  19.779 MB/sec)
  262144 bytes took     24478 usec (  21.419 MB/sec)
  524288 bytes took     47442 usec (  22.102 MB/sec)
 1048576 bytes took     91929 usec (  22.813 MB/sec)

  Bi-directional asynchronous ping-pong

       8 bytes took       490 usec (   0.033 MB/sec)
      16 bytes took       247 usec (   0.130 MB/sec)
      32 bytes took       248 usec (   0.258 MB/sec)
      64 bytes took       247 usec (   0.518 MB/sec)
     128 bytes took       248 usec (   1.032 MB/sec)
     256 bytes took       498 usec (   1.028 MB/sec)
     512 bytes took       528 usec (   1.939 MB/sec)
    1024 bytes took       748 usec (   2.738 MB/sec)
    2048 bytes took      1261 usec (   3.248 MB/sec)
    4096 bytes took      1521 usec (   5.386 MB/sec)
    8192 bytes took      2270 usec (   7.218 MB/sec)
   16384 bytes took      3785 usec (   8.658 MB/sec)
   32768 bytes took      6564 usec (   9.984 MB/sec)
   65536 bytes took     13499 usec (   9.710 MB/sec)
  131072 bytes took     19491 usec (  13.450 MB/sec)
  262144 bytes took     36962 usec (  14.185 MB/sec)
  524288 bytes took     71936 usec (  14.577 MB/sec)
 1048576 bytes took    160475 usec (  13.068 MB/sec)

 Max rate = 22.826658 MB/sec  Min latency = 94.413757 usec

HDD test

http://www.linux-mag.com/id/7906/2/

iostat -x -m /dev/sda1 1
*************************************
root@NewClusterH:~# hdparm -Tt /dev/sdb

/dev/sdb:
 Timing cached reads:   19890 MB in  2.00 seconds = 9969.08 MB/sec
 Timing buffered disk reads:  2318 MB in  3.00 seconds = 772.65 MB/sec
*************************************

test af write på SSD

********************************
root@NewClusterH:/ssd# dd count=500 bs=100M if=/dev/zero of=/ssd/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 68,0062 s, 771 MB/s
********************************

test af read på SSD

********************************
root@NewClusterH:/ssd# dd count=500 bs=100M if=/ssd/test.img of=/dev/null
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 63,5971 s, 824 MB/s
********************************

test af write på WD

********************************
root@NewClusterH:/raid5# dd count=500 bs=100M if=/dev/zero of=/raid5/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 150,079 s, 349 MB/s
********************************

test af read på WD

********************************
root@NewClusterH:/raid5# dd count=500 bs=100M of=/dev/null if=/raid5/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 156,591 s, 335 MB/s
********************************

Memory test

http://www.cs.virginia.edu/stream/ref.html#start

rael@newclusterh:~/stream$ gcc -O stream.c -o stream
rael@newclusterh:~/stream$ ./stream
-------------------------------------------------------------
STREAM version $Revision: 5.9 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 2170 microseconds.
   (= 2170 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       12071.0251       0.0027       0.0027       0.0027
Scale:      11632.6684       0.0028       0.0028       0.0028
Add:        13065.5196       0.0037       0.0037       0.0037
Triad:      13072.3065       0.0037       0.0037       0.0037
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------
rael@newclusterh:~/stream$

Links

http://home.comcast.net/~fbui/bandwidth.html