Beowulf Aug2010/Installation/Test Applicationer

From Teknologisk videncenter
Jump to: navigation, search

<accesscontrol>teacher</accesscontrol>

Brug 100% CPU

Skal måske startes flere gange for at bruge begge kerne.

#!/bin/bash

while : ; do
true
done

Teste MPI båndbrede

Testen er lavet på vores Dell PowerEdge 1750 med onboard netkort i en 3550 100mbit/s switch

Hello from 0 of 2
Timer accuracy of ~0.953674 usecs

Hello from 1 of 2
       8 bytes took       331 usec (   0.048 MB/sec)
      16 bytes took       206 usec (   0.155 MB/sec)
      32 bytes took       213 usec (   0.301 MB/sec)
      64 bytes took       229 usec (   0.559 MB/sec)
     128 bytes took       266 usec (   0.962 MB/sec)
     256 bytes took       334 usec (   1.533 MB/sec)
     512 bytes took       472 usec (   2.170 MB/sec)
    1024 bytes took       758 usec (   2.701 MB/sec)
    2048 bytes took      1112 usec (   3.683 MB/sec)
    4096 bytes took      1402 usec (   5.843 MB/sec)
    8192 bytes took      2138 usec (   7.664 MB/sec)
   16384 bytes took      3555 usec (   9.218 MB/sec)
   32768 bytes took      6403 usec (  10.235 MB/sec)
   65536 bytes took     12508 usec (  10.479 MB/sec)
  131072 bytes took     23635 usec (  11.091 MB/sec)
  262144 bytes took     46029 usec (  11.390 MB/sec)
  524288 bytes took     90590 usec (  11.575 MB/sec)
 1048576 bytes took    179719 usec (  11.669 MB/sec)

  Asynchronous ping-pong

       8 bytes took       212 usec (   0.075 MB/sec)
      16 bytes took       205 usec (   0.156 MB/sec)
      32 bytes took       211 usec (   0.303 MB/sec)
      64 bytes took       237 usec (   0.540 MB/sec)
     128 bytes took       264 usec (   0.969 MB/sec)
     256 bytes took       336 usec (   1.524 MB/sec)
     512 bytes took       470 usec (   2.179 MB/sec)
    1024 bytes took       753 usec (   2.720 MB/sec)
    2048 bytes took      1076 usec (   3.807 MB/sec)
    4096 bytes took      1407 usec (   5.822 MB/sec)
    8192 bytes took      2141 usec (   7.653 MB/sec)
   16384 bytes took      3661 usec (   8.950 MB/sec)
   32768 bytes took      6428 usec (  10.195 MB/sec)
   65536 bytes took     12507 usec (  10.480 MB/sec)
  131072 bytes took     23708 usec (  11.057 MB/sec)
  262144 bytes took     46073 usec (  11.380 MB/sec)
  524288 bytes took     90696 usec (  11.561 MB/sec)
 1048576 bytes took    179756 usec (  11.667 MB/sec)

  Bi-directional asynchronous ping-pong

       8 bytes took       197 usec (   0.081 MB/sec)
      16 bytes took       189 usec (   0.169 MB/sec)
      32 bytes took       196 usec (   0.327 MB/sec)
      64 bytes took       200 usec (   0.640 MB/sec)
     128 bytes took       230 usec (   1.114 MB/sec)
     256 bytes took       324 usec (   1.580 MB/sec)
     512 bytes took       454 usec (   2.256 MB/sec)
    1024 bytes took       735 usec (   2.786 MB/sec)
    2048 bytes took      1075 usec (   3.810 MB/sec)
    4096 bytes took      1421 usec (   5.765 MB/sec)
    8192 bytes took      2187 usec (   7.491 MB/sec)
   16384 bytes took      3630 usec (   9.027 MB/sec)
   32768 bytes took      6534 usec (  10.030 MB/sec)
   65536 bytes took     13344 usec (   9.823 MB/sec)
  131072 bytes took     25400 usec (  10.321 MB/sec)
  262144 bytes took     49689 usec (  10.551 MB/sec)
  524288 bytes took    124407 usec (   8.429 MB/sec)
 1048576 bytes took    261184 usec (   8.029 MB/sec)

 Max rate = 11.669048 MB/sec  Min latency = 94.413757 usec

Og de adskiller sig jo ikke det store fra dem i Test Clusteret

Her er de så hvis man bruger Intel netkortet igennem en 100mbit/s switch:

Hello from 0 of 2
Timer accuracy of ~0.953674 usecs

       8 bytes took       504 usec (   0.032 MB/sec)
      16 bytes took       266 usec (   0.120 MB/sec)
      32 bytes took       199 usec (   0.322 MB/sec)
Hello from 1 of 2
      64 bytes took       262 usec (   0.489 MB/sec)
     128 bytes took       262 usec (   0.977 MB/sec)
     256 bytes took       491 usec (   1.042 MB/sec)
     512 bytes took       460 usec (   2.227 MB/sec)
    1024 bytes took      1012 usec (   2.024 MB/sec)
    2048 bytes took      1084 usec (   3.778 MB/sec)
    4096 bytes took      1780 usec (   4.602 MB/sec)
    8192 bytes took      2179 usec (   7.519 MB/sec)
   16384 bytes took      3735 usec (   8.774 MB/sec)
   32768 bytes took      6395 usec (  10.248 MB/sec)
   65536 bytes took     12623 usec (  10.384 MB/sec)
  131072 bytes took     13000 usec (  20.165 MB/sec)
  262144 bytes took     24516 usec (  21.386 MB/sec)
  524288 bytes took     47452 usec (  22.098 MB/sec)
 1048576 bytes took     91873 usec (  22.827 MB/sec)

  Asynchronous ping-pong

       8 bytes took       434 usec (   0.037 MB/sec)
      16 bytes took       189 usec (   0.169 MB/sec)
      32 bytes took       234 usec (   0.273 MB/sec)
      64 bytes took       221 usec (   0.579 MB/sec)
     128 bytes took       482 usec (   0.531 MB/sec)
     256 bytes took       325 usec (   1.576 MB/sec)
     512 bytes took       714 usec (   1.435 MB/sec)
    1024 bytes took       743 usec (   2.756 MB/sec)
    2048 bytes took      1255 usec (   3.264 MB/sec)
    4096 bytes took      1419 usec (   5.773 MB/sec)
    8192 bytes took      2247 usec (   7.292 MB/sec)
   16384 bytes took      3574 usec (   9.168 MB/sec)
   32768 bytes took      6691 usec (   9.794 MB/sec)
   65536 bytes took     13006 usec (  10.078 MB/sec)
  131072 bytes took     13254 usec (  19.779 MB/sec)
  262144 bytes took     24478 usec (  21.419 MB/sec)
  524288 bytes took     47442 usec (  22.102 MB/sec)
 1048576 bytes took     91929 usec (  22.813 MB/sec)

  Bi-directional asynchronous ping-pong

       8 bytes took       490 usec (   0.033 MB/sec)
      16 bytes took       247 usec (   0.130 MB/sec)
      32 bytes took       248 usec (   0.258 MB/sec)
      64 bytes took       247 usec (   0.518 MB/sec)
     128 bytes took       248 usec (   1.032 MB/sec)
     256 bytes took       498 usec (   1.028 MB/sec)
     512 bytes took       528 usec (   1.939 MB/sec)
    1024 bytes took       748 usec (   2.738 MB/sec)
    2048 bytes took      1261 usec (   3.248 MB/sec)
    4096 bytes took      1521 usec (   5.386 MB/sec)
    8192 bytes took      2270 usec (   7.218 MB/sec)
   16384 bytes took      3785 usec (   8.658 MB/sec)
   32768 bytes took      6564 usec (   9.984 MB/sec)
   65536 bytes took     13499 usec (   9.710 MB/sec)
  131072 bytes took     19491 usec (  13.450 MB/sec)
  262144 bytes took     36962 usec (  14.185 MB/sec)
  524288 bytes took     71936 usec (  14.577 MB/sec)
 1048576 bytes took    160475 usec (  13.068 MB/sec)

 Max rate = 22.826658 MB/sec  Min latency = 94.413757 usec

HDD test

http://www.linux-mag.com/id/7906/2/

iostat -x -m /dev/sda1 1
*************************************
root@NewClusterH:~# hdparm -Tt /dev/sdb

/dev/sdb:
 Timing cached reads:   19890 MB in  2.00 seconds = 9969.08 MB/sec
 Timing buffered disk reads:  2318 MB in  3.00 seconds = 772.65 MB/sec
*************************************

test af write på SSD

********************************
root@NewClusterH:/ssd# dd count=500 bs=100M if=/dev/zero of=/ssd/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 68,0062 s, 771 MB/s
********************************

test af read på SSD

********************************
root@NewClusterH:/ssd# dd count=500 bs=100M if=/ssd/test.img of=/dev/null
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 63,5971 s, 824 MB/s
********************************

test af write på WD

********************************
root@NewClusterH:/raid5# dd count=500 bs=100M if=/dev/zero of=/raid5/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 150,079 s, 349 MB/s
********************************

test af read på WD

********************************
root@NewClusterH:/raid5# dd count=500 bs=100M of=/dev/null if=/raid5/test.img
500+0 records in
500+0 records out
52428800000 bytes (52 GB) copied, 156,591 s, 335 MB/s
********************************

Memory test

http://www.cs.virginia.edu/stream/ref.html#start

rael@newclusterh:~/stream$ gcc -O stream.c -o stream
rael@newclusterh:~/stream$ ./stream
-------------------------------------------------------------
STREAM version $Revision: 5.9 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 2170 microseconds.
   (= 2170 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       12071.0251       0.0027       0.0027       0.0027
Scale:      11632.6684       0.0028       0.0028       0.0028
Add:        13065.5196       0.0037       0.0037       0.0037
Triad:      13072.3065       0.0037       0.0037       0.0037
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------
rael@newclusterh:~/stream$

Links

http://home.comcast.net/~fbui/bandwidth.html