Linux Cluster til Center of Excelence/nVidia GPU
From Teknologisk videncenter
Contents
Billig løsning for at komme igang
Anvende nVidia CUDA compatibelt grafrikkort fx. GeForce 9800GTX+ til kr. 1.361,- inkl. moms.
nVidea Tesla C1060 GPU
- Pris pr. 14. maj 2010 kr. 7980,-
Form Factor | 10.5" x 4.376", Dual Slot |
# of Tesla GPUs | 1 |
# of Streaming Processor Cores | 240 |
Frequency of processor cores | 1.3 GHz |
Single Precision floating point performance (peak) | 933 |
Double Precision floating point performance (peak) | 78 |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 4 GDDR3 |
Memory Speed | 800MHz |
Memory Interface | 512-bit |
Memory Bandwidth | 102 GB/sec |
Max Power Consumption | 187.8 W |
System Interface | PCIe x16 |
Auxiliary Power Connectors | 6-pin & 8-pin |
Thermal Solution | Active fan sink |
Software Development Tools | C-based CUDA Toolkit |
nVidea Tesla S1070
Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. Tesla S1070 skal tilsluttes en eller to host PC'er. Pris cirka. kr. 60.000,- (Ser ikke ud til at kunne købes i DK)
Specifikationer
Number of Tesla GPUs | 4 |
Number of Streaming Processor Cores | 960 (240 per processor) |
Frequency of processor cores | 1.296 to 1.44 GHz |
Single Precision floating point performance (peak) | 3.73 to 4.14 TFlops |
Double Precision floating point performance (peak) | 311 to 345 GFlops |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 16 |
Memory Interface | 512-bit |
Memory Bandwidth | 408 GB/sec |
Max Power Consumption | 800 W (typical) |
System Interface | PCIe x16 or x8 |
Software Development Tools | C-based CUDA Toolkit |
Sammenligning af GPU'er
CUDA
Compute Unified Device Architecture
- See nVidia's CUDA zone
- See nVidia's Online Seminars
Links til artikelserie af Rob Farber
- CUDA, Supercomputing for the Masses: Part 1 CUDA lets you work with familiar programming concepts while developing software that can run on a GPU
- CUDA, Supercomputing for the Masses: Part 2 A first kernel
- CUDA, Supercomputing for the Masses: Part 3 Error handling and global memory performance limitations
- CUDA, Supercomputing for the Masses: Part 4 Understanding and using shared memory (1)
- CUDA, Supercomputing for the Masses: Part 5 Understanding and using shared memory (2)
- CUDA, Supercomputing for the Masses: Part 6 Global memory and the CUDA profiler
- CUDA, Supercomputing for the Masses: Part 7 Double the fun with next-generation CUDA hardware
- CUDA, Supercomputing for the Masses: Part 8 Using libraries with CUDA
- CUDA, Supercomputing for the Masses: Part 9 Extending High-level Languages with CUDA
- CUDA, Supercomputing for the Masses: Part 10 CUDPP, a powerful data-parallel CUDA library
- CUDA, Supercomputing for the Masses: Part 11 Revisiting CUDA memory spaces
- CUDA, Supercomputing for the Masses: Part 12 CUDA 2.2 Changes the Data Movement Paradigm
- CUDA, Supercomputing for the Masses: Part 13 Using texture memory in CUDA
- CUDA, Supercomputing for the Masses: Part 14 Debugging CUDA and using CUDA-GDB
- CUDA, Supercomputing for the Masses: Part 15 Using Pixel Buffer Objects with CUDA and OpenGL
Links til kursusbeskrivelser
- Applied Parallel Programming UNIVERSITY OF ILLINOIS (+++)
Software
Links
- Tsunami cluster Interessant artikel bla. strømforbrug
- MatLAB and Tesla How-to eksempel med Linux