Difference between revisions of "Linux Cluster til Center of Excelence/nVidia GPU"
From Teknologisk videncenter
(→Links til kursusbeskrivelser) |
m (→Documentation) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 85: | Line 85: | ||
=CUDA= | =CUDA= | ||
''C''ompute ''U''nified ''D''evice ''A''rchitecture | ''C''ompute ''U''nified ''D''evice ''A''rchitecture | ||
+ | *[http://developer.nvidia.com/page/home.html CUDA Developer homepage] ** | ||
*See nVidia's [http://www.nvidia.com/object/cuda_home_new.html CUDA zone] | *See nVidia's [http://www.nvidia.com/object/cuda_home_new.html CUDA zone] | ||
*See nVidia's [http://developer.nvidia.com/object/gpu_computing_online.html Online Seminars] | *See nVidia's [http://developer.nvidia.com/object/gpu_computing_online.html Online Seminars] | ||
+ | |||
+ | ==Documentation== | ||
+ | *[http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf Programming Guide CUDA 8] | ||
+ | *[http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/NVIDIA_CUDA_BestPracticesGuide_2.3.pdf CUDA C best practices guide] | ||
+ | *[http://developer.nvidia.com/object/cuda-by-example.html CUDA by example] (Also available on Safari) | ||
+ | |||
==Links til artikelserie af Rob Farber== | ==Links til artikelserie af Rob Farber== | ||
*[http://www.drdobbs.com/high-performance-computing/207200659 CUDA, Supercomputing for the Masses: Part 1] CUDA lets you work with familiar programming concepts while developing software that can run on a GPU | *[http://www.drdobbs.com/high-performance-computing/207200659 CUDA, Supercomputing for the Masses: Part 1] CUDA lets you work with familiar programming concepts while developing software that can run on a GPU |
Latest revision as of 10:04, 21 November 2010
Contents
Billig løsning for at komme igang
Anvende nVidia CUDA compatibelt grafrikkort fx. GeForce 9800GTX+ til kr. 1.361,- inkl. moms.
nVidea Tesla C1060 GPU
- Pris pr. 14. maj 2010 kr. 7980,-
Form Factor | 10.5" x 4.376", Dual Slot |
# of Tesla GPUs | 1 |
# of Streaming Processor Cores | 240 |
Frequency of processor cores | 1.3 GHz |
Single Precision floating point performance (peak) | 933 |
Double Precision floating point performance (peak) | 78 |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 4 GDDR3 |
Memory Speed | 800MHz |
Memory Interface | 512-bit |
Memory Bandwidth | 102 GB/sec |
Max Power Consumption | 187.8 W |
System Interface | PCIe x16 |
Auxiliary Power Connectors | 6-pin & 8-pin |
Thermal Solution | Active fan sink |
Software Development Tools | C-based CUDA Toolkit |
nVidea Tesla S1070
Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. Tesla S1070 skal tilsluttes en eller to host PC'er. Pris cirka. kr. 60.000,- (Ser ikke ud til at kunne købes i DK)
Specifikationer
Number of Tesla GPUs | 4 |
Number of Streaming Processor Cores | 960 (240 per processor) |
Frequency of processor cores | 1.296 to 1.44 GHz |
Single Precision floating point performance (peak) | 3.73 to 4.14 TFlops |
Double Precision floating point performance (peak) | 311 to 345 GFlops |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 16 |
Memory Interface | 512-bit |
Memory Bandwidth | 408 GB/sec |
Max Power Consumption | 800 W (typical) |
System Interface | PCIe x16 or x8 |
Software Development Tools | C-based CUDA Toolkit |
Sammenligning af GPU'er
Model | GFlops | Kerner | Mem. bandwith GB/s | kr. | kr. pr. GFlop |
GTS 250 | 470 | 128 | 70 | 915 | 1.95 |
CUDA
Compute Unified Device Architecture
- CUDA Developer homepage **
- See nVidia's CUDA zone
- See nVidia's Online Seminars
Documentation
- Programming Guide CUDA 8
- CUDA C best practices guide
- CUDA by example (Also available on Safari)
Links til artikelserie af Rob Farber
- CUDA, Supercomputing for the Masses: Part 1 CUDA lets you work with familiar programming concepts while developing software that can run on a GPU
- CUDA, Supercomputing for the Masses: Part 2 A first kernel
- CUDA, Supercomputing for the Masses: Part 3 Error handling and global memory performance limitations
- CUDA, Supercomputing for the Masses: Part 4 Understanding and using shared memory (1)
- CUDA, Supercomputing for the Masses: Part 5 Understanding and using shared memory (2)
- CUDA, Supercomputing for the Masses: Part 6 Global memory and the CUDA profiler
- CUDA, Supercomputing for the Masses: Part 7 Double the fun with next-generation CUDA hardware
- CUDA, Supercomputing for the Masses: Part 8 Using libraries with CUDA
- CUDA, Supercomputing for the Masses: Part 9 Extending High-level Languages with CUDA
- CUDA, Supercomputing for the Masses: Part 10 CUDPP, a powerful data-parallel CUDA library
- CUDA, Supercomputing for the Masses: Part 11 Revisiting CUDA memory spaces
- CUDA, Supercomputing for the Masses: Part 12 CUDA 2.2 Changes the Data Movement Paradigm
- CUDA, Supercomputing for the Masses: Part 13 Using texture memory in CUDA
- CUDA, Supercomputing for the Masses: Part 14 Debugging CUDA and using CUDA-GDB
- CUDA, Supercomputing for the Masses: Part 15 Using Pixel Buffer Objects with CUDA and OpenGL
Links til kursusbeskrivelser
- Applied Parallel Programming UNIVERSITY OF ILLINOIS (+++)
- GPU Computing Online Seminars Juni juli 2010
Software
Perl
Links
- Tsunami cluster Interessant artikel bla. strømforbrug
- MatLAB and Tesla How-to eksempel med Linux
- IEEE on GPU's Good HeTh