Difference between revisions of "Linux Cluster til Center of Excelence/nVidia GPU"
From Teknologisk videncenter
m (New page: = nVidea Tesla S1070 = Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. {| |[[Image:Tesla s1070 1.gif|thumb|500px|left|4 GPU'er i kabinettet med 4 x 240 kerner = 960 kerner]...) |
m (→Documentation) |
||
(23 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
+ | = Billig løsning for at komme igang = | ||
+ | Anvende nVidia [[ Linux Cluster til Center of Excelence/nVidia GPU#CUDA|CUDA]] compatibelt grafrikkort fx. [http://www.midtdata.dk/hardware_1000180575_XFX-Geforce-9800GTX+---512MB-DDR3---PCI-E---OC.html GeForce 9800GTX+] til kr. 1.361,- inkl. moms. | ||
+ | |||
+ | =nVidea Tesla C1060 GPU = | ||
+ | *Pris pr. 14. maj 2010 kr. [http://www.edbpriser.dk/Product/Details.aspx?q=tesla&sp=all&pid=483374 7980,-] | ||
+ | [[Image:Tesla s1060 1.gif|S1060 GPU]] | ||
+ | {|border=1 ;style="margin: 0 auto; text-align: center;cellpadding="5" cellspacing="0" | ||
+ | |+Oplysninger hentet fra [http://www.nvidia.com/object/product_tesla_c1060_us.html Tesla c1060 Specifications] | ||
+ | |Form Factor ||10.5" x 4.376", Dual Slot | ||
+ | |- | ||
+ | |# of Tesla GPUs ||1 | ||
+ | |- | ||
+ | |# of Streaming Processor Cores ||240 | ||
+ | |- | ||
+ | |Frequency of processor cores ||1.3 GHz | ||
+ | |- | ||
+ | |Single Precision floating point performance (peak) ||933 | ||
+ | |- | ||
+ | |Double Precision floating point performance (peak) ||78 | ||
+ | |- | ||
+ | |Floating Point Precision ||IEEE 754 single & double | ||
+ | |- | ||
+ | |Total Dedicated Memory ||4 GDDR3 | ||
+ | |- | ||
+ | |Memory Speed ||800MHz | ||
+ | |- | ||
+ | |Memory Interface ||512-bit | ||
+ | |- | ||
+ | |Memory Bandwidth ||102 GB/sec | ||
+ | |- | ||
+ | |Max Power Consumption ||187.8 W | ||
+ | |- | ||
+ | |System Interface ||PCIe x16 | ||
+ | |- | ||
+ | |Auxiliary Power Connectors ||6-pin & 8-pin | ||
+ | |- | ||
+ | |Thermal Solution ||Active fan sink | ||
+ | |- | ||
+ | |Software Development Tools ||[http://www.nvidia.com/object/tesla_software.html C-based CUDA Toolkit] | ||
+ | |} | ||
+ | |||
= nVidea Tesla S1070 = | = nVidea Tesla S1070 = | ||
− | Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. | + | Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. Tesla S1070 skal tilsluttes en eller to host PC'er. Pris cirka. kr. 60.000,- (Ser ikke ud til at kunne købes i DK) |
{| | {| | ||
|[[Image:Tesla s1070 1.gif|thumb|500px|left|4 GPU'er i kabinettet med 4 x 240 kerner = 960 kerner]] | |[[Image:Tesla s1070 1.gif|thumb|500px|left|4 GPU'er i kabinettet med 4 x 240 kerner = 960 kerner]] | ||
Line 34: | Line 75: | ||
|Software Development Tools ||[http://www.nvidia.com/object/tesla_software.html C-based CUDA Toolkit] | |Software Development Tools ||[http://www.nvidia.com/object/tesla_software.html C-based CUDA Toolkit] | ||
|} | |} | ||
+ | =Sammenligning af GPU'er= | ||
+ | {|border=1 ;style="margin: 0 auto; text-align: center;cellpadding="5" cellspacing="0" | ||
+ | |- | ||
+ | |Model || GFlops || Kerner || Mem. bandwith GB/s || kr. || kr. pr. GFlop | ||
+ | |- | ||
+ | |GTS 250 || 470 || 128 || 70 || 915 || 1.95 | ||
+ | |} | ||
+ | |||
+ | =CUDA= | ||
+ | ''C''ompute ''U''nified ''D''evice ''A''rchitecture | ||
+ | *[http://developer.nvidia.com/page/home.html CUDA Developer homepage] ** | ||
+ | *See nVidia's [http://www.nvidia.com/object/cuda_home_new.html CUDA zone] | ||
+ | *See nVidia's [http://developer.nvidia.com/object/gpu_computing_online.html Online Seminars] | ||
+ | |||
+ | ==Documentation== | ||
+ | *[http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf Programming Guide CUDA 8] | ||
+ | *[http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/NVIDIA_CUDA_BestPracticesGuide_2.3.pdf CUDA C best practices guide] | ||
+ | *[http://developer.nvidia.com/object/cuda-by-example.html CUDA by example] (Also available on Safari) | ||
+ | |||
+ | ==Links til artikelserie af Rob Farber== | ||
+ | *[http://www.drdobbs.com/high-performance-computing/207200659 CUDA, Supercomputing for the Masses: Part 1] CUDA lets you work with familiar programming concepts while developing software that can run on a GPU | ||
+ | *[http://www.drdobbs.com/high-performance-computing/207402986 CUDA, Supercomputing for the Masses: Part 2] A first kernel | ||
+ | *[http://www.drdobbs.com/high-performance-computing/207603131 CUDA, Supercomputing for the Masses: Part 3] Error handling and global memory performance limitations | ||
+ | *[http://www.drdobbs.com/architecture-and-design/208401741 CUDA, Supercomputing for the Masses: Part 4] Understanding and using shared memory (1) | ||
+ | *[http://www.drdobbs.com/high-performance-computing/208801731 CUDA, Supercomputing for the Masses: Part 5] Understanding and using shared memory (2) | ||
+ | *[http://www.drdobbs.com/architecture-and-design/209601096 CUDA, Supercomputing for the Masses: Part 6] Global memory and the CUDA profiler | ||
+ | *[http://www.drdobbs.com/high-performance-computing/210102115 CUDA, Supercomputing for the Masses: Part 7] Double the fun with next-generation CUDA hardware | ||
+ | *[http://www.drdobbs.com/architecture-and-design/210602684 CUDA, Supercomputing for the Masses: Part 8] Using libraries with CUDA | ||
+ | *[http://www.drdobbs.com/high-performance-computing/211800683 CUDA, Supercomputing for the Masses: Part 9] Extending High-level Languages with CUDA | ||
+ | *[http://www.drdobbs.com/architecture-and-design/212903437 CUDA, Supercomputing for the Masses: Part 10] CUDPP, a powerful data-parallel CUDA library | ||
+ | *[http://www.drdobbs.com/high-performance-computing/215900921 CUDA, Supercomputing for the Masses: Part 11] Revisiting CUDA memory spaces | ||
+ | *[http://www.drdobbs.com/high-performance-computing/217500110 CUDA, Supercomputing for the Masses: Part 12] CUDA 2.2 Changes the Data Movement Paradigm | ||
+ | *[http://www.drdobbs.com/high-performance-computing/218100902 CUDA, Supercomputing for the Masses: Part 13] Using texture memory in CUDA | ||
+ | *[http://www.drdobbs.com/high-performance-computing/220601124 CUDA, Supercomputing for the Masses: Part 14] Debugging CUDA and using CUDA-GDB | ||
+ | *[http://www.drdobbs.com/architecture-and-design/222600097 CUDA, Supercomputing for the Masses: Part 15] Using Pixel Buffer Objects with CUDA and OpenGL | ||
+ | == Links til kursusbeskrivelser == | ||
+ | *[http://courses.ece.illinois.edu/ece498/al/index.html Applied Parallel Programming] UNIVERSITY OF ILLINOIS '''(+++)''' | ||
+ | *[http://developer.nvidia.com/object/gpu_computing_online.html GPU Computing Online Seminars Juni juli 2010] | ||
+ | |||
+ | == Software == | ||
+ | *[http://developer.nvidia.com/object/cuda_3_0_downloads.html GetCUDA] | ||
+ | **[http://developer.download.nvidia.com/compute/cuda/3_0/docs/GettingStartedWindows.pdf Getting Started in Windows environment] | ||
+ | == Perl == | ||
+ | *[http://psilambda.com/products/kappa/ Kappa] CUDA made easier | ||
+ | *[http://search.cpan.org/~brian/KappaCUDA-1.1.1/KappaCUDA.pod KappaCUDA] Perl module | ||
+ | |||
+ | =Links= | ||
+ | *[http://research.microsoft.com/en-us/um/redmond/events/escience2008/matsuoka-escience2008.pdf Tsunami cluster] Interessant artikel bla. strømforbrug | ||
+ | *[http://forums.nvidia.com/lofiversion/index.php?t70731.html MatLAB and Tesla] How-to eksempel med Linux | ||
+ | *[http://www.computer.org/portal/c/document_library/get_file?uuid=6fefff2f-cf51-487b-b744-18ac6f3872ac&groupId=53319 IEEE on GPU's] '''Good HeTh''' | ||
+ | [[Category:Cluster]] [[Category:Linux]][[Category:CoE]][[Category:CUDA]] |
Latest revision as of 10:04, 21 November 2010
Contents
Billig løsning for at komme igang
Anvende nVidia CUDA compatibelt grafrikkort fx. GeForce 9800GTX+ til kr. 1.361,- inkl. moms.
nVidea Tesla C1060 GPU
- Pris pr. 14. maj 2010 kr. 7980,-
Form Factor | 10.5" x 4.376", Dual Slot |
# of Tesla GPUs | 1 |
# of Streaming Processor Cores | 240 |
Frequency of processor cores | 1.3 GHz |
Single Precision floating point performance (peak) | 933 |
Double Precision floating point performance (peak) | 78 |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 4 GDDR3 |
Memory Speed | 800MHz |
Memory Interface | 512-bit |
Memory Bandwidth | 102 GB/sec |
Max Power Consumption | 187.8 W |
System Interface | PCIe x16 |
Auxiliary Power Connectors | 6-pin & 8-pin |
Thermal Solution | Active fan sink |
Software Development Tools | C-based CUDA Toolkit |
nVidea Tesla S1070
Tesla S1070 er et 1U kabinet med 4 GPU'er delt i to sektioner. Tesla S1070 skal tilsluttes en eller to host PC'er. Pris cirka. kr. 60.000,- (Ser ikke ud til at kunne købes i DK)
Specifikationer
Number of Tesla GPUs | 4 |
Number of Streaming Processor Cores | 960 (240 per processor) |
Frequency of processor cores | 1.296 to 1.44 GHz |
Single Precision floating point performance (peak) | 3.73 to 4.14 TFlops |
Double Precision floating point performance (peak) | 311 to 345 GFlops |
Floating Point Precision | IEEE 754 single & double |
Total Dedicated Memory | 16 |
Memory Interface | 512-bit |
Memory Bandwidth | 408 GB/sec |
Max Power Consumption | 800 W (typical) |
System Interface | PCIe x16 or x8 |
Software Development Tools | C-based CUDA Toolkit |
Sammenligning af GPU'er
Model | GFlops | Kerner | Mem. bandwith GB/s | kr. | kr. pr. GFlop |
GTS 250 | 470 | 128 | 70 | 915 | 1.95 |
CUDA
Compute Unified Device Architecture
- CUDA Developer homepage **
- See nVidia's CUDA zone
- See nVidia's Online Seminars
Documentation
- Programming Guide CUDA 8
- CUDA C best practices guide
- CUDA by example (Also available on Safari)
Links til artikelserie af Rob Farber
- CUDA, Supercomputing for the Masses: Part 1 CUDA lets you work with familiar programming concepts while developing software that can run on a GPU
- CUDA, Supercomputing for the Masses: Part 2 A first kernel
- CUDA, Supercomputing for the Masses: Part 3 Error handling and global memory performance limitations
- CUDA, Supercomputing for the Masses: Part 4 Understanding and using shared memory (1)
- CUDA, Supercomputing for the Masses: Part 5 Understanding and using shared memory (2)
- CUDA, Supercomputing for the Masses: Part 6 Global memory and the CUDA profiler
- CUDA, Supercomputing for the Masses: Part 7 Double the fun with next-generation CUDA hardware
- CUDA, Supercomputing for the Masses: Part 8 Using libraries with CUDA
- CUDA, Supercomputing for the Masses: Part 9 Extending High-level Languages with CUDA
- CUDA, Supercomputing for the Masses: Part 10 CUDPP, a powerful data-parallel CUDA library
- CUDA, Supercomputing for the Masses: Part 11 Revisiting CUDA memory spaces
- CUDA, Supercomputing for the Masses: Part 12 CUDA 2.2 Changes the Data Movement Paradigm
- CUDA, Supercomputing for the Masses: Part 13 Using texture memory in CUDA
- CUDA, Supercomputing for the Masses: Part 14 Debugging CUDA and using CUDA-GDB
- CUDA, Supercomputing for the Masses: Part 15 Using Pixel Buffer Objects with CUDA and OpenGL
Links til kursusbeskrivelser
- Applied Parallel Programming UNIVERSITY OF ILLINOIS (+++)
- GPU Computing Online Seminars Juni juli 2010
Software
Perl
Links
- Tsunami cluster Interessant artikel bla. strømforbrug
- MatLAB and Tesla How-to eksempel med Linux
- IEEE on GPU's Good HeTh