From NVidia's Web Making small GeForce gtx 750Ti GPU cluster with less than $5000 K.Hoshina Sep. 15 2014 IceCube Collaboration Meeting
Motivation All simulation production uses GPU Currently we have ~200 GPU nodes, but... Usually a dataset uses only less than 30 GPUs at a time because it have to compete with other datasets If your job priority is lower than others, your jobs may stack at GPU task for more than weeks It is essential to "keep processing" GPU tasks in order not to waste time CPU Jobs don't start until GPU jobs are finished, even if we have plenty CPU nodes NuGen Photon Propagation DetectorSim L1 processing L2 processing
We have $5000, then... It's a little bit off for buying a blade server Main issue is power supply : Most of GPU cards (e.g. gtx 680) require ~200W per card, then power unit must supply more than ~800W (for two cards). It's not easy to find a server with two PCIe Gen3 x24 slots and high power unit. Only SunMicro (Oracle) provides them with a reasonable price, but they are still expensive in Japan. ERI group have 6 old Dell PowerEdgeT410 machines with 48 CPU cores. They are good enough for CPU jobs and has one PCI Gen3 x24 port, however, the power unit supplies only 450W. Does having 10 private GPU cores improve the situation? Yes! 10 GPU cores are already ~30% of IceCube public GPUs you may available at a time, and it won't stop even if your priority is low. So, Can we use low-power GPUs with cheap PC?
GEForce GTX 750 Ti Uses only 75w, works without extra power cable ZOTAC GEForce GTX 750 Ti 2GB (~$140) EVGA GEForce GTX 750 Ti 2GB Superclocked (~$150)
Tested Machines DELL PowerEdge T420 & T410 Homebuild Machine ($580) instructed by Nvidia http://www.geforce.com/whats-new/guides/geforce-gtx-750-ti-mini-itx-pc-build-guide
Homebuild PC *1 Now it's hard to obtain. You may buy SG05B-Lite and buy power unit as option. *2 Since we have CPU cluster already, we didn't pay much for CPU. However, if you want to use it as CPU machine too, you may use quad-core CPU. Also, you need to increase memory size ~ 4GB / core if you want to use it for phonics-table-based reconstructions. Note that GPU occupies one CPU core.
Performance (ppc 1e+11 photons) Machine System GPU Device Time [ms] ratio CobaltGPU SL6.4 cuda 5.5 EVGA NVidia GeForce 680 Running on 8 MPs x 1024 threads 3689659 1.0 homebuild P C SL6.5 cuda 6.0 EVGA NVidia GeForce 750 Ti 2GB SC (75W) Running on 5 MPs x 1024 threads 4987016 1.35 homebuild P C SL6.5 cuda 6.0 Zotac NVidia GeForce 750 Ti 2GB (75W) Running on 5 MPs x 1024 threads 5658848 1.53 DELL T420 SL6.3 cuda 6.0 EVGA NVidia GeForce 750 Ti 2GB SC (75W) Running on 5 MPs x 1024 threads 4986203 1.35 DELL T420 SL6.3 cuda 6.0 Zotac NVidia GeForce 750 Ti 2GB (75W) Running on 5 MPs x 1024 threads 5659683 1.53 DELL T410 SL6.3 cuda EVGA NVidia GeForce 660 (140W) Running on 5 MPs x 1024 threads 5603139 1.52 *Nvidia announces that the performance of GeForce 750Ti is lower than 660, but it looks comparable for our simulation and even faster for EVGA Superclocked card.
ERI01 CPU + GPU mini cluster 6 DELL T410 (8CPU cores + 1GPU core /Machine) 6 HomePC (2CPU cores + 1GPU core /Machine) On Condor: 45 CPU cores + 12 GPU cores IceProd support Total cost for upgrading ERI01 cluster : ~$4500 ERI01 is primarily for EarthCore related simulation, but feel free to use it when the grid is not busy :)
Summary We compared performances of NVidia gtx 750Ti. EVGA GeForce 750 Ti 2GB SC(Superclocked) model showed remarkable speed performance for PPC test. The price of EVGA gtx 750Ti SC is now ~$140, uses only 75W and does not require additional power. If your PC have a PCIe x24 port, it will be the most economical option to have a GPU test machine. We did simple stress test. No fatal error is observed. The temperature was stable and low enough through the test. Power consumption and heat problem may not be an issue for 750Ti. A homebuild PC costs only less than $600 per 1 GPU core + 2 CPU cores. With $5000 you may build a test cluster with 8 CPU cores + 8 GPU cores. This is almost comparable to buy a blade server if you can pay another $1000+ (and the performance is better for blade server + high-end GPU card). Still the home build cluster is good for most of institute where no computer specialist exists, because it's easy to fix or replace when technical problems happen.
Setup Procedures (Go to http://www.icecube.wisc.edu/~hoshina/ and click "Blog".) How to install a GPU card http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=8 How to install condor + iceprod with GPU setting http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=9 How to install Scientific Linux 6.5 to Homebuild Machine http://icecube.wisc.edu/~hoshina/blog/special_blog? cmd=post&id=10