As machine learning becomes more advanced, we’re moving from our local computers to more powerful machines. An example is high-powered GPU machines for neural networks. These machines can be thousands of dollars to build, and difficult to maintain. An example is the machine shown in the cover photo for this story, which is one of the beefy GPU lab machines at Regis.
The GPU rigs are crucial for taking advantage of recent deep learning breakthroughs. This includes things like recognizing cancer cells in brain scans, detecting pneumonia from X-rays, or something more fun like classifying dog breeds.
In this post I’ll overview a GPU high-performance computing (HPC) cluster I created for student and faculty use at Regis, and show how only a few GPUS can be used by dozens of people to schedule and run neural network training jobs.