How does CPU differ from GPU (in terms of training ML models)?
Think about CPU as a 4 lane highway with trucks delivering the computation, and GPUs as a 100 lane highway with little shopping carts. GPUs are great at parallelism, but only for less complex tasks.
Deep learning specifically benefits from that since it's mainly batches of matrix multiplication, and these can be parallelized very easily. So training a neural network in a GPU can be 10x faster than on a CPU. But other models don't get that benefit at all
Thank you! This makes a lot more sense now.
I've got an analogy coming!
Let’s say that I am in a very large library, and my goal is to count all of the books.
There’s a librarian. That librarian is super smart and knowledgeable about where books are, how they’re organized, how the library works, and all that. The librarian is the boss! The librarian is perfectly capable of counting the books on their own and they’ll probably be very good and organized about it.
But what if there was a big team of people who could count the books with the librarian? We don’t need these people to be library experts — it’s not like you have to be a librarian to count books — we just need people who can count accurately.
A CPU is like a librarian.
A GPU is like a team of people counting books.
A GPU can usually accomplish certain tasks much faster than a CPU can.
Let’s say you have 100 shelves in your library and it takes 1 minute to count all of the books on 1 shelf.
3 minutes instead of 100 minutes — that’s way better! Again: GPUs can usually accomplish certain tasks much faster than a CPU can.
There are some cases when a GPU probably isn’t needed:
Let’s look at a simple math example: calculating the average of a set of numbers.
Let’s look at a more complicated machine learning example: a random forest is basically a large number of decision trees. Let’s say you want to build a random forest with 100 decision trees.
There’s a good image from NVIDIA that helps to compare CPUs to GPUs.
So, in short:
Speaking of an analogy, here is a video about, quite literally, an analogy. Specifically analog CPUs (as opposed to digital). This video is very interesting, very well presented, and gives a full history of CPUs and GPUs usage wrt AI, and why the next evolution could be analog computers. Well worth watching!!
Ah, I was hoping for a Robot 3 analogy, they are always fantastic 🙂
Thanks all who shared!
A more simplified and general comparison:
CPU's are designed to coordinate AND calculate a bunch of math - they have a bunch of routing set up and they're going to have drivers [or operating systems] built to make that pathing and organizing as easy as the simple calculations. Because they're designed to be a "brain" for a computer, they're built to do it ALL.
GPU's are designed to be specialized for, well, graphics hence the name. To quickly render video and 3d graphics, you want a bunch of very simple calculations performed all at once - instead of having one "thing" [CPU cores] calculating the color for a 1920x1080 display [a total of 2073600 pixels], maybe you have 1920 "things" [GPU cores] dedicated to doing one line of pixels each and all running in parallel.
"Split this Hex code for this pixel's color into a separate R, G, and B value and send it to the screen's pixel matrix" is a much simpler task than, say, the "convert this video file into a series of frames, combine them with the current display frame of this other application, be prepared to interrupt this task to catch and respond to keyboard/mouse input, and keep this background process running the whole time..." tasks that a CPU might be doing. Because of this, a GPU can be slower and more limited than a CPU while still being useful, and it might have unique methods to complete its calculations so it can be specialized for X purpose [3d rendering takes more flexibility than "display to screen"]. Maybe it only knows very simple conversions or can't keep track of what it used to be doing - "history" isn't always useful for displaying graphics, especially if there's a CPU and a buffer [RAM] keeping track of history for you.
Since CPU's want to be usable for a lot of different things, there tends to be a lot of Operating Systems/drivers to translate between the higher level code I might write and the machine's specific registers and routing. BUT since a GPU is made with the default assumption "this is going to make basic graphics data more scalable" they often have more specialized machine functionality, and drivers can be much more limited in many cases. It might be harder to find a translator that can tell the GPU how to do the very specific thing that would be helpful in a specific use case, vs the multiple helpful translators ready to explain to your CPU how to do what you need
Or: if your CPU is a teacher, your GPU is a classroom full of elementary school students.
Sometimes it might be worth having the teacher explain to the class how to help with a group project… but it depends on the cost of the teacher having to figure out how to talk to each student in a way they'll understand and listen plus the energy the teacher now has to spend making sure they're doing what they're supposed to and getting the materials they need along the way. Meanwhile, your teacher came pre-trained and already knows how to do a bunch of more complicated tasks and organization!
If it's a project where a lot of unskilled but eager help can make things go faster, then it might be worth using a GPU. But before you can get the benefits, you need to make sure you know what languages each kid in the classroom speaks and what they already know how to do. Sometimes, its just easier and more helpful to focus on making sure your teachers can do the tasks themselves before recruiting the kids.
[or your trained library vs a bunch of random people who just happen to be in the library at the moment - the same issue of "figure out how to talk to each person + what they can and cannot do" applies]