Why Do You Use GPUs Instead of CPUs for Machine Learning?

GPU vs CPU

Admittedly, discussing the differences between CPUs and GPUs is a rather elementary concept for technologists, but it’s an important exercise that helps us better understand what drives modern Artificial Intelligence. Although GPUs are traditionally used to compliment the tasks that CPUs execute, they are, in fact, the driving force behind your AI initiatives.

  • Central Processing Unit (CPU): A CPU, or the “brain of the computer,” is a microchip located on the motherboard that is responsible for receiving data, executing commands, and processing the majority of information sent from the other computer and software components. By nature, a CPU is best at sequential processing (e.g. r1 + r2 = r3, r3 + r4 = r5, etc.) and executing multiple different operations on the same piece of data through scalar processing.
  • Graphics Processing Unit (GPU): In traditional computer models, a GPU is often integrated directly into the CPU and handles what the CPU doesn’t—conducting intense graphics processing. Unlike the CPU which can only process a few varied commands at once, the GPU can handle the thousands of (parallel) tiny calculations required for graphics processing because it is performing the same operation on multiple pieces of data. Built on a Single Instruction Multiple Data architecture (SIMD), GPUs rely on vector processing to schedule inputs into streams of (same sequence-of-operations) data and process them all together.

So, we’ve established that GPUs and CPUs fundamentally process different data sets, and although (technically speaking) CPUs are capable of handling graphics-related tasks, GPUs are better optimized to handle the fast-paced compute requirements of them.

Until recently, advanced GPUs were predominantly used for 3D game rendering. However, recent research and development has revealed a much broader application than originally anticipated.

What Do Graphics Have to Do With AI and Machine Learning?

AI is a loaded word.

And it has drastically different meanings in business than it does in science fiction. We’re not talking Terminator here or HAL 9000. We’re talking business intelligence and analytics shortcuts that have been born out of the world’s first supercomputers like Watson and Deep Blue. In particular, Deep Blue was the first computer to beat world champion chess player, Garry Kasparov, at his own game in 1996. Deep Blue was a pretty big computer for the time, coming in at an impressive 75 teraflops in processing power and requiring several racks of server hardware. It took up a lot of floor space.

Today, a single NVIDIA graphics card has the same level of power as Kasparov’s vanquisher, with 50-70 teraflops of calculation built into it and, when used for compute, anywhere from 2,000-3,000 cores (a typical laptop has 4,000). This means that this one GPU chip can process 1,000 times more data sets simultaneously than your typical x64 or x86 processor can handle.

CPUs and GPUs exist to augmenting our capabilities. They do things that humans could do, but they make these tasks easier and faster (e.g. you are perfectly capable of sending a letter, but email is faster and more efficient and so on). Machine Learning (the closest thing we have to AI, in the same vein, goes way beyond our human capabilities by performing tasks and calculations in a matter of days that would take a lifetime—if not more—for us.

GPUs and Machine Learning Use Cases

AI-driven GPUs are predominantly used for analytics and Big Data using genetic algorithms. Inspired by Darwin’s theory of Natural Selection, these genetic algorithms imitate the methodology of only selecting the “fittest” outcomes for future iterations.

For example, a local SMB wants to analyze a large set of heuristic data for future business solutions. At first, the computer won’t know what any of the data means, so it will begin to analyze and attribute meaning to each data stream (i.e. Thread 1 = Field X, Thread 2 = Field Y, etc.). These data sets will be run through 5,000-15,000 simulations with the computer comparing the best results after each generation.

We see this method, too, on online streaming sites. For instance, the analytics and recommendation tools that YouTube puts in place determine (from historical and trend data) what you are most likely to watch next after viewing “Baby Foxes Jump on Trampoline.” And YouTube does this for every video posted to its site, taking into account your Google searches as well. If you do a Google search for a wood planer, YouTube is more likely to recommend woodworking tutorial videos in the future. The same applies for sites like Facebook and Amazon; if you view a few cycling websites and then purchase cycling shoes through Amazon Prime, you’re more likely to see other cycling related gear when shopping and cycling-targeted ads on your Facebook feed.

Accelerated GPU analytics can pull from tens of thousands of sources of data and analyze it concurrently across thousands of servers for massively impactful insights—it’s staggeringly wide-scale, but it’s also iterative and done in very short time frames. The data set is simply too large for a CPU to calculate in a timely (and cost-effective) manner.

What Configurations Work Best for Different Machine Learning Applications

The main differentiator in GPU performance is the type of math that needs to be done.

It comes down to whether the company needs Single Floating Point or Dual Floating Point precision (i.e. to how many places can this thing make calculations?).

Single-Precision Floating Points occupies 32 bits in computer memory while Dual-Precision Floating Points occupy 64 bits. With Dual, the extra bits allow for increased precision as well as an increased range of magnitude, allowing for much more complex computations to be executed. Moreover, Dual-precision tends to require a higher tier of card and run more slowly because you are often processing more abstract, higher-level math.

Unless you have experienced developers in-house, you should think twice before trying to implement these highest-end technologies. There is no “out of the box” solution for this level of analysis, yet; you have to have the system customized for the specific data sources collected and created by your company. And don’t forget power and cooling requirements as each GPU demands 200-300 watts. In order to keep these servers from affecting the environment for your other devices, you have to implement sufficient cooling racks and air-cooling systems to counteract the heat generation.

How Do You Determine What Type of Rig Your Company Needs?

If your company is still looking for the right machine learning use case for your current infrastructure, you have some fundamental details to establish first.

  • What are our data sources?
  • How do we collect data?
  • What is it that we want to learn from that data?

Are you in the oil and gas industry? If so, you may have upwards of 10,000 wells spread across several regions, and you need to assess the efficiency and resource availability of each pump. You will need to have several thousand sensors at each site (each capable of transmitting 10KB of data per second) to report mapping seismic activity, oil-levels, and pump efficiency data back to a singular location.

If you manage a large-scale enterprise network, you may want to run analytics against all of your application log data to search for potential breaches and inefficiencies. There, you are looking at fewer data sources than the thousands of sensors from the oil company, but these sources generate massive amounts of data over time. Once you outsource the data to a siloed folder, your machine learning system can begin assimilation and analysis.

Current Market Offerings

Machine learning has a variety of different use cases and just as many offerings in the market. Depending on the nature of your business and the data analysis requirements of your company, you will need to determine which mix of solutions is right for you:

  • ExtraHop places a box in your network to analyze every TCP/IP packet that is sent. The unit stores the data and runs single-threaded reports against it because these calculations require too much data to be stored in real-time per command in memory. ExtraHop also makes use of cloud-based machine learning engines to power their SaaS security product.
  • Intel Xeon Phi is a combination of CPU and GPU processing, with a 100 core GPU that is capable of running any x86 workload (which means that you can use traditional CPU instructions against the graphics card). Phi can be used to analyze existing system memory and can scale from SMP-based queries (e.g. move from 16 threads to 60+ threads).
  • NVIDIA CUDA features the most recent NVIDIA GPUs with parallel computing extensions to many popular languages, drop-in accelerated libraries, and cloud-based compute appliances, allowing the compute platform to be applied across myriad industries and disciplines.
  • There are also many IaaS cloud instances that offer GPU-based processing, so, even if you don’t have the equipment in house, you can test and train algorithms with GPU efficiency.

Your Untapped Data Potential

If your company already collects data and you believe that heuristic analysis of that historical record could deliver actionable insights for future business initiatives, then GPUs are your answer. But, remember, that AI and machine learning solutions are never “out of box” implementations. You need an experienced team to help you decide on, install, and optimize your analytics solution to properly tailor it to your data sources and business goals.

Want to see what your data has to offer?

We can guide you to begin getting the predictive insights that need to stay ahead.