What is Artificial Intelligence, Machine Learning, and Deep Learning?

There is hardly a day where there is no news on artificial intelligence in the media.  Below is a short collection of some news headlines from the past 24 hours only:

It is interesting that most of those articles have a skeptical, if not even negative tone.  This sentiment was also fueled with statements of Bill Gates, Elon Musk, or even Stephen Hawking.  With all due respect, but I would not stand in public talking nonsense about wormholes so we should all focus a bit more on the areas we are experts in.

This all underlines two things: artificial intelligence and machine learning finally became mainstream. And people know shockingly little about it.

There is also a high dose of hype around those topics.  We all heard about “Linear Regression” before. This should not come as a surprise since it was already invented more than 200 years ago by Legendre and Gauss.  And still this overdose of hype can lead to situations where people are a little bit carried away whenever they use this method.  Here is one of my favorite tweet exchanges which exemplifies this:

Anyway, there is high level of confusion around those terms. This post should help to understand the differences and relationships of those fields. Let’s get started with the following picture. It explains the three terms artificial intelligence, machine learning, and deep learning:

ai_ml_dl

Artificial Intelligence is covering anything which enables computers to behave like a human.  Think of the famous – although a bit outdated – Turing test to determine if this is the case or not.  If you talk to Siri on your phone and get an answer, this is close already.  Automatic trading systems using machine learning to be more adaptive would also already fall into this category.

Machine Learning is the subset of Artificial Intelligence which deals with the extraction of patterns from data sets. This means that the machine can find rules for optimal behavior but also can adapt to changes in the world. Many of the involved algorithms are known since decades and sometimes even centuries. But thanks to the advances in computer science as well as parallel computing they can now scale up to massive data volumes.

Deep Learning is a specific class of Machine Learning algorithms which are using complex neural networks.  In a sense, it is a group of related techniques like the group of “decision trees” or “support vector machines”.  But thanks to the advances in parallel computing they got quite a bit of hype recently which is why I broke them out here. As you can see, deep learning is a subset of methods from machine learning.  When somebody explains that deep learning is “radically different from machine learning“, they are wrong.  But if you would like to get a BS-free view on deep learning, check out this webinar I did some time ago.

But if Machine Learning is only a subset of Artificial Intelligence, what else is part of this field?  Below is a summary of the most important research areas and methods for each of the three groups:

  • Artificial Intelligence: Machine Learning (duh!), planning, natural language understanding, language synthesis, computer vision, robotics, sensor analysis, optimization & simulation, among others.
  • Machine Learning: Deep Learning (another duh!), support vector machines, decision trees, Bayes learning, k-means clustering, association rule learning, regression, and many more.
  • Deep Learning: artificial neural networks, convolutional neural networks, recursive neural networks, long short-term memory, deep belief networks, and many more.

As you can see, there are dozens of techniques in each of those fields. And researchers generate new algorithms on a weekly basis.  Those algorithms might be complex.  The conceptual differences like explained above are not.