What Artificial Intelligence and Machine Learning can do – and what not

I have written on Artificial Intelligence (AI) before.  Back then I focused on the technology side of it: what is part of an AI system and what isn’t.  But there is another question which might be even more important.  What are we DOING with AI?

Part of my job is to help investors with their due diligence.  I discuss companies with them in which they might want to invest. Here is a quick observation:  By now, every company pitch is full with stuff about how they are using AI to solve a given business problem.

Part of me loves this since some of those companies are on something and should get the chance.  But I also have a built-in “bullshit-meter”.  So, another part of me wants to cringe every time I listen to a founder making stuff up about how AI will help him.  I listened to many founders who do not know a lot about AI, but they sense that they can get millions of dollars of funding.  Just by adding those fluffy keywords to their pitch.  The bad news is that it sooner or later actually works.  Who am I to blame them?

I have seen situations where AI or at least machine learning (ML) has an incredible impact.  But I also have seen situations where this is not the case.  What was the difference?

In most of the cases where organizations fail with AI or ML, they used those techniques in the wrong context.  ML models are not very helpful if you have only one big decision you need to make.  Analytics still can help you in such cases by giving you easier access to the data you need to make this decision.  Or by presenting this data in a consumable fashion.  But at the end of the day, those single big decisions are often very strategic.  Building a machine learning model or an AI to help you making this decision is not worth doing it.  And often they also do not yield better results than just making the decision on your own.

Here is where ML and AI can help. Machine Learning and Artificial Intelligence deliver most value whenever you need to make lots of similar decisions quickly. Good examples for this are:

  • Defining the price of a product in markets with rapidly changing demands,
  • Making offers for cross-selling in an E-Commerce platform,
  • Approving a credit or not,
  • Detecting customers with a high risk for churn,
  • Stopping fraudulent transactions,
  • …among others.

You can see that a human being who would have access to all relevant data could make those decisions in a matter of seconds or minutes.  Only that they can’t without AI or ML, since they would need to make this type of decision millions of times, every day.  Like sifting through your customer base of 50 million clients every day to identify those with a high churn risk.  Impossible for any human being.  But no problem at all for an ML model.

So, the biggest value of artificial intelligence and machine learning is not to support us with those big strategic decisions.  Machine learning delivers most value when we operationalize models and automate millions of decisions.

The image below shows this spectrum of decisions and the times humans need to make those.  The blue boxes are situations where analytics can help, but it is not providing its full value. The orange boxes are situations where AI and ML show real value. And the interesting observation is: the more decisions you can automate, the higher this value will be (upper right end of this spectrum).


One of the shortest descriptions of this phenomenon comes from Andrew Ng, who is a well-known researcher in the field of AI.  Andrew described what AI can do as follows:

“If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.”

I agree with him on this characterization. And I like that he puts the emphasis on automation and operationalization of those models – because this is where the biggest value is. The only thing I disagree with is the time unit he chose. It is safe to go already with a minute instead of a second.

What is Artificial Intelligence, Machine Learning, and Deep Learning?

There is hardly a day where there is no news on artificial intelligence in the media.  Below is a short collection of some news headlines from the past 24 hours only:

It is interesting that most of those articles have a skeptical, if not even negative tone.  This sentiment was also fueled with statements of Bill Gates, Elon Musk, or even Stephen Hawking.  With all due respect, but I would not stand in public talking nonsense about wormholes so we should all focus a bit more on the areas we are experts in.

This all underlines two things: artificial intelligence and machine learning finally became mainstream. And people know shockingly little about it.

There is also a high dose of hype around those topics.  We all heard about “Linear Regression” before. This should not come as a surprise since it was already invented more than 200 years ago by Legendre and Gauss.  And still this overdose of hype can lead to situations where people are a little bit carried away whenever they use this method.  Here is one of my favorite tweet exchanges which exemplifies this:

Anyway, there is high level of confusion around those terms. This post should help to understand the differences and relationships of those fields. Let’s get started with the following picture. It explains the three terms artificial intelligence, machine learning, and deep learning:


Artificial Intelligence is covering anything which enables computers to behave like a human.  Think of the famous – although a bit outdated – Turing test to determine if this is the case or not.  If you talk to Siri on your phone and get an answer, this is close already.  Automatic trading systems using machine learning to be more adaptive would also already fall into this category.

Machine Learning is the subset of Artificial Intelligence which deals with the extraction of patterns from data sets. This means that the machine can find rules for optimal behavior but also can adapt to changes in the world. Many of the involved algorithms are known since decades and sometimes even centuries. But thanks to the advances in computer science as well as parallel computing they can now scale up to massive data volumes.

Deep Learning is a specific class of Machine Learning algorithms which are using complex neural networks.  In a sense, it is a group of related techniques like the group of “decision trees” or “support vector machines”.  But thanks to the advances in parallel computing they got quite a bit of hype recently which is why I broke them out here. As you can see, deep learning is a subset of methods from machine learning.  When somebody explains that deep learning is “radically different from machine learning“, they are wrong.  But if you would like to get a BS-free view on deep learning, check out this webinar I did some time ago.

But if Machine Learning is only a subset of Artificial Intelligence, what else is part of this field?  Below is a summary of the most important research areas and methods for each of the three groups:

  • Artificial Intelligence: Machine Learning (duh!), planning, natural language understanding, language synthesis, computer vision, robotics, sensor analysis, optimization & simulation, among others.
  • Machine Learning: Deep Learning (another duh!), support vector machines, decision trees, Bayes learning, k-means clustering, association rule learning, regression, and many more.
  • Deep Learning: artificial neural networks, convolutional neural networks, recursive neural networks, long short-term memory, deep belief networks, and many more.

As you can see, there are dozens of techniques in each of those fields. And researchers generate new algorithms on a weekly basis.  Those algorithms might be complex.  The conceptual differences like explained above are not.