In many popular press articles, the terms “artificial intelligence,” “machine learning,” and “deep learning” seem to be used interchangeably. What are the distinctions between these different techniques?

Artificial Intelligence

Wikipedia defines AI as

intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and animals

“AI” is thus a broad umbrella, encompassing all computational techniques that make machines look smart.

There is a further distinction between “strong AI” and “weak AI”:

  • Strong AI: a machine with sentience, consciousness, and/or mind, e.g. Data from Star Trek. This is still the domain of science fiction — nobody has ever constructed a strong AI system. There is a lot of debate about whether creating strong AI would be possible (e.g. see Chinese room argument) or advisable (e.g. AIs taking over the world).
  • Weak AI (aka narrow AI): “non-sentient” AI focused on a particular task, e.g. medical applications, face recognition, AI art. All of the AI being developed today is “narrow AI.”

Artificial intelligence includes machine learning as a sub-field. Artificial intelligence also includes non-machine-learning techniques like rule-based algorithms.

A rule-based algorithm for detecting birds might look something like this:


Of course, you can see from this pseudocode example that rule-based algorithms are hard to get right. How many characteristics of a bird do you need to specify before you can be truly confident that you’re seeing a bird? How many shapes and colors are involved? (This is also glossing over the challenge of getting a computer to recognize particular shapes and colors.) How do you specify exactly what a “feather” looks like? You might need to write hundreds of different rules. Consequently, rule-based methods have fallen out of favor for the vast majority of AI tasks.

Machine Learning

Arthur Samuel, one of the early AI researchers, is credited with the following definition of machine learning:

an application of artificial intelligence that gives computers the ability to learn without being explicitly programmed

Machine learning algorithms are designed to learn from data. For example, if you wanted to build a machine learning algorithm to identify birds, you wouldn’t need to write down any particular characteristics of birds, or any rules. Instead, you would collect thousands of bird and non-bird photographs, and then feed them to your machine learning algorithm with the labels “bird” (1) and “not bird” (0). The machine learning algorithm would figure out on its own what characteristics are useful for distinguishing a bird from a non-bird.

A recent preprint, “This Looks Like That: Deep Learning for Interpretable Image Recognition” gives an example of a machine learning algorithm that explains what parts of a bird photograph it’s looking at in order to determine the bird species.

Deep Learning

Deep learning refers to a type of machine learning in which computers learn to understand the world as a hierarchy of concepts. A deep learning model is a specific kind of machine learning algorithm called a neural network which has been designed to have many layers, i.e it is “deep.” The lower layers learn simple concepts like edges, and the higher layers learn complicated concepts like faces.

This earlier post about feedforward neural networks defines the “layers” of a neural network model, and this image shows a schematic diagram of a neural network with many layers.

Deep learning is responsible for much of the recent excitement around artificial intelligence. With bigger data sets and better computers than ever before, deep learning algorithms can demonstrate impressive performance that is useful in the real world. Deep learning has been successfully applied to speech recognitionspeech synthesislanguage translationimage captioning, and face recognition


ai vs ml vs dl

As you can see, neural network models are a type of “machine learning” but they are also “deep learning” if they have many layers. Other methods, like support vector machines, are considered “machine learning” but not “deep learning”; rule-based systems are considered “artificial intelligence” but not “machine learning.”

About the Featured Image

The featured image is an artist’s rendering of trees from 66 million to 2.6 million years ago, “palms and Cycadeae of middle tertiary Europe.” I think they look a little like sunflowers, and certainly different from modern trees. What will artificial intelligence look like in 2.6 million years (if it’s still around)?

Also, there’s a kind of machine learning method called a “decision tree“; if you use many decision trees all at once, that’s another machine learning method called a “random forest.”