2.4 Types of Learning

As you can probably tell from what we've covered in this chapter so far, machine learning (ML) is a pretty major part of the AI discipline. It's also the most advanced and useful area and worth diving into with greater detail. Let's break it up into three (maybe four) types of learning and explore them further.

1. Supervised Learning

In supervised learning, the model is trained on a labeled dataset, meaning that each training example is paired with an output label. The goal is to learn a mapping from inputs to outputs that can be applied to new, unseen data.

Supervised learning requires large amounts of labeled data and predicts outcomes based on those labels. For example, let's say you have a table of data about customers including their age, income, and education and you want to predict whether each customer will purchase a new product you are developing. In order to make that prediction, you would also need a record of what each customer has purchased from you in the past in order to make those predictions. Their past purchase history is the "label" for a supervised machine learning model

There are many algorithms used for supervised machine learning like linear regression, logistic regression, decision trees, support vector machines, and even neural networks. With the right dataset, supervised learning can be very accurate and makes it easy to understand which features (e.g. age, income, education) have the most/least impact on predicting the label (e.g. purchase likelihood).

The main limitations of supervised learning is that it is heavily dependent on having high-quality data. Some datasets simply do not include features that are very predictive of the label of interest. Supervised learning is also prone to what is called "overfitting" where the predictions are too closely tied to the training dataset and do not apply well to new datasets.

2. Unsupervised Learning

In unsupervised learning, the model is trained on a dataset without labeled responses. The goal is to find hidden patterns or intrinsic structures in the input data.

Unsupervised learnign does not require labeled data. Instead, you can use it to identify groups of customers that are very similar to each other. Once you have identified the common groups, you can add your own label to each group. For example, based on the demographics and purchase history of the customer, you could label each group as "frequent purchase" customers, "buy and return" customers, "buy as a gift" customers, "look but don't buy" customers, etc.

We often refer to unsupervised learning as "clustering" or "dimension reduction" and there are many algorithms for this including k-means, hierarchical, principal component analysis (PCA), and autoencoders. Besides customer segmentation, we also use unsupervised learning for anomaly/outlier detection, noise reduction, and feature extraction.

The main limitations of unsupervised learning is that it can be hard (but not impossible) to interpret which features are contributing most to the groupings identified. It is also difficult (but not impossible) to evaluate how strong the groupings are--whether they are "tight" or "loose."

2b. Semi-Supervised Learning

Semi-supervised learning is really just a combination of the prior two types when the dataset includes both labeled and unlabeled data. It makes effective use of both types of data which reduces the need for extensive labeled datasets.

Semi-supervised learning also yields better performance than purely supervised or unsupervised approaches when the labeled data is limited. However, it is not particularly unique from the prior two types above so I've called it "2b". Algorithms for semmi-supervised learning include self-training, co-training, and graph-based methods. It is commonly used for text- and image-classification.

The limitations of semi-supervised learning are somewhat familiar already. There must be high-quality data with valid relationships. These methods can also be complex to understand and perform.

3. Reinforcement Learning

Reinforcement learning involves training an agent to make a sequence of decisions by interacting with an environment. The agent receives rewards or penalties based on its actions and aims to maximize cumulative rewards. This is very unique from the other types and has very specific use cases.

Common algorithms for reinforcement learning include q-learning, deep q-learning (DQN), policy gradient methods, and actor-critic methods. Reinforcement learning is used to create a "computer opponent" in video games like Chess, AlphaGo, and Atari games. It is also used to train more advanced robotics than are powered by RPA in an assembly line. For example, reinforcement learning is used for robots that actually walk around, grasp, and navigate unexpected objects (e.g. robot vaccums).

The primary risk of reinforcement learning is that convergence can be poor. For example, what if the human opponent in a game keeps making contrary and unexpected decisions. That will make it very difficult for the robot to learn. Or, what if you keep moving furniture around in your home? It will be tough for the robot vacuum to build an internal map of the layout for future efficiency.

In Summary

Table 2.1

Feature	Supervised Learning	Unsupervised Learning	Semi-Supervised Learning	Reinforcement Learning
Data	Labeled	Unlabeled	Combination of labeled and unlabeled	Interaction-based
Output	Predicted outcomes	Identified patterns and structures	Predicted outcomes with improved accuracy when datasets are small	Policy that maximizes rewards
Algorithms Examples	Linear regression, decision trees	K-means, PCA	Self-Training, Co-Training	Q-Learning, DQN, Policy Gradient
Applications	Classification, regression	Clustering, Dimensionality Reduction	Text and Image Classification	Game Playing, Robotics, Autonomous Vehicles
Strengths	High accuracy, interpretability	Exploratory analysis, flexibility	Data efficiency, improved performance	Sequential decision-making, adaptability
Limitations	Data dependency, overfitting	Interpretability, evaluation	Complexity, dependency on data quality	Complexity, risk of poor convergence

Previous Next