Table of Contents
Quick Answer
- Supervised: learn from input-output pairs (spam / not spam)
- Unsupervised: find structure in raw data (cluster users into segments)
- Self-supervised: invent labels from the data itself (predict the next word)
LLMs are primarily self-supervised with a supervised fine-tuning stage.
What Do These Terms Mean?
Supervised learning needs a human to label every example. Unsupervised learning runs on raw data — no labels needed. Self-supervised learning is a clever subset of supervised where labels come from the data itself (Stanford CS229 lecture notes; Google AI blog on self-supervision, 2022).
How Each Works
Supervised
- Input: {image: cat.jpg, label: "cat"}
- Model learns to minimize prediction error on labels
- Needs thousands-to-millions of labeled examples
- Examples: image classification, fraud detection, spam filters
Unsupervised
- Input: raw data, no labels
- Model discovers clusters, reduced representations, anomalies
- Examples: customer segmentation, PCA, autoencoders, topic modeling
Self-Supervised (inside supervised family)
- Input: "The cat sat on the ___" with target "mat"
- Labels fabricated from the data structure
- All modern LLMs start here
- Also: masked image modeling, contrastive learning
Examples
- Supervised: predicting house prices from labeled sales data
- Unsupervised: grouping Spotify users by listening patterns
- Self-supervised: GPT-5 trained on predicting the next token across 15T tokens
- Unsupervised anomaly: flagging unusual credit card transactions
- Supervised fine-tuning: RLHF step that aligns LLMs to human preferences
Supervised vs Unsupervised
Aspect
Supervised
Unsupervised
Needs labels
Yes
No
Goal
Predict
Discover
Evaluation
Clear (accuracy, F1)
Subjective
Data cost
High
Low
Typical algos
Random forest, XGBoost, neural nets
K-means, PCA, DBSCAN
When to Use Each
- Have labels + want predictions -> Supervised
- Have raw data + want exploration -> Unsupervised
- Have huge corpus + want a generalist -> Self-supervised pre-training
- Need human-aligned behavior on a base model -> Supervised fine-tuning + RLHF
FAQs
Is reinforcement learning a third type? Yes — RL uses reward signals rather than labels or raw data. RLHF combines supervised and RL.
Are LLMs supervised? Yes — self-supervised during pre-training, then supervised during fine-tuning.
Which is easier? Unsupervised needs less data prep; supervised produces more reliable outcomes.
Can I convert unsupervised into supervised? Sometimes — label a small sample, then use semi-supervised learning.
What is semi-supervised learning? Mixes a small labeled set with a large unlabeled one.
Do I need unsupervised for embeddings? Modern embedding models are self-supervised with contrastive learning.
Which paradigm is most used commercially? Supervised (labeled classification) — but self-supervised pre-training enabled the LLM boom.
Conclusion
Self-supervised pre-training plus supervised fine-tuning is the recipe behind every frontier LLM. Most businesses use supervised learning for targeted prediction. More ML primers on Misar Blog↗.