Table of Contents
Quick Answer
- Algorithm: the recipe (e.g., stochastic gradient descent, backpropagation)
- Model: the cake (the trained network with specific weights)
The algorithm is the method; the model is the artifact.
What Do These Terms Mean?
An algorithm is a sequence of steps a computer follows. In ML specifically, it refers to the learning procedure — how weights are updated (Stanford CS229; MIT OpenCourseware).
A model is the resulting function: architecture plus learned parameters. You can run the same algorithm on different data to get different models (Google AI Glossary, 2024).
How Each Works
Algorithm
- Written once in code (e.g., AdamW optimizer)
- Consumes data, produces gradients, updates parameters
- Does not "know" anything specific until trained
Model
- Data structure: architecture (layers) + weights (numbers)
- Runs inference: input -> output
- Serializable (saved to disk as .safetensors, .ckpt, .bin)
Examples
Algorithms
- Gradient descent
- Backpropagation
- Transformer architecture (also an architecture)
- K-means clustering
- Q-learning
Models
- GPT-4 (weights)
- Llama 3 70B
- Stable Diffusion XL
- BERT-base
- Your fine-tuned support classifier
Algorithm vs Model
Aspect
Algorithm
Model
Tangible?
No (pure instructions)
Yes (file on disk)
Changes during training
Usually fixed
Yes — weights update
Reusable
Across datasets
Specific to one training run
Size
A few lines to a few thousand
MB to TB
Swapping
Easy
Hard (retrain)
An architecture like "Transformer" is sometimes called a model family — the combination of architecture + weights is the specific model.
When the Distinction Matters
- Research papers propose new algorithms (attention, Mixture of Experts)
- Products ship specific models (GPT-4o, Claude Sonnet 4.5)
- Licensing: algorithms are rarely licensed; model weights are (Llama 3 license, Mistral license)
- Reproducibility: publishing the algorithm is not enough — sharing weights or training data may be needed
FAQs
Is "the transformer" a model or algorithm? Architecture (a type of algorithm). Specific instances (GPT-4) are models.
Can I use GPT-4's algorithm without the weights? The transformer architecture is public; GPT-4's specific training recipe and weights are not.
Are model weights copyrightable? Legally debated; most labs treat them as proprietary regardless.
Is a model the same as an AI system? Broader — an AI system includes model + inference code + safety filters + UI.
Do open-weight models share algorithms? Usually yes — papers describe the training recipe; weights are released.
What about hyperparameters? Settings tuning the algorithm's behavior for a specific model (covered elsewhere).
Can I combine algorithms? Yes — modern training uses many: AdamW + warmup + mixed precision + gradient accumulation.
Conclusion
Algorithms are the craft; models are the artifacts. Knowing the difference clarifies licensing, reproducibility, and product discussions. More on Misar Blog↗.