The Machines That Learn
Computers used to do what we told them, line by line. Now they figure things out from examples. That shift is what "machine learning" means — and it's already running most of the technology in your pocket.
- Give one concrete example of ML in your own life
- Explain in one sentence the difference between "programmed" and "learned" behavior
- Open Google Colab and run your first line of Python
Rules vs. examples — the central shift
This is the most important idea in the entire book. If you get it, everything else follows. If you don't, nothing else makes sense.
Imagine you're trying to teach a friend to recognize a cat.
The old way (rules-based programming): you'd write a list. "Has fur. Has four legs. Has whiskers. Says 'meow.' Tail under 14 inches. Eyes round and forward-facing…" But every rule has an exception. Hairless cats. Three-legged cats. Kittens that haven't learned to meow. You'd be writing rules forever and you'd still get it wrong.
The new way (machine learning): you show your friend ten thousand pictures of cats and ten thousand pictures of dogs, and just say "cat" or "dog" each time. After a while, they can tell new cats from new dogs without you ever having written down a single rule. They've learned the pattern from examples.
That second way is machine learning. The whole field is just figuring out clever ways to do that "show examples and let it figure out the pattern" trick on harder and harder problems.
Where this is going. In Phase 2, you'll formalize this idea. "Show examples and figure out the pattern" will become: we have data points , and we want to find a function f such that for as many i as possible. That sentence is most of machine learning. The rest of the book unpacks what "function," "close," and "as possible" mean precisely.
Where you've already seen ML
ML isn't some future thing. It's already running your life:
| Where you've seen it | What's being predicted |
|---|---|
| Netflix / Spotify / TikTok recommendations | What you'll watch, listen to, or scroll past |
| Gmail's spam folder | Whether an email is junk |
| Phone face unlock | Whether the camera is seeing you |
| Siri / Alexa / Google Assistant | What words you said |
| Google Maps ETA | How long the drive will take |
| Instagram / Snapchat filters | Where your face, eyes, and mouth are |
| ChatGPT / Claude / Gemini | What word should come next |
| Self-driving cars | Where the lane is, what other cars will do |
You've been training ML systems with your behavior for years. Every time you skipped a song, watched a video to the end, or scrolled past a post, that was a labeled example feeding a model.
Three flavors of ML
You'll see these names again and again. For now, just plant the flag — we unpack each one in the next chapter.
- Supervised learning. Learning from labeled examples ("this is a cat, this is a dog").
- Unsupervised learning. Finding patterns in unlabeled data.
- Reinforcement learning. Learning by trial and error with rewards.
Your first code — hello, Colab
Open colab.research.google.com. Sign in with a Google account. Click "New notebook."
In the first cell, type:
print("Hello, ML.")
Hit Shift+Enter. The output appears below the cell.
Add a new cell. Now you'll do something slightly more ML-shaped: load a real dataset and look at it.
# Import pandas. Pandas is the standard library for working with
# spreadsheet-like data in Python. We rename it to "pd" because
# that's what every Python ML tutorial does.
import pandas as pd
# Load the famous iris dataset. This is a built-in dataset of flower
# measurements, often used as the "hello world" of machine learning.
from sklearn.datasets import load_iris
iris = load_iris(as_frame=True)
df = iris.frame
# Show the first 5 rows.
df.head()
Run it. You'll see a small table: 5 rows, 5 columns. Sepal length, sepal width, petal length, petal width, target (the species).
That table is a dataset. Every row is one example (one flower). Every column is one feature (one measurement). The target column is the label — which species the flower is. By the end of Phase 3, you will train a model on this exact dataset that can identify a flower's species from its measurements with 95+% accuracy.
Run one more cell to see why the dataset is so well-loved:
import matplotlib.pyplot as plt
# Scatter plot: petal length vs petal width, colored by species.
plt.scatter(df['petal length (cm)'], df['petal width (cm)'],
c=df['target'], cmap='viridis')
plt.xlabel('petal length (cm)')
plt.ylabel('petal width (cm)')
plt.title('Iris dataset: three species are visibly separable')
plt.show()
The output shows three colored clusters in a 2D plot. Three different species form three visible groups, just from two measurements. Your job by the end of the year is to turn that visible separation into an algorithm a computer can use.
Vocabulary
Questions you might have
You've seen what ML is. Next, the three big flavors of how computers actually learn from data — and you'll train your first real model from scratch in a browser tab.