Phase 1 · Session 02 · 45 min

Teaching Computers Like Teaching Kids

Big idea

Three fundamentally different ways a computer can learn from data: supervised (flashcards), unsupervised (exploration), and reinforcement (trial and error with rewards).

By the end, you'll be able to

Distinguish supervised from unsupervised learning by example
Train your own image classifier in Teachable Machine
Predict which "flavor" of ML powers a system you've used

Supervised learning — learning from flashcards

The metaphor: studying for a vocab quiz with flashcards. You see the front, guess, then flip to check. Adjust. Repeat. After enough cards, you don't need them anymore.

That's supervised learning. Each example has a known correct answer (a "label"). The model predicts, compares to the correct answer, and adjusts.

The math behind the metaphor. A supervised dataset is a set of pairs:

D = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{n}, y_{n})}

where each $x_{i}$ is an input (a feature vector) and each $y_{i}$ is the corresponding correct output (the label). Training is: find a function f such that $f (x_{i}) \approx y_{i}$ for as many i as possible.

What "≈" means precisely is the cost function (Chapter 5). What "find a function f" means precisely is gradient descent (Chapter 6). But the shape of the problem is right there: pairs of inputs and labels, find the function that maps one to the other.

Two sub-types:

Classification. y is a category (spam / not spam). Output is a label.
Regression. y is a number (house price). Output is a value on a continuous scale.

You'll spend Phase 2 on regression and Phase 3 on classification.

Real examples. Email spam classifiers (millions of emails labeled spam/not-spam). Medical image diagnosis (X-rays labeled by radiologists). Self-driving car perception (dashcam frames labeled with what's a pedestrian, car, stop sign). The CAPTCHAs you fill out are labeling training data for Google's self-driving cars. Free labor. Billions of labels.

Unsupervised learning — learning by exploring

The metaphor: you walk into a party with no name tags. After 20 minutes you've noticed clusters: engineers by the snacks, artists on the porch, sports kids in the kitchen. Nobody told you the categories existed. You found them.

The math behind the metaphor. An unsupervised dataset is just inputs:

D = {x_{1}, x_{2}, \dots, x_{n}}

No labels. The job is to find structure: clusters, patterns, weird outliers. We'll formalize one approach (k-means clustering) in Chapter 14.

Real examples. Spotify's "Discover Weekly" (groups songs by sonic similarity nobody hand-coded). Customer segmentation (marketing teams find natural groups of shoppers). Fraud detection (flag transactions that don't look like the rest). Topic discovery in news articles.

Reinforcement learning — trial and error

The metaphor: teaching a dog to sit. No flashcards, no explanation. The dog tries things; when it sits, it gets a treat. Over time it figures out: the thing I was doing right before the treat, do that more.

The math behind the metaphor. RL has a different shape. There's an agent (the model) interacting with an environment. The agent observes a state s, takes an action a, gets a reward r, and the environment transitions to a new state s'. The agent's goal: pick actions that maximize total reward over time.

There are no labeled examples of "right actions." The agent has to discover them by trying. RL is its own field; we don't go deep on it in this book.

Real examples. AlphaGo, AlphaStar, AlphaZero (DeepMind's game-playing agents). The robot hand that learned to solve a Rubik's cube. ChatGPT's final training stage (called RLHF, reinforcement learning from human feedback), where humans rank pairs of model outputs and the model learns to produce ones humans prefer.

Vocabulary

Label—The correct answer attached to a training example.

Dataset—The collection of all your training examples. Notation: D.

Feature—A single piece of input information. For predicting house price: square footage, bedrooms, neighborhood. The full input x for one example is a feature vector.

ActivityTrain your own model in Teachable Machine· 20 min

Open teachablemachine.withgoogle.com. Pick "Image Project, Standard image model."

Create two classes. Name them whatever you want. "Thumbs up vs thumbs down." "Pen vs pencil." Anything that's visually distinct.
Click record on Class 1, hold the pose, capture about 50 examples (10 seconds of webcam).
Same for Class 2.
Hit "Train Model." Wait about 20 seconds.
Hold up your pose at the live preview. Watch the bar fill up for the predicted class.

Once that works, try the harder version: 5 classes. Try to fool your own model. What confuses it? Why?

The pedagogical question. What kind of learning was that, supervised or unsupervised? Supervised — you labeled every example by which class it belonged to. What would unsupervised look like? You'd dump all the webcam frames in without telling it which were which, and the model would have to figure out on its own that there were two distinct groups.

A tiny supervised pipeline in code

You won't write this yourself yet, but seeing it now grounds the experience you just had with Teachable Machine.

# A complete supervised learning pipeline, in 6 lines.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# 1. Get a labeled dataset.
X, y = load_iris(return_X_y=True)

# 2. Split it: 80% to train on, 20% to test on.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# 3. Pick a model.
model = LogisticRegression(max_iter=1000)

# 4. Train it.
model.fit(X_train, y_train)

# 5. Score it.
print("Test accuracy:", model.score(X_test, y_test))

Run it. Output: something like Test accuracy: 0.967.

Six lines. That's the whole shape of supervised ML. Every model you build this year will follow that pattern: get data, split it, pick a model, fit, evaluate. You'll spend Phase 2 understanding what fit is actually doing.

Questions you might have

How did Teachable Machine learn so fast?

Why does it sometimes get confused if I move the camera?

Could I build a model that recognizes my friends?

Is ChatGPT supervised or unsupervised?

Next upChapter 3 — Data is everything

You've seen how computers learn from data. But the data is everything. A model trained on bad data is bad, no matter how clever the algorithm. Next: how ML systems can fail when their data is wrong, and how to inspect a real dataset for problems.

Teaching Computers Like Teaching KidsLab · in development