Week 2: Data, ML, and How Models Learn  /  Lesson Preview

Classification, Regression, and Model Choice

Know when a classical model is the correct tool.

Difficulty core
Duration 55 min
Gate ML Decision Boundary Gate
Objective

Differentiate classification from regression and choose a simple baseline model appropriately.

The lesson is public. The pressure loop lives inside the app where submissions, revision, and AI review happen.

Deliverable

A simple ML pipeline with evaluation and a leakage audit.

Each lesson contributes to a week-level artifact and eventually to the shipped AI-native SaaS.

PREVIEW_LESSON

Classification, Regression, and Model Choice

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.

Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.

Unlock full lesson

What the machine covers in this lesson.

What This Is

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Why This Matters in Production

Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.

Mental Model

Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.

Deep Dive

Classification predicts categories. Regression predicts continuous values. Clustering groups without labels. The deeper lesson is that problem framing governs everything after it. If you frame the problem badly, metrics, model family, data prep, and deployment expectations all drift. A serious AI Engineer learns to slow down at framing time because that is where cost and interpretability are decided.

Worked Example

Predicting whether a user will churn next month is classification. Predicting next-month spend is regression. Segmenting learners into latent behavior groups is clustering. If you misuse one for another, your evaluation metric stops telling the truth.

Common Failure Modes

Common failures include optimizing for novelty, skipping baselines, and choosing a model because it sounds advanced instead of because it matches the target and operating constraints.

Further reading the machine expects you to use properly.

official-doc

scikit-learn Supervised Learning

Use this as the canonical task-family reference.

Open reference
official-doc

ML Glossary

Useful for keeping terminology precise.

Open reference
article

Why Baselines Matter

Tie model selection to observability and practical judgment.

Open reference

The full lesson is inside the app.

Submit the exercise, receive AI review, close the gaps the machine finds, and unlock the next lesson in the sequence.

Enter the training loop Back to week