makeyourAI.work the machine teaches the human

Week 2: Data, ML, and How Models Learn

Classification, Regression, and Model Choice

Know when a classical model is the correct tool.

core 55 minutes ML Decision Boundary Gate

Objective

Differentiate classification from regression and choose a simple baseline model appropriately.

The lesson is public. The pressure loop lives inside the app where submissions, revision, and review happen.

Deliverable

A simple ML pipeline with evaluation and a leakage audit.

Each lesson contributes to a week-level artifact and eventually to the shipped AI-native SaaS.

Preview

Public lesson preview.

Lesson Preview

Classification, Regression, and Model Choice

Know when a classical model is the correct tool.

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.

Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.

What This Is

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Why This Matters in Production

Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.

Mental Model

Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.

Deep Dive

Classification predicts categories. Regression predicts continuous values. Clustering groups without labels. The deeper lesson is that problem framing governs everything after it. If you frame the problem badly, metrics, model family, data prep, and deployment expectations all drift. A serious AI Engineer learns to slow down at framing time because that is where cost and interpretability are decided.

Worked Example

Predicting whether a user will churn next month is classification. Predicting next-month spend is regression. Segmenting learners into latent behavior groups is clustering. If you misuse one for another, your evaluation metric stops telling the truth.

Common Failure Modes

Common failures include optimizing for novelty, skipping baselines, and choosing a model because it sounds advanced instead of because it matches the target and operating constraints.

References

Further reading the machine expects you to use properly.

official-doc

scikit-learn Supervised Learning

Use this as the canonical task-family reference.

Open reference

official-doc

ML Glossary

Useful for keeping terminology precise.

Open reference

article

Why Baselines Matter

Tie model selection to observability and practical judgment.

Open reference