makeyourAI.work the machine teaches the human

Week 2

Week 2: Data, ML, and How Models Learn

Data handling, feature thinking, evaluation, and classical ML before the LLM layer.

>

This lesson is about data shaping as engineering work, not notebook theater. You are learning how raw tables become trustworthy model inputs.

Checkpoint

ML Decision Boundary Gate

This week ends with a gated checkpoint. You progress by shipping a real artifact, not by reading passively.

Deliverable

A simple ML pipeline with evaluation and a leakage audit.

Each week leaves behind portfolio evidence that compounds into the final SaaS and its operating narrative.

Week Thesis

What the machine expects from you.

This lesson is about data shaping as engineering work, not notebook theater. You are learning how raw tables become trustworthy model inputs.

Bad data silently poisons everything downstream. If your features are inconsistent, mislabeled, or leaky, your model quality and your product decisions become fiction.

Think of the dataset as an interface contract between the real world and the model. Every column carries assumptions about meaning, freshness, allowable values, and transformation history.

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Lesson Stack

Three dense lessons, one enforced deliverable.

Lesson Preview

Data Shaping With Pandas and NumPy

Clean data beats clever models.

This lesson is about data shaping as engineering work, not notebook theater. You are learning how raw tables become trustworthy model inputs.

Bad data silently poisons everything downstream. If your features are inconsistent, mislabeled, or leaky, your model quality and your product decisions become fiction.

Think of the dataset as an interface contract between the real world and the model. Every column carries assumptions about meaning, freshness, allowable values, and transformation history.

Lesson Preview

Classification, Regression, and Model Choice

Know when a classical model is the correct tool.

This lesson teaches the decision boundary between common supervised learning tasks and the discipline of choosing the simplest model that fits the job.

Most expensive AI mistakes begin with using the wrong class of tool. Choosing an LLM for structured prediction or choosing a complex model when a baseline would do creates operational waste.

Start with the business question, map it to target type, then choose the least complex model that gives acceptable signal and explainability.

Lesson Preview

Evaluation, Leakage, and GDPR Boundaries

A bad evaluation pipeline can make a useless model look great.

This lesson teaches you how to distrust a flattering metric until the evaluation design has earned your trust.

Leaky evaluation produces false confidence, which is one of the fastest ways to launch a bad model with executive approval. Privacy mistakes add legal and reputational cost on top.

Evaluation is a claim about future usefulness. Leakage and privacy failures invalidate that claim by corrupting either the data boundary or the legal boundary.

Portfolio Artifact

What survives the week.

audit

ML Pipeline Audit

A structured audit describing data prep, evaluation design, leakage risks, and privacy boundaries.