← All exercises
Exercise 07

Testing as a Safety Net

Give the agent eyes. No change ships unchecked.

Duration 45–75 minutes
Difficulty Intermediate
Prerequisites Exercise 05 — a deployed Travel Bucket List.
Testing as a Safety Net

Overview

Your Travel Bucket List is deployed. It's secured. Everything works. Now change something. Add a feature. Tweak the layout. How do you know you didn't break anything? Right now, you don't.

Tests give the agent eyes. A test suite is a machine-readable specification of what your app is supposed to do. You'll write end-to-end tests with Playwright, experience TDD with an AI agent, and set up GitHub Actions CI — so no change ships unchecked.

Without tests, 'add sorting' is ambiguous. Sort what? By what criteria? In what direction? Where does the button go? The agent makes guesses. With a test, every question has an answer encoded in the assertions. The test is the spec.

— from the exercise

By the end of this exercise

Playwright installed and configured for browser testing
End-to-end tests covering every core feature of your app
Experience with TDD: the red-green-refactor cycle with an AI agent
Confidence from deliberately breaking things and watching tests catch it
A GitHub Actions CI pipeline that runs tests on every push
A green checkmark (or red X) on every pull request

Terms you'll learn

Term What it means
End-to-end test A test that uses a real browser to interact with your app like a user would
Headless browser A browser that runs without a visible window — same rendering, no screen needed
TDD Test-Driven Development — write the test first, then make it pass, then clean up
Red-green-refactor Red: test fails. Green: make it pass. Refactor: clean up without breaking it
CI/CD Continuous Integration / Continuous Deployment — automated testing and shipping
Assertion A statement in a test that says "this should be true" — if not, the test fails

Sections in this exercise

01 What You'll Build
02 What You'll Need
03 The Blindfolded Agent
04 Your First Test
05 Build the Suite
06 Red-Green-Refactor with an Agent
07 Breaking Things on Purpose
08 Automate It: GitHub Actions
09 The Checkpoint

You broke your app on purpose and watched the tests catch it. The red X is a feature, not a failure — it caught a problem before it reached production. Your agent now has eyes: a machine-readable specification of what "working" means.