Exercise 07: Testing as a Safety Net — The Agentic Crew Hands-On Guide

Overview

Your Travel Bucket List is deployed. It's secured. Everything works. Now change something. Add a feature. Tweak the layout. How do you know you didn't break anything? Right now, you don't.

Tests give the agent eyes. A test suite is a machine-readable specification of what your app is supposed to do. You'll write end-to-end tests with Playwright, experience TDD with an AI agent, and set up GitHub Actions CI — so no change ships unchecked.

Without tests, 'add sorting' is ambiguous. Sort what? By what criteria? In what direction? Where does the button go? The agent makes guesses. With a test, every question has an answer encoded in the assertions. The test is the spec.
— from the exercise

What you'll build

By the end of this exercise

✓ Playwright installed and configured for browser testing

✓ End-to-end tests covering every core feature of your app

✓ Experience with TDD: the red-green-refactor cycle with an AI agent

✓ Confidence from deliberately breaking things and watching tests catch it

✓ A GitHub Actions CI pipeline that runs tests on every push

✓ A green checkmark (or red X) on every pull request

Key concepts

Terms you'll learn

Term What it means

End-to-end test A test that uses a real browser to interact with your app like a user would

Headless browser A browser that runs without a visible window — same rendering, no screen needed

TDD Test-Driven Development — write the test first, then make it pass, then clean up

Red-green-refactor Red: test fails. Green: make it pass. Refactor: clean up without breaking it

CI/CD Continuous Integration / Continuous Deployment — automated testing and shipping

Assertion A statement in a test that says "this should be true" — if not, the test fails

What's covered

Sections in this exercise

01 What You'll Build

02 What You'll Need

03 The Blindfolded Agent

04 Your First Test

05 Build the Suite

06 Red-Green-Refactor with an Agent

07 Breaking Things on Purpose

08 Automate It: GitHub Actions

09 The Checkpoint

What just happened

You broke your app on purpose and watched the tests catch it. The red X is a feature, not a failure — it caught a problem before it reached production. Your agent now has eyes: a machine-readable specification of what "working" means.