scalable systems handle growth without breaking
← back · 2 min read ·

the production readiness review

a checklist with teeth, run before ship.

the production readiness review (prr) is one of those rituals that sounds bureaucratic and turns out to be the cheapest insurance you’ll ever buy. google’s sre book covers a more elaborate version under “early engagement” and “launch coordination engineering” — the checklist below is the one-hour version, sized for teams that don’t have a dedicated sre org.

the format

one hour, one document, one team’s launch on the line. the document is a checklist. the items are specific — not “is the system observable” but “what’s the dashboard url for the new endpoint, and what alerts fire on it.” the team that’s launching presents. the team that’s reviewing asks the questions on the checklist, in order.

the categories that have earned their place

it’s not a gate that stops launches — it’s a forcing function that surfaces the items the team didn’t think about. teams that go through prr a few times start writing the answers into the original design doc. the review gets shorter. that’s the point.

the failure mode of the prr: it becomes a rubber stamp. defenses: rotate the reviewers, allow nobody who built the thing to also review it, keep the document specific to this launch (not a generic template the team filled in).

the cost: an hour. the alternative: an outage at 3am.

scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk