AI Ships Fast — But Who's Checking the Code?
Table of Contents
I shipped a broken deploy script last week. Not because I wrote bad code — because I didn’t write the code. Claude did. And I merged it without reading it carefully enough.
The script worked. It built, it pushed, it deployed. But it was missing a single file — .nojekyll — and that one omission nuked every stylesheet on my site. The whole blog went live looking like raw HTML from 1997.
Here’s the thing: AI didn’t make a mistake. I made a mistake. I trusted the output without verifying it. And that’s the part I keep getting wrong.
The speed problemLink to heading
AI makes you fast. Dangerously fast.
When Claude scaffolds a feature in three minutes, there’s this gravitational pull toward shipping it immediately. The code compiles. The types check. It looks right. So you merge it and move on to the next thing.
But “compiles and looks right” is not the same as “correct.” I’ve merged AI-generated code that passed every lint check and still had a subtle logic bug that didn’t surface for days. The code was syntactically perfect and functionally wrong.
Speed is only valuable if what you’re shipping actually works. Otherwise you’re just accumulating debt faster.
What I actually do nowLink to heading
After getting burned enough times, I’ve landed on a set of habits that catch most issues before they hit production. Nothing revolutionary — just discipline.
Read the diff, not the file. When AI generates or modifies code, I review the diff, not the final file. The diff shows exactly what changed. It’s harder to miss a deleted line or a weird addition when you’re looking at deltas instead of the whole file.
Run it locally before pushing. This sounds obvious but I skipped it constantly when I first started using AI tools. The code looked good, the agent said it was done, and I pushed. Now I run the build, hit the feature manually, and check the output. Every time.
Ask “what could this break?” Before merging anything non-trivial, I spend thirty seconds thinking about blast radius. Does this touch auth? Does it change a database schema? Does it modify a deploy pipeline? The higher the blast radius, the slower I go.
Check the edges, not just the happy path. AI is great at the happy path. It’ll build you a form that works perfectly — when every field is filled in correctly. But what about empty submissions? Special characters? Network failures? I’ve learned to specifically ask about edge cases and then verify the answers in the code.
Use the tools you already have. Type checking, linting, tests — these aren’t just for human-written code. If anything, they matter more for AI-generated code because the AI doesn’t have the intuitive sense of “this feels off” that experienced developers carry. pnpm build catches things that vibes don’t.
The verification taxLink to heading
I know what you’re thinking. If I have to verify everything, doesn’t that cancel out the speed gains?
No. But it changes where the time goes.
Without AI, I’d spend 80% of my time writing code and 20% reviewing it. With AI, I spend maybe 20% guiding the generation and 40% reviewing it — and the remaining 40% is time I just didn’t have before. That’s time I now spend on architecture, on testing, on shipping more features.
The verification step isn’t a tax. It’s the part of the job that was always there — it just used to be invisible because writing and reviewing happened simultaneously in your head.
The practices that compoundLink to heading
The best thing I’ve done is build verification into my workflow system rather than relying on willpower.
My CLAUDE.md files include validation steps. My slash commands run checks automatically. My deploy pipeline has gates. These aren’t optional checklists I heroically remember to follow — they’re baked into the process so I can’t skip them.
I wrote about system evolution a while back — the idea that every bug should improve your system, not just get fixed. That applies here too. When I find an issue that slipped through, I don’t just fix the bug. I ask: what check would have caught this? And then I add that check.
The .nojekyll incident? I now have a mental model for deploy scripts: what files does the target platform expect that aren’t in the build output? That question would have caught it in thirty seconds.
Where this is headedLink to heading
AI-generated code is only going to get better. The models will make fewer mistakes, the tooling will catch more issues, and the verification step will get lighter over time.
But it will never hit zero. There will always be a gap between “the AI produced code” and “this code is correct for my specific context.” Closing that gap is the job now. Not writing code — verifying code. Not generating features — validating features.
The developers who ship reliable software with AI won’t be the ones who trust it the most. They’ll be the ones who built the best systems for catching what it gets wrong.
That’s the skill I’m investing in. And honestly, it’s making me a better engineer than I was before AI showed up.