Table of Contents
Would I trust AI to resolve merge conflicts? My answer keeps surprising me.
Yes — but only when three things are true.
I’m the only one touching the code. I know what I wrote. And there’s a real QA pass plus automation behind me to catch anything that slips through.
Pull any one of those out and my answer flips to no.
Condition 1: I’m the only one touching the codeLink to heading
This is the foundation. When the branch is mine — when nobody else has been editing the same files in parallel — the conflict is almost always trivial. It’s me versus an older version of myself, and “myself” doesn’t refactor in surprising directions overnight.
Multi-author conflicts are a different animal. Two people refactoring the same module in different directions can produce a resolution that compiles, passes types, and is semantically wrong. The agent doesn’t know that the team decided last week to deprecate the pattern one of those branches was building on.
Solo code means bounded ambiguity. Shared code means ambiguity I can’t see from the diff.
Condition 2: I’m familiar with the codeLink to heading
The agent’s resolution lands inside a space I already understand. If it picks the wrong side, I’d feel it — not on the diff, but downstream, the first time the feature behaves oddly.
Familiarity is the verification I’d otherwise have to do explicitly. I’m not reading the resolution line by line, but I am reading the running app, the deploy preview, the QA report. And I’d recognize wrong behavior in a codebase I wrote in a way I’d never recognize it in someone else’s.
Strip the familiarity and the agent’s mistakes go quiet. They ship.
Condition 3: QA and automation catch the restLink to heading
This is the load-bearing one.
Every change runs through a stack of automated checks before it gets near a user: unit tests, integration tests, lint, type-check, E2E suites, plus a manual QA pass on a deployed preview. That pipeline doesn’t care whether the code came from me, from Claude, or from the conflict-resolution agent. If the resolution broke something real, the gates catch it before anything ships.
This is what lets me skip the review entirely. The agent isn’t the last line of defense. I’m not either. The test suite is. The QA pass is. Those gates were built to catch bad code — and bad code is bad code, whether a human or an agent wrote it.
What this looks like for meLink to heading
In my own projects, all three conditions hold. So I let the agent run end-to-end — check out the branch, rebase against origin/main, resolve any conflicts, force-push it back. I don’t read the diff. I don’t open the file. I merge.
git fetch origingit checkout <branch>git rebase origin/main# any conflicts — agent resolves them inlinegit rebase --continuegit push --force-with-leaseTwo months in, this is working — not perfectly. Some bugs still make it to production. The gates catch most of them, not all of them. Like everything else, it’s a work in progress. What I care about is the time. The agent takes the boring step, the gates carry most of the risk, and I get hours of my week back to help with other things instead of getting stuck on a rebase.
Human intervention, last — not firstLink to heading
Here’s the bigger principle underneath all of this. There is still a need for human judgment in the loop. The question is where to spend it.
My answer: at the end, not the middle.
If AI is good at one thing, it’s writing code. Scaffolding components, wiring integrations, composing the boring mechanics of a merge — those are the parts it’s already faster and more accurate at than I am. Why would I waste my attention writing what it can scaffold in seconds?
So I push the human step as late as I can. The agent writes. The agent rebases. The agent resolves. I show up at the end — final pass, deploy decision, the read on whether the running thing actually serves the user. That’s where my judgment moves the needle. Earlier than that, I’m just slowing the loop down.
That’s also why the three conditions matter so much. They define the perimeter inside which “human, last” is safe. On a shared branch, in code I don’t know, without QA and automation behind me — the perimeter collapses, and the human has to step back into the middle of the loop. Review every line. Re-run the tests. Carry the trust manually.
Inside the perimeter, the agent does the composing. I do the deciding. That split is what’s working.
Where I landLink to heading
Trust lives in the system the agent runs inside, not in the agent itself. My system has three solid layers around it, and that’s what makes “agent fixes it, I push” something other than reckless.
The day any one of those layers gives out, the workflow has to change with it. I’m watching for that more closely than I’m watching the agent.