Definition
Mutation on diff is running mutation testing (or mutation-style checks) only against the code that changed in a PR, to prove your tests would fail if that logic were wrong.
It’s “high-signal testing, scoped to what you touched.”
Why it exists
Full mutation testing is powerful but expensive. Diff-scoping gives you most of the benefit:
- catches “tests that don’t test,”
- forces assertions on meaningful behavior,
- and makes reward hacking harder.
What counts as “mutation”
A mutation is a small change that should break a good test suite:
- flip a boolean
- remove a condition
- change a comparison
- return
nullinstead of a value
If tests still pass, the tests aren’t protecting the behavior.
How to implement
- Determine the diff surface (files/functions touched).
- Run a mutation tool (or a custom mutator) only on that surface.
- Fail the gate if mutants survive above a threshold.
- Emit results into the build receipt.
Where it fits
Mutation-on-diff is a strong gate late in a gated multi-agent flow:
- after unit tests are green,
- before merge.
Practical rule
- Use it on security utilities, validators, and boundary code first.
- Keep thresholds strict on critical paths, lenient elsewhere.
- Don’t run it on every PR until it’s fast and reliable.