Mutation Testing in Rust Works, but Your Compile Times Will Hate You

Your Tests Pass. Your Code Is Still Wrong.

You have 100% line coverage. Every branch is hit. Every function is called. Then someone changes a + to a - in your pricing logic, runs the tests, and they all pass.

That is not a theoretical problem. It is what happens when your tests execute the code but do not actually verify the behavior. Coverage measures which lines run, not which outputs get checked. Mutation testing closes that gap by introducing small bugs on purpose and verifying that your tests catch them.

The question for Rust teams is not whether mutation testing is a good idea. It is whether cargo-mutants, the dominant tool in the ecosystem, is practical given Rust’s compile times and type system. The answer is yes, with caveats that matter.

What Mutation Testing Actually Does

Mutation testing is simple in concept. The tool makes a tiny change to your source code, runs your test suite, and checks whether anything fails.

If the test suite fails, the mutant is “killed.” That is what you want. It means your tests noticed the bug.

If the test suite passes, the mutant “survives.” That means your tests executed the mutated code and did not notice anything was wrong. You have a weak test.

Common mutations include replacing arithmetic operators (+ becomes -), swapping comparison operators (> becomes >=), replacing boolean literals (true becomes false), and deleting function calls that return values. Each change is small enough that a human would recognize it as a bug. The test suite should recognize it too.

How cargo-mutants Works on Rust Code

cargo-mutants is a mutation testing tool built specifically for Rust. It does not require you to annotate your tests or change your build system. You install it and run it.

cargo install cargo-mutants
cargo mutants

The tool scans your source files, generates mutants by applying transformation rules to the AST, and runs cargo test for each one. It tracks which mutants survive and prints a report.

Here is a function with a test that looks solid but is not:

pub fn apply_discount(price: f64, rate: f64) -> f64 {
    price * (1.0 - rate)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_apply_discount() {
        let result = apply_discount(100.0, 0.2);
        // We ran the function. Coverage is 100%.
        // But we never asserted the result.
    }
}

cargo mutants will generate a mutant that changes the * to / or replaces 1.0 - rate with 1.0 + rate. The test will still pass because it never checks result. The surviving mutant flags the problem.

A real test that kills the mutant looks like this:

#[test]
fn test_apply_discount() {
    assert_eq!(apply_discount(100.0, 0.2), 80.0);
    assert_eq!(apply_discount(50.0, 0.0), 50.0);
}

Now every arithmetic mutant fails because the assertions catch the wrong output.

What the Output Looks Like

Run cargo mutants and you get a summary:

Found 42 mutants
Killed 38 mutants
Missed 4 mutants
Timeout 0 mutants
Unviable 0 mutants

Missed mutants are the ones that survived. cargo mutants writes each one to mutants.out/ with the diff and file path. You read the diff and add the missing assertion.

Timeouts happen when a mutant causes an infinite loop. cargo-mutants detects this and marks it as killed by timeout, which counts as a success.

Unviable mutants are changes that do not compile. Rust’s type system rejects them before the tests even run.

Rust’s Type System Is a Double-Edged Sword

In JavaScript or Python, mutation testing tools can replace almost any operator and the code will still run. It will just produce wrong results. In Rust, many mutations are caught by the compiler before the tests even run.

Replace + with - on unsigned integers and you might get an overflow, but the code compiles. Replace > with < in a generic context and the compiler might reject it if the trait bounds do not support the comparison. Delete a function call that returns a value the caller expects, and the compiler errors out.

This means cargo-mutants generates fewer viable mutants than equivalent tools in other languages. A Python project might see 200 mutants for a module. A Rust project might see 40. The mutants that do compile are the ones that could actually slip into production. The type system filters out noise.

The trade-off is compile time. Every viable mutant triggers a rebuild. A project with a five-minute test suite might spend an hour running cargo mutants.

The Compile-Time Tax Is Real

This is the main reason teams hesitate. Mutation testing is embarrassingly parallel in theory. Each mutant is independent. In practice, Rust’s build system does not parallelize cleanly across dozens of compiler invocations on the same source tree.

cargo-mutants has a --jobs flag, but disk I/O and crate graph locking become bottlenecks. On a typical CI runner with two cores, the job scales poorly.

You can mitigate this. Use --in-place to avoid copying the source tree for every mutant. Use --file or --exclude to target specific modules. Run mutation testing nightly or weekly, not on every push.

What cargo-mutants Misses

No mutation testing tool catches everything. cargo-mutants has specific limitations you should know about.

It does not mutate macro expansions. If your critical logic lives inside a macro, the tool sees the invocation, not the generated code.

It does not understand semantic equivalence. Some mutants produce behavior that is different but still correct for all valid inputs. A redundant + 0 might survive because the tests do not care, even though the mutation is not a real bug. You have to triage these manually.

When Mutation Testing Is Worth the Cost

You do not need to run cargo mutants on every commit. You need it when your test suite is large enough that you no longer trust your own assertions.

Run it when a critical module has high coverage but you have shipped bugs in it anyway, or when a refactor changed logic in subtle ways and you want confidence that assertions are tight.

Do not run it when your test suite is already flaky or your compile times are the bottleneck everyone complains about. Fix the fundamentals first.

Adding It to CI Without Breaking the Pipeline

The practical setup is a scheduled job, not a gate on every pull request.

Here is a GitHub Actions workflow that runs weekly:

name: Mutation Testing

on:
  schedule:
    - cron: "0 3 * * 1"
  workflow_dispatch:

jobs:
  mutants:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - uses: Swatinem/rust-cache@v2
      - name: Install cargo-mutants
        run: cargo install cargo-mutants
      - name: Run mutation testing
        run: cargo mutants --in-place
      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mutants-report
          path: mutants.out/

The --in-place flag keeps disk usage reasonable. rust-cache cuts the initial build time. The scheduled trigger avoids blocking developers. Upload the report as an artifact so you can review surviving mutants without scrolling through CI logs.

Start with One Module

You do not need to mutate your entire codebase. Pick one module with business-critical logic and a history of bugs. Run cargo mutants --file src/pricing.rs. Read the report. Fix the weakest test.

The first run is always the worst. You will find tests that execute the code but assert nothing. You will find branches covered by tests that do not check the branch outcome. You will wonder how those tests ever felt adequate.

That is the point. Mutation testing does not find bugs in your code. It finds bugs in your tests. In Rust, where the compiler already catches the obvious mistakes, that is exactly the feedback loop you need.

Frequently Asked Questions

What is mutation testing?

Mutation testing evaluates your test suite by introducing small, deliberate bugs into your source code. If your tests fail, the mutant is “killed.” If your tests pass, the mutant “survives” and you have a gap.

How does mutation testing differ from code coverage?

Coverage measures which lines executed. Mutation testing measures whether your tests would detect wrong output from those lines. A test can have 100% coverage and catch zero mutants.

Is mutation testing slow for all Rust projects?

The cost scales with compile time and test count. Small libraries can finish in minutes. Large workspace projects take significantly longer. Use --file and --exclude to scope runs to specific modules.

Can I ignore false positive mutants?

Yes. cargo-mutants supports a mutants.toml configuration file to exclude files, functions, or specific mutation types. Use this sparingly so you do not mask real test gaps.