當你不懂 mutant 改了什麼時，如何殺掉一個存活下來的 mutant

你的 mutation testing 報告充滿了 survivors，其中至少有一個讓你完全摸不著頭緒。

工具說它在第 47 行把 > 翻成了 >=，或是把整個 conditional block 替換成 true，或是 mutate 了一個你根本不知道正在被測試的 string literal。你把 diff 讀了三遍。你仍然不明白這個 mutant 破壞了什麼行為，也不知道什麼 test 能抓到它。於是你跳過它。Mutant 活了下來。你的分數依然很低。

這是 mutation testing 導入最常見的卡關原因。不是執行時間。不是 equivalent mutants。而是工程師盯著一個 survivor，無法把它對應到缺少的 test，然後認定 mutation testing 只是噪音。

它不是。你只是需要一個不同的起點。

問題所在：你從 Mutation 開始，而不是從程式碼開始

大多數開發者面對 surviving mutants 的方式是反的。他們讀 mutation diff，試圖理解引入了什麼 synthetic bug，然後試著想出一個能抓到那個特定 bug 的 test。

這對明顯的狀況有效。對任何細微的狀況都會失敗。

Mutation 可能藏在深達三層呼叫的 helper function 裡。它可能影響了一個你不知道存在的 side effect。它可能在 generated code 或 framework callback 裡。Diff 顯示了什麼改變了，但沒有顯示為什麼現有的 tests 不在意。如果你從解讀 mutation 開始，你等於在對 synthetic code 做 reverse engineering。這即使對有經驗的工程師來說也很困難。

更好的方法是完全忽略 mutation，把 survivor 當作關於你程式碼的訊號，而不是關於 synthetic bug 的訊號。

存活的 Mutant 就只是你的 Tests 沒有驗證到的那一行

每個 surviving mutant 都指向一行在測試期間有執行到的程式碼，但它的 output 或 side effects 從未被 assert。

Mutation 可能是任何東西。它存活下來這件事只代表一件事：如果那一行產生了錯誤的結果，你的 tests 仍然會通過。你不需要理解那個特定的 mutation 就能修正這個問題。你需要理解那一行應該要做什麼，然後寫一個 test 來檢查它是否有做到。

這個重新框架把問題從對 synthetic diffs 做 reverse engineering，變成了正常的 test design。

方法：從那一行往回推，而不是從 Mutation 往前推

這裡有一個四步驟的流程，適用於任何 surviving mutant，無論 diff 看起來多麼令人困惑。

步驟 1：找出 mutation 碰到的那一行

你的 mutation testing 工具的 HTML report 會在 source code 中內嵌顯示被 mutated 的那一行。打開那個檔案，找出原始的那一行，而不是 diff。

舉例來說，假設 Stryker 在這個 function 中回報了一個 survivor：

// pricing.js
function calculateDiscount(price, customer) {
  if (customer.loyaltyYears > 5) {
    return price * 0.85;
  }
  if (customer.isStudent) {
    return price * 0.90;
  }
  return price;
}

module.exports = { calculateDiscount };

Mutation 把第一個 conditional 裡的 > 改成了 >=。這就是那個可能讓你困惑的細節。暫時忘掉它。那一行是 if (customer.loyaltyYears > 5)。

步驟 2：問問看這一行應該要強制執行什麼

不要去想 mutation。去想 business rule。

這一行應該要檢查顧客是否已經忠誠超過五年。如果是，他們得到 15% 的折扣。Boundary 很重要。正好五年的顧客不應該拿到這個折扣。六年的顧客應該要拿到。

現在看看現有的 tests：

// pricing.test.js
const { calculateDiscount } = require('./pricing');

test('returns full price for new customers', () => {
  expect(calculateDiscount(100, { loyaltyYears: 0 })).toBe(100);
});

test('gives loyalty discount to long-term customers', () => {
  expect(calculateDiscount(100, { loyaltyYears: 6 })).toBe(85);
});

test('gives student discount to students', () => {
  expect(calculateDiscount(100, { isStudent: true })).toBe(90);
});

這些 tests 覆蓋了第一個 if statement 的兩個 branch。但它們沒有測試 boundary。loyaltyYears: 5 從未出現。這就是為什麼 >= mutant 存活了下來。工具發現了一個你不知道存在的缺口。

步驟 3：寫一個如果這一行錯了就會失敗的 test

你不需要寫一個專門殺掉這個特定 mutation 的 test。你需要寫一個如果 business rule 被違反了就會失敗的 test。

// pricing.test.js
test('does not give loyalty discount at exactly 5 years', () => {
  expect(calculateDiscount(100, { loyaltyYears: 5 })).toBe(100);
});

test('gives loyalty discount at 6 years', () => {
  expect(calculateDiscount(100, { loyaltyYears: 6 })).toBe(85);
});

現在 boundary 是明確的。如果有人把 > 改成 >=，第一個 test 就會失敗，因為正好五年的顧客會錯誤地拿到折扣。Mutant 死了。你從來不需要理解 >= 在 synthetic diff 裡代表什麼意思。

步驟 4：重新執行 mutation test 並確認

只對這個檔案執行你的 mutation tool，或者如果你夠有耐心就執行完整的 suite。Survivor 應該要消失了。如果沒有，表示你的 test 實際上並沒有執行到你以為的那一行。檢查 coverage data 來確認。

當那一行本身就很令人困惑時

有時候被 mutated 的那一行在 library wrapper、framework hook，或是你沒寫過的 generated code 裡。在這些情況下，survivor 告訴你的是另一件事：你的 codebase 裡有沒有人足夠理解到可以測試的程式碼。

這不是 mutation testing 的問題。這是 mutation testing 浮現出來的 code quality 問題。

你的選擇和沒有 mutation testing 時一樣：refactor 程式碼直到它有一個可測試的表面，或者接受這段程式碼沒有被測試並如此標記。有些工具讓你忽略特定的行或檔案。謹慎使用這個權力。每個被忽略的 mutant 都是一個可能上線的 bug。

困難案例：改變 Side Effects 的 Mutations

Boundary checks 很簡單。Side effects 比較難。

看看這個 function：

// logger.js
function logError(error, context) {
  const timestamp = new Date().toISOString();
  console.error(`[${timestamp}] ${context}: ${error.message}`);
  metrics.increment('error.count');
}

module.exports = { logError };

Mutation testing 工具可能會把整個 console.error 呼叫替換成什麼都沒有，或者把 string template 替換成空字串。如果你的 tests 沒有驗證 log output，這些 mutants 就會存活下來。

大多數團隊不測試 logging。這通常沒問題。但如果你的 logs 被 alerting system 消費，或者 metrics.increment 驅動了一個會呼叫 on-call 的 dashboard，那麼跳過這些 tests 就有風險了。

方法是一樣的。不要研究 mutation。問問看這一行應該要產生什麼行為。如果答案是「帶有 timestamp 的 structured log entry」，就寫一個 assert log output 的 test：

// logger.test.js
const { logError } = require('./logger');

test('logs error with timestamp and context', () => {
  const spy = jest.spyOn(console, 'error').mockImplementation(() => {});
  logError(new Error('db timeout'), 'payment-service');
  expect(spy).toHaveBeenCalledWith(
    expect.stringMatching(/\d{4}-\d{2}-\d{2}T.*payment-service.*db timeout/)
  );
  spy.mockRestore();
});

刪除 console.error 呼叫的那個 mutant 現在會失敗，因為 spy 偵測不到任何呼叫。破壞 string template 的那個 mutant 會失敗，因為 regex 不匹配。你不需要理解任何一個 mutation。

為什麼這個方法比研究 Mutations 更有擴展性

可能的 mutations 有無限多種。你的程式碼應該有的行為是有限的。

如果你試著寫專門殺掉特定 mutations 的 tests，你等於在打 synthetic bugs 的打地鼠遊戲。如果你寫的是驗證程式碼實際行為的 tests，mutations 會作為side effect而死掉。第二種方法是可持續的。第一種不是。

這也是你避免寫出與 mutation tool 過度耦合的 tests 的方法。一個 assert 第 47 行使用 > 的 test 是脆弱的。一個 assert 五年資歷的顧客付全額的 test 才是正確的。

限制：Equivalent Mutants 依然存在

這個方法對 equivalent mutants 沒有幫助，因為 equivalent mutants 不代表缺少 tests。它們代表產生相同行為的 transformations。

如果一個 mutation 在 commutative operation 中把 a + b 改成 b + a，沒有任何 test 能殺掉它。沒有缺少的行為可以 assert。這些是 false positives，而且每個 mutation testing tool 都有。學會辨識它們、忽略它們，然後繼續前進。不要讓 2% 的 equivalent-mutant noise floor 說服你其他 98% 也是噪音。

從最糟的三個檔案開始

如果你的 mutation score 很低，而且有幾十個 survivors，不要試著全部理解它們。挑出 survivor 最多的三個檔案。對每個檔案，挑出三行最可疑的程式碼。把這個方法應用在每一行上。

在一小時之內，你會寫出九個讓你的 codebase 更正確的 tests。重新執行 mutation testing。你的分數會大幅跳升。更重要的是，你會比以前更理解自己的程式碼。

Mutants 不是要你理解它們。它們是要你理解你的程式碼。

FAQ

我需要理解 mutation operator 才能寫 test 嗎？

不需要。Mutation operator 只是一個干擾。專注在原始那一行應該要做什麼。為那個行為寫 test。Mutant 會作為side effect而死掉。

如果被 mutated 的那一行在我無法直接測試的 private function 裡怎麼辦？

這是一個設計訊號。如果一個 function 有值得測試的行為，它就應該是可測試的。要么把它暴露出來以便測試，要么透過呼叫它的 public API 來測試。如果 public API test 無法觸及那個行為，那個行為可能就是 dead code。

我應該殺掉每個 surviving mutant 嗎？

不需要。有些 mutants 觸及 logging、metrics 或其他 observability 程式碼，在那些地方測試的成本超過價值。為你的 codebase 設定一個合理的 threshold，把精力集中在 business logic 裡的 mutants。

如果我的 test 殺掉了 mutant 但感覺還是不對怎麼辦？

相信那個感覺。一個碰巧殺掉 mutant 但沒有清楚 assert business rule 的 test 是 technical debt。重寫它，用 domain language 而不是 test-language 來表達預期的行為。