The Wrong Metric

Most conversations about AI-generated code quality focus on correctness at generation time. Does the output compile? Does it pass the tests? Does it match the spec?

Those are table stakes. They tell you nothing about the real cost.

The real metric is replaceability: how cheaply can you delete this module and re-implement it behind the same contract when requirements shift?

If the answer is “trivially,” the AI speed compounds. If the answer is “we need to trace six implicit dependencies first,” you have already lost the speed advantage you thought you were buying.

Why AI Code Tends Toward Coupling

Large language models optimize for the immediate ask. They do not optimize for future change.

When you prompt for a login flow, you get a login flow that works. You do not get a login flow behind an auth interface that can be swapped from Firebase to Supabase to a custom JWT service without touching any screen that consumes it.

This is not a model failure. It is a context failure. The model was not asked to optimize for replaceability, so it did not.

The result: AI-generated code tends toward tight coupling by default. Not because the model is bad, but because isolation is never the path of least resistance for a next-token predictor.

Replaceability Is an Architecture Decision

Replaceability does not happen by accident. It is a deliberate structural choice:

  • Every external dependency sits behind an interface the app owns.
  • Every generated module exposes a contract, not an implementation.
  • Every optional capability is injected, never imported directly.
  • Every composition decision lives in one root, not scattered across fifty files.

This is not new. It is basic dependency inversion. What is new is that AI-generated code makes violating these principles effortless and invisible until you need to change something.

The Compound Effect

When every module is replaceable:

  • Bad generations cost minutes, not days.
  • Vendor changes are interface swaps, not rewrites.
  • AI speed stays linear across iterations, not logarithmic.
  • The team can say “regenerate this behind the same contract” and mean it.

When modules are entangled:

  • Every change requires understanding the full dependency graph.
  • Every AI-assisted refactor risks breaking unrelated systems.
  • The team gradually stops trusting AI output because the blast radius is unpredictable.
  • Speed collapses back to pre-AI levels, but now with more code to maintain.

The Practical Test

Before accepting any AI-generated module into a codebase, apply one test:

Can I delete this file and re-implement it from scratch, using only the interface it exposes, without modifying any consumer?

If yes, ship it. If no, fix the boundary before you ship.

This is cheap to enforce. A composition root plus interface-first design gives you this property structurally. You do not need elaborate governance. You need one architectural rule applied consistently.

The Connection to AI-Native Architecture

I wrote about this broader problem in Stanford CS146S Is Right About AI Coding — The Missing Subject Is Architecture. The replaceability principle is the specific mechanism that makes AI-native codebases survivable past version one.

Stanford is teaching developers how to use AI tools effectively. That matters. But tool fluency without replaceability is a trap: you ship faster until the codebase becomes expensive to change, and then you ship slower than teams that never used AI at all.

The discipline is not “prompt better.” The discipline is “architect so that prompting mistakes are cheap to undo.”

FAQ

What does “replaceable architecture” mean for AI-generated code?

Replaceable architecture means every AI-generated module sits behind an interface that the rest of the system depends on — not the implementation itself. When requirements change or the generation is wrong, you delete the module and re-implement it without touching consumers.

How do you enforce replaceability in an AI-generated codebase?

Three mechanisms: interfaces owned by the application (not the dependency), a single composition root where all implementations are wired together, and a CI check that fails if any module imports another module’s internals directly.

Does replaceability slow down initial development?

No. Defining an interface before generating an implementation adds seconds. The cost of not doing it shows up weeks later when a vendor swap or refactor turns into a full rewrite.

Is this the same as dependency injection?

Dependency injection is one mechanism for achieving replaceability, but it is not the whole picture. Replaceability also requires contract tests, boundary validation, and a composition root — not just constructor parameters.