Your service crashes halfway through a POST /charge request. The client sees a timeout and retries. You now have two charges. The customer is angry. The database is consistent. Your business logic is not.
This is not an edge case. It is the default behavior of distributed systems. Networks drop packets. Containers get OOM-killed mid-request. Load balancers return 502s for requests that already reached the backend. If your API assumes “I got an error, so the operation definitely didn’t happen,” you are building a bug factory.
The at-least-once reality you cannot escape
HTTP is at-least-once by default. TCP retries lost packets. Your HTTP client retries on timeout. Your infrastructure retries on 5xx. Every layer assumes the previous one might fail and resends.
The problem is not retries. The problem is non-idempotent operations.
A GET is safe to retry because reading twice is the same as reading once. A POST /charge or POST /orders is not. Running it twice creates two resources. Running it zero times loses a sale. You cannot choose “exactly once” on an unreliable network. You can only choose between “at least once” with deduplication, or “maybe zero, maybe two” with data corruption.
Idempotency is how you make “at least once” safe.
How idempotency keys actually work
Stripe popularized this pattern, but the idea is older. The client generates a unique key (a UUID) and sends it in a header: Idempotency-Key: <uuid>. The server stores the tuple (key, request_body, response) before executing the side effect. If the same key arrives again, the server returns the stored response without re-running the operation.
The key insight: the server must store the key before doing the work, not after. If you write the charge row first and crash before storing the key, the retry creates a second charge. The storage of the idempotency key and the business mutation must be atomic.
In practice, this means one of two things:
-
The idempotency store and the business store share a database transaction. You insert into
idempotency_keysandchargesin the sameBEGIN ... COMMIT. If the commit succeeds, both exist. If it fails, neither does. -
The idempotency store is the business store. Your
chargestable has aclient_idempotency_keycolumn with aUNIQUEconstraint. The retry fails the unique check, and you return the existing row.
Option 2 is simpler and what most teams should start with.
A working implementation with atomic storage
Here is a minimal but complete Python server using SQLite. The idempotency key lives in the same transaction as the business mutation. If the server crashes after COMMIT, the retry hits the cache. If it crashes before COMMIT, nothing is persisted and the retry re-runs safely.
import sqlite3
import json
import uuid
DB_PATH = "/data/inventory.db"
def init_db():
conn = sqlite3.connect(DB_PATH)
conn.execute("""
CREATE TABLE IF NOT EXISTS reservations (
id TEXT PRIMARY KEY,
item_id TEXT NOT NULL,
quantity INTEGER NOT NULL,
idempotency_key TEXT UNIQUE NOT NULL,
created_at INTEGER
)
""")
conn.execute("""
CREATE TABLE IF NOT EXISTS idempotency_responses (
key TEXT PRIMARY KEY,
response_body TEXT NOT NULL,
created_at INTEGER
)
""")
conn.commit()
return conn
def reserve_inventory(conn, idempotency_key, item_id, quantity):
# Step 1: Check for an existing response.
cursor = conn.execute(
"SELECT response_body FROM idempotency_responses WHERE key = ?",
(idempotency_key,)
)
row = cursor.fetchone()
if row:
return json.loads(row[0])
# Step 2: Do the work inside a single transaction.
conn.execute("BEGIN")
# Deduct inventory atomically.
conn.execute(
"UPDATE inventory SET available = available - ? WHERE item_id = ? AND available >= ?",
(quantity, item_id, quantity)
)
if conn.total_changes == 0:
conn.rollback()
return {"error": "insufficient inventory"}
# Record the reservation.
reservation_id = str(uuid.uuid4())
conn.execute(
"INSERT INTO reservations (id, item_id, quantity, idempotency_key, created_at) VALUES (?, ?, ?, ?, strftime('%s','now'))",
(reservation_id, item_id, quantity, idempotency_key)
)
# Cache the response for retries.
response = {"reservation_id": reservation_id, "status": "reserved"}
conn.execute(
"INSERT INTO idempotency_responses (key, response_body, created_at) VALUES (?, ?, strftime('%s','now'))",
(idempotency_key, json.dumps(response))
)
conn.commit()
return response
The idempotency_responses table is the safety net. The first request performs the mutation, commits the result, and caches the response. Every subsequent request with the same key skips the work and returns the cached JSON. The reservation and the cache entry are written in the same transaction, so they are either both visible or both absent.
Where this breaks down: the side-effect boundary
Idempotency keys handle duplicate requests from the same client. They do not handle concurrent duplicate requests from different clients. If two users click “Buy” on the last item in stock, you still need pessimistic locking or optimistic concurrency control. The available >= ? check in the example above is a primitive form of this, but real inventory systems need more.
The bigger problem is side effects outside the transaction. If you charge a card via Stripe, send an email via SendGrid, and write to your database, the idempotency key only protects the database part. The email might send twice. The card might charge twice if Stripe’s own idempotency window expired. True safety requires every downstream system to participate.
This is why Stripe accepts its own Idempotency-Key on charge creation. They deduplicate at their layer. You should do the same at yours. Pass the same key through to any idempotent downstream service. For services that don’t support it, wrap the call in a local transaction or accept the risk.
Key collisions, TTLs, and other operational traps
UUID4 has 122 bits of randomness. The probability of a collision is negligible for any realistic volume. Do not use sequential integers, timestamps, or hashed request bodies as keys. A client-generated UUID is the industry standard for a reason.
Key storage grows forever unless you expire old entries. Set a TTL: 24 hours is standard. After that, delete old keys. If a client retries after the TTL, they get a duplicate. Document this. The retry window and the TTL are a business contract, not a technical detail.
The idempotency store must be at least as available as the API. If your Redis cache is down, you cannot verify keys. Some teams fall back to “assume new request,” which creates duplicates during outages. Others reject the request, which is safer but creates a different failure mode. There is no free lunch here.
The client side matters just as much
Server-side idempotency keys are useless if the client does not send them. Every mutating request should generate a key at the call site and retry on timeout with the same key:
import uuid
import requests
def safe_post(url, payload, max_retries=3):
key = str(uuid.uuid4())
for attempt in range(max_retries):
try:
resp = requests.post(
url,
json=payload,
headers={"Idempotency-Key": key},
timeout=10,
)
return resp
except requests.Timeout:
if attempt == max_retries - 1:
raise
# Retry with the SAME key. The server deduplicates.
The key must be generated once per logical operation, not once per HTTP attempt. If you generate a new UUID on every retry, you’ve missed the point. The key is the contract that ties the retries together.
FAQ
What if the client is a browser and the user refreshes the page?
A page refresh creates a new JavaScript context. The new request gets a new idempotency key unless you persist the key to localStorage or the URL. Most teams don’t bother for non-critical flows. For payments, you should.
Should GET requests use idempotency keys?
No. GET is already safe by HTTP semantics. Idempotency keys are for methods that change state: POST, PUT, PATCH, DELETE.
Can I use the request body hash as the idempotency key?
Only if the body is deterministic and contains no timestamps or random values. In practice, client-generated UUIDs are simpler and more reliable.
How long should I keep idempotency keys?
Longer than your longest client retry window. If clients retry for 60 seconds, keep keys for 24 hours. If clients might retry tomorrow because of a batch job, keep keys for a week.
Make your first request idempotent today
Add an Idempotency-Key header to every mutating endpoint. Start with a 24-hour cache. Use a UUID4. Make the storage atomic with your business transaction. The first time a container dies mid-request and the client retries, you’ll be glad you did. The alternative is explaining to your finance team why a single customer has seventeen identical charges.