A lock that outlives your process: how distributed leases actually work

Your sync.Mutex does not survive a kill -9. It does not survive an OOM, a deployment rollout, or a node reboot. The instant the process exits, the lock is gone. If that lock was protecting a scheduled job, a data migration, or a leadership election, you now have two processes convinced they are the only ones running.

That is not a bug in your mutex. It is a category error. A process-local lock cannot protect a cluster-wide resource.

The fix is a distributed lease: a lock stored in an external system with a Time-To-Live, a unique owner token, and a mechanism for safe release. Redis, PostgreSQL, etcd, and ZooKeeper all implement the same idea with different trade-offs. The pattern is straightforward. The edge cases are not.

What is a distributed lease?

A lease is a promise from an external store that only one client holds a named lock for a bounded period of time. Unlike a mutex, which lives in heap memory, a lease lives in Redis or PostgreSQL. It persists across process restarts because the store persists.

The basic operations are:

Acquire: atomically write a key with a TTL, but only if it does not already exist.
Renew: extend the TTL while you still hold the lock.
Release: delete the key, but only if it still contains your unique token.

The unique token is the critical part. Without it, a slow client can release a lock that was re-acquired by someone else after a crash.

Why TTL is mandatory, and why it is dangerous

If a client acquires a lock and then dies, the lock must eventually free itself. The only way to do that without human intervention is an expiration time. Redis supports this natively with SET key value NX EX 30. PostgreSQL advisory locks are session-bound and die when the TCP connection closes, which is elegant but less portable.

TTLs introduce a new failure mode: if your work takes longer than the TTL, the lock expires while you are still running. Another client acquires the lock and starts the same work. You now have two concurrent processes mutating the same data.

The naive fix is a very long TTL. That works until a client dies immediately after acquiring the lock. The remaining TTL becomes a mandatory downtime window where no one else can take over.

The correct fix is a heartbeat. The holder spawns a background goroutine that renews the lease every few seconds. If the holder crashes, the heartbeat stops, the TTL expires, and a new owner can acquire the lock within seconds. If the holder is merely slow, the heartbeat keeps the lease alive as long as the process lives.

A working Redis lease in Go

Here is a complete implementation that handles acquisition, renewal, and safe release. It uses a Lua script for release so that the delete only succeeds if the token matches.

package lease

import (
    "context"
    "crypto/rand"
    "encoding/hex"
    "fmt"
    "sync"
    "time"

    "github.com/redis/go-redis/v9"
)

type Lease struct {
    client     *redis.Client
    key        string
    token      string
    ttl        time.Duration
    renewEvery time.Duration
    stopRenew  chan struct{}
    once       sync.Once
}

func generateToken() string {
    b := make([]byte, 16)
    rand.Read(b)
    return hex.EncodeToString(b)
}

func Acquire(ctx context.Context, client *redis.Client, key string, ttl time.Duration) (*Lease, error) {
    token := generateToken()
    ok, err := client.SetNX(ctx, key, token, ttl).Result()
    if err != nil {
        return nil, err
    }
    if !ok {
        return nil, fmt.Errorf("lock already held")
    }

    l := &Lease{
        client:     client,
        key:        key,
        token:      token,
        ttl:        ttl,
        renewEvery: ttl / 3,
        stopRenew:  make(chan struct{}),
    }
    go l.renew()
    return l, nil
}

func (l *Lease) renew() {
    ticker := time.NewTicker(l.renewEvery)
    defer ticker.Stop()
    for {
        select {
        case <-ticker.C:
            ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
            script := `
                if redis.call("get", KEYS[1]) == ARGV[1] then
                    return redis.call("pexpire", KEYS[1], ARGV[2])
                else
                    return 0
                end
            `
            l.client.Eval(ctx, script, []string{l.key}, l.token, l.ttl.Milliseconds()).Result()
            cancel()
        case <-l.stopRenew:
            return
        }
    }
}

func (l *Lease) Release(ctx context.Context) error {
    l.once.Do(func() { close(l.stopRenew) })
    script := `
        if redis.call("get", KEYS[1]) == ARGV[1] then
            return redis.call("del", KEYS[1])
        else
            return 0
        end
    `
    res, err := l.client.Eval(ctx, script, []string{l.key}, l.token).Int64()
    if err != nil {
        return err
    }
    if res == 0 {
        return fmt.Errorf("lock was lost or stolen")
    }
    return nil
}

Usage:

lease, err := lease.Acquire(ctx, redisClient, "job:invoice-generation", 10*time.Second)
if err != nil {
    // Another instance is running the job.
    return
}
defer lease.Release(ctx)

// Do the work. If this takes 30 seconds, the heartbeat keeps the lease alive.
generateInvoices()

The Lua scripts are necessary because Redis does not support compare-and-delete as a single native command. Without them, a release could race with a re-acquisition and delete someone else’s lock.

Where this breaks down: the fencing token problem

Even with perfect TTL and renewal, there is a subtle race condition identified by Martin Kleppmann in his critique of Redlock. Imagine:

Client A acquires the lease.
Client A pauses for 45 seconds (GC stop-the-world, VM suspension, CPU throttling).
The lease expires.
Client B acquires the lease.
Client A unpauses and writes to the database.
Client B writes to the database.

Both clients believed they held the lock. Both mutated shared state.

The TTL protects against permanent deadlocks, but it cannot protect against delayed processes. The fix is a fencing token: a monotonic number or UUID that the lock holder attaches to every write. The storage layer rejects writes with an outdated token.

In practice, this means your database table needs a lock_version column, or your blob store needs conditional writes. Most applications skip this step because it requires changing the data layer, not just the locking layer. That is a reasonable trade-off, but it is a trade-off. You should know you are making it.

Alternatives worth considering

PostgreSQL advisory locks are session-scoped. When the TCP connection closes, the lock releases automatically. There is no TTL management and no clock skew. The downside is that they are tied to a single database connection, so they do not work well with connection pooling or multi-region setups.

etcd leases are designed exactly for this problem. They support TTL, automatic revocation, and watch-based notifications when a lease dies. If you are already running Kubernetes, you have etcd. The API is more verbose than Redis, but the semantics are cleaner.

ZooKeeper ephemeral sequential nodes are the classic solution. They are CP (consistent and partition-tolerant) under the CAP theorem, which eliminates the clock-skew problem entirely. They are also slower and operationally heavier than Redis.

What we did not try

We did not implement a custom consensus protocol on top of a relational database. Every team eventually tries this: a locks table with INSERT ... ON CONFLICT and a last_heartbeat column swept by a cron job. It works in the happy path. It falls apart under contention because MVCC databases serialize conflicting writes, and your lock acquisition becomes the bottleneck for your entire system. Use the right tool for the job.

Choosing a TTL

Too short: heartbeats become chatty, and a single slow GC pause can lose the lease.

Too long: a crashed holder blocks failover for the full TTL.

A good starting point is 10 seconds with a heartbeat every 3 seconds. Tune from there based on your observed GC pauses and network latency. Measure your p99 heartbeat latency. If it is 500 ms, your TTL must be at least an order of magnitude larger.

Try it

If you are currently using a sync.Mutex to protect a background job, replace it with a leased lock backed by Redis or PostgreSQL. Start with the implementation above. Add metrics for lease_acquired, lease_lost, and heartbeat_latency. The first time you deploy during a long-running job and watch the second instance politely wait instead of colliding, you will know the category error is fixed.