rerun

Durable execution for Go

crash · restart · resume

A multi-step process that runs to completion — even when the machine crashes halfway through and restarts hours later. It resumes from where it left off instead of starting over.

go get github.com/sylvester-francis/rerun

The thirty-second pitch

Completed steps are replayed from a journal rather than re-executed, so with idempotent steps there are no double charges, no skipped steps, and no half-finished state left behind. It is the guarantee you would reach for Temporal to get — without the cluster. No servers, no queues, no YAML: under 1000 lines of engine, a pure-Go SQLite default (zero CGO), and zero third-party dependencies in the core.

Write it as if crashes did not exist

func checkout(w *rerun.W) error {
    txn, err := rerun.Do(w, "charge", func(ctx context.Context) (string, error) {
        return chargeCard(ctx, order) // runs once, ever — even across crashes
    })
    if err != nil {
        return err
    }

    rerun.Sleep(w, 24*time.Hour) // durable: survives restarts, skipped on replay

    _, err = rerun.Do(w, "receipt", func(ctx context.Context) (any, error) {
        return nil, sendReceipt(ctx, txn)
    })
    return err
}

If the process dies after charge but before receipt, the next boot's Recover replays the journal: charge returns its stored result without re-charging, the Sleep is skipped because its deadline already passed, and execution continues into receipt.

Why it holds up

Journal, then replay

The engine persists the result every completed step produced. Recovery is just running the function again — done steps return instantly from the journal, so recovery is not guesswork.

Zero-dependency core

Under 1000 lines, stdlib only. The pure-Go SQLite and Postgres backends stay out of your build unless you import them.

Correct under failure

A poisoned run is contained, never crashing its neighbours. Graceful shutdown parks in-flight runs and resumes them. Proven by a SIGKILL crash harness at every step boundary.

One store, any process

Swap sqlite.New(path) for postgres.New(dsn) to run across machines. The workflow code is identical against any Store.

The whole API

Do run a step exactly once · Sleep a durable delay · Recover resume everything mid-flight

Plus Return/Result, Retry, signals (Wait/Deliver), Cancel, and Version — each one an ordinary Do underneath.

Honest asterisk: at-least-once for side effects. Make your steps idempotent and replay makes them effectively exactly-once.