Skip to content

Background job monitoring

Background job monitoring for the work customers feel when it fails.

Background work fails differently from web requests. Luota tracks the run lifecycle so operators can see when work started, how long it ran, how it failed, what changed, and who owns the next decision.

Operating spine

The workflow operating spine.

A queue draining or worker exiting is not always the promise. The proof path follows the work until the customer-visible state is true.

01 / Promise

Name the outcome

The background job updates the state, file, provider, message, or customer-visible result people rely on.

02 / Signal

Send the signal

Open a run when the job starts and close success only after the final reconciliation or delivery check passes.

03 / Drill

Break it on purpose

Delay, fail, or mark one run stuck so Luota proves it can distinguish incomplete work from normal queue movement.

04 / Incident

Inspect the bad day

The incident should show external run id, stage, duration, failure summary, payload tags, deploy SHA, and host.

05 / Handoff

Leave an operator path

Route the owner to retry, reconcile, or suppress duplicate work with enough context to act safely.

Start and close calls

Send /runs/start when the job begins and /success or /fail when it ends. Re-sending the same run start is idempotent through externalRunId.

Failure evidence

Attach summary, exit code, output snippet, payload, environment, host, tags, and deploy SHA so the incident page reads like a supportable timeline.

Late and stuck runs

Configure maximum duration, stuck-after time, schedule, timezone, and grace window. Luota opens incidents when runs break those promises.

One incident model

Run failures, late starts, stuck runs, and slow runs land in the same queue as heartbeat and freshness incidents.

Need a concrete migration or monitoring pattern? Start with the docs, then adapt the payload to the evidence your operator needs.

Open integration docs