Webhook Observability: Correlation IDs & Structured Logs
Webhook observability with correlation IDs
When deliveries fail or retry, correlation makes the difference between guesswork and a deterministic incident response. This guide shows how to add correlation_id, structure logs safely, and trace retries back to Message Log so you can debug webhook delivery runs reliably.
You will implement correlation IDs in middleware, store them with your inbox rows, and use structured logs to match Message Log attempts to your system traces. The primary keyword “debug webhook delivery logs retries message log” appears in the observability recommendations.
What you can rely on
Know the guarantees SendPromptly provides so your observability is accurate.
One delivery run per endpoint; retries on failure
Each endpoint receives its own delivery run; failures are retried with exponential backoff.
Confirm runs in Message Log (baseline workflow) ([SendPromptly][2])
Message Log shows every attempt and is the source of truth when investigating retries.
Micro checklist:
- Use Message Log to inspect per-attempt timing.
- Match delivery timestamps to your
X-Correlation-Id.- Don’t assume a single attempt maps to a single processing run.
Add correlation IDs everywhere
A correlation_id lets you join Message Log with your app logs and inbox rows.
Generate correlation_id if missing
Accept X-Correlation-Id from the sender or generate one server-side and return it in responses.
Include it in logs and inbox rows
Persist correlation_id with the inbox row and include it in structured logs for every processing step.
Common gotcha: Missing
correlation_idmakes it extremely difficult to match retries to app logs — always persist it.
Send a test event from the Sample Project and use Message Log to validate the timing, then match it to your logs via correlation_id.
Structured logging recommendations
Log minimal, consistent fields in JSON so tools can filter and correlate events.
Log headers subset (no secrets)
Log safe header values (correlation id, content length, message id) and never log secrets or signatures.
Log dedupe key + event_key + timing
Include dedupe_key, event_key, and processing durations to speed triage.
Micro checklist:
- Log
correlation_id,X-SP-Message-Id, anddedupe_key.- Avoid logging body or signature values (mask or omit them).
- Include timing fields (
received_at,started_at,finished_at).
Retry-aware log patterns
Make retries visible in logs and mark idempotent hits.
Distinguish first attempt vs replay
Log an attempt number or a replayed: true/false field so you can see replays.
Mark “idempotent hit”
When dedupe prevents work, log idempotent_hit: true so triage knows the system behaved correctly.
Micro checklist:
- Add
attempt_numberto structured logs when available.- Log
idempotent_hiton dedupe paths.- Emit metrics for replayed vs fresh attempts.
Incident playbook
From Message Log to your app logs: a short playbook for deterministic debugging.
From Message Log → match to correlation_id → inspect handler result
- Open Message Log and find the failed run.
- Copy
X-SP-Message-Idand timestamp; search your logs forcorrelation_idordedupe_key. - Inspect the handler result and replay the raw payload to a staging endpoint if needed.
Suggested diagram: A triangle showing Message Log → correlation_id → app logs and how to follow timestamps across systems.
Failure modes
- Logging full headers/body including secrets/signatures.
- No correlation IDs ⇒ impossible to match retries to processing.
- Dedupe key not logged ⇒ duplicates look like “random repeats.”
- Treating 2xx as “processed” rather than “accepted” ⇒ false confidence.
- No consistent fields across services ⇒ debugging becomes guesswork.
- Missing ingestion idempotency key when sending events ⇒ duplicates during client retries.
Related
- Confirm delivery runs in Message Log
- Webhook delivery rules + retry behavior
- Idempotency TTL + 201 success semantics
- Backpressure patterns
Code snippets
1) Correlation ID middleware (and safe structured log)
| |
2) Store correlation + dedupe key
| |
Test steps (curl + expected response)
| |
Expected:
- Response includes
X-Correlation-Id: cid-test-001. - Your logs contain
webhook.receivedwith thatcorrelation_id.
Send a test event from the Sample Project and use Message Log to validate the timing, then match it to your logs via correlation_id.
Conclusion
- Add
correlation_idat the edge and persist it with the inbox row. - Use structured logs (no secrets) and include
dedupe_key,message_id, and timing fields. - Mark replays and idempotent hits in logs to speed triage.
- Always confirm delivery attempts in Message Log when debugging retries.
- Structured correlation makes incident response deterministic.