Webhook Observability: Correlation IDs & Structured Logs

Webhook observability with correlation IDs

When deliveries fail or retry, correlation makes the difference between guesswork and a deterministic incident response. This guide shows how to add correlation_id, structure logs safely, and trace retries back to Message Log so you can debug webhook delivery runs reliably.

You will implement correlation IDs in middleware, store them with your inbox rows, and use structured logs to match Message Log attempts to your system traces. The primary keyword “debug webhook delivery logs retries message log” appears in the observability recommendations.

What you can rely on

Know the guarantees SendPromptly provides so your observability is accurate.

One delivery run per endpoint; retries on failure

Each endpoint receives its own delivery run; failures are retried with exponential backoff.

Confirm runs in Message Log (baseline workflow) ([SendPromptly][2])

Message Log shows every attempt and is the source of truth when investigating retries.

Micro checklist:

  • Use Message Log to inspect per-attempt timing.
  • Match delivery timestamps to your X-Correlation-Id.
  • Don’t assume a single attempt maps to a single processing run.

Add correlation IDs everywhere

A correlation_id lets you join Message Log with your app logs and inbox rows.

Generate correlation_id if missing

Accept X-Correlation-Id from the sender or generate one server-side and return it in responses.

Include it in logs and inbox rows

Persist correlation_id with the inbox row and include it in structured logs for every processing step.

Common gotcha: Missing correlation_id makes it extremely difficult to match retries to app logs — always persist it.

Send a test event from the Sample Project and use Message Log to validate the timing, then match it to your logs via correlation_id.

Structured logging recommendations

Log minimal, consistent fields in JSON so tools can filter and correlate events.

Log headers subset (no secrets)

Log safe header values (correlation id, content length, message id) and never log secrets or signatures.

Log dedupe key + event_key + timing

Include dedupe_key, event_key, and processing durations to speed triage.

Micro checklist:

  • Log correlation_id, X-SP-Message-Id, and dedupe_key.
  • Avoid logging body or signature values (mask or omit them).
  • Include timing fields (received_at, started_at, finished_at).

Retry-aware log patterns

Make retries visible in logs and mark idempotent hits.

Distinguish first attempt vs replay

Log an attempt number or a replayed: true/false field so you can see replays.

Mark “idempotent hit”

When dedupe prevents work, log idempotent_hit: true so triage knows the system behaved correctly.

Micro checklist:

  • Add attempt_number to structured logs when available.
  • Log idempotent_hit on dedupe paths.
  • Emit metrics for replayed vs fresh attempts.

Incident playbook

From Message Log to your app logs: a short playbook for deterministic debugging.

From Message Log → match to correlation_id → inspect handler result

  1. Open Message Log and find the failed run.
  2. Copy X-SP-Message-Id and timestamp; search your logs for correlation_id or dedupe_key.
  3. Inspect the handler result and replay the raw payload to a staging endpoint if needed.

Suggested diagram: A triangle showing Message Log → correlation_id → app logs and how to follow timestamps across systems.

Failure modes

  1. Logging full headers/body including secrets/signatures.
  2. No correlation IDs ⇒ impossible to match retries to processing.
  3. Dedupe key not logged ⇒ duplicates look like “random repeats.”
  4. Treating 2xx as “processed” rather than “accepted” ⇒ false confidence.
  5. No consistent fields across services ⇒ debugging becomes guesswork.
  6. Missing ingestion idempotency key when sending events ⇒ duplicates during client retries.

Code snippets

1) Correlation ID middleware (and safe structured log)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// app/Http/Middleware/CorrelationId.php
namespace App\Http\Middleware;

use Closure;
use Illuminate\Http\Request;
use Illuminate\Support\Str;
use Illuminate\Support\Facades\Log;

class CorrelationId
{
    public function handle(Request $r, Closure $next)
    {
        $cid = $r->header('X-Correlation-Id') ?: (string) Str::uuid();
        $r->attributes->set('correlation_id', $cid);

        // Log minimal safe context (avoid secrets/signature values)
        Log::info('webhook.received', [
            'correlation_id' => $cid,
            'path' => $r->path(),
            'content_length' => strlen($r->getContent()),
        ]);

        $resp = $next($r);
        return $resp->header('X-Correlation-Id', $cid);
    }
}

2) Store correlation + dedupe key

1
2
3
4
5
6
7
$cid = $request->attributes->get('correlation_id');
$dedupeKey = hash('sha256', $request->getContent());

\DB::table('webhook_inbox')->updateOrInsert(
  ['dedupe_key' => $dedupeKey],
  ['correlation_id' => $cid, 'status' => 'queued', 'updated_at' => now(), 'created_at' => now()]
);

Test steps (curl + expected response)

1
2
3
4
5
6
curl -i -X POST http://localhost:8000/webhooks/sendpromptly \
  -H "Content-Type: application/json" \
  -H "X-Correlation-Id: cid-test-001" \
  -H "X-SP-Timestamp: 1700000000" \
  -H "X-SP-Signature: <valid>" \
  -d '{"event_key":"order.created","payload":{"order_id":"O-1"}}'

Expected:

  • Response includes X-Correlation-Id: cid-test-001.
  • Your logs contain webhook.received with that correlation_id.

Send a test event from the Sample Project and use Message Log to validate the timing, then match it to your logs via correlation_id.

Conclusion

  • Add correlation_id at the edge and persist it with the inbox row.
  • Use structured logs (no secrets) and include dedupe_key, message_id, and timing fields.
  • Mark replays and idempotent hits in logs to speed triage.
  • Always confirm delivery attempts in Message Log when debugging retries.
  • Structured correlation makes incident response deterministic.