Scale Your Webhook Consumer: Ack Fast & Apply Backpressure
Scale webhook consumers with ack-fast backpressure
Webhooks can burst — the only robust pattern is “accept quickly, persist, then process asynchronously.” This guide explains ack-fast + queue workers, per-tenant throttles, and backpressure responses you can implement in Laravel.
You will implement an ack-fast inbox, queue worker idempotency, and overload responses (429/Retry-After). The primary keyword “scale webhook consumer rate limiting queue backpressure” appears in the scaling guidance below.
Know the sender’s behavior
Understand sender semantics before shaping your backpressure strategy.
Success is 2xx; failures retry with exponential backoff
SendPromptly retries failed deliveries with exponential backoff — a 2xx means the sender will stop retrying.
Common gotcha: Treating
2xxas processed instead of accepted; track async completion separately.
The only scalable pattern: persist → ack → async
Persist the raw webhook, return 200 quickly, and process in background workers.
Inbox table
Persist the raw payload, dedupe key, and correlation id for replay and auditing.
Queue worker concurrency
Workers should be idempotent and concurrency-controlled (throttles, partitioning).
Micro checklist:
- Persist to
webhook_inboximmediately.- Return
200quickly to the sender.- Use queued workers for heavy processing, with idempotency checks.
| |
Backpressure options
When you are overloaded, communicate clearly to the sender so retries are sensible.
429 + Retry-After (for overload)
Return 429 with a Retry-After header when you cannot accept more work.
503 for transient dependency outage
Return 503 for dependency outages where retrying shortly is reasonable.
Common gotcha: Returning
429withoutRetry-Aftercauses the sender to retry aggressively.
Concurrency control
Use tenant-level throttles and partitioned queues to limit blast radius.
Per-tenant throttles
Prevent a noisy tenant from consuming all worker capacity.
Partitioned queues
Route heavy tenants or event types to separate queues with configured concurrency.
Micro checklist:
- Implement per-tenant rate limits.
- Partition queues by load or tenant.
- Monitor queue depth and consumer lag.
Operational guardrails
Protect the system with payload limits and circuit breakers.
Payload size limits
Reject overly large payloads early (document limits to SendPromptly if needed).
Timeouts and circuit breakers
Fail fast on downstream timeouts and trigger circuit breakers for repeated failures.
Suggested diagram: Flow showing HTTP accept → persist → enqueue → worker pool with per-tenant throttle & circuit breaker.
Failure modes
- Doing heavy work in HTTP request ⇒ timeouts ⇒ retries ⇒ load spiral.
- No dedupe ⇒ retries create duplicate side effects.
- Always returning 5xx for “bad payload” ⇒ permanent poison events retry forever.
- 429 without Retry-After ⇒ sender retries too aggressively.
- No observability ⇒ you can’t tell if you’re overloaded or broken.
- Treating 429 from SendPromptly API as “try new idempotency key” ⇒ duplicates.
Related
- Retries and 2xx success rules
- rate_limited (429) and idempotency errors
- Ingestion behavior: idempotency + 201
- Stop duplicates under retries
Use the Sample Project to send a burst (same event key, different idempotency keys) and confirm clean handling in Message Log.
Conclusion
- Ack-fast: persist, ack, then process asynchronously.
- Return
429+Retry-Afterwhen overloaded; return200when accepted. - Make workers idempotent and add per-tenant throttles.
- Monitor queue depth and implement circuit breakers for downstream failures.
- Use Message Log to verify delivery attempts and timing.