Rate Limiting

Cap request rates to protect your service.

token_bucket

6/6 tokens

bucket · refills +1/0.9s

request log

Each request spends a token. Empty bucket → 429. Bursts allowed up to capacity.

How it works

Rate limiting caps how many requests a client can make in a window, protecting services from abuse and overload. Token bucket, leaky bucket, and sliding-window counters each balance burst tolerance against strictness.

Mental models

Token bucket refills at a fixed rate and allows controlled bursts.
Leaky bucket queues requests and drains at a constant rate, smoothing output.
Fixed window is cheap but lets a boundary burst hit up to 2× the limit.
Sliding window counts the trailing interval — fairer, but tracks more state.
Limits live at the edge (API gateway) keyed by IP, user, or API key.

Reach for it when

API abuse prevention
Fair multi-tenant usage
DDoS mitigation