⬡ ReliabilityIntermediate interactive
Rate Limiting
Cap request rates to protect your service.
token_bucket
6/6 tokens
bucket · refills +1/0.9s
request log
Each request spends a token. Empty bucket → 429. Bursts allowed up to capacity.
How it works
Rate limiting caps how many requests a client can make in a window, protecting services from abuse and overload. Token bucket, leaky bucket, and sliding-window counters each balance burst tolerance against strictness.
Mental models
- Token bucket refills at a fixed rate and allows controlled bursts.
- Leaky bucket queues requests and drains at a constant rate, smoothing output.
- Fixed window is cheap but lets a boundary burst hit up to 2× the limit.
- Sliding window counts the trailing interval — fairer, but tracks more state.
- Limits live at the edge (API gateway) keyed by IP, user, or API key.
Reach for it when
- API abuse prevention
- Fair multi-tenant usage
- DDoS mitigation