Getting Started with Rate Limiting
Rate limiting caps how many requests a client can make to your API within a time
window. It protects your backend from traffic spikes, enforces fair usage across
consumers, and supports tiered access for different customer plans. When a
client exceeds the configured limit, they receive a 429 Too Many Requests
response with a Retry-After header indicating when they can retry.
This guide walks you through picking a rateLimitBy strategy, adding the policy
to a route, and testing it end to end. If you want the sliding window algorithm,
every rateLimitBy mode in detail, and the full set of configuration levers,
read How Rate Limiting Works alongside or after this guide.
Choose an approach
Pick a rateLimitBy mode based on what your API looks like today. If you are
not sure, start from the first row that matches and follow the linked guide or
section below.
| Use case | rateLimitBy | Policy | Learn more |
|---|---|---|---|
| Public API with no authentication | ip | Rate Limiting | Follow the steps below |
| Authenticated API, same limit for every consumer | user | Rate Limiting | §5 Rate limit authenticated users |
| Tiered limits (free, pro, enterprise) from API key metadata | function | Rate Limiting with a custom function | Dynamic Rate Limiting |
| Tiered limits sourced from a database | function | Rate Limiting with a custom function | Per-user limits with a database |
| Single global cap on an expensive endpoint | all | Rate Limiting | How rate limiting works |
| Usage-based pricing counting multiple resources per request | user | Complex Rate Limiting (enterprise) | How rate limiting works |
rateLimitBy: "user" requires an authentication policy (such as API key or JWT
authentication) earlier in the route's policy pipeline. Without it, the rate
limit policy has no user to group requests by and returns an error. Section 5
below walks through the full authenticated setup.
For a definition of rateLimitBy, the sliding window algorithm, and the full
list of configuration options (mode, headerMode, throwOnFailure, and
more), see How Rate Limiting Works.
Prerequisites
- An existing Zuplo project with at least one route configured in
config/routes.oas.json. - The Zuplo CLI installed, or access to the Zuplo Portal.
- To test rate limiting locally, the project must be linked to a Zuplo
environment. Run
npx zuplo linkonce in the project directory and select an environment. Rate limiting uses a globally distributed counter service, so an unlinked local project cannot enforce limits. See Connecting to Zuplo Services Locally for more detail.
1. Add the policy
Open config/policies.json and add a rate limiting policy to the policies
array. This example limits each IP address to 2 requests per minute, which makes
it easy to test.
config/policies.json
The key options are:
rateLimitBy-- How to group requests into rate limit buckets."ip"groups by the caller's IP address and requires no authentication.requestsAllowed-- The maximum number of requests allowed in the time window.timeWindowMinutes-- The length of the sliding time window in minutes.
If your project already has other policies in config/policies.json, add the
rate limiting entry to the existing policies array rather than replacing it.
The name field (rate-limit-inbound above) is what scopes the counter. Every
route that references this exact name shares the same counter. If you later copy
this policy block to create a second limit, change the name — a forgotten
rename silently merges two unrelated limits into one. Policy names must also
match exactly between config/policies.json and config/routes.oas.json; a
typo there causes the policy to be skipped without any error. See
Counter scoping for the full rules.
2. Attach the policy to a route
Open config/routes.oas.json and add the policy name to the policies.inbound
array inside the x-zuplo-route object of the route you want to protect.
config/routes.oas.json
The "rate-limit-inbound" string must match the name field from the policy
you defined in config/policies.json. When a request hits this route, Zuplo
runs each inbound policy in array order before forwarding to the handler.
You can attach the same policy to multiple routes. Add its name to the
policies.inbound array on each route that needs rate limiting.
3. Test the rate limit
Start your local dev server (or deploy to a Zuplo environment) and send requests
to the protected route. With the configuration above, the third request within a
one-minute window returns a 429 response.
Code
The first two requests return a 200 response from your upstream service. The
third request returns a 429 Too Many Requests response in
Problem Details format:
Code
The response also includes a Retry-After header with the number of seconds
until the client can send another request (for example, Retry-After: 42).
4. Choose production limits
The requestsAllowed: 2 value above exists so the limit triggers on your third
curl. Production APIs need numbers that reflect real usage. There is no single
right answer, but these reference points from widely used APIs are a useful
starting point:
| API | Typical per-consumer limit |
|---|---|
| Stripe | 100 read and 100 write requests per second per account |
| GitHub | 5,000 authenticated requests per hour per user |
| Twilio | 100 requests per second per account (varies by resource) |
| Shopify | 40 requests per app per store (bucket refills at 2/second) |
When sizing your own limit, consider three inputs:
- What your backend can sustain. Start from a conservative fraction of your backend's measured capacity so that a single caller cannot exhaust it.
- What legitimate callers actually do. If p99 usage for your best customers is 10 requests per minute, a 100-per-minute limit leaves headroom without being permissive.
- How your customers are structured. Per-API-key limits usually give tighter control than per-IP; a single corporate IP can hide dozens of real users.
It is almost always easier to raise a limit in response to a support ticket than to lower one that customers have started relying on. When in doubt, start low, measure, and increase.
5. Rate limit authenticated users
IP-based limits are a good first layer but they penalise every user behind a
shared NAT or corporate proxy. For an authenticated API, limit per consumer
instead. This requires an authentication policy earlier in the pipeline so that
request.user is populated before the rate limit policy runs.
The full policies configuration looks like this:
config/policies.json
Attach both policies to the route, with authentication first so the rate limit policy has a user to group by:
config/routes.oas.json (excerpt)
Create two API keys in the Zuplo Portal (or with the CLI) so you can verify that each consumer has its own counter. Then send requests with each key:
Code
Requests 1–60 for key A return 200, request 61 returns 429, and the first
request for key B still returns 200. That confirms the counter is scoped to
each consumer, not shared across the API key pool.
See API Key Authentication for the
full walkthrough of creating and managing API keys. If you use JWT
authentication instead, replace the api-key-auth policy with your JWT policy —
the rate limit policy works the same way as long as request.user.sub is
populated.
Next steps
Understand the mechanics:
- How Rate Limiting Works — The sliding window algorithm,
every
rateLimitBymode in detail, and advanced options likemode,headerMode, andthrowOnFailure.
Customize the behavior:
- Dynamic Rate Limiting — Vary limits per caller using a custom TypeScript function (for example, higher limits for paid plans).
- Per-user limits with a database — An advanced example using ZoneCache and a database lookup to drive limits per customer.
Combine with other policies:
- Combining Policies — Stack per-minute and per-hour limits, pair rate limiting with quotas, and layer in monetization.
Operate in production:
- Monitoring and Troubleshooting — Observe limits in production, alert on silent failures, and diagnose unexpected 429s.
Reference:
- Rate Limiting policy reference — Every configuration option for the standard policy.
- Complex Rate Limiting policy reference — Multi-counter configuration for usage-based pricing (enterprise).