Rate Limits
The API limits how many requests each key can make over time to keep the platform fast and reliable for everyone. Build for these limits and your integration will degrade gracefully under load.
Rate limits are enforced per API key. Because the operating account is derived from the key itself, each agency key and sub-account key gets its own independent budget — traffic on one key never consumes another's allowance.
Limits
Each key is allowed a sustained rate of 100 requests per second. To absorb short spikes, the limiter uses a token bucket that permits a burst of up to 200 requests before sustained throttling kicks in. The bucket refills continuously at the sustained rate.
- Sustained: 100 requests/second per key.
- Burst: up to 200 requests in a brief window, drawing down the token bucket.
- Scope: limits apply independently to each key — agency and sub-account keys are metered separately.
Limits are evaluated on a rolling window. A request that would exceed the available budget is rejected with 429 Too Many Requests rather than queued.
Rate limit headers
Every response includes headers describing your current budget so you can throttle proactively instead of waiting to be rejected:
X-RateLimit-Limit— the maximum number of requests permitted in the current window.X-RateLimit-Remaining— requests remaining in the current window before throttling.X-RateLimit-Reset— Unix timestamp (seconds) at which the window resets and the budget is replenished.
429 responses
When you exceed the limit the API responds with 429 Too Many Requests and the standard error envelope. A Retry-After header tells you how many seconds to wait before retrying:
Always read Retry-After and pause for at least that long. Retrying immediately will simply be rejected again and counts against your budget.
Best practices
Design your client to stay comfortably within these limits:
- Exponential backoff: on a 429 (or 5xx), retry with increasing delays — e.g. 1s, 2s, 4s, 8s — plus a little random jitter to avoid thundering-herd retries.
- Cache: store responses that change infrequently (agent configs, contact lookups) and reuse them instead of re-fetching on every request.
- Batch: prefer list endpoints and bulk operations over many single-resource calls in a tight loop.
- Respect Retry-After: honor the
Retry-Afterheader and theX-RateLimit-*headers to slow down before you hit the wall, not after.
Need higher limits?