Rate limits ⏱️

The Cardda API enforces a global rate limit to keep the service fair and stable for everyone.

Limit

Window	Limit	Bucketed by
1 second	10 requests	`company-id`, falling back to API key, then user, then IP

The bucketing rule means: requests that share the same company-id are counted together regardless of which API key signed them. Requests without company-id (a few endpoints, see The company-id header) bucket by API key. This makes burst usage from multiple machines targeting the same company hit the limit even when each machine is well-behaved on its own.

Need higher limits for a specific integration (e.g. month-end batch jobs)? Email [email protected] with the company-id, the endpoint, and an estimated peak QPS. We grant case-by-case increases.

How rejections look

When you exceed the limit:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please retry after 1 second.",
  "retry_after": 1
}

Always honor Retry-After. It is sent as seconds, integer.

Recommended client behavior

Exponential backoff with jitter

async function carddaWithBackoff(fn, { maxAttempts = 6 } = {}) {
  let attempt = 0;
  while (true) {
    try {
      return await fn();
    } catch (err) {
      attempt += 1;
      if (err.status === 429 && attempt < maxAttempts) {
        const baseSec = Number(err.retryAfter ?? Math.min(2 ** attempt, 30));
        const jitter = Math.random() * 0.3 * baseSec;
        await new Promise(r => setTimeout(r, (baseSec + jitter) * 1000));
        continue;
      }
      throw err;
    }
  }
}

import random, time

def cardda_with_backoff(fn, max_attempts=6):
    attempt = 0
    while True:
        try:
            return fn()
        except RateLimited as e:
            attempt += 1
            if attempt >= max_attempts:
                raise
            base = e.retry_after or min(2 ** attempt, 30)
            time.sleep(base + random.uniform(0, 0.3 * base))

Avoid hot-loops

When backfilling, prefer page sizes of 100 with _start += 100 over 100 concurrent requests of size 1. The latter is the fastest way to trip the limit and the slowest way to finish.

Coalesce parallel processes

If you have multiple workers, route Cardda calls through a single per-company token bucket on your side (any standard rate-limiter library works). This avoids the case where 10 workers each "feel responsible" and together produce 100 QPS.

Detection in your dashboards

Track these on your side to spot rate-limit pressure before it becomes an outage:

Count of 429 responses per minute, per company-id.
p50 / p95 latency on Cardda calls (a cluster of slow requests is often an early signal that you're queuing behind your own backoff timers).
Time spent sleeping in your backoff loop (this is "real money" — surface it).

Errors — full error catalog.
Pagination — the most common source of high-volume calls.