> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.pivotal.app/llms.txt.
> For full documentation content, see https://docs.pivotal.app/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.pivotal.app/_mcp/server.

# Rate limits

Pivotal applies a sliding-window rate limit per API key. The default is **60 requests per minute** per live key. Test keys get **30/min**. Workspaces on a paid plan get **600/min** per key — open a ticket if you need more.

There is no global per-workspace cap on top of the per-key limit; create more keys (one per service is the usual pattern) if a single key is the bottleneck.

## Response headers

Every API response — whether it succeeded, errored, or got throttled — carries the current bucket state:

| Header                  | Meaning                                                   |
| ----------------------- | --------------------------------------------------------- |
| `X-RateLimit-Limit`     | Requests allowed in the current window                    |
| `X-RateLimit-Remaining` | Requests left in this window                              |
| `X-RateLimit-Reset`     | Unix epoch seconds when the window resets                 |
| `Retry-After`           | (429 only) seconds the client should wait before retrying |

Example:

```http
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1748275869
Content-Type: application/json
```

## 429 responses

When the bucket is empty, Pivotal returns `429 Too Many Requests` with the standard error envelope:

```json
{
  "error": {
    "type": "rate_limit_error",
    "code": "rate_limited",
    "message": "Rate limit exceeded. Retry after 12 seconds."
  }
}
```

`Retry-After` tells you exactly how long to wait. Don't parse the message — use the header.

## Handling 429 in client code

Exponential backoff with jitter is the right shape. Pseudocode:

```typescript
async function call(req: Request, attempt = 0): Promise<Response> {
  const res = await fetch(req);
  if (res.status !== 429 || attempt >= 5) return res;

  const retryAfter = Number(res.headers.get("Retry-After") ?? 1);
  const jitter = Math.random() * 0.5; // 0–500 ms
  await sleep((retryAfter + jitter) * 1000);
  return call(req, attempt + 1);
}
```

Two refinements worth adding:

1. **Honor `X-RateLimit-Remaining`** — if it's at 1 or 2, slow yourself down before you hit zero.
2. **Cap the retry depth** — five tries is enough; beyond that something else is wrong and the call should bubble up.

## Bulk operations

The API does not yet expose bulk endpoints. If you're moving thousands of records, see [Bulk operations](/api/guides/bulk-operations) for the recommended concurrency settings — running 4–8 in-flight requests against a 60/min bucket plus respecting `X-RateLimit-Remaining` is usually enough.

## What does NOT count

* 401 from a missing or revoked key — rejected before the rate limiter sees it.
* The token-refresh dance for any SDK — there is no token refresh; Bearer is the whole story.

## What DOES count

* Successful 2xx responses.
* 4xx responses (validation, not found, conflict) — they still spent a request slot.
* 5xx responses — same. The rate limiter doesn't know the request "shouldn't" have failed.

This means a buggy client looping on a 404 will burn its quota fast. Add a circuit breaker around any code that fails twice in a row against the same resource.