The Polymarket US API enforces rate limits to ensure fair usage and system stability.
Rate Limit
| Limit | Value |
|---|
| Requests per second | 20 |
| Scope | Per IP address |
Exceeding 20 requests per second from a single IP address will result in HTTP 429 Too Many Requests responses.
Rate Limit Response
When rate limited, the API returns:
{
"code": 8,
"message": "rate limit exceeded",
"details": []
}
HTTP Status: 429 Too Many Requests
Retry Strategy
When receiving a 429 response:
- Stop making requests immediately
- Wait 1 second before retrying
- Implement exponential backoff for repeated 429s
- Consider reducing your request rate
import time
def make_request_with_retry(url, headers, max_retries=3):
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 429:
wait_time = 2 ** attempt # 1, 2, 4 seconds
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
continue
return response
raise Exception("Max retries exceeded")
Best Practices
Use Streaming Instead of Polling
The API is designed as a streaming-first system. Instead of repeatedly polling for updates, subscribe to real-time streams:
| Don’t Poll | Use Streaming Instead |
|---|
Repeated calls to /v1/report/orders/search | CreateOrderSubscription gRPC stream |
Repeated calls to /v1/positions | CreatePositionSubscription gRPC stream |
Repeated calls to /v1/orderbook | CreateMarketDataSubscription gRPC stream |
Streaming connections don’t count against the REST rate limit. One streaming connection can replace hundreds of polling requests.
Cache Reference Data
Reference data (instruments, symbols, metadata) changes infrequently. Cache it locally:
class InstrumentCache:
def __init__(self):
self.instruments = {}
self.last_refresh = None
def get_instrument(self, symbol):
# Refresh cache every 5 minutes
if self._needs_refresh():
self._refresh_instruments()
return self.instruments.get(symbol)
def _needs_refresh(self):
if not self.last_refresh:
return True
return (time.time() - self.last_refresh) > 300
def _refresh_instruments(self):
response = api.list_instruments()
for inst in response.instruments:
self.instruments[inst.symbol] = inst
self.last_refresh = time.time()
Batch Operations
Where possible, batch your operations instead of making individual requests:
- Use
SearchOrders with filters instead of fetching orders one by one
- Use
ListInstruments with symbol filters instead of individual lookups
- Subscribe to multiple symbols in a single streaming connection
Connection Limits
| Connection Type | Limit |
|---|
| REST requests | 20/sec per IP |
| gRPC/Connect streaming connections | 10 concurrent per account |
| Streaming ingress (client to server) | 20 messages/sec |
| WebSocket connections | Not supported (use gRPC/Connect) |
Endpoint-Specific Rate Limits
Some endpoints have stricter rate limits due to computational cost:
| Endpoint | Limit | Scope |
|---|
/v1/valuations/positions | 5/min | Per account |
/v1/valuations/positions/download | 5/min | Per account |
Valuation API Rate LimitThe Valuation API is rate-limited to 5 requests per minute per account. This API performs mark-to-market calculations across all positions and is intended for periodic reporting (e.g., end-of-day P&L), not real-time monitoring.
Streaming Ingress Rate LimitClient-to-server messages on streaming connections (gRPC and Connect) are limited to 20 messages per second. This applies to requests sent by your client, not to server-pushed updates like market data. Messages exceeding this rate may be dropped or the connection may be throttled.
Monitoring Your Usage
Track your request patterns to stay within limits:
import time
from collections import deque
class RateLimiter:
def __init__(self, max_requests=20, window_seconds=1):
self.max_requests = max_requests
self.window = window_seconds
self.requests = deque()
def can_make_request(self):
now = time.time()
# Remove old requests outside the window
while self.requests and self.requests[0] < now - self.window:
self.requests.popleft()
return len(self.requests) < self.max_requests
def record_request(self):
self.requests.append(time.time())
def wait_if_needed(self):
while not self.can_make_request():
time.sleep(0.05) # 50ms
self.record_request()
Abuse Prevention
Patterns that may result in temporary or permanent restrictions:
- Sustained requests above the rate limit
- Polling for data available via streaming
- Requesting the same unchanged data repeatedly
- Automated retry loops without backoff
Abuse of the API may result in temporary or permanent restrictions on your API credentials. Contact onboarding@qcex.com if you need higher limits for legitimate use cases.
Troubleshooting Rate Limits
Consistently Hitting Limits
If you’re consistently receiving 429 errors:
Solutions:
- Reduce request frequency
- Batch multiple operations where possible
- Cache responses that don’t change frequently (reference data, instrument lists)
- Use streaming endpoints instead of polling
- Contact support to discuss higher rate limits for production use
Rate Limits Seem Inconsistent
Rate limits are enforced at multiple levels:
- Per IP address
- Per account (for certain endpoints)
- Per API scope
Different endpoints may have different limits. Check endpoint-specific limits above.
Need Higher Limits
For production use cases requiring higher limits:
- Document your use case and expected volume
- Contact support at onboarding@qcex.com
- Provide environment (dev, preprod, prod)
- Specify which endpoints you need higher limits for
Next Steps