Skip to main content
The Polymarket US API enforces rate limits to ensure fair usage and system stability.

Rate Limit

LimitValue
Requests per second20
ScopePer IP address
Exceeding 20 requests per second from a single IP address will result in HTTP 429 Too Many Requests responses.

Rate Limit Response

When rate limited, the API returns:
{
  "code": 8,
  "message": "rate limit exceeded",
  "details": []
}
HTTP Status: 429 Too Many Requests

Retry Strategy

When receiving a 429 response:
  1. Stop making requests immediately
  2. Wait 1 second before retrying
  3. Implement exponential backoff for repeated 429s
  4. Consider reducing your request rate
import time

def make_request_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 429:
            wait_time = 2 ** attempt  # 1, 2, 4 seconds
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)
            continue

        return response

    raise Exception("Max retries exceeded")

Best Practices

Use Streaming Instead of Polling

The API is designed as a streaming-first system. Instead of repeatedly polling for updates, subscribe to real-time streams:
Don’t PollUse Streaming Instead
Repeated calls to /v1/report/orders/searchCreateOrderSubscription gRPC stream
Repeated calls to /v1/positionsCreatePositionSubscription gRPC stream
Repeated calls to /v1/orderbookCreateMarketDataSubscription gRPC stream
Streaming connections don’t count against the REST rate limit. One streaming connection can replace hundreds of polling requests.

Cache Reference Data

Reference data (instruments, symbols, metadata) changes infrequently. Cache it locally:
class InstrumentCache:
    def __init__(self):
        self.instruments = {}
        self.last_refresh = None

    def get_instrument(self, symbol):
        # Refresh cache every 5 minutes
        if self._needs_refresh():
            self._refresh_instruments()
        return self.instruments.get(symbol)

    def _needs_refresh(self):
        if not self.last_refresh:
            return True
        return (time.time() - self.last_refresh) > 300

    def _refresh_instruments(self):
        response = api.list_instruments()
        for inst in response.instruments:
            self.instruments[inst.symbol] = inst
        self.last_refresh = time.time()

Batch Operations

Where possible, batch your operations instead of making individual requests:
  • Use SearchOrders with filters instead of fetching orders one by one
  • Use ListInstruments with symbol filters instead of individual lookups
  • Subscribe to multiple symbols in a single streaming connection

Connection Limits

Connection TypeLimit
REST requests20/sec per IP
gRPC/Connect streaming connections10 concurrent per account
Streaming ingress (client to server)20 messages/sec
WebSocket connectionsNot supported (use gRPC/Connect)

Endpoint-Specific Rate Limits

Some endpoints have stricter rate limits due to computational cost:
EndpointLimitScope
/v1/valuations/positions5/minPer account
/v1/valuations/positions/download5/minPer account
Valuation API Rate LimitThe Valuation API is rate-limited to 5 requests per minute per account. This API performs mark-to-market calculations across all positions and is intended for periodic reporting (e.g., end-of-day P&L), not real-time monitoring.
Streaming Ingress Rate LimitClient-to-server messages on streaming connections (gRPC and Connect) are limited to 20 messages per second. This applies to requests sent by your client, not to server-pushed updates like market data. Messages exceeding this rate may be dropped or the connection may be throttled.

Monitoring Your Usage

Track your request patterns to stay within limits:
import time
from collections import deque

class RateLimiter:
    def __init__(self, max_requests=20, window_seconds=1):
        self.max_requests = max_requests
        self.window = window_seconds
        self.requests = deque()

    def can_make_request(self):
        now = time.time()
        # Remove old requests outside the window
        while self.requests and self.requests[0] < now - self.window:
            self.requests.popleft()
        return len(self.requests) < self.max_requests

    def record_request(self):
        self.requests.append(time.time())

    def wait_if_needed(self):
        while not self.can_make_request():
            time.sleep(0.05)  # 50ms
        self.record_request()

Abuse Prevention

Patterns that may result in temporary or permanent restrictions:
  • Sustained requests above the rate limit
  • Polling for data available via streaming
  • Requesting the same unchanged data repeatedly
  • Automated retry loops without backoff
Abuse of the API may result in temporary or permanent restrictions on your API credentials. Contact onboarding@qcex.com if you need higher limits for legitimate use cases.

Troubleshooting Rate Limits

Consistently Hitting Limits

If you’re consistently receiving 429 errors: Solutions:
  • Reduce request frequency
  • Batch multiple operations where possible
  • Cache responses that don’t change frequently (reference data, instrument lists)
  • Use streaming endpoints instead of polling
  • Contact support to discuss higher rate limits for production use

Rate Limits Seem Inconsistent

Rate limits are enforced at multiple levels:
  • Per IP address
  • Per account (for certain endpoints)
  • Per API scope
Different endpoints may have different limits. Check endpoint-specific limits above.

Need Higher Limits

For production use cases requiring higher limits:
  1. Document your use case and expected volume
  2. Contact support at onboarding@qcex.com
  3. Provide environment (dev, preprod, prod)
  4. Specify which endpoints you need higher limits for

Next Steps