Skip to content

Rate Limits

TierRPM (Requests/Min)TPM (Tokens/Min)Concurrency
Free (new signup)2050,0002
Starter60200,0005
Pro3001,000,00020
EnterpriseCustomCustomCustom

Every response includes rate limit headers:

x-ratelimit-limit-requests: 60
x-ratelimit-remaining-requests: 58
x-ratelimit-reset-requests: 45s
x-ratelimit-limit-tokens: 200000
x-ratelimit-remaining-tokens: 189000
x-ratelimit-reset-tokens: 30s

When you hit a rate limit, the API returns 429 Too Many Requests. Implement exponential backoff:

import time
import random
from openai import OpenAI, RateLimitError
client = OpenAI(
api_key="tsn_live_xxx",
base_url="https://api.tokensupernova.com/v1",
)
def chat_with_retry(messages, model="deepseek-chat", max_retries=5):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model=model,
messages=messages,
)
except RateLimitError:
if attempt == max_retries - 1:
raise
delay = (2 ** attempt) + random.uniform(0, 1)
time.sleep(delay)
  • Batch requests when possible instead of rapid-fire single requests
  • Cache responses for repeated prompts
  • Monitor headers to avoid hitting limits
  • Upgrade your tier for production workloads