Rate limits and usage tiers
When you use the Mistral API, your requests are subject to rate limits. These limits help us ensure fair usage, balance load, and prevent abuse.
Rate limits are set at the Organization level, meaning they apply across all Workspaces within your organization.
Visit Admin›Limits ↗ to see the current rate limits and usage tier for your Workspace.
How rate limits work
We enforce three types of limits:
- Requests per second (RPS): the maximum number of concurrent API requests.
- Tokens per minute: throughput limit for token processing (input and output tokens combined).
- Tokens per month: overall consumption cap.
Plans and tiers
Rate limit tiers depend on your AI Studio plan:
Experiment plan (free)
The free API tier is intended for evaluation and prototyping only. You have access to the free tier with limited rate limits. To increase your limits, you must upgrade to a Scale plan.
Scale plan (pay-as-you-go)
The Scale plan gives you access to Tier 1 and above. Upgrade in Admin›Subscriptions ↗.
Usage tiers
Once on the Scale plan, tier upgrades happen automatically based on your cumulative billed amount:
| Cumulative billing | Tier | Upgrade |
|---|---|---|
| $0 / €0 (Experiment plan) | Free | Limited rate limits for evaluation and prototyping |
| $0 / €0 (Scale plan) | Tier 1 | Automatic on plan upgrade |
| > $20 / €20 | Tier 2 | Automatic |
| > $100 / €100 | Tier 3 | Automatic |
| > $500 / €500 | Tier 4 | Automatic |
| > $2,000 / €2,000 | Higher limits | Contact support |
Cumulative billing is the total sum of all your invoices, not a monthly amount.
Request higher limits
To request limits beyond Tier 4, you must first reach Tier 4 and meet the required billing threshold (> $2,000 / €2,000). Then, contact support with:
- Your target requests per second
- The specific model you plan to use
- An estimate of tokens required per minute and per month