Rate limits and usage tiers

When you use the Mistral API, your requests are subject to rate limits. These limits help us ensure fair usage, balance load, and prevent abuse.

Rate limits are set at the Organization level, meaning they apply across all Workspaces within your organization.

i
Information

Visit AdminLimits to see the current rate limits and usage tier for your Workspace.

How rate limits work

How rate limits work

We enforce three types of limits:

  • Requests per second (RPS): the maximum number of concurrent API requests.
  • Tokens per minute: throughput limit for token processing (input and output tokens combined).
  • Tokens per month: overall consumption cap.
Plans and tiers

Plans and tiers

Rate limit tiers depend on your AI Studio plan:

warning

Experiment plan (free)

The free API tier is intended for evaluation and prototyping only. You have access to the free tier with limited rate limits. To increase your limits, you must upgrade to a Scale plan.

tip
Usage tiers

Usage tiers

Once on the Scale plan, tier upgrades happen automatically based on your cumulative billed amount:

Cumulative billingTierUpgrade
$0 / €0 (Experiment plan)FreeLimited rate limits for evaluation and prototyping
$0 / €0 (Scale plan)Tier 1Automatic on plan upgrade
> $20 / €20Tier 2Automatic
> $100 / €100Tier 3Automatic
> $500 / €500Tier 4Automatic
> $2,000 / €2,000Higher limitsContact support

Cumulative billing is the total sum of all your invoices, not a monthly amount.

Request higher limits

Request higher limits

To request limits beyond Tier 4, you must first reach Tier 4 and meet the required billing threshold (> $2,000 / €2,000). Then, contact support with:

  • Your target requests per second
  • The specific model you plan to use
  • An estimate of tokens required per minute and per month