Rate limits and usage tiers

When you use the Mistral API, your requests are subject to rate limits. These limits help us ensure fair usage, balance load, and prevent abuse.

Rate limits are set at the Organization level, meaning they apply across all Workspaces within your organization.

i
Information

Visit AdminLimits to see the current rate limits and usage tier for your Workspace.

How rate limits work

How rate limits work

We enforce three types of limits:

  • Requests per second (RPS): the maximum number of concurrent API requests.
  • Tokens per minute: throughput limit for token processing (input and output tokens combined).
  • Tokens per month: overall consumption cap.
Plans and tiers

Plans and tiers

Rate limit tiers depend on your Studio plan:

Warning

Free mode (default)

Free mode is enabled by default with limited rate limits, intended for evaluation and prototyping. To increase your limits, upgrade to a Scale plan.

Tip
Usage tiers

Usage tiers

Once on the Scale plan, tier upgrades happen automatically based on your cumulative billed amount:

Cumulative billingTierUpgrade
$0 / €0 (Free mode)FreeLimited rate limits for evaluation and prototyping
$0 / €0 (Scale plan)Tier 1Automatic on plan upgrade
> $20 / €20Tier 2Automatic
> $100 / €100Tier 3Automatic
> $500 / €500Tier 4Automatic
> $2,000 / €2,000Higher limitsContact support

Cumulative billing is the total sum of all your invoices, not a monthly amount.

Request higher limits

Request higher limits

To request limits beyond Tier 4, you must first reach Tier 4 and meet the required billing threshold (> $2,000 / €2,000). Then, contact support with:

  • Your target requests per second
  • The specific model you plan to use
  • An estimate of tokens required per minute and per month