Known limitations
This page documents current limitations of the Mistral platform. We actively work to address these. Check the changelogs for updates.
Context window
Context window
| Model | Max context length |
|---|---|
| Mistral Small | 32,768 tokens |
| Mistral Medium | 32,768 tokens |
| Mistral Large | 131,072 tokens |
| Codestral | 32,768 tokens |
| Ministral 3B / 8B | 131,072 tokens |
| Pixtral Large | 131,072 tokens |
- Requests exceeding the model's context window return a
400 Bad Requesterror. - Token counts include both input and output tokens. Plan your
max_tokensaccordingly.
Rate limits
Rate limits
Rate limits vary by subscription tier and model. When exceeded, the API returns 429 Too Many Requests.
- Requests per second and tokens per minute are enforced independently.
- Limits apply per API key, not per workspace.
- Batch processing does not count against real-time rate limits.
tip
Check the X-RateLimit-Remaining response header to monitor your usage before hitting the limit.
File uploads
File uploads
- Maximum file size: 512 MB
- Supported formats for OCR: PDF, PNG, JPG, JPEG, TIFF, BMP, GIF, WEBP
- Uploaded files are retained for 30 days unless deleted earlier.
Batch processing
Batch processing
- Maximum batch file size: 512 MB.
- Maximum requests per batch: 100,000.
- Batch jobs are processed asynchronously; completion time depends on queue depth and request complexity.
- Batch results are available for download for 24 hours after completion.
Streaming
Streaming
- Streaming connections time out after 10 minutes of inactivity.
stream_options.include_usagemust be explicitly set to receive token usage in stream events.- Some client HTTP libraries may buffer streamed responses; ensure chunked transfer encoding is handled correctly.
Function calling
Function calling
- Maximum number of tools per request: 128.
- Tool descriptions are included in the token count. Long descriptions reduce available context for messages.
- Parallel function calls are supported but may return calls in any order.
tool_choice: "any"forces a tool call but does not guarantee which tool is selected.
JSON mode
JSON mode
- When
response_format: {"type": "json_object"}is set, the model always returns valid JSON. - You must include "JSON" in the system or user prompt. Otherwise the model may produce an infinite whitespace stream.
- JSON mode does not guarantee adherence to a specific schema. Use function calling for structured outputs.
Vision
Vision
- Maximum image size: 20 MB per image.
- Supported formats: PNG, JPG, JPEG, GIF, WEBP.
- Maximum images per request depends on the model and total token budget.
- Images are resized internally; very small images may lose detail.
Audio transcription
Audio transcription
- Supported formats: WAV, MP3, FLAC, OGG, WEBM.
- Maximum audio duration: 60 minutes.
- Maximum file size: 500 MB.
- Transcription is optimized for clear speech; heavy background noise reduces accuracy.
Regional availability
Regional availability
- The Mistral API is served from EU data centers by default.
- Some models may not be available in all regions. Check the models page for details.