Supported languages
Mistral's language models are trained on multilingual data and support a wide range of languages. The lists below reflect languages with strong expected performance. Because of how models are trained, they can also perform well in additional languages not listed here.
Language models
Language models
| Region | Languages |
|---|---|
| European | French, German, Spanish, Portuguese, Italian, Dutch, Polish, Czech, Danish, Finnish, Greek, Norwegian, Romanian, Swedish, Croatian, Serbian, Ukrainian, Catalan, Breton |
| Middle Eastern | Arabic, Farsi, Hebrew, Turkish, Urdu |
| South Asian | Hindi, Bengali, Gujarati, Kannada, Marathi, Nepali, Punjabi, Tamil, Telugu |
| Southeast Asian | Indonesian, Lao, Malaysian, Tagalog, Thai, Vietnamese |
| East Asian | Chinese, Japanese, Korean, Russian |
OCR
OCR
| Region | Languages |
|---|---|
| European | English, French, German, Spanish, Portuguese, Italian, Dutch, Polish, Czech, Danish, Finnish, Greek, Hungarian, Norwegian, Romanian, Swedish, Serbian, Catalan, Ukrainian |
| Middle Eastern | Arabic, Hebrew, Persian |
| South Asian | Bengali, Gujarati, Hindi, Kannada, Marathi, Nepali, Punjabi, Tamil, Telugu |
| Southeast Asian | Indonesian, Tagalog, Thai, Vietnamese |
| East Asian | Chinese, Japanese, Korean, Russian |
| Central Asian | Armenian, Georgian, Turkish |
Performance is also good for additional languages such as Icelandic, Malayalam, Urdu, and Kazakh, among others.