Supported languages

Mistral's language models are trained on multilingual data and support a wide range of languages. The lists below reflect languages with strong expected performance. Because of how models are trained, they can also perform well in additional languages not listed here.

Language models

Language models

RegionLanguages
EuropeanFrench, German, Spanish, Portuguese, Italian, Dutch, Polish, Czech, Danish, Finnish, Greek, Norwegian, Romanian, Swedish, Croatian, Serbian, Ukrainian, Catalan, Breton
Middle EasternArabic, Farsi, Hebrew, Turkish, Urdu
South AsianHindi, Bengali, Gujarati, Kannada, Marathi, Nepali, Punjabi, Tamil, Telugu
Southeast AsianIndonesian, Lao, Malaysian, Tagalog, Thai, Vietnamese
East AsianChinese, Japanese, Korean, Russian
OCR

OCR

RegionLanguages
EuropeanEnglish, French, German, Spanish, Portuguese, Italian, Dutch, Polish, Czech, Danish, Finnish, Greek, Hungarian, Norwegian, Romanian, Swedish, Serbian, Catalan, Ukrainian
Middle EasternArabic, Hebrew, Persian
South AsianBengali, Gujarati, Hindi, Kannada, Marathi, Nepali, Punjabi, Tamil, Telugu
Southeast AsianIndonesian, Tagalog, Thai, Vietnamese
East AsianChinese, Japanese, Korean, Russian
Central AsianArmenian, Georgian, Turkish

Performance is also good for additional languages such as Icelandic, Malayalam, Urdu, and Kazakh, among others.