Skip to main content

Apache 2.0 models

We open-source both pre-trained models and instruction-tuned models. These models are not tuned for safety as we want to empower users to test and refine moderation based on their use cases. For safer models, follow our guardrailing tutorial.

License

Downloading

ModelDownload linksFeatures
Mistral-7B-v0.1Hugging Face
raw_weights (md5sum: 37dab53973db2d56b2da0a033a15307f)
- 32k vocabulary size
- Rope Theta = 1e4
- With sliding window
Mistral-7B-Instruct-v0.2Hugging Face
raw_weights (md5sum: fbae55bc038f12f010b4251326e73d39)
- 32k vocabulary size
- Rope Theta = 1e6
- No sliding window
Mistral-7B-v0.3Hugging Face
raw_weights (md5sum: 0663b293810d7571dad25dae2f2a5806)
- Extended vocabulary to 32768
Mistral-7B-Instruct-v0.3Hugging Face
raw_weights (md5sum: 80b71fcb6416085bcb4efad86dfb4d52)
- Extended vocabulary to 32768
- Supports v3 Tokenizer
- Supports function calling
Mixtral-8x7B-v0.1Hugging Face- 32k vocabulary size
- Rope Theta = 1e6
Mixtral-8x7B-Instruct-v0.1Hugging Face
raw_weights (md5sum: 8e2d3930145dc43d3084396f49d38a3f)
- 32k vocabulary size
- Rope Theta = 1e6
Mixtral-8x7B-v0.3Updated model coming soon!- Extended vocabulary to 32768
- Supports v3 Tokenizer
Mixtral-8x7B-Instruct-v0.3Updated model coming soon!- Extended vocabulary to 32768
- Supports v3 Tokenizer
- Supports function calling
Mixtral-8x22B-v0.1Hugging Face
raw_weights (md5sum: 0535902c85ddbb04d4bebbf4371c6341)
- 32k vocabulary size
Mixtral-8x22B-Instruct-v0.1/
Mixtral-8x22B-Instruct-v0.3
Hugging Face
raw_weights (md5sum: 471a02a6902706a2f1e44a693813855b)
- 32768 vocabulary size
Mixtral-8x22B-v0.3raw_weights (md5sum: a2fa75117174f87d1197e3a4eb50371a)- 32768 vocabulary size
- Supports v3 Tokenizer
Codestral-22B-v0.1Hugging Face
raw_weights (md5sum: 1ea95d474a1d374b1d1b20a8e0159de3)
- 32768 vocabulary size
- Supports v3 Tokenizer
Codestral-Mamba-7B-v0.1Hugging Face
raw_weights (md5sum: d3993e4024d1395910c55db0d11db163)
- 32768 vocabulary size
- Supports v3 Tokenizer
Mathstral-7B-v0.1Hugging Face
raw_weights (md5sum: 5f05443e94489c261462794b1016f10b)
- 32768 vocabulary size
- Supports v3 Tokenizer
Mistral-Nemo-Base-2407Hugging Face
raw_weights (md5sum: c5d079ac4b55fc1ae35f51f0a3c0eb83)
- 131k vocabulary size
- Supports tekken.json tokenizer
Mistral-Nemo-Instruct-2407Hugging Face
raw_weights (md5sum: 296fbdf911cb88e6f0be74cd04827fe7)
- 131k vocabulary size
- Supports tekken.json tokenizer
- Supports function calling
Mistral-Large-Instruct-2407Hugging Face
raw_weights (md5sum: fc602155f9e39151fba81fcaab2fa7c4)
- 32768 vocabulary size
- Supports v3 Tokenizer
- Supports function calling

Sizes

NameNumber of parametersNumber of active parametersMin. GPU RAM for inference (GB)
Mistral-7B-v0.37.3B7.3B16
Mixtral-8x7B-v0.146.7B12.9B100
Mixtral-8x22B-v0.3140.6B39.1B300
Codestral-22B-v0.122.2B22.2B60
Codestral-Mamba-7B-v0.17.3B7.3B16
Mathstral-7B-v0.17.3B7.3B16
Mistral-Nemo-Instruct-240712B12B28 - bf16
16 - fp8
Mistral-Large-Instruct-2407123B123B228

How to run?

Check out mistral-inference, a Python package for running our models. You can install mistral-inference by

pip install mistral-inference

To learn more about how to use mistral-inference, take a look at the README and dive into this colab notebook to get started:

Open In Colab