Skip to main content


Mistral 7B v0.1 is Mistral AI's first Large Language Model (LLM). A Large Language Model (LLM) is an artificial intelligence algorithm trained on massive amounts of data that is able to generate coherent text and perform various natural language processing tasks.

The raw model weights are downloadable from the documentation and on Hugging Face.

A Docker image bundling vLLM, a fast Python inference server, with everything required to run our model is provided to quickly spin a completion API on any major cloud provider with NVIDIA GPUs.

Where to start?

If you are interested in the deployment of the Mistral AI LLM on your own infrastructure, check out the Quickstart. If you want to use the API served by a deployed instance, go to the Interacting with the model page or to the API specification.

For local deployment on consumer grade hardware, check out the llama.cpp project or Ollama.

Get Help

Join our Discord community to discuss our models and talk to our engineers. Alternatively, reach out to our sales team if you have enterprise needs or want more information about our products.


Mistral AI is committed to open source software development and welcomes external contributions. Please open a PR!