Azure AI

The Mistral AI open and commercial models can be deployed on your Azure subscription.

This page explains how to easily get started with Mistral Large deployed as an Azure AI endpoint. If you use the Mistral AI Python client, it should be a drop-in replacement where you only need to change the client parameters (endpoint URL, API key, model name).

Deploying Mistral Large

Mistral AI models can be deployed on Azure AI either as:

pay-as-you-go managed services billed on endpoint usage,
real-time endpoints with quota-based billing indexed on the infrastructure you choose (only for existing open-weight models).

To deploy Mistral Large as a pay-as-you-go managed service, follow the instructions from the Azure AI documentation.

Querying the model

Once your model is deployed and provided that you have the relevant permissions, consuming it will basically be the same process as for a Mistral AI platform endpoint.

To run the examples below, you will need to define the following environment variables:

AZURE_AI_MISTRAL_LARGE_ENDPOINT is your endpoint URL, should be of the form https://your-endpoint.inference.ai.azure.com/v1/chat/completions.
AZURE_AI_MISTRAL_LARGE_KEY is your authentication key.

curl
Python

curl --location $AZURE_AI_MISTRAL_LARGE_ENDPOINT/v1/chat/completions \
    --header  'Content-Type: application/json' \
    --header 'Authorization: Bearer $AZURE_AI_MISTRAL_LARGE_KEY' \
    --data '{
  "model": "azureai",
  "messages": [
    {
      "role": "user",
      "content": "What is the best French cheese ?"
    }
  ]
}'

You will need to install the Mistral AI Python client, by following the instructions from the repository.

import os
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage

endpoint = os.environ["AZURE_AI_MISTRAL_LARGE_ENDPOINT"]
api_key = os.environ["AZURE_AI_MISTRAL_LARGE_KEY"]
model = "azureai"

client = MistralClient(api_key=api_key,
                       endpoint=endpoint)

# With streaming 
for chunk in client.chat_stream(
    model=model,
    messages=[ChatMessage(role="user", content="What is the best French cheese?")],
):
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Going further

For other usage examples, you can also check the following notebooks:

Deploying Mistral Large​

Querying the model​

Going further​

Deploying Mistral Large

Querying the model

Going further