[Deployment]

Azure AI

Mistral AI's open and commercial models can be deployed on the Microsoft Azure AI cloud platform in two ways:

  • Pay-as-you-go managed services: Using Model-as-a-Service (MaaS) serverless API deployments billed on endpoint usage. No GPU capacity quota is required for deployment.
  • Real-time endpoints: With quota-based billing tied to the underlying GPU infrastructure you choose to deploy.

As of today, the following models are available:

  • Mistral Large (24.11, 24.07)
  • Mistral Medium (25.05)
  • Mistral Small (25.03)
  • Mistral Document AI (25.05)
  • Mistral OCR (25.05)
  • Ministral 3B (24.10)
  • Mistral Nemo

For more details, visit the models page.

Getting Started

Getting Started

The following sections outline the steps to deploy and query a Mistral model on the Azure AI MaaS platform.

Deploying the Model

Deploying the Model

Follow the instructions on the Azure documentation to create a new deployment for the model of your choice. Once deployed, take note of its corresponding URL and secret key.

Querying the Model

Querying the Model

Deployed endpoints expose a REST API that you can query using Mistral's SDKs or plain HTTP calls.

Before running the examples below, ensure you:

  • Set the following environment variables:
    • AZUREAI_ENDPOINT: Your endpoint URL, should be of the form https://your-endpoint.inference.ai.azure.com/v1/chat/completions.
    • AZUREAI_API_KEY: Your secret key.
# This code requires a virtual environment with the following packages: mistralai-azure>=1.0.0
from mistralai_azure import MistralAzure
import os
endpoint = os.environ.get("AZUREAI_ENDPOINT", "")
api_key = os.environ.get("AZUREAI_API_KEY", "")
client = MistralAzure(azure_endpoint=endpoint,
                 azure_api_key=api_key)
resp = client.chat.complete(messages=[
    {
        "role": "user",
        "content": "Who is the best French painter? Answer in one short sentence."
    },
], model="azureai")
if resp:
    print(resp)
Going Further

Going Further

For more details and examples, refer to the following resources: