Azure AI
Introduction
Mistral AI's open and commercial models can be deployed on the Microsoft Azure AI cloud platform in two ways:
-
Pay-as-you-go managed services: Using Model-as-a-Service (MaaS) serverless API deployments billed on endpoint usage. No GPU capacity quota is required for deployment.
-
Real-time endpoints: With quota-based billing tied to the underlying GPU infrastructure you choose to deploy.
This page focuses on the MaaS offering, where the following models are available:
- Mistral Large
- Mistral Small
- Mistral NeMo
For more details, visit the models page.
Getting started
The following sections outline the steps to deploy and query a Mistral model on the Azure AI MaaS platform.
Deploying the model
Follow the instructions on the Azure documentation to create a new deployment for the model of your choice. Once deployed, take note of its corresponding URL and secret key.
Querying the model
Deployed endpoints expose a REST API that you can query using Mistral's SDKs or plain HTTP calls.
To run the examples below, set the following environment variables:
AZUREAI_ENDPOINT
: Your endpoint URL, should be of the formhttps://your-endpoint.inference.ai.azure.com/v1/chat/completions
.AZUREAI_API_KEY
: Your secret key.
- cURL
- Python
- TypeScript
curl --location $AZUREAI_ENDPOINT/v1/chat/completions \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $AZURE_API_KEY" \
--data '{
"model": "azureai",
"messages": [
{
"role": "user",
"content": "Who is the best French painter? Answer in one short sentence."
}
]
}'
This code requires a virtual environment with the following packages:
mistralai-azure>=1.0.0
from mistralai_azure import MistralAzure
import os
endpoint = os.environ.get("AZUREAI_ENDPOINT", "")
api_key = os.environ.get("AZUREAI_API_KEY", "")
client = MistralAzure(azure_endpoint=endpoint,
azure_api_key=api_key)
resp = client.chat.complete(messages=[
{
"role": "user",
"content": "Who is the best French painter? Answer in one short sentence."
},
], model="azureai")
if resp:
print(resp)
This code requires the following package:
@mistralai/mistralai-azure
(version >=1.0.0
)
import { MistralAzure } from "@mistralai/mistralai-azure";
const client = new MistralAzure({
endpoint: process.env.AZUREAI_ENDPOINT || "",
apiKey: process.env.AZUREAI_API_KEY || ""
});
async function chat_completion(user_msg: string) {
const resp = await client.chat.complete({
model: "azureai",
messages: [
{
content: user_msg,
role: "user",
},
],
});
if (resp.choices && resp.choices.length > 0) {
console.log(resp.choices[0]);
}
}
chat_completion("Who is the best French painter? Answer in one short sentence.");
Going further
For more details and examples, refer to the following resources:
- Release blog post for Mistral Large 2 and Mistral NeMo.
- Azure documentation for MaaS deployment of Mistral models.
- Azure ML examples GitHub repository with several Mistral-based samples.