[Deployment]

IBM watsonx.ai

Mistral AI's Large model is available on the IBM watsonx.ai platform as a fully managed solution, as well as an on-premise deployment.

Getting Started

Getting Started

The following sections outline the steps to query Mistral Large on the SaaS version of IBM watsonx.ai.

Prerequisites

Prerequisites

The following items are required:

  • An IBM watsonx project (IBM_CLOUD_PROJECT_ID)
  • A Service ID with an access policy enabling the use of the Watson Machine Learning service.

To enable access to the API, ensure that:

  • Your Service ID has been added to the project as EDITOR.
  • You have generated an API key (IBM_CLOUD_API_KEY) for your Service ID.
Querying the Model (Chat Completion)

Querying the Model (Chat Completion)

You can query Mistral Large using either IBM's SDK or plain HTTP calls.

warning

The examples below use the mistral-common Python package to properly format user messages with special tokens. Avoid passing raw strings or handling special tokens manually, as this may result in silent tokenization errors that degrade model output quality.

You will need to run your code from a virtual environment with the following packages:

  • httpx (tested with 0.27.2)
  • ibm-watsonx-ai (tested with 1.1.11)
  • mistral-common (tested with 1.4.4)

In the following snippet, your API key generates an IAM token, and the model call uses this token for authentication.

from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.protocol.instruct.messages import UserMessage
import os
import httpx

IBM_CLOUD_REGIONS = {
    "dallas": "us-south",
    "london": "eu-gb",
    "frankfurt": "eu-de",
    "tokyo": "jp-tok"
}
IBM_CLOUD_PROJECT_ID = "xxx-xxx-xxx"  # Replace with your project ID

def get_iam_token(api_key: str) -> str:
    """Return an IAM access token generated from an API key."""
    headers = {"Content-Type": "application/x-www-form-urlencoded"}
    data = f"apikey={api_key}&grant_type=urn\:ibm\:params\:oauth\:grant-type\:apikey"
    resp = httpx.post(
        url="https://iam.cloud.ibm.com/identity/token",
        headers=headers,
        data=data,
    )
    token = resp.json().get("access_token")
    return token

def format_user_message(raw_user_msg: str) -> str:
    """Return a formatted prompt using the official Mistral tokenizer."""
    tokenizer = MistralTokenizer.v3()  # Use v3 for Mistral Large
    tokenized = tokenizer.encode_chat_completion(
        ChatCompletionRequest(
            messages=[UserMessage(content=raw_user_msg)],
            model="mistral-large"
        )
    )
    return tokenized.text

region = "frankfurt"  # Define your region here
api_key = os.environ["IBM_API_KEY"]
access_token = get_iam_token(api_key=api_key)
credentials = Credentials(
    url=f"https://{IBM_CLOUD_REGIONS[region]}.ml.cloud.ibm.com",
    token=access_token
)
params = {
    GenParams.MAX_NEW_TOKENS: 256,
    GenParams.TEMPERATURE: 0.0
}
model_inference = ModelInference(
    project_id=IBM_CLOUD_PROJECT_ID,
    model_id="mistralai/mistral-large",
    params=params,
    credentials=credentials,
)
user_msg_content = "Who is the best French painter? Answer in one short sentence."
resp = model_inference.generate_text(prompt=format_user_message(user_msg_content))
print(resp)
Going Further

Going Further

For more information and examples, check: