IBM watsonx.ai
Mistral AI's Large model is available on the IBM watsonx.ai platform as a fully managed solution, as well as an on-premise deployment.
Getting Started
The following sections outline the steps to query Mistral Large on the SaaS version of IBM watsonx.ai.
Prerequisites
The following items are required:
- An IBM watsonx project (
IBM_CLOUD_PROJECT_ID) - A Service ID with an access policy enabling the use of the Watson Machine Learning service.
To enable access to the API, ensure that:
- Your Service ID has been added to the project as
EDITOR. - You have generated an API key (
IBM_CLOUD_API_KEY) for your Service ID.
Querying the Model (Chat Completion)
You can query Mistral Large using either IBM's SDK or plain HTTP calls.
The examples below use the mistral-common Python package to properly format user messages with special tokens. Avoid passing raw strings or handling special tokens manually, as this may result in silent tokenization errors that degrade model output quality.
You will need to run your code from a virtual environment with the following packages:
httpx(tested with0.27.2)ibm-watsonx-ai(tested with1.1.11)mistral-common(tested with1.4.4)
In the following snippet, your API key generates an IAM token, and the model call uses this token for authentication.
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.protocol.instruct.messages import UserMessage
import os
import httpx
IBM_CLOUD_REGIONS = {
"dallas": "us-south",
"london": "eu-gb",
"frankfurt": "eu-de",
"tokyo": "jp-tok"
}
IBM_CLOUD_PROJECT_ID = "xxx-xxx-xxx" # Replace with your project ID
def get_iam_token(api_key: str) -> str:
"""Return an IAM access token generated from an API key."""
headers = {"Content-Type": "application/x-www-form-urlencoded"}
data = f"apikey={api_key}&grant_type=urn\:ibm\:params\:oauth\:grant-type\:apikey"
resp = httpx.post(
url="https://iam.cloud.ibm.com/identity/token",
headers=headers,
data=data,
)
token = resp.json().get("access_token")
return token
def format_user_message(raw_user_msg: str) -> str:
"""Return a formatted prompt using the official Mistral tokenizer."""
tokenizer = MistralTokenizer.v3() # Use v3 for Mistral Large
tokenized = tokenizer.encode_chat_completion(
ChatCompletionRequest(
messages=[UserMessage(content=raw_user_msg)],
model="mistral-large"
)
)
return tokenized.text
region = "frankfurt" # Define your region here
api_key = os.environ["IBM_API_KEY"]
access_token = get_iam_token(api_key=api_key)
credentials = Credentials(
url=f"https://{IBM_CLOUD_REGIONS[region]}.ml.cloud.ibm.com",
token=access_token
)
params = {
GenParams.MAX_NEW_TOKENS: 256,
GenParams.TEMPERATURE: 0.0
}
model_inference = ModelInference(
project_id=IBM_CLOUD_PROJECT_ID,
model_id="mistralai/mistral-large",
params=params,
credentials=credentials,
)
user_msg_content = "Who is the best French painter? Answer in one short sentence."
resp = model_inference.generate_text(prompt=format_user_message(user_msg_content))
print(resp)Going Further
For more information and examples, check:
- The IBM watsonx.ai Python SDK documentation
- This IBM Developer tutorial on using Mistral Large with IBM watsonx.ai flows engine.