IBM watsonx.ai
Introduction
Mistral AI's Large model is available on the IBM watsonx.ai platform as a fully managed solution, as well as an on-premise deployment.
Getting started
The following solutions outline the steps to query Mistral Large on the SaaS version of IBM watsonx.ai.
Pre-requisites
The following items are required:
- An IBM watsonx project (
IBM_CLOUD_PROJECT_ID
) - A Service ID with an access policy enabling the use of the Watson Lachine Learning service.
To enable access to the API, you must make sure that:
- Your Service ID has been added to the project as
EDITOR
, - You have generated an API key (
IBM_CLOUD_API_KEY
) for your Service ID.
Querying the model (chat completion)
You can query Mistral Large using either IBM's SDK or plain HTTP calls.
The examples below leverage the mistral-common
Python package to properly format
the user messages with special tokens. It is strongly recommended to avoid passing
raw strings and handle special tokens manually: this might result in silent
tokenization errors that would highly degrade the quality of the model output.
- Python
You will need to run your code from a virtual environment with the following packages:
httpx
(tested with0.27.2
)ibm-watsonx-ai
(tested with1.1.11
)mistral-common
(tested with1.4.4
)
In the following snippet, your API key will be used to generate an IAM token, then the call to the model is performed using this token for authentication.
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference
from ibm_watsonx_ai.metanames import GenTextParamsMetaNames as GenParams
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.protocol.instruct.messages import UserMessage
import os
import httpx
IBM_CLOUD_REGIONS = {
"dallas": "us-south",
"london": "eu-gb",
"frankfurt": "eu-de",
"tokyo": "jp-tok"
}
IBM_CLOUD_PROJECT_ID = "xxx-xxx-xxx" # Replace with your project id
def get_iam_token(api_key: str) -> str:
"""
Return an IAM access token generated from an API key.
"""
headers = {"Content-Type": "application/x-www-form-urlencoded"}
data = f"apikey={api_key}&grant_type=urn:ibm:params:oauth:grant-type:apikey"
resp = httpx.post(
url="https://iam.cloud.ibm.com/identity/token",
headers=headers,
data=data,
)
token = resp.json().get("access_token")
return token
def format_user_message(raw_user_msg: str) -> str:
"""
Return a formatted prompt using the official Mistral tokenizer.
"""
tokenizer = MistralTokenizer.v3() # Use v3 for Mistral Large
tokenized = tokenizer.encode_chat_completion(
ChatCompletionRequest(
messages=[UserMessage(content=raw_user_msg)], model="mistral-large"
)
)
return tokenized.text
region = "frankfurt" # Define the region of your choice here
api_key = os.environ["IBM_API_KEY"]
access_token = get_iam_token(api_key=api_key)
credentials = Credentials(url=f"https://{IBM_CLOUD_REGIONS[region]}.ml.cloud.ibm.com",
token=access_token)
params = {GenParams.MAX_NEW_TOKENS: 256, GenParams.TEMPERATURE: 0.0}
model_inference = ModelInference(
project_id=IBM_CLOUD_PROJECT_ID,
model_id="mistralai/mistral-large",
params=params,
credentials=credentials,
)
user_msg_content = "Who is the best French painter? Answer in one short sentence."
resp = model_inference.generate_text(prompt=format_user_message(user_msg_content))
print(resp)
Going further
For more information and examples, you can check:
- The IBM watsonx.ai Python SDK documentation
- This IBM Developer tutorial on how to use Mistral Large with IBM watsonx.ai flows engine.