Adjustable
Adjustable reasoning is available on mistral-small-latest via the reasoning_effort parameter, which controls how much the model thinks and whether thinking traces appear in the response.
Before continuing, we recommend reading the Chat Completions documentation to learn more about the chat completions API and how to use it before proceeding.
Before you start
Before you start
Model
mistral-small-latest: Supports adjustable reasoning via thereasoning_effortparameter. No extra configuration required — just add the parameter to any chat completion request.
The reasoning_effort parameter controls how reasoning is surfaced in the output:
reasoning_effort = "high": The response includes a full thinking chunk before the final answer, at the cost of increased token usage.reasoning_effort = "none": The model thinks minimally and the thinking chunk is omitted from the response.
reasoning_effort is also available on the Agents and Conversations endpoints via the API, inside the completion_args field. SDK support is coming soon.
Usage
Usage
Adjustable reasoning with chat completions
Adjustable reasoning with chat completions
Here is an example via our chat completions endpoint:
import os
from mistralai.client import Mistral
api_key = os.environ["MISTRAL_API_KEY"]
model = "mistral-small-latest"
client = Mistral(api_key=api_key)
chat_response = client.chat.complete(
model = model,
messages = [
{
"role": "user",
"content": "John is one of 4 children. The first sister is 4 years old. Next year, the second sister will be twice as old as the first sister. The third sister is two years older than the second sister. The third sister is half the age of her older brother. How old is John?",
},
],
reasoning_effort="high"
)