Adjustable

Adjustable reasoning is available on mistral-small-latest via the reasoning_effort parameter, which controls how much the model thinks and whether thinking traces appear in the response.

tip

Before continuing, we recommend reading the Chat Completions documentation to learn more about the chat completions API and how to use it before proceeding.

Before you start

Before you start

Model

  • mistral-small-latest: Supports adjustable reasoning via the reasoning_effort parameter. No extra configuration required — just add the parameter to any chat completion request.

The reasoning_effort parameter controls how reasoning is surfaced in the output:

  • reasoning_effort = "high": The response includes a full thinking chunk before the final answer, at the cost of increased token usage.
  • reasoning_effort = "none": The model thinks minimally and the thinking chunk is omitted from the response.
note

reasoning_effort is also available on the Agents and Conversations endpoints via the API, inside the completion_args field. SDK support is coming soon.

Usage

Usage

Adjustable reasoning with chat completions

Adjustable reasoning with chat completions

Here is an example via our chat completions endpoint:

import os
from mistralai.client import Mistral

api_key = os.environ["MISTRAL_API_KEY"]
model = "mistral-small-latest"

client = Mistral(api_key=api_key)

chat_response = client.chat.complete(
    model = model,
    messages = [
        {
            "role": "user",
            "content": "John is one of 4 children. The first sister is 4 years old. Next year, the second sister will be twice as old as the first sister. The third sister is two years older than the second sister. The third sister is half the age of her older brother. How old is John?",
        },
    ],
    reasoning_effort="high"
)