Campaigns
Campaigns let you batch-annotate production traffic using a Judge. You define a filter, pick a Judge, and the Campaign runs the Judge on every matching event, writing the annotations back into Explorer.
When to use Campaigns
Campaigns are the right tool when you want to score existing production traffic at scale. Common scenarios:
- Detect problematic behavior: your agent might be rude, off-topic, or giving inaccurate answers. Run a Campaign with a rudeness or quality Judge to find out.
- Tag traffic for analysis: classify a batch of events (e.g.,
code/search/general) and filter by category in Explorer. - Build quality-labeled Datasets: run a Campaign, then export the annotated events to a Dataset.
How to run a Campaign
Prerequisite: create a Judge
Before creating a Campaign, you need a Judge that defines your quality criteria.
A Campaign uses a single Judge. To run multiple checks on the same traffic, create separate Campaigns.
Step 1: Filter events
Select a time range, then add filter conditions to narrow the scope (see Explorer filter syntax).
If your filter returns too many events, set a maximum number of events (from 100 to 10,000).
Step 2: Launch the Campaign
Start the Campaign. It runs in the background: you can close the tab and check progress later in the Campaign details.
Step 3: Analyze results
Once complete, matching events appear with annotations in the Judge output column. From there you can:
- Filter by annotation value to surface flagged events (e.g., labeled
rudeor scored below 3). - Inspect individual events to verify the Judge's assessments.
- Export to a Dataset for further review or analysis.
[Developer] Use Campaigns programmatically
The SDK lets you create and monitor Campaigns from code. Useful for scheduled quality checks, automated alerting pipelines, or CI/CD integration.
import os
from mistralai.client import Mistral
mistral = Mistral(
api_key=os.getenv("MISTRAL_API_KEY", ""),
)
# Create a Campaign to annotate last week's support conversations
campaign = mistral.beta.observability.campaigns.create(
name="Support Quality Review - Week 3",
description="Evaluate quality of customer support responses from last week",
judge_id="judge-456", # replace with your Judge ID
search_params={
"filters": {
"AND": [
{"field": "timestamp", "op": "gte", "value": "2026-01-15T00:00:00Z"},
{"field": "timestamp", "op": "lt", "value": "2026-01-22T00:00:00Z"},
{"field": "model_name", "op": "eq", "value": "mistral-medium-2508"}
]
}
},
max_nb_events=5000
)
print(f"Campaign created: {campaign.id} — {campaign.name}")