Observability
Observability helps you understand what your LLM applications are doing in production, measure response quality at scale, and iterate with confidence.
The entire Observability suite (Explorer, Judges, Campaigns, and Datasets) is available to Enterprise-tier organizations only.
What Observability does
The Observability suite gives you three core capabilities:
- Visibility: see what’s happening in your production traffic, event by event.
- Quality signals: score and classify assistant responses automatically with LLM-powered Judges.
- Iteration loops: use Campaigns to annotate traffic at scale and build quality-tagged Datasets.
The four components
These capabilities are built around four components that work together.
Explorer lets you search, filter, and inspect every chat completion event flowing through your workspace.
You can explore individual conversations (including messages, tool calls, and metadata) and export filtered slices to Datasets for deeper analysis.
When to use it? You want to understand what’s happening in production, investigate a quality issue, or find representative examples for later analysis.
How they connect
The typical flow moves from left to right:
You don’t need to follow this exact sequence. Adjust the workflow based on your specific needs.
Next steps
End-to-end guide
- Observability quickstart — Learn how to set up a Judge and get a quality signal from real traffic.
Component deep dives