Observability

Observability helps you understand what your LLM applications are doing in production, measure response quality at scale, and iterate with confidence.

i
Information

The entire Observability suite (Explorer, Judges, Campaigns, and Datasets) is available to Enterprise-tier organizations only.

What Observability does

What Observability does

The Observability suite gives you three core capabilities:

  • Visibility: see what’s happening in your production traffic, event by event.
  • Quality signals: score and classify assistant responses automatically with LLM-powered Judges.
  • Iteration loops: use Campaigns to annotate traffic at scale and build quality-tagged Datasets.
The four components

The four components

These capabilities are built around four components that work together.

Explorer lets you search, filter, and inspect every chat completion event flowing through your workspace.

You can explore individual conversations (including messages, tool calls, and metadata) and export filtered slices to Datasets for deeper analysis.

When to use it? You want to understand what’s happening in production, investigate a quality issue, or find representative examples for later analysis.

Go to Explorer →

How they connect

How they connect

The typical flow moves from left to right:

A flow diagram showing the Observability workflow: Explorer → Judge → Campaign → Explorer (filter by annotations) → Dataset. Arrows indicate data flow.
tip

You don’t need to follow this exact sequence. Adjust the workflow based on your specific needs.

Next steps

Next steps

End-to-end guide

Component deep dives

  • Explorer — Search, filter, inspect, and export production events.
  • Judges — Design and configure automated scoring criteria.
  • Campaigns — Run batch annotations on live production traffic.
  • Datasets — Build and manage curated collections of conversation records.