Quickstart
Build a RAG pipeline in 5 minutes: ingest documents into a vector store, then search them.
Prerequisites
- Python 3.12+
- Docker (for running Vespa locally)
- A Mistral API key from console.mistral.ai
Install
Install Search Toolkit with the Vespa plugin using uv:
uv add "mistralai-search-toolkit[vespa]"Set up Vespa
Start a local Vespa instance with Docker:
docker run --detach \
--name vespa \
--hostname vespa-container \
--publish 8080:8080 \
--publish 19071:19071 \
vespaengine/vespaWait for Vespa to be healthy:
curl --retry 10 --retry-delay 3 --retry-all-errors \
http://localhost:19071/state/v1/healthSet your Mistral API key:
export MISTRAL_API_KEY=your-api-keyDefine your schema
Create a migration to describe the structure of your documents:
mistral-vespa generate-migration \
--app-dir ./vespa/migrations \
initial_schemaFill in the generated file:
from mistralai.search.toolkit.plugins.vespa.app.schemas.app import FieldDefinition, SearchMode
from mistralai.search.toolkit.plugins.vespa.migration import VespaMigration, create_default_schema, set_app_name
class InitialSchema(VespaMigration):
def migrate(self) -> None:
set_app_name("my_quickstart")
create_default_schema(
name="quickstart_collection",
mode=SearchMode.INDEX,
embedding_dimensions=1024,
schema_version=1,
additional_fields=[
FieldDefinition.TextField(name="title"),
],
)For more details on Vespa application management and deployment, see Manage and deploy Vespa applications.
Deploy from migrations
Deploy the schema to your local Vespa instance:
mistral-vespa migrate \
--app-dir ./vespa/migrations \
--config-server http://localhost:19071 \
--query-port 8080This builds the application package from your migrations in memory, deploys it, and waits until the app is ready.
Ingest documents
Create a pipeline that loads files, extracts text, splits into chunks, embeds, and indexes into Vespa:
import asyncio
from pathlib import Path
from mistralai.client import Mistral
from mistralai.search.toolkit.embedders import MistralEmbedder
from mistralai.search.toolkit.ingestion.extractors import PlainTextExtractor
from mistralai.search.toolkit.ingestion.loaders import FilesystemFileLoader
from mistralai.search.toolkit.ingestion.pipelines import Pipeline
from mistralai.search.toolkit.ingestion.text_splitters import CharacterTextSplitter
from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app
async def main():
mistral_client = Mistral(api_key="your-api-key")
# Configure Vespa
config = VespaClientConfig(
endpoint="http://localhost:8080",
)
vector_store = app.get_search_index(config, collection_name="quickstart_collection")
# Create the pipeline
pipeline = Pipeline(
loader=FilesystemFileLoader(),
extractor=PlainTextExtractor(),
text_splitter=CharacterTextSplitter(chunk_size=500),
embedder=MistralEmbedder(client=mistral_client, model_name="mistral-embed"),
vector_store=vector_store,
)
# Ingest documents
await pipeline.run(
documents=[Path("doc1.txt"), Path("doc2.txt")],
collection_name="quickstart_collection",
)
print("Documents ingested!")
asyncio.run(main())The pipeline chains five stages:
FilesystemFileLoaderreads raw file bytes from disk.PlainTextExtractorextracts text content from the file. For PDFs, useMistralOCRExtractorinstead.CharacterTextSplitterbreaks the text into chunks of 500 characters.MistralEmbeddergenerates a vector embedding for each chunk.vector_store(Vespa) indexes each chunk, storing the embedding for vector search.
Search
Query the indexed documents using vector search:
import asyncio
from mistralai.client import Mistral
from mistralai.search.toolkit.embedders import MistralEmbedder
from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from mistralai.search.toolkit.retrieval import QueryEngine
from mistralai.search.toolkit.retrieval.retrievers import VectorRetriever
from vespa_app import app
async def main():
mistral_client = Mistral(api_key="your-api-key")
embedder = MistralEmbedder(client=mistral_client, model_name="mistral-embed")
# Configure Vespa for search
config = VespaClientConfig(
endpoint="http://localhost:8080",
)
vector_store = app.get_search_index(config, collection_name="quickstart_collection")
# Build query engine
query_engine = QueryEngine(
retriever=[VectorRetriever(client=vector_store, embedder=embedder)],
)
# Search
result = await query_engine.search(
query="What is RAG?",
top_k=5,
include_metadata=True,
include_content=True,
)
for i, r in enumerate(result.results, 1):
print(f"{i}. [Score: {r.score:.3f}] {r.chunk.content[:200]}...")
asyncio.run(main())Ingesting PDFs with OCR
For PDF documents, swap PlainTextExtractor with MistralOCRExtractor and use MarkdownTextSplitter for structure-aware chunking:
from mistralai.search.toolkit.ingestion.extractors import MistralOCRExtractor
from mistralai.search.toolkit.ingestion.text_splitters import (
MarkdownTextSplitter,
MarkdownTextSplitterConfig,
)
pipeline = Pipeline(
loader=FilesystemFileLoader(),
extractor=MistralOCRExtractor(client=mistral_client),
text_splitter=MarkdownTextSplitter(
MarkdownTextSplitterConfig(chunk_size=5048, chunk_overlap=50)
),
embedder=MistralEmbedder(client=mistral_client),
vector_store=vector_store,
)