Set up RAG with document search

Upload documents to a Library and query them in a chat completion.

  • Create a Library and upload a file
  • The model retrieves relevant passages from your documents
  • Answers are grounded in your content instead of general knowledge

Time to complete: ~10 minutes

Prerequisites

Prerequisites

  • A Mistral API key (see Get your API key if you don't have one yet)
  • Python 3.9+ or Node.js 18+ installed
  • The Mistral SDK installed (see Install the SDK if you haven't yet)
  • A document to upload (PDF, TXT, or DOCX)
Step 1: Create a Library

Step 1: Create a Library

A Library is a container for documents that the model can search during conversations.

import os
from mistralai import Mistral

client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

# Create a library
library = client.files.upload(
    file=open("company-handbook.pdf", "rb"),
    purpose="retrieval",
)
print(f"File uploaded: {library.id}")
Step 2: Wait for processing

Step 2: Wait for processing

We process and index the document for retrieval. Check the status before querying.

import time

# Wait for the file to be processed
while True:
    file_info = client.files.retrieve(file_id=library.id)
    if file_info.status == "processed":
        print("File ready for retrieval")
        break
    print(f"Status: {file_info.status}... waiting")
    time.sleep(2)

We typically process small files (under 10 pages) in under 30 seconds. Larger documents may take a few minutes.

Step 3: Query with RAG

Step 3: Query with RAG

Ask a question and include the file reference so the model retrieves relevant passages before answering.

response = client.chat.complete(
    model="mistral-medium-latest",
    messages=[
        {
            "role": "user",
            "content": "What is our company's remote work policy?",
        }
    ],
    documents=[{"type": "file", "id": library.id}],
)

print(response.choices[0].message.content)
Step 4: Verify

Step 4: Verify

The response should reference information from your uploaded document instead of general knowledge. For example, if you uploaded a company handbook and asked about remote work policy, you should see specific details like:

"According to the handbook, employees may work remotely up to 3 days per week with manager approval. Remote work requests must be submitted through the HR portal..."

Look for:

  • Specific details, names, or policies from your document
  • Answers grounded in the uploaded content rather than generic advice
  • Reduced hallucination compared to the same question without documents

If the response seems generic, confirm the file status is processed and that the documents parameter is included in your request.

What's next

What's next