Deploy and operate | Mistral Docs

Production deployment and operation of Vespa applications.

VespaClientConfig

Configure the search backend with connection settings, timeouts, and retry policies:

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

collection_name = "my_collection"
config = VespaClientConfig(
    endpoint="http://localhost:8080",
    # Advanced options
    timeout=30,              # Request timeout in seconds
    max_retries=3,          # Retry failed requests
    verify_ssl=True,        # Verify TLS certificates
)

vector_store = app.get_search_index(config, collection_name=collection_name)

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

collection_name = "my_collection"
config = VespaClientConfig(
    endpoint="http://localhost:8080",
    # Advanced options
    timeout=30,              # Request timeout in seconds
    max_retries=3,          # Retry failed requests
    verify_ssl=True,        # Verify TLS certificates
)

vector_store = app.get_search_index(config, collection_name=collection_name)

Connection parameters:

Parameter	Purpose	Default
`endpoint`	Vespa query endpoint URL	Required
`timeout`	Request timeout in seconds	30
`max_retries`	Automatic retry attempts on failure	3
`verify_ssl`	Verify TLS certificates	True

TLS/SSL configuration (production):

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

config = VespaClientConfig(
    endpoint="https://vespa.example.com:8080",
    verify_ssl=True,
    # Custom CA certificate
    ca_cert_path="/path/to/ca-bundle.pem",
)

vector_store = app.get_search_index(config, collection_name="my_collection")

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

config = VespaClientConfig(
    endpoint="https://vespa.example.com:8080",
    verify_ssl=True,
    # Custom CA certificate
    ca_cert_path="/path/to/ca-bundle.pem",
)

vector_store = app.get_search_index(config, collection_name="my_collection")

Health Checks and Readiness

Before ingesting or querying, ensure Vespa is ready:

import asyncio
from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

async def wait_for_vespa(endpoint: str, max_attempts=30):
    """Wait for Vespa to be ready for operations."""
    for attempt in range(max_attempts):
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get(
                    f"{endpoint}/ApplicationStatus",
                    timeout=5
                ) as resp:
                    if resp.status == 200:
                        print("Vespa is ready")
                        return True
        except Exception as e:
            print(f"Attempt {attempt + 1}: {e}")
        await asyncio.sleep(2)

    raise RuntimeError("Vespa failed to become ready")

# Usage
endpoint = "http://localhost:8080"
await wait_for_vespa(endpoint)
config = VespaClientConfig(endpoint=endpoint)
vector_store = app.get_search_index(config, collection_name="my_collection")

import asyncio
from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

async def wait_for_vespa(endpoint: str, max_attempts=30):
    """Wait for Vespa to be ready for operations."""
    for attempt in range(max_attempts):
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get(
                    f"{endpoint}/ApplicationStatus",
                    timeout=5
                ) as resp:
                    if resp.status == 200:
                        print("Vespa is ready")
                        return True
        except Exception as e:
            print(f"Attempt {attempt + 1}: {e}")
        await asyncio.sleep(2)

    raise RuntimeError("Vespa failed to become ready")

# Usage
endpoint = "http://localhost:8080"
await wait_for_vespa(endpoint)
config = VespaClientConfig(endpoint=endpoint)
vector_store = app.get_search_index(config, collection_name="my_collection")

Health check endpoints:

# Application status
curl http://localhost:8080/ApplicationStatus

# Document count in collection
curl "http://localhost:8080/search/?yql=select%20*%20from%20my_collection%20limit%201"

# Application status
curl http://localhost:8080/ApplicationStatus

# Document count in collection
curl "http://localhost:8080/search/?yql=select%20*%20from%20my_collection%20limit%201"

Production Deployment

For production environments, follow these guidelines:

Cluster setup:

Deploy at least 3 Vespa nodes for redundancy
Distribute nodes across availability zones
Use a load balancer with health checks in front of query endpoints

Network configuration:

Use private networks for inter-node communication
Expose query endpoint through a reverse proxy (nginx, Cloudflare, etc.)
Implement rate limiting and DDoS protection

Configuration example (distributed setup):

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

# Load balancer endpoint (single entry point)
config = VespaClientConfig(
    endpoint="https://vespa-lb.example.com",
    timeout=30,
    max_retries=3,
    verify_ssl=True,
)

vector_store = app.get_search_index(config, collection_name="my_collection")

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

# Load balancer endpoint (single entry point)
config = VespaClientConfig(
    endpoint="https://vespa-lb.example.com",
    timeout=30,
    max_retries=3,
    verify_ssl=True,
)

vector_store = app.get_search_index(config, collection_name="my_collection")

Monitoring:

Track ingestion latency and throughput
Monitor query latency and error rates
Set alerts on health check failures
Monitor disk usage and index size growth

VespaClientConfig

Health Checks and Readiness

Production Deployment

See Also