Production deployment and operation of Vespa applications.

VespaClientConfig

VespaClientConfig

Configure the search backend with connection settings, timeouts, and retry policies:

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

collection_name = "my_collection"
config = VespaClientConfig(
    endpoint="http://localhost:8080",
    # Advanced options
    timeout=30,              # Request timeout in seconds
    max_retries=3,          # Retry failed requests
    verify_ssl=True,        # Verify TLS certificates
)

vector_store = app.get_search_index(config, collection_name=collection_name)

Connection parameters:

ParameterPurposeDefault
endpointVespa query endpoint URLRequired
timeoutRequest timeout in seconds30
max_retriesAutomatic retry attempts on failure3
verify_sslVerify TLS certificatesTrue

TLS/SSL configuration (production):

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

config = VespaClientConfig(
    endpoint="https://vespa.example.com:8080",
    verify_ssl=True,
    # Custom CA certificate
    ca_cert_path="/path/to/ca-bundle.pem",
)

vector_store = app.get_search_index(config, collection_name="my_collection")
Health Checks and Readiness

Health Checks and Readiness

Before ingesting or querying, ensure Vespa is ready:

import asyncio
from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

async def wait_for_vespa(endpoint: str, max_attempts=30):
    """Wait for Vespa to be ready for operations."""
    for attempt in range(max_attempts):
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get(
                    f"{endpoint}/ApplicationStatus",
                    timeout=5
                ) as resp:
                    if resp.status == 200:
                        print("Vespa is ready")
                        return True
        except Exception as e:
            print(f"Attempt {attempt + 1}: {e}")
        await asyncio.sleep(2)

    raise RuntimeError("Vespa failed to become ready")

# Usage
endpoint = "http://localhost:8080"
await wait_for_vespa(endpoint)
config = VespaClientConfig(endpoint=endpoint)
vector_store = app.get_search_index(config, collection_name="my_collection")

Health check endpoints:

# Application status
curl http://localhost:8080/ApplicationStatus

# Document count in collection
curl "http://localhost:8080/search/?yql=select%20*%20from%20my_collection%20limit%201"
Production Deployment

Production Deployment

For production environments, follow these guidelines:

Cluster setup:

  • Deploy at least 3 Vespa nodes for redundancy
  • Distribute nodes across availability zones
  • Use a load balancer with health checks in front of query endpoints

Network configuration:

  • Use private networks for inter-node communication
  • Expose query endpoint through a reverse proxy (nginx, Cloudflare, etc.)
  • Implement rate limiting and DDoS protection

Configuration example (distributed setup):

from mistralai.search.toolkit.plugins.vespa import VespaClientConfig
from vespa_app import app

# Load balancer endpoint (single entry point)
config = VespaClientConfig(
    endpoint="https://vespa-lb.example.com",
    timeout=30,
    max_retries=3,
    verify_ssl=True,
)

vector_store = app.get_search_index(config, collection_name="my_collection")

Monitoring:

  • Track ingestion latency and throughput
  • Monitor query latency and error rates
  • Set alerts on health check failures
  • Monitor disk usage and index size growth
See Also

See Also