Text Embeddings
Embeddings are at the core of multiple enterprise use cases, such as retrieval systems, clustering, code analytics, classification, and a variety of search applications. Embedding content, allows you to perform semantic search and diverse NLP tasks for your applications.
Mistral Embed API
How to Generate Embeddings
To generate text embeddings using Mistral AI's embeddings API, we can make a request to the API endpoint and specify the embedding model mistral-embed, along with providing a list of input texts. The API will then return the corresponding embeddings as numerical vectors, which can be used for further analysis or processing in NLP applications.
import os
from mistralai import Mistral
api_key = os.environ["MISTRAL_API_KEY"]
model = "mistral-embed"
client = Mistral(api_key=api_key)
embeddings_batch_response = client.embeddings.create(
model=model,
inputs=["Embed this sentence.", "As well as this one."],
)The output is an embedding object with the embeddings and the token usage information.
Let's take a look at the length of the first embedding:
len(embeddings_batch_response.data[0].embedding)It returns 1024, which means that our embedding dimension is 1024. The mistral-embed model generates embedding vectors of dimension 1024 for each text string, regardless of the text length. It's worth nothing that while higher dimensional embeddings can better capture text information and improve the performance of NLP tasks, they may require more computational resources for hosting and inference, and may result in increased latency and memory usage for storing and processing these embeddings. This trade-off between performance and computational resources should be considered when designing NLP systems that rely on text embeddings.
Usage Examples
Below you will find some examples of how to use the Mistral Embeddings API and different use cases.
¡Meow! Click one of the tabs above to learn more.