In this notebook, we will demonstrate how to build a RAG pipeline using Ollama, Mistral models, and LlamaIndex. The following topics will be covered:
- Integrating Mistral with Ollama and LlamaIndex.
- Implementing RAG with Ollama and LlamaIndex using the Mistral model.
- Routing queries with RouterQueryEngine.
- Handling complex queries with SubQuestionQueryEngine.
Before running this notebook, you need to set up Ollama. Please follow the instructions here.
import nest_asyncio
nest_asyncio.apply()
from IPython.display import display, HTML
Setup LLM
from llama_index.llms.ollama import Ollama
llm = Ollama(model="mistral:instruct", request_timeout=60.0)
Querying
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are a helpful assistant."),
ChatMessage(role="user", content="What is the capital city of France?"),
]
response = llm.chat(messages)
display(HTML(f'<p style="font-size:20px">{response}</p>'))
Setup Embedding Model
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Download Data
We will use Uber and Lyft 10K SEC filings for the demostration.
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O './uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O './lyft_2021.pdf'
Load Data
from llama_index.core import SimpleDirectoryReader
uber_docs = SimpleDirectoryReader(input_files=["./uber_2021.pdf"]).load_data()
lyft_docs = SimpleDirectoryReader(input_files=["./lyft_2021.pdf"]).load_data()
Create Index and Query Engines
from llama_index.core import VectorStoreIndex
from llama_index.core import SummaryIndex
uber_vector_index = VectorStoreIndex.from_documents(uber_docs)
uber_vector_query_engine = uber_vector_index.as_query_engine(similarity_top_k=2)
lyft_vector_index = VectorStoreIndex.from_documents(lyft_docs)
lyft_vector_query_engine = lyft_vector_index.as_query_engine(similarity_top_k=2)
Querying
response = uber_vector_query_engine.query("What is the revenue of uber in 2021 in millions?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
response = lyft_vector_query_engine.query("What is the revenue of lyft in 2021 in millions?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
RouterQueryEngine
We will utilize the RouterQueryEngine
to direct user queries to the appropriate index based on the query related to either Uber/ Lyft.
Create QueryEngine tools
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector
query_engine_tools = [
QueryEngineTool(
query_engine=lyft_vector_query_engine,
metadata=ToolMetadata(
name="vector_lyft_10k",
description="Provides information about Lyft financials for year 2021",
),
),
QueryEngineTool(
query_engine=uber_vector_query_engine,
metadata=ToolMetadata(
name="vector_uber_10k",
description="Provides information about Uber financials for year 2021",
),
),
]
Create RouterQueryEnine
query_engine = RouterQueryEngine(
selector=LLMSingleSelector.from_defaults(),
query_engine_tools=query_engine_tools,
verbose = True
)
Querying
response = query_engine.query("What are the investments made by Uber?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
response = query_engine.query("What are the investments made by the Lyft in 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
SubQuestionQueryEngine
We will explore how the SubQuestionQueryEngine
can be leveraged to tackle complex queries by generating and addressing sub-queries.
Create SubQuestionQueryEngine
from llama_index.core.query_engine import SubQuestionQueryEngine
sub_question_query_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=query_engine_tools,
verbose=True)
Querying
response = sub_question_query_engine.query("Compare the revenues of Uber and Lyft in 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
response = sub_question_query_engine.query("What are the investments made by Uber and Lyft in 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))