User queries in general can be complex queries, simple queries. One don't always need complex RAG system even to handle simple queries. Adaptive RAG proposes an approach to handle complex queries and simple queries.
In this notebook, we will implement an approach similar to Adaptive RAG, which differentiates between handling complex and simple queries. We'll focus on Lyft's 10k SEC filings for the years 2020, 2021, and 2022.
Our approach will involve using RouterQueryEngine
and FunctionCalling
capabilities of MistralAI
to call different tools or indices based on the query's complexity.
- Complex Queries: These will leverage multiple tools that require context from several documents.
- Simple Queries: These will utilize a single tool that requires context from a single document or directly use an LLM to provide an answer.
Following are the steps we follow here:
- Download Data.
- Load Data.
- Create indices for 3 documents.
- Create query engines with documents and LLM.
- Initialize a
FunctionCallingAgentWorker
for complex queries. - Create tools.
- Create
RouterQueryEngine
- To route queries based on its complexity. - Querying.
Installation
!pip install llama-index
!pip install llama-index-llms-mistralai
!pip install llama-index-embeddings-mistralai
Setup API Key
import os
os.environ['MISTRAL_API_KEY'] = '<YOUR MISTRAL API KEY>'
Setup LLM and Embedding Model
import nest_asyncio
nest_asyncio.apply()
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.mistralai import MistralAI
from llama_index.embeddings.mistralai import MistralAIEmbedding
from llama_index.core import Settings
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector
# Note: Only `mistral-large-latest` supports function calling
llm = MistralAI(model='mistral-large-latest')
embed_model = MistralAIEmbedding()
Settings.llm = llm
Settings.embed_model = embed_model
Logging
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
# This results in nested event-loops when we start an event-loop to make async queries.
# This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio
nest_asyncio.apply()
import logging
import sys
# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO) # Set logger level to INFO
# Clear out any existing handlers
logger.handlers = []
# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO) # Set handler level to INFO
# Add the handler to the logger
logger.addHandler(handler)
from IPython.display import display, HTML
Download Data
We will download Lyft's 10k SEC filings for the years 2020, 2021, and 2022.
!wget "https://www.dropbox.com/scl/fi/ywc29qvt66s8i97h1taci/lyft-10k-2020.pdf?rlkey=d7bru2jno7398imeirn09fey5&dl=0" -q -O ./lyft_10k_2020.pdf
!wget "https://www.dropbox.com/scl/fi/lpmmki7a9a14s1l5ef7ep/lyft-10k-2021.pdf?rlkey=ud5cwlfotrii6r5jjag1o3hvm&dl=0" -q -O ./lyft_10k_2021.pdf
!wget "https://www.dropbox.com/scl/fi/iffbbnbw9h7shqnnot5es/lyft-10k-2022.pdf?rlkey=grkdgxcrib60oegtp4jn8hpl8&dl=0" -q -O ./lyft_10k_2022.pdf
Load Data
# Lyft 2020 docs
lyft_2020_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2020.pdf"]).load_data()
# Lyft 2021 docs
lyft_2021_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2021.pdf"]).load_data()
# Lyft 2022 docs
lyft_2022_docs = SimpleDirectoryReader(input_files=["./lyft_10k_2022.pdf"]).load_data()
Create Indicies
# Index on Lyft 2020 Document
lyft_2020_index = VectorStoreIndex.from_documents(lyft_2020_docs)
# Index on Lyft 2021 Document
lyft_2021_index = VectorStoreIndex.from_documents(lyft_2021_docs)
# Index on Lyft 2022 Document
lyft_2022_index = VectorStoreIndex.from_documents(lyft_2022_docs)
Create Query Engines
# Query Engine on Lyft 2020 Docs Index
lyft_2020_query_engine = lyft_2020_index.as_query_engine(similarity_top_k=5)
# Query Engine on Lyft 2021 Docs Index
lyft_2021_query_engine = lyft_2021_index.as_query_engine(similarity_top_k=5)
# Query Engine on Lyft 2022 Docs Index
lyft_2022_query_engine = lyft_2022_index.as_query_engine(similarity_top_k=5)
Query Engine for LLM. With this we will use LLM to answer the query.
from llama_index.core.query_engine import CustomQueryEngine
class LLMQueryEngine(CustomQueryEngine):
"""RAG String Query Engine."""
llm: llm
def custom_query(self, query_str: str):
response = self.llm.complete(query_str)
return str(response)
llm_query_engine = LLMQueryEngine(llm=llm)
Initialize a FunctionCallingAgentWorker
# These tools are used to answer complex queries involving multiple documents.
query_engine_tools = [
QueryEngineTool(
query_engine=lyft_2020_query_engine,
metadata=ToolMetadata(
name="lyft_2020_10k_form",
description="Annual report of Lyft's financial activities in 2020",
),
),
QueryEngineTool(
query_engine=lyft_2021_query_engine,
metadata=ToolMetadata(
name="lyft_2021_10k_form",
description="Annual report of Lyft's financial activities in 2021",
),
),
QueryEngineTool(
query_engine=lyft_2022_query_engine,
metadata=ToolMetadata(
name="lyft_2022_10k_form",
description="Annual report of Lyft's financial activities in 2022",
),
)
]
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
agent_worker = FunctionCallingAgentWorker.from_tools(
query_engine_tools,
llm=llm,
verbose=True,
allow_parallel_tool_calls=True,
)
agent = AgentRunner(agent_worker)
Create Tools
We will create tools using the QueryEngines
, and FunctionCallingAgentWorker
created earlier.
query_engine_tools = [
QueryEngineTool(
query_engine=lyft_2020_query_engine,
metadata=ToolMetadata(
name="lyft_2020_10k_form",
description="Queries related to only 2020 Lyft's financial activities.",
),
),
QueryEngineTool(
query_engine=lyft_2021_query_engine,
metadata=ToolMetadata(
name="lyft_2021_10k_form",
description="Queries related to only 2021 Lyft's financial activities.",
),
),
QueryEngineTool(
query_engine=lyft_2022_query_engine,
metadata=ToolMetadata(
name="lyft_2022_10k_form",
description="Queries related to only 2022 Lyft's financial activities.",
),
),
QueryEngineTool(
query_engine=agent,
metadata=ToolMetadata(
name="lyft_2020_2021_2022_10k_form",
description=(
"Useful for queries that span multiple years from 2020 to 2022 for Lyft's financial activities."
)
)
),
QueryEngineTool(
query_engine=llm_query_engine,
metadata=ToolMetadata(
name="general_queries",
description=(
"Provides information about general queries other than lyft."
)
)
)
]
Create RouterQueryEngine
RouterQueryEngine
will route user queries to select one of the tools based on the complexity of the query.
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
query_engine = RouterQueryEngine(
selector=LLMSingleSelector.from_defaults(),
query_engine_tools=query_engine_tools,
verbose = True
)
Querying
Simple Queries:
Query: What is the capital of France?
You can see that it used LLM tool since it is a general query.
response = query_engine.query("What is the capital of France?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Query: What did Lyft do in R&D in 2022?
You can see that it used lyft_2022 tool to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2022?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Query: What did Lyft do in R&D in 2021?
You can see that it used lyft_2021 tool to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Query: What did Lyft do in R&D in 2020?
You can see that it used lyft_2020 tool to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Complex Queries
Let's test queries that requires multiple tools.
Query: What did Lyft do in R&D in 2022 vs 2020?
You can see that it used lyft_2020 and lyft_2022 tools with FunctionCallingAgent
to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2022 vs 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Query: What did Lyft do in R&D in 2021 vs 2020?
You can see that it used lyft_2020 and lyft_2021 tools with FunctionCallingAgent
to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2020 vs 2021?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
Query: What did Lyft do in R&D in 2022 vs 2021 vs 2020?
You can see that it used lyft_2020, lyft_2021 and lyft_2022 tools with FunctionCallingAgent
to answer the query.
response = query_engine.query("What did Lyft do in R&D in 2022 vs 2021 vs 2020?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))