RAG Pipeline with LlamaIndex

RAGLlamaindex

In this notebook we will look into building RAG with LlamaIndex using MistralAI LLM and Embedding Model. Additionally, we will look into using Index as Retreiver.

  1. Basic RAG pipeline.
  2. Index as Retriever.
!pip install llama-index 
!pip install llama-index-embeddings-mistralai
!pip install llama-index-llms-mistralai

Setup API Keys

import os
os.environ['MISTRAL_API_KEY'] = '<YOUR MISTRALAI API KEY>'

Basic RAG pipeline

Following are the steps involved in Builiding a basic RAG pipeline.

  1. Setup LLM and Embedding Model
  2. Download Data
  3. Load Data
  4. Create Nodes
  5. Create Index
  6. Create Query Engine
  7. Querying

Query Engine combines Retrieval and Response Synthesis modules to generate response for the given query.

Setup LLM and Embedding Model

from llama_index.llms.mistralai import MistralAI
from llama_index.embeddings.mistralai import MistralAIEmbedding
llm = MistralAI(model='mistral-large')
embed_model = MistralAIEmbedding()
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model

Download Data

We will use Uber 2021 10K SEC Filings for the demonstration.

!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O './uber_2021.pdf'

Load Data

from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader(input_files=["./uber_2021.pdf"]).load_data()

Create Nodes

from llama_index.core.node_parser import TokenTextSplitter
splitter = TokenTextSplitter(
    chunk_size=512,
    chunk_overlap=0,
)

nodes = splitter.get_nodes_from_documents(documents)

Create Index

from llama_index.core import VectorStoreIndex
index = VectorStoreIndex(nodes)

Create Query Engine

query_engine = index.as_query_engine(similarity_top_k=2)

Querying

response = query_engine.query("What is the revenue of Uber in 2021?")
print(response)

Index as Retriever

We can make use of created index as a Retriever. Retriever helps you to retrieve relevant chunks/ nodes for the given user query.

Create Retriever

retriever = index.as_retriever(similarity_top_k = 2)

Retrieve relevant nodes for a Query

retrieved_nodes = retriever.retrieve("What is the revenue of Uber in 2021?")
from llama_index.core.response.notebook_utils import display_source_node

for node in retrieved_nodes:
    display_source_node(node, source_length=1000)