Understanding Query Pipeline Links in LlamaIndex

In the ever-evolving landscape of AI and natural language processing, the ability to efficiently query and analyze large volumes of text is becoming increasingly crucial. Enter LlamaIndex, a powerful tool that's revolutionizing how we approach document analysis and question-answering systems. Today, we're going to unpack one of its most potent features: the modular query pipeline.

The Building Blocks: Understanding the Modules

Before we dive into the pipeline itself, let's familiarize ourselves with the key players:

Prompt Template: The starting point, formatting our initial question.
Language Model (LLM): The brain of the operation, processing and understanding our query.
Retriever: Our librarian, fetching relevant documents based on the query.
Reranker: The discerning critic, refining the relevance of retrieved documents.
Summarizer: The concise storyteller, distilling key information into a coherent answer.

The Pipeline in Action

Let's break down the flow of the pipeline:

p.add_link("prompt_tmpl", "llm")
- This link sends the output of the prompt template to the language model (LLM).
- The LLM uses this to understand what question it needs to answer.
p.add_link("llm", "retriever")
- The LLM's output (likely a processed query) is sent to the retriever.
- The retriever uses this to search for relevant documents.
p.add_link("retriever", "reranker", dest_key="nodes")
- The retrieved documents (nodes) are sent to the reranker.
- dest_key="nodes" specifies that these will be the input nodes for reranking.
p.add_link("llm", "reranker", dest_key="query_str")
- The original query from the LLM is also sent to the reranker.
- dest_key="query_str" indicates this will be used as the query for reranking.
p.add_link("reranker", "summarizer", dest_key="nodes")
- The reranked nodes are sent to the summarizer.
- These will be the input nodes for summarization.
p.add_link("llm", "summarizer", dest_key="query_str")
- The original query is sent to the summarizer.
- This helps the summarizer understand what information is most relevant.

Visual Representation

prompt_tmpl ---> llm ---> retriever ---> reranker ---> summarizer
     |         ^          ^    |
     |         |          |    |
     |         |          |    v
     +---------|----------|----+
               |          |
               +----------+

Full Code Implementation

Here's a complete example of how to implement this pipeline:

from llama_index.core.query_pipeline import QueryPipeline
from llama_index.postprocessor.cohere_rerank import CohereRerank
from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core import PromptTemplate
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from dotenv import load_dotenv
import os 

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
os.environ["OPENAI_API_KEY"] = api_key

# Load documents
pdf_path = "path/to/your/document.pdf"
documents = SimpleDirectoryReader(input_files=[pdf_path]).load_data()
index = VectorStoreIndex.from_documents(documents)

# Define modules
prompt_str = "does drake brag about the artist {artist}?"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")
retriever = index.as_retriever(similarity_top_k=5)
reranker = CohereRerank()
summarizer = TreeSummarize(llm=llm)

# Define query pipeline
p = QueryPipeline(verbose=True)

# Add modules to the pipeline
p.add_modules({
    "llm": llm,
    "prompt_tmpl": prompt_tmpl,
    "retriever": retriever,
    "summarizer": summarizer,
    "reranker": reranker
})

# Add links between modules
p.add_link("prompt_tmpl", "llm")
p.add_link("llm", "retriever")
p.add_link("retriever", "reranker", dest_key="nodes")
p.add_link("llm", "reranker", dest_key="query_str")
p.add_link("reranker", "summarizer", dest_key="nodes")
p.add_link("llm", "summarizer", dest_key="query_str")

# Run the pipeline
output = p.run(topic="michael jackson")
print(output)

Conclusion

LlamaIndex's modular query pipeline offers a powerful and flexible way to process and analyze large volumes of text data. By breaking down the process into distinct modules and linking them together, we can create sophisticated question-answering systems that can handle complex queries with ease.

This approach allows for easy customization and optimization at each stage of the process, from initial query formulation to final answer synthesis. Whether you're building a chatbot, a research assistant, or any other NLP application, understanding and leveraging these query pipeline links can significantly enhance your system's capabilities.

For more information and advanced usage, check out the LlamaIndex documentation.