Build a RAG pipeline with RAGStack, {db-serverless}, and LlamaIndex.
This example demonstrates loading and parsing a PDF document with LLamaParse into an {db-serverless} vector store, then querying the index with LlamaIndex.
You will need an vector-enabled {db-serverless} database.
-
Create an Astra vector database.
-
Within your database, create an Astra DB Access Token with Database Administrator permissions.
-
Get your {db-serverless} API Endpoint:
-
https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
-
-
Create an API key at LlamaIndex.ai. Install the following dependencies:
pip install ragstack-ai
See the Prerequisites page for more details.
Create a .env
file in your application directory with the following environment variables:
LLAMA_CLOUD_API_KEY=llx-...
ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN=AstraCS:...
OPENAI_API_KEY=sk-...
If you’re using Google Colab, you’ll be prompted for these values in the Colab environment.
See the Prerequisites page for more details.
-
Import dependencies and load environment variables.
import os import requests from dotenv import load_dotenv from llama_parse import LlamaParse from llama_index.vector_stores.astra_db import AstraDBVectorStore from llama_index.core.node_parser import SimpleNodeParser from llama_index.core import VectorStoreIndex, StorageContext, Settings from llama_index.llms.openai import OpenAI from llama_index.embeddings.openai import OpenAIEmbedding load_dotenv() llama_cloud_api_key = os.getenv("LLAMA_CLOUD_API_KEY") api_endpoint = os.getenv("ASTRA_DB_API_ENDPOINT") token = os.getenv("ASTRA_DB_APPLICATION_TOKEN") openai_api_key = os.getenv("OPENAI_API_KEY")
-
Configure global settings for LlamaIndex. (As of LlamaIndex v.0.10.0, the
Settings
component replacesServiceContext
. For more information, see the LlamaIndex documentation).Settings.llm = OpenAI(model="gpt-4", temperature=0.1) Settings.embed_model = OpenAIEmbedding( model="text-embedding-3-small", embed_batch_size=100 )
-
Download a PDF about attention mechanisms in transformer model architectures.
url = "https://arxiv.org/pdf/1706.03762.pdf" file_path = "./attention.pdf" response = requests.get(url, timeout=30) if response.status_code == 200: with open(file_path, "wb") as file: file.write(response.content) print("Download complete.") else: print("Error downloading the file.")
-
Load the downloaded PDF with LlamaParse as a text Document for indexing. LlamaParse also supports Markdown-type Documents with
(result_type=markdown)
.documents = LlamaParse(result_type="text").load_data(file_path) print(documents[0].get_content()[10000:11000])
-
Create an {db-serverless} vector store instance.
astra_db_store = AstraDBVectorStore( token=token, api_endpoint=api_endpoint, collection_name="astra_v_table_llamaparse", embedding_dimension=1536 )
-
Parse Documents into nodes and set up storage context to use {db-serverless}.
node_parser = SimpleNodeParser() nodes = node_parser.get_nodes_from_documents(documents) print(nodes[0].get_content()) storage_context = StorageContext.from_defaults(vector_store=astra_db_store)
-
Create a vector store index and query engine from your nodes and contexts.
index = VectorStoreIndex(nodes=nodes, storage_context=storage_context) query_engine = index.as_query_engine(similarity_top_k=15)
-
Query the {db-serverless} vector store for an example with expected context - this query should return a relevant response.
query = "What is Multi-Head Attention also known as?" response_1 = query_engine.query(query) print("\n***********New LlamaParse+ Basic Query Engine***********") print(response_1)
-
Query the {db-serverless} vector store for an example with expected lack of context. This query should return
The context does not provide information about the color of the sky
because your document does not contain information about the color of the sky.query = "What is the color of the sky?" response_2 = query_engine.query(query) print("\n***********New LlamaParse+ Basic Query Engine***********") print(response_2)