Skip to content

Latest commit

 

History

History
161 lines (141 loc) · 5.26 KB

llama-parse-astra.adoc

File metadata and controls

161 lines (141 loc) · 5.26 KB

RAG with LlamaParse and {db-serverless}

Build a RAG pipeline with RAGStack, {db-serverless}, and LlamaIndex.

This example demonstrates loading and parsing a PDF document with LLamaParse into an {db-serverless} vector store, then querying the index with LlamaIndex.

Prerequisites

You will need an vector-enabled {db-serverless} database.

  • Create an Astra vector database.

  • Within your database, create an Astra DB Access Token with Database Administrator permissions.

  • Get your {db-serverless} API Endpoint:

    • https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com

  • Create an API key at LlamaIndex.ai. Install the following dependencies:

pip install ragstack-ai

See the Prerequisites page for more details.

Set up your local environment

Create a .env file in your application directory with the following environment variables:

LLAMA_CLOUD_API_KEY=llx-...
ASTRA_DB_API_ENDPOINT=https://<ASTRA_DB_ID>-<ASTRA_DB_REGION>.apps.astra.datastax.com
ASTRA_DB_APPLICATION_TOKEN=AstraCS:...
OPENAI_API_KEY=sk-...

If you’re using Google Colab, you’ll be prompted for these values in the Colab environment.

See the Prerequisites page for more details.

Create RAG pipeline

  1. Import dependencies and load environment variables.

    import os
    import requests
    from dotenv import load_dotenv
    from llama_parse import LlamaParse
    from llama_index.vector_stores.astra_db import AstraDBVectorStore
    from llama_index.core.node_parser import SimpleNodeParser
    from llama_index.core import VectorStoreIndex, StorageContext, Settings
    from llama_index.llms.openai import OpenAI
    from llama_index.embeddings.openai import OpenAIEmbedding
    
    load_dotenv()
    
    llama_cloud_api_key = os.getenv("LLAMA_CLOUD_API_KEY")
    api_endpoint = os.getenv("ASTRA_DB_API_ENDPOINT")
    token = os.getenv("ASTRA_DB_APPLICATION_TOKEN")
    openai_api_key = os.getenv("OPENAI_API_KEY")
  2. Configure global settings for LlamaIndex. (As of LlamaIndex v.0.10.0, the Settings component replaces ServiceContext. For more information, see the LlamaIndex documentation).

    Settings.llm = OpenAI(model="gpt-4", temperature=0.1)
    Settings.embed_model = OpenAIEmbedding(
        model="text-embedding-3-small", embed_batch_size=100
    )
  3. Download a PDF about attention mechanisms in transformer model architectures.

    url = "https://arxiv.org/pdf/1706.03762.pdf"
    file_path = "./attention.pdf"
    
    response = requests.get(url, timeout=30)
    if response.status_code == 200:
        with open(file_path, "wb") as file:
            file.write(response.content)
        print("Download complete.")
    else:
        print("Error downloading the file.")
  4. Load the downloaded PDF with LlamaParse as a text Document for indexing. LlamaParse also supports Markdown-type Documents with (result_type=markdown).

    documents = LlamaParse(result_type="text").load_data(file_path)
    print(documents[0].get_content()[10000:11000])
  5. Create an {db-serverless} vector store instance.

    astra_db_store = AstraDBVectorStore(
        token=token,
        api_endpoint=api_endpoint,
        collection_name="astra_v_table_llamaparse",
        embedding_dimension=1536
    )
  6. Parse Documents into nodes and set up storage context to use {db-serverless}.

    node_parser = SimpleNodeParser()
    nodes = node_parser.get_nodes_from_documents(documents)
    print(nodes[0].get_content())
    
    storage_context = StorageContext.from_defaults(vector_store=astra_db_store)
  7. Create a vector store index and query engine from your nodes and contexts.

    index = VectorStoreIndex(nodes=nodes, storage_context=storage_context)
    query_engine = index.as_query_engine(similarity_top_k=15)

Execute a query

  1. Query the {db-serverless} vector store for an example with expected context - this query should return a relevant response.

    query = "What is Multi-Head Attention also known as?"
    response_1 = query_engine.query(query)
    print("\n***********New LlamaParse+ Basic Query Engine***********")
    print(response_1)
  2. Query the {db-serverless} vector store for an example with expected lack of context. This query should return The context does not provide information about the color of the sky because your document does not contain information about the color of the sky.

    query = "What is the color of the sky?"
    response_2 = query_engine.query(query)
    print("\n***********New LlamaParse+ Basic Query Engine***********")
    print(response_2)

Complete code

examples:partial$llama-parse.adoc