LlamaIndex integration guide
This version of the Anyscale docs is deprecated. Go to the latest version for up to date information.
LlamaIndex is a data framework designed to support developers in building applications powered by Large Language Models (LLMs). It offers tools for data ingestion, structuring, retrieval, and easy integration with Anyscale Private Endpoints.
This example shows how to build a query engine, that is, a chatbot that can answer questions based on a collection of documents.
Step 0: Install dependencies
pip install llama-index>=0.9.8
pip install langchain>=0.0.341
Step 1: Loading documents
In order for the model to answer questions and generate insights from private data, you need to provide a folder of documents to build a vector index with. LlamaIndex's SimpleDirectoryReader
uses the file extensions to select the best file readers to load in the documents.
from llama_index import SimpleDirectoryReader, VectorStoreIndex
# Specify the path to your data folder.
data_folder = 'DATA_FOLDER_PATH'
# Load documents from the directory.
documents = SimpleDirectoryReader(data_folder).load_data()
Step 2: Setup ServiceContext
with Anyscale
A ServiceContext
is a utility container in LlamaIndex for index and query classes that allows you to specify which supported open source model from Anyscale Private Endpoints to use as the LLM and which text embedding model from OpenAI to use to generate vectors.
For a quick demo, you can paste in your API key, but for development, follow best practices for setting your API base and key.
# Your Anyscale and OpenAI API tokens
ANYSCALE_BASE_URL = "ANYSCALE_BASE_URL"
ANYSCALE_API_KEY = "ANYSCALE_API_KEY"
OPENAI_API_KEY = "OPENAI_API_KEY"
Option 1: LlamaIndex Anyscale
You can set a ServiceContext
using LlamaIndex's integration with Anyscale.
from llama_index import ServiceContext, VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.anyscale import Anyscale
# Initialize the service context with LLM and embedding model
service_context = ServiceContext.from_defaults(
llm=Anyscale(model="meta-llama/Llama-2-70b-chat-hf",
api_base=ANYSCALE_BASE_URL,
api_key=ANYSCALE_API_KEY),
embed_model=OpenAIEmbedding(model="text-embedding-ada-002",
api_key=OPENAI_API_KEY),
)
Option 2: LangChain ChatAnyscale
Alternatively, you can set a ServiceContext
using LangChain's integration with Anyscale.
from langchain.chat_models import ChatAnyscale
from llama_index import LLMPredictor, ServiceContext, VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
# Initialize the service context with LLM and embedding model
service_context = ServiceContext.from_defaults(
llm_predictor=LLMPredictor(llm=ChatAnyscale(
anyscale_api_base=ANYSCALE_BASE_URL,
anyscale_api_key=ANYSCALE_API_KEY,
model_name="meta-llama/Llama-2-70b-chat-hf",
temperature=0.7)),
embed_model=OpenAIEmbedding(model="text-embedding-ada-002",
api_key=OPENAI_API_KEY),
)
Step 3: Run queries
With the documents and ServiceContext
prepared, you can create an index for the documents with VectorStoreIndex
and execute queries against it.
# Create the index from documents using the service context
index = VectorStoreIndex.from_documents(documents, service_context=service_context)
# Convert the index into a query engine
query_engine = index.as_query_engine()
# Run a query against the index
response = query_engine.query("Sample query message.")
print(response)