Google Memorystore for Redis is a fully-managed service that is powered by the Redis in-memory data store to build application caches that provide sub-millisecond data access. Extend your database application to build AI-powered experiences leveraging Memorystore for Redis's Langchain integrations.
This notebook goes over how to use Memorystore for Redis to store vector embeddings with the MemorystoreVectorStore
class.
Learn more about the package on GitHub.
Pre-reqs Before You BeginTo run this notebook, you will need to do the following:
The integration lives in its own langchain-google-memorystore-redis
package, so we need to install it.
%pip install -upgrade --quiet langchain-google-memorystore-redis langchain
Colab only: Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top.
☁ Set Your Google Cloud ProjectSet your Google Cloud project so that you can leverage Google Cloud resources within this notebook.
If you don't know your project ID, try the following:
gcloud config list
.gcloud projects list
.
PROJECT_ID = "my-project-id"
!gcloud config set project {PROJECT_ID}
🔐 Authentication
Authenticate to Google Cloud as the IAM user logged into this notebook in order to access your Google Cloud Project.
from google.colab import auth
auth.authenticate_user()
Basic Usage Initialize a Vector Index
import redis
from langchain_google_memorystore_redis import (
DistanceStrategy,
HNSWConfig,
RedisVectorStore,
)
redis_client = redis.from_url("redis://127.0.0.1:6379")
index_config = HNSWConfig(
name="my_vector_index", distance_strategy=DistanceStrategy.COSINE, vector_size=128
)
RedisVectorStore.init_index(client=redis_client, index_config=index_config)
Prepare Documents
Text needs processing and numerical representation before interacting with a vector store. This involves:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
loader = TextLoader("./state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
Add Documents to the Vector Store
After text preparation and embedding generation, the following methods insert them into the Redis vector store.
Method 1: Classmethod for Direct InsertionThis approach combines embedding creation and insertion into a single step using the from_documents classmethod:
from langchain_community.embeddings.fake import FakeEmbeddings
embeddings = FakeEmbeddings(size=128)
redis_client = redis.from_url("redis://127.0.0.1:6379")
rvs = RedisVectorStore.from_documents(
docs, embedding=embeddings, client=redis_client, index_name="my_vector_index"
)
Method 2: Instance-Based Insertion
This approach offers flexibility when working with a new or existing RedisVectorStore:
rvs = RedisVectorStore(
client=redis_client, index_name="my_vector_index", embeddings=embeddings
)
ids = rvs.add_texts(
texts=[d.page_content for d in docs], metadatas=[d.metadata for d in docs]
)
Perform a Similarity Search (KNN)
With the vector store populated, it's possible to search for text semantically similar to a query. Here's how to use KNN (K-Nearest Neighbors) with default settings:
similarity_search
method finds items in the vector store closest to the query in meaning.import pprint
query = "What did the president say about Ketanji Brown Jackson"
knn_results = rvs.similarity_search(query=query)
pprint.pprint(knn_results)
Perform a Range-Based Similarity Search
Range queries provide more control by specifying a desired similarity threshold along with the query text:
similarity_search_with_score
method finds items from the vector store that fall within the specified similarity threshold.rq_results = rvs.similarity_search_with_score(query=query, distance_threshold=0.8)
pprint.pprint(rq_results)
Perform a Maximal Marginal Relevance (MMR) Search
MMR queries aim to find results that are both relevant to the query and diverse from each other, reducing redundancy in search results.
max_marginal_relevance_search
method returns items that optimize the combination of relevance and diversity based on the lambda setting.mmr_results = rvs.max_marginal_relevance_search(query=query, lambda_mult=0.90)
pprint.pprint(mmr_results)
Use the Vector Store as a Retriever
For seamless integration with other LangChain components, a vector store can be converted into a Retriever. This offers several advantages:
as_retriever()
method converts the vector store into a format that simplifies querying.retriever = rvs.as_retriever()
results = retriever.invoke(query)
pprint.pprint(results)
Clean up Delete Documents from the Vector Store
Occasionally, it's necessary to remove documents (and their associated vectors) from the vector store. The delete
method provides this functionality.
There might be circumstances where the deletion of an existing vector index is necessary. Common reasons include:
Caution: Vector index deletion is an irreversible operation. Be certain that the stored vectors and search functionality are no longer required before proceeding.
RedisVectorStore.drop_index(client=redis_client, index_name="my_vector_index")
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4