Stay organized with collections Save and categorize content based on your preferences.
MySQL | PostgreSQL | SQL ServerThis page provides an overview of capabilities offered by Cloud SQL for PostgreSQL to help you build generative AI applications. For getting started with a sample application, see Get started with using Cloud SQL for generative AI applications.
Retrieval-Augmented Generation (RAG) is a technique for optimizing the output of a large language model (LLM) by referencing an authoritative knowledge base before generating a response. RAG enhances generative AI applications by improving their accuracy. Cloud SQL databases offer capabilities curated for RAG and generative AI applications, as explained in this page.
Generate vector embeddingsVector embeddings are essential for RAG because they enable a semantic understanding and an efficient similarity search. These embeddings are numerical representations of text, images, audio, and video. Embedding models generate the vector embeddings so that, if two pieces of content are similar semantically, then their respective embeddings are located near each other in the embedding vector space.
Cloud SQL integrates with Vertex AI. You can use the models that Vertex AI hosts to generate vector embeddings by using SQL queries.
Cloud SQL extends PostgreSQL syntax with an embedding function for generating vector embeddings of text. After you generate these embeddings, you can store them in a Cloud SQL database without needing a separate vector database.
You can also use Cloud SQL to store vector embeddings that are generated outside of Cloud SQL. For example, you can store vector embeddings that are generated by using pre-trained models in the Vertex AI Model Garden. You can use these vector embeddings as inputs to pgvector
functions for similarity and semantic searches.
pgvector
You can store, index, and query vector embeddings in Cloud SQL by using the pgvector
PostgreSQL extension.
For more information about configuring this extension, see Configure PostgreSQL extensions. For more information about storing, indexing, and querying vector embeddings, see Store a generated embedding and Query and index embeddings using pgvector
.
You can invoke online predictions using models stored in the Vertex AI Model Garden by using SQL queries.
Use the LangChain integrationCloud SQL integrates with LangChain, an open-source LLM orchestration framework, to simplify developing generative AI applications. You can use the following LangChain packages:
You can improve the performance of a vector search by using the following:
Data cache metrics: optimize queries based on how effectively the data cache is used in a vector search.
Cloud SQL provides the following metrics in the Metrics Explorer in Cloud Monitoring:
Metric Description Metric label Data cache used The data cache usage (in bytes)database/data_cache/bytes_used
Data cache quota The maximum data cache size (in bytes) database/data_cache/quota
Data cache hit count The total number of data cache hit read operations for an instance database/postgresql/data_cache/hit_count
Data cache miss count The total number of data cache miss read operations for an instance database/postgresql/data_cache/miss_count
Data cache hit ratio The ratio of data cache hit read operations to data cache miss read operations for an instance database/postgresql/data_cache/hit_ratio
System Insights: provide system metrics such as CPU utilization, disk utilization, and throughput to help you monitor the health of instances and troubleshoot issues that affect the performance of your generative AI applications. To view these metrics, use the Cloud SQL System Insights dashboard.
Query Insights: detect, diagnose, and prevent query performance problems. This is helpful to improve the performance of vector search in your generative AI applications.
You can use the Cloud SQL Query Insights dashboard to observe the performance of top queries and analyze these queries by using visual query plans. You can also monitor performance at an application level and trace the source of a problematic query across the application stack to the database by using SQLcommenter. This is an open-source, object-relational mapping (ORM), auto-instrumentation library.
Query Insights can also help you integrate with your existing application monitoring (APM) tools so that you can troubleshoot query problems using tools with which you're familiar.
Using Cloud SQL to build generative AI applications provides the following:
pgvector
and integrates with both Vertex AI and LangChain.To get started building generative AI applications, use this sample app. The app uses Cloud SQL, Vertex AI, and either Google Kubernetes Engine (GKE) or Cloud Run. You can use the app to build a basic chatbot API that:
pgvector
asyncpg
and FastAPI
The solution contains the following contents:
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-07-09 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-07-09 UTC."],[],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4