Stay organized with collections Save and categorize content based on your preferences.
This page describes Vertex AI Search apps and data stores.
With Vertex AI Search, you create a search or recommendations app and connect it to a data store. A Google Cloud project can contain multiple apps.
Relationship between apps and data storesThe relationship between apps and data stores depends on the type of app:
Custom search apps have a many-to-many relationship with data stores. When multiple data stores are connected to a single custom search app, this is referred to as blended search. For information about limitations of connecting a search app to more than one data store, see About blended search.
A custom recommendations app has a one-to-one connection with its data store.
A media app has a many-to-one relationship with its data store. An app can only connect to one data store, whereas a given data store can be connected to several apps. For example, a media search app and a media recommendations app can share a data store.
A healthcare search app has a many-to-one relationship with its data store. An app can only connect to one data store, whereas a given data store can be connected to several apps. For example, a patient-facing app and a provider-facing app can connect to the same data store.
For a batch data import of healthcare data, data is imported into a data store that's within an app. For streaming data import (Preview) of healthcare data, data is imported into an entity, which is a type of data store that's within a data connector. A data connector is also a type of data store that's within an app.
After a data store is connected to an app, it can't be disconnected.
Method of app creation and data ingestionHow you create an app and ingest data depends on the type of data you have:
For website data, you can use either the Google Cloud console or the API. To use a website data created with the API, you must attach it to an app with Enterprise features enabled in the Google Cloud console.
For structured or unstructured data, you can use either the Google Cloud console or the API.
For healthcare data, you can use either the Google Cloud console or the API.
Each data store has one or more data records, called documents. What a document represents varies depending on the type of data in the data store:
Website. A document is a web page.
Structured data. A document is a row in a table or a JSON record that follows a particular schema. You can provide this schema yourself or you can let AI Applications derive the schema from the ingested data.
Structured data for media. A document is a row in a table or a JSON record that follows a schema that is specific to media. The documents are records pertaining to media content, such as videos, news articles, music files, and podcasts. A document contains information that describes the media item, at minimum: title, URI to the content location, categories, duration, and available date.
Unstructured data. A document is a file in HTML, PDF with embedded text, or TXT format. PPTX and DOCX formats are available in Preview.
Healthcare FHIR data. A document is a supported FHIR R4 resource. For a list of FHIR R4 resources that Vertex AI Search supports, see Healthcare FHIR R4 data schema reference.
In AI Applications, there are various kinds of data stores. A data store can contain only one type of data.
Website dataA data store with website data uses data indexed from public websites. You can provide a set of URL patterns that you want to include in your data store. The web pages that fit the URL patterns are called included web pages. You can then set up search over data crawled from the included web pages.
For example, you can provide URL patterns such as example.com/faq/*
and example.com/events/*
and enable search over the data crawled from these web pages that fit the pattern. This data includes text, images tagged with metadata, and other structured data such as meta
tags, PageMap attributes, and schema.org data.
You can also provide URL patterns for portions of websites that you want excluded, for example, example.com/events/members-only/*
or example.com/events/past-*
. Excluded URLs take priority over included ones.
There are two types of website data stores:
Basic website search:
Advanced website indexing:
meta
tags, PageMap attributes, and schema.org data to your web pages. You can then use this structured data to edit the data store schema as explained in Use structured data for advanced website indexing.For website search:
A data store with structured data enables semantic search or recommendations over structured data. You can import data from BigQuery or Cloud Storage. You can also manually upload structured JSON data through the API.
For example, you can enable search or recommendations over a product catalog for your ecommerce experience or a directory of doctors for provider search or recommendations.
AI Applications auto-detects the schema from the data that you import. Optionally, you can provide a schema for your data. Providing a schema for your data typically improves the quality of results.
What's nextFor custom search:
For custom recommendations:
Structured data for mediaMedia apps can only be connected to media data stores. Media data stores are structured data stores with a Google-defined schema or with your own custom schema that contains a specific set of five media-related fields. For more information about the schema, see About media documents and data stores.
For example, you can enable recommendations by creating a media recommendations app for a movie catalog or a news site so that your users will have suitable and personalized suggestion made for them.
In addition to media documents, media data stores also contain the user event information that allows Vertex AI Search to customize recommendations and search for your users. User events are required for media apps. For information about user events, see Record real-time user events.
What's next Unstructured dataAn unstructured data store enables semantic search over data such as documents and images.
Unstructured data stores support documents in HTML, PDF with embedded text, and TXT format. PPTX and DOCX formats are available in Preview.
Search provides results in the form of 10 URLs and summarized answers for natural language queries. Documents must be uploaded to a Cloud Storage bucket with appropriate access permissions. For example, a financial institution can enable search over their private corpus of financial research publications, or a biotech company can enable search or recommendations over their private repository of medical research.
What's nextFor search:
A healthcare search app uses FHIR R4 data imported from a Cloud Healthcare API FHIR store. For a list of FHIR R4 resources that Vertex AI Search supports, see Healthcare FHIR R4 data schema reference. A FHIR R4 data store must satisfy some requirements before it can be used as a data source for Vertex AI Search data store. For more information, see how to prepare healthcare FHIR data for ingestion.
What's nextYou can create a blended search app, where multiple data stores can be connected to a single custom search app. This feature lets you use one app to search across multiple sources and types of data.
To make a blended search app, select multiple data stores when creating a new custom search app. If you don't select multiple data stores during creation, then you can't add additional data stores later.
When getting search results, you can either search across all data stores, or filter for results from a single data store.
The following limitations apply:
boostSpec
contentSearchSpec
dataStoreSpecs
facetSpecs
filter
languageCode
offset
oneBoxPageSize
orderBy
query
pageSize
pageToken
relevanceScoreSpec
relevanceThreshold
session
sessionSpec
spellCorrectionSpec
userInfo
userPseudoId
dataStoreSpecs
:
dataStore
boostSpec
: If there are boost specs specified for both SearchRequest
and dataStoreSpecs
, both boost specs are applied to search resultsfilter
: If there are filters specified for both SearchRequest
and dataStoreSpecs
, both filters are applied to search resultsboostControlIds
displayName
filterControlIds
genericConfig
:
contentSearchSpec
name
solutionType
synonymsControlIds
boostAction
synonymAction
filterAction
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-07 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["Vertex AI Search enables the creation of search or recommendation apps connected to various data stores within a Google Cloud project, with multiple apps possible per project."],["The relationship between apps and data stores varies, including many-to-many for generic search apps (blended search), one-to-one for generic recommendations, and many-to-one for media and healthcare apps."],["Data ingestion and app creation methods depend on the data type, allowing website, structured, unstructured, or healthcare data to be processed via the Google Cloud console or API."],["Data stores can contain specific data types, including website data, structured data, media-specific structured data, third-party structured data, unstructured data, and Healthcare FHIR data, each with unique characteristics and use cases."],["Blended search allows a single generic search app to connect to multiple data stores, enabling search across various data sources, but with limitations on adding/removing data stores and supported features."]]],[]]
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4