RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://docs.databricks.com/en/generative-ai/create-query-vector-search.html below:

How to create and query a vector search index

This article describes how to create and query a vector search index using Mosaic AI Vector Search.

You can create and manage vector search components, like a vector search endpoint and vector search indices, using the UI, the Python SDK, or the REST API.

Requirementsâ

Unity Catalog enabled workspace.
Serverless compute enabled. For instructions, see Connect to serverless compute.
For standard endpoints, the source table must have Change Data Feed enabled. See Use Delta Lake change data feed on Databricks.
To create a vector search index, you must have CREATE TABLE privileges on the catalog schema where the index will be created.
To query an index that is owned by another user, you must have additional privileges. See Query a vector search endpoint.

Permission to create and manage vector search endpoints is configured using access control lists. See Vector search endpoint ACLs.

Installationâ

To use the vector search SDK, you must install it in your notebook. Use the following code to install the package:

%pip install databricks-vectorsearch
dbutils.library.restartPython()

Then use the following command to import VectorSearchClient:

from databricks.vector_search.client import VectorSearchClient

Authenticationâ

See Data protection and authentication.

Create a vector search endpointâ

You can create a vector search endpoint using the Databricks UI, Python SDK, or the API.

Create a vector search endpoint using the UIâ

Follow these steps to create a vector search endpoint using the UI.

In the left sidebar, click Compute.
Click the Vector Search tab and click Create.
The Create endpoint form opens. Enter a name for this endpoint.
In the Type field, select Standard or Storage Optimized. See Endpoint options.
(Optional) Under Advanced settings, select a budget policy. See Mosaic AI Vector Search: Budget policies.
Click Confirm.

Create a vector search endpoint using the Python SDKâ

The following example uses the create_endpoint() SDK function to create a vector search endpoint.

Python


client = VectorSearchClient()




client.create_endpoint(
    name="vector_search_endpoint_name",
    endpoint_type="STANDARD" 
)

Create a vector search endpoint using the REST APIâ

See the REST API reference documentation: POST /api/2.0/vector-search/endpoints.

(Optional) Create and configure an endpoint to serve the embedding modelâ

If you choose to have Databricks compute the embeddings, you can use a pre-configured Foundation Model APIs endpoint or create a model serving endpoint to serve the embedding model of your choice. See Pay-per-token Foundation Model APIs or Create foundation model serving endpoints for instructions. For example notebooks, see Notebook examples for calling an embeddings model.

When you configure an embedding endpoint, Databricks recommends that you remove the default selection of Scale to zero. Serving endpoints can take a couple of minutes to warm up, and the initial query on an index with a scaled down endpoint might timeout.

note

The vector search index initialization might time out if the embedding endpoint isn't configured appropriately for the dataset. You should only use CPU endpoints for small datasets and tests. For larger datasets, use a GPU endpoint for optimal performance.

Create a vector search indexâ

You can create a vector search index using the UI, the Python SDK, or the REST API. The UI is the simplest approach.

There are two types of indexes:

Delta Sync Index automatically syncs with a source Delta Table, automatically and incrementally updating the index as the underlying data in the Delta Table changes.
Direct Vector Access Index supports direct read and write of vectors and metadata. The user is responsible for updating this table using the REST API or the Python SDK. This type of index cannot be created using the UI. You must use the REST API or the SDK.

note

The column name _id is reserved. If your source table has a column named _id, rename it before creating a vector search index.

Create index using the UIâ

In the left sidebar, click Catalog to open the Catalog Explorer UI.
Navigate to the Delta table you want to use.
Click the Create button at the upper-right, and select Vector search index from the drop-down menu.
Use the selectors in the dialog to configure the index.

Name: Name to use for the online table in Unity Catalog. The name requires a three-level namespace, <catalog>.<schema>.<name>. Only alphanumeric characters and underscores are allowed.

Primary key: Column to use as a primary key.

Endpoint: Select the vector search endpoint that you want to use.

Columns to sync: (Supported only for standard endpoints.) Select the columns to sync with the vector index. If you leave this field blank, all columns from the source table are synced with the index. The primary key column and embedding source column or embedding vector column are always synced. For storage-optimized endpoints, all columns from the source table are always synced.

Embedding source: Indicate if you want Databricks to compute embeddings for a text column in the Delta table (Compute embeddings), or if your Delta table contains precomputed embeddings (Use existing embedding column).
- If you selected Compute embeddings, select the column that you want embeddings computed for and the endpoint that is serving the embedding model. Only text columns are supported. For production applications, Databricks recommends using the foundation model databricks-gte-large-en with a provisioned throughput serving endpoint.
- If you selected Use existing embedding column, select the column that contains the precomputed embeddings and the embedding dimension. The format of the precomputed embedding column should be array[float]. For storage-optimized endpoints, the embedding dimension must be evenly divisible by 16.
Sync computed embeddings: Toggle this setting to save the generated embeddings to a Unity Catalog table. For more information, see Save generated embedding table.

Sync mode: Continuous keeps the index in sync with seconds of latency. However, it has a higher cost associated with it since a compute cluster is provisioned to run the continuous sync streaming pipeline. For standard endpoints, both Continuous and Triggered perform incremental updates, so only data that has changed since the last sync is processed. For storage-optimized endpoints, every sync fully rebuilds the vector search index. See Storage-optimized endpoints limitations.

With Triggered sync mode, you use the Python SDK or the REST API to start the sync. See Update a Delta Sync Index.

For storage-optimized endpoints, only Triggered sync mode is supported.
When you have finished configuring the index, click Create.

Create index using the Python SDKâ

The following example creates a Delta Sync Index with embeddings computed by Databricks. For details, see the Python SDK reference.

Python

client = VectorSearchClient()

index = client.create_delta_sync_index(
  endpoint_name="vector_search_demo_endpoint",
  source_table_name="vector_search_demo.vector_search.en_wiki",
  index_name="vector_search_demo.vector_search.en_wiki_index",
  pipeline_type="TRIGGERED",
  primary_key="id",
  embedding_source_column="text",
  embedding_model_endpoint_name="e5-small-v2"
)

The following example creates a Delta Sync Index with self-managed embeddings. This example also shows the use of the optional parameter columns_to_sync to select only a subset of columns to use in the index.

Python

client = VectorSearchClient()

index = client.create_delta_sync_index(
  endpoint_name="vector_search_demo_endpoint",
  source_table_name="vector_search_demo.vector_search.en_wiki",
  index_name="vector_search_demo.vector_search.en_wiki_index",
  pipeline_type="TRIGGERED",
  primary_key="id",
  embedding_dimension=1024,
  embedding_vector_column="text_vector"
)

By default, all columns from the source table are synced with the index.

On standard endpoints, you can select a subset of columns to sync using columns_to_sync. The primary key and embedding columns are always included in the index.

To sync only the primary key and the embedding column, you must specify them in columns_to_sync as shown:

Python

index = client.create_delta_sync_index(
  ...
  columns_to_sync=["id", "text_vector"] 
)

To sync additional columns, specify them as shown. You do not need to include the primary key and the embedding column, as they are always synced.

Python

index = client.create_delta_sync_index(
  ...
  columns_to_sync=["revisionId", "text"] 
)

The following example creates a Direct Vector Access Index.

Python


client = VectorSearchClient()

index = client.create_direct_access_index(
  endpoint_name="storage_endpoint",
  index_name=f"{catalog_name}.{schema_name}.{index_name}",
  primary_key="id",
  embedding_dimension=1024,
  embedding_vector_column="text_vector",
  schema={
    "id": "int",
    "field2": "string",
    "field3": "float",
    "text_vector": "array<float>"}
)

Create index using the REST APIâ

See the REST API reference documentation: POST /api/2.0/vector-search/indexes.

Save generated embedding tableâ

If Databricks generates the embeddings, you can save the generated embeddings to a table in Unity Catalog. This table is created in the same schema as the vector index and is linked from the vector index page.

The name of the table is the name of the vector search index, appended by _writeback_table. The name is not editable.

You can access and query the table like any other table in Unity Catalog. However, you should not drop or modify the table, as it is not intended to be manually updated. The table is deleted automatically if the index is deleted.

Update a vector search indexâ Update a Delta Sync Indexâ

Indexes created with Continuous sync mode automatically update when the source Delta table changes. If you are using Triggered sync mode, you can start the sync using the UI, the Python SDK, or the REST API.

DatabricksÂ UI
PythonÂ SDK
RESTÂ API

In Catalog Explorer, navigate to the vector search index.
On the Overview tab, in the Data Ingest section, click Sync now.

.

For details, see the Python SDK reference.

Python

client = VectorSearchClient()
index = client.get_index(index_name="vector_search_demo.vector_search.en_wiki_index")

index.sync()

See the REST API reference documentation: POST /api/2.0/vector-search/indexes/{index_name}/sync.

Update a Direct Vector Access Indexâ

You can use the Python SDK or the REST API to insert, update, or delete data from a Direct Vector Access Index.

PythonÂ SDK
RESTÂ API

For details, see the Python SDK reference.

Python

index.upsert([
    {
        "id": 1,
        "field2": "value2",
        "field3": 3.0,
        "text_vector": [1.0] * 1024
    },
    {
        "id": 2,
        "field2": "value2",
        "field3": 3.0,
        "text_vector": [1.1] * 1024
    }
])

See the REST API reference documentation: POST /api/2.0/vector-search/indexes.

For production applications, Databricks recommends using service principals instead of personal access tokens. Performance can be improved by up to 100 msec per query.

The following code example illustrates how to update an index using a service principal.

export SP_CLIENT_ID=...
export SP_CLIENT_SECRET=...
export INDEX_NAME=...
export WORKSPACE_URL=https://...
export WORKSPACE_ID=...


export AUTHORIZATION_DETAILS='{"type":"unity_catalog_permission","securable_type":"table","securable_object_name":"'"$INDEX_NAME"'","operation": "WriteVectorIndex"}'

# Generate OAuth token
export TOKEN=$(curl -X POST --url $WORKSPACE_URL/oidc/v1/token -u "$SP_CLIENT_ID:$SP_CLIENT_SECRET" --data 'grant_type=client_credentials' --data 'scope=all-apis' --data-urlencode 'authorization_details=['"$AUTHORIZATION_DETAILS"']' | jq .access_token | tr -d '"')

# Get index URL
export INDEX_URL=$(curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME | jq -r '.status.index_url' | tr -d '"')

# Upsert data into vector search index.
curl -X POST -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url https://$INDEX_URL/upsert-data --data '{"inputs_json": "[...]"}'

# Delete data from vector search index
curl -X DELETE -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url https://$INDEX_URL/delete-data --data '{"primary_keys": [...]}'

The following code example illustrates how to update an index using a personal access token (PAT).

export TOKEN=...
export INDEX_NAME=...
export WORKSPACE_URL=https://...


curl -X POST -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME/upsert-data --data '{"inputs_json": "..."}'


curl -X DELETE -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME/delete-data --data '{"primary_keys": [...]}'

Query a vector search endpointâ

You can only query the vector search endpoint using the Python SDK, the REST API, or the SQL vector_search() AI function.

note

If the user querying the endpoint is not the owner of the vector search index, the user must have the following UC privileges:

USE CATALOG on the catalog that contains the vector search index.
USE SCHEMA on the schema that contains the vector search index.
SELECT on the vector search index.

The default query type is ann (approximate nearest neighbor). To perform a hybrid keyword-similarity search, set the parameter query_type to hybrid. With hybrid search, all text metadata columns are included, and a maximum of 200 results are returned.

To use the reranker in a query, see Use the reranker in a query.

PythonÂ SDKÂ standardÂ endpoint
PythonÂ SDKÂ storage-optimizedÂ endpoint
RESTÂ API
SQL

For details, see the Python SDK reference.

Python


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "field2"],
    num_results=2
    )


results3 = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "field2"],
    num_results=2,
    query_type="hybrid"
    )


results2 = index.similarity_search(
    query_vector=[0.9] * 1024,
    columns=["id", "text"],
    num_results=2
    )

For details, see the Python SDK reference.

The existing filter interface has been re-designed for storage-optimized vector search indexes to adopt a more SQL-like filter string instead of the filter dictionary used in standard vector search endpoints.

Python

client = VectorSearchClient()
index = client.get_index(index_name="vector_search_demo.vector_search.en_wiki_index")


results = index.similarity_search(
    query_vector=[0.2, 0.33, 0.19, 0.52],
    columns=["id", "text"],
    num_results=2
)


results = index.similarity_search(
    query_vector=[0.2, 0.33, 0.19, 0.52],
    columns=["id", "text"],
    
    filters="language = 'en' AND country = 'us'",
    num_results=2
)

See the REST API reference documentation: POST /api/2.0/vector-search/indexes/{index_name}/query.

For production applications, Databricks recommends using service principals instead of personal access tokens. In addition to improved security and access management, using service principals can improve performance by up to 100 msec per query.

The following code example illustrates how to query an index using a service principal.

export SP_CLIENT_ID=...
export SP_CLIENT_SECRET=...
export INDEX_NAME=...
export WORKSPACE_URL=https://...
export WORKSPACE_ID=...


export AUTHORIZATION_DETAILS='{"type":"unity_catalog_permission","securable_type":"table","securable_object_name":"'"$INDEX_NAME"'","operation": "ReadVectorIndex"}'
# If you are using an route_optimized embedding model endpoint, then you need to have additional authorization details to invoke the serving endpoint
# export EMBEDDING_MODEL_SERVING_ENDPOINT_ID=...
# export AUTHORIZATION_DETAILS="$AUTHORIZATION_DETAILS"',{"type":"workspace_permission","object_type":"serving-endpoints","object_path":"/serving-endpoints/'"$EMBEDDING_MODEL_SERVING_ENDPOINT_ID"'","actions": ["query_inference_endpoint"]}'

# Generate OAuth token
export TOKEN=$(curl -X POST  --url $WORKSPACE_URL/oidc/v1/token -u "$SP_CLIENT_ID:$SP_CLIENT_SECRET" --data 'grant_type=client_credentials' --data 'scope=all-apis' --data-urlencode 'authorization_details=['"$AUTHORIZATION_DETAILS"']' | jq .access_token | tr -d '"')

# Get index URL
export INDEX_URL=$(curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME | jq -r '.status.index_url' | tr -d '"')

# Query vector search index.
curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url https://$INDEX_URL/query --data '{"num_results": 3, "query_vector": [...], "columns": [...], "debug_level": 1}'

# Query vector search index.
curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url https://$INDEX_URL/query --data '{"num_results": 3, "query_text": "...", "columns": [...], "debug_level": 1}'

The following code example illustrates how to query an index using a personal access token (PAT).

export TOKEN=...
export INDEX_NAME=...
export WORKSPACE_URL=https://...


curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME/query --data '{"num_results": 3, "query_vector": [...], "columns": [...], "debug_level": 1}'


curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME/query --data '{"num_results": 3, "query_text": "...", "columns": [...], "debug_level": 1}'

Use filters on queriesâ

A query can define filters based on any column in the Delta table. similarity_search returns only rows that match the specified filters.

The following table lists the supported filters.

See the following code examples:

PythonÂ SDKÂ standardÂ endpoint
PythonÂ SDKÂ storage-optimizedÂ endpoint
RESTÂ API
LIKE

Python


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title": ["Ares", "Athena"]},
    num_results=2
    )


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title OR id": ["Ares", "Athena"]},
    num_results=2
    )


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters={"title NOT": "Hercules"},
    num_results=2
    )

Python


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters='title IN ("Ares", "Athena")',
    num_results=2
    )


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters='title = "Ares" OR id = "Athena"',
    num_results=2
    )


results = index.similarity_search(
    query_text="Greek myths",
    columns=["id", "text"],
    filters='title != "Hercules"',
    num_results=2
    )

See POST /api/2.0/vector-search/indexes/{index_name}/query.

LIKE examples

{"column LIKE": "apple"}: matches the strings "apple" and "apple pear" but does not match "pineapple" or "pear". Note that it does not match "pineapple" even though it contains a substring "apple" --- it looks for an exact match over whitespace separated tokens like in "apple pear".

{"column NOT LIKE": "apple"} does the opposite. It matches "pineapple" and "pear" but does not match "apple" or "apple pear".

Use the reranker in a queryâ

The examples in this section show how to use the vector search reranker. When you use the reranker, you set the columns to return (columns) and the columns to use for reranking (columns_to_rerank) separately. num_results is the final number of results to return. This does not affect the number of results used for reranking.

The query debug message includes information about how long the reranking step took. For example:

Bash

'debug_info': {'response_time': 1647.0, 'ann_time': 29.0, 'reranker_time': 1573.0}

If the reranker call fails, that information is included in the debug message:

Bash

'debug_info': {'response_time': 587.0, 'ann_time': 331.0, 'reranker_time': 246.0, 'warnings': [{'status_code': 'RERANKER_TEMPORARILY_UNAVAILABLE', 'message': 'The reranker is temporarily unavailable. Results returned have not been processed by the reranker. Please try again later for reranked results.'}]}

PythonÂ SDK
RESTÂ API

Python



%pip install databricks-vectorsearch --force-reinstall
dbutils.library.restartPython()

Python

from databricks.vector_search.reranker import DatabricksReranker

results = index.similarity_search(
    query_text = "How to create a Vector Search index",
    columns = ["id", "text", "parent_doc_summary", "date"],
    num_results = 10,
    query_type = "hybrid",
    reranker=DatabricksReranker(columns_to_rerank=["text", "parent_doc_summary", "other_column"])
    )

To ensure that you get latency information, set debug_level to at least 1.

Bash

export TOKEN=...
export INDEX_NAME=...
export WORKSPACE_URL=https://...

curl -X GET -H 'Content-Type: application/json' -H "Authorization: Bearer $TOKEN" --url $WORKSPACE_URL/api/2.0/vector-search/indexes/$INDEX_NAME/query --data '{"num_results": 10, "query_text": "How to create a Vector Search index", "columns": ["id", "text", "parent_doc_summary", "date"], "reranker": {"model": "databricks_reranker",
             "parameters": {
               "columns_to_rerank":
                 ["text", "parent_doc_summary"]
              }
             },
"debug_level": 1}'

Example notebooksâ

The examples in this section demonstrate usage of the vector search Python SDK. For reference information, see the Python SDK reference.

LangChain examplesâ

See How to use LangChain with Mosaic AI Vector Search for using Mosaic AI Vector Search as in integration with LangChain packages.

The following notebook shows how to convert your similarity search results to LangChain documents.

Vector search with the Python SDK notebook Notebook examples for calling an embeddings modelâ

The following notebooks demonstrate how to configure a Mosaic AI Model Serving endpoint for embeddings generation.

Call an OpenAI embeddings model using Mosaic AI Model Serving notebook Call a GTE embeddings model using Mosaic AI Model Serving notebook Register and serve an OSS embedding model notebook

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4