A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://cloud.google.com/sql/docs/mysql/search-filter-vector-embeddings below:

Search and filter with vector embeddings | Cloud SQL for MySQL

MySQL   |  PostgreSQL   |  SQL Server To use the GA version of vector search, the instance maintenance version should be version MYSQL_8_0_version.R20241208.01_00 or newer, where version is the minor version number. For information about how to upgrade your instance to a newer version that supports GA vector embeddings, see Self-service maintenance.

For information about the syntax and behavior of vector embeddings for the Public Preview version of this feature, see Work with vector embeddings (Preview).

This page describes the different ways that you can query vector embeddings. For an overview of ANN and KNN similarity searches, see vector searches.

Search approximate nearest neighbors (ANN)

To perform an ANN search, use the approx_distance function in a SELECT and ORDER BY clause. You must use a LIMIT clause in an ANN search. You can also get the distance value by putting approx_distance in a SELECT list.

Use the following syntax for ANN queries:

# Ordering by distance
SELECT title
FROM books
ORDER BY approx_distance(embedding, string_to_vector('[1,2,3]'), 'distance_measure=l2_squared')
LIMIT 4;

# Selecting the distance value
SELECT
  approx_distance(
    embedding_name,
    string_to_vector('[1,2,3]'),
    'distance_measure=cosine,num_leaves_to_search=3')
    dist
FROM table
ORDER BY dist
LIMIT limit_value;

The approx_distance function uses the following options:

The following example shows how to use approx_distance to find the top K closest rows using the l2_squared distance measure and order the results by distance.

# Ordering by distance
SELECT title
FROM books
ORDER BY approx_distance(embedding, string_to_vector('[1,2,3]'),
                         'distance_measure=l2_squared')
LIMIT 4;

# Selecting the distance value
SELECT
    approx_distance
        (embedding, string_to_vector('[1,2,3]'),
         'distance_measure=l2_squared') dist
FROM table
ORDER BY dist
LIMIT 4;
Filter results from approx_distance queries

You can use the approx_distance function with WHERE conditions that filter query results with a non-vector predicate to perform post filtering. The approx_distance function is evaluated before applying the filter which means that the number of results returned are nondeterministic.

For example, for the following query:

SELECT id FROM products WHERE price < 100
ORDER BY approx(embedding, @query_vector,'distance_measure=cosine')
LIMIT 11;

The approx_distance function returns the 11 nearest neighbors to the query vector regardless of price. In post filtering, the products with a price < 100 are selected. It's possible that all of the nearest neighbors have a price < 100, so there are 11 results to the query. Alternatively, if none of the nearest neighbors have a price < 100, there are 0 rows returned.

If you anticipate that your filter in the WHERE condition is very selective, an exact search (KNN) might be a better option to ensure that a sufficient number of rows are returned.

Check the fallback status on ANN searches

There are certain cases where an ANN search falls back to a KNN search. These include the following:

All of these cases push a warning to the client indicating that exact search was performed and the reason why.

Use the following command in the mysql client to view the fallback status:

SHOW global status LIKE '%cloudsql_vector_knn_fallback%';

If you want to use ANN and it's falling back to KNN, the query might run slower. You should find the reason it's falling back and assess whether to make changes so that ANN is used instead.

Example: Create a vector index and run an ANN query

The following example walkthrough provides steps to create a vector index and run an ANN query in Cloud SQL.

  1. Generate vector embeddings. You can create vector embeddings manually or use a text embedding API of your choice. For an example that uses Vertex AI, see Generate vector embeddings based on row data.
  2. Create a table in Cloud SQL that contains a vector embedding column with three dimensions.

    CREATE TABLE books(
    id INTEGER PRIMARY KEY AUTO_INCREMENT, title VARCHAR(60), embedding VECTOR(3) USING VARBINARY);
    
  3. Insert a vector embedding into the column.

    INSERT INTO books VALUES ((1, 'book title', string_to_vector('[1,2,3]')));
    
  4. Commit the changes.

    commit;
    
  5. Create the vector index using the L2_squared function to measure distance.

    CREATE
      VECTOR INDEX vectorIndex
    ON dbname.books(embeddings)
    USING SCANN QUANTIZER = SQ8 DISTANCE_MEASURE = l2_squared;
    
  6. Use the following syntax to perform an ANN search with a LIMIT of 4 search results:

    SELECT title
    FROM books
    ORDER BY approx_distance(embedding, string_to_vector('[1,2,3]'), 'distance_measure=l2_squared')
    LIMIT 4;
    
    SELECT approx_distance(embedding, string_to_vector('[1,2,3]'), 'distance_measure=cosine') dist
    FROM books
    ORDER BY dist
    LIMIT 4;
    
Search K-nearest neighbors (KNN)

To perform a K-nearest neighbor search, use the vector_distance function with a distance measure option and a vector conversion function (string_to_vector or vector_to_string) in a SELECT statement. Use the following syntax:

SELECT vector_distance(string_to_vector('[1,2,3]'),
                      string_to_vector('[1,2,3]'),
                      'Distance_Measure=dot_product');

Replace the values [1,2,3] with the embedding values of your data.

The following example shows how to use this query with the cosine_distance function and the string_to_vector vector conversion function.

SELECT id,cosine_distance(embedding, string_to_vector('[1,2,3]')) dist
FROM books
ORDER BY distance
LIMIT 10;
Get the Cosine distance in a KNN query

Use the Cloud SQL cosine_distance function to calculate the distance using cosine.

SELECT cosine_distance(embedding, string_to_vector('[3,1,2]')) AS distance FROM books WHERE id = 10;
Get the Dot Product distance in a KNN query

Use the Cloud SQL dot_product function to calculate the distance using the dot product.

SELECT dot_product(embedding, string_to_vector('[3,1,2]')) AS distance FROM books WHERE id = 10;
Get the L2-squared distance in a KNN query

Use the Cloud SQL l2_squared_distance function to calculate the distance using L2 squared.

SELECT
  l2_squared_distance(embedding, string_to_vector('[3,1,2]'))
    AS distance
FROM books
WHERE id = 10;
What's next

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4