Beta
This feature is in Beta and is available in the following regions:
us-east-1
, us-west-2
, eu-west-1
, ap-southeast-1
, ap-southeast-2
, eu-central-1
, us-east-2
, ap-south-1
Databricks Online Feature Stores are a high-performance, scalable solution for serving feature data to online applications and real-time machine learning models. Powered by Databricks Lakebase, it provides low-latency access to feature data at a high scale while maintaining consistency with your offline feature tables.
The primary use cases for the Online Feature Store include:
Databricks Online Feature Stores requires Databricks Runtime 16.4 LTS ML or above. You can also use serverless compute.
To use Databricks Online Feature Stores, you must first install the package. The following lines of code must be executed each time a notebook is run:
Python
%pip install --pre databricks-feature-engineering>=0.13.0a4
dbutils.library.restartPython()
Create an online storeâ
To create a new online feature store:
Python
from databricks.feature_engineering import FeatureEngineeringClient
fe = FeatureEngineeringClient()
fe.create_online_store(
name="my-online-store",
capacity="CU_2"
)
The capacity options correspond to different performance tiers "CU_1", "CU_2", "CU_4", and "CU_8". Each capacity unit allocates about 16GB of RAM to the database instance, along with all associated CPU and local SSD resources. Scaling up increases these resources linearly. For more details see Manage instance capacity.
Manage online storesâThe following code shows how to retrieve and update online stores:
Python
store = fe.get_online_store(name="my-online-store")
if store:
print(f"Store: {store.name}, State: {store.state}, Capacity: {store.capacity}")
updated_store = fe.update_online_store(
name="my-online-store",
capacity="CU_4"
)
Add read replicas to an online storeâ
When creating or updating an online store, you can add read replicas to the online store by specifying the read_replica_count
parameter. Read traffic is automatically distributed across read replicas, reducing latency and improving performance and scalability for high-concurrency workloads.
After your online store is in the AVAILABLE state, you can publish feature tables to make them available for low-latency access.
note
Change data feed must be enabled on the table before it can be published to an online store.
To publish a feature table to an online store:
Python
from databricks.ml_features.entities.online_store import DatabricksOnlineStore
online_store = fe.get_online_store(name="my-online-store")
fe.publish_table(
online_store=online_store,
source_table_name="catalog_name.schema_name.feature_table_name",
online_table_name="catalog_name.schema_name.online_feature_table_name"
)
The publish_table
operation does the following:
If publish_table
is called with streaming=True
, the online table is set up with a streaming pipeline to continuously update the online store as new data arrives in the offline feature table.
To periodically update features in an online table, create a scheduled Lakeflow Job that runs publish_table
. The job automatically refreshes the table and incrementally updates the online features. See Lakeflow Jobs.
After your published table status shows as "AVAILABLE", you can explore and query the feature data in several ways:
Unity Catalog UI: Navigate to the online table in Unity Catalog to view sample data and explore the schema directly in the UI. This provides a convenient way to inspect your feature data and verify that the publishing process completed successfully.
SQL Editor: For more advanced querying and data exploration, you can use the SQL editor to run PostgreSQL queries against your online feature tables. This allows you to perform complex queries, joins, and analysis on your feature data. For detailed instructions on using the SQL editor with online stores, see Access a database instance from the SQL editor.
Use online features in real-time applicationsâTo serve features to real-time applications and services, create a feature serving endpoint. See Feature Serving endpoints.
Models that are trained using features from Databricks automatically track lineage to the features they were trained on. When deployed as endpoints, these models use Unity Catalog to find appropriate features in online stores. For details, see Use features in online workflows.
Delete an online storeâTo delete an online store:
Python
fe.delete_online_store(name="my-online-store")
note
Deleting an online published table can lead to unexpected failures in downstream dependencies. Before you delete a table, you should ensure that its online features are no longer used by model serving or feature serving endpoints.
Limitationsâfilter_condition
, checkpoint_location
, mode
, trigger
, and features
.The following notebook shows an example of how to set up and access a Databricks Online Feature Store using Databricks Lakebase.
Online feature store with Lakebase notebook Additional resourcesâRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4