This page covers feature engineering and serving capabilities for workspaces that are enabled for Unity Catalog. If your workspace is not enabled for Unity Catalog, see Workspace Feature Store (legacy).
Why use Databricks as your feature store?âWith the Databricks Data Intelligence Platform, the entire model training workflow takes place on a single platform:
In addition, the platform provides the following:
If your workspace does not meet these requirements, see Workspace Feature Store (legacy) for how to use the legacy Workspace Feature Store.
How does feature engineering on Databricks work?âThe typical machine learning workflow using feature engineering on Databricks follows this path:
You can now use the model to make predictions on new data. For batch use cases, the model automatically retrieves the features it needs from Feature Store.
For real-time serving use cases, publish the features to an online feature store.
At inference time, the model reads pre-computed features from the online store and joins them with the data provided in the client request to the model serving endpoint.
Start using feature engineering â example notebooksâTo get started, try these example notebooks. The basic notebook shows how to create a feature table, use it to train a model, and run batch scoring using automatic feature lookup. It also shows the Feature Engineering UI, which you can use to search for features and understand how features are created and used.
Basic Feature Engineering in Unity Catalog example notebookThe taxi example notebook illustrates the process of creating features, updating them, and using them for model training and batch inference.
Feature Engineering in Unity Catalog taxi example notebook Supported data typesâFeature engineering in Unity Catalog and legacy Workspace Feature Store support the following PySpark data types:
IntegerType
FloatType
BooleanType
StringType
DoubleType
LongType
TimestampType
DateType
ShortType
ArrayType
BinaryType
[1]DecimalType
[1]MapType
[1]StructType
[2][1] BinaryType
, DecimalType
, and MapType
are supported in all versions of Feature Engineering in Unity Catalog and in Workspace Feature Store v0.3.5 or above. [2] StructType
is supported in Feature Engineering v0.6.0 or above.
The data types listed above support feature types that are common in machine learning applications. For example:
ArrayType
.MapType
.StringType
.When published to online stores, ArrayType
and MapType
features are stored in JSON format.
The Feature Store UI displays metadata on feature data types:
More informationâFor more information on best practices, download The Comprehensive Guide to Feature Stores.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4