This article explains how to create a compute resource assigned to a group using the Dedicated access mode.
Dedicated group access mode allows users to get the operational efficiency of a standard access mode cluster while also securely supporting languages and workloads that are not supported by standard access mode, such as Databricks Runtime for ML, RDD APIs, and R.
RequirementsâTo use the dedicated group access mode:
CAN MANAGE
permissions on a workspace folder where they can keep notebooks, ML experiments, and other workspace artifacts used by the group cluster.Dedicated access mode is the latest version of single-user access mode. With dedicated access, a compute resource can be assigned to a single user or group, only allowing the assigned user(s) access to use the compute resource.
When a user is connected to a compute resource dedicated to a group (a group cluster), the user's permissions automatically down-scopes to the group's permissions, allowing the user to securely share the resource with the other members of the group.
Create a compute resource dedicated to a groupâBecause user permissions are scoped down to the group when using group clusters, Databricks recommends creating a /Workspace/Groups/<groupName>
folder for each group you plan to use with a group cluster. Then, assign CAN MANAGE
permissions on the folder to the group. This allows groups to avoid permission errors. All of the group's notebooks and workspace assets should be managed in the group folder.
You must also modify the following workloads to run on group clusters:
mlflow.set_tracking_uri("/Workspace/Groups/<groupName>")
.experiment_dir
parameter to â/Workspace/Groups/<groupName>â
for your AutoML runs.dbutils.notebook.run
: Ensure the group has READ
permission on the notebook being executed.All commands, queries, and other actions performed on a group cluster use the permissions assigned to the group, not the individual user.
Individual user permissions cannot be enforced because all group members have full access to the Spark APIs and shared compute environment. If user-based permissions were applied, one member could query restricted data, and another member without access could still retrieve the results through the shared environment. Therefore, the group itself, not the user who is a member of the group, must have the necessary permissions to successfully perform the action.
For example, the group needs explicit permission to query a table, access a secret scope or secret, use a Unity Catalog connection credential, access a Git folder, or create a workspace object.
Example group permissionsâWhen you create a data object using the group cluster, the group is assigned as the object's owner.
For example, if you have a notebook attached to a group cluster and run the following command:
SQL
use catalog main;
create schema group_cluster_group_schema;
Then run this query to check the owner of the schema:
SQL
describe schema group_cluster_group_schema;
Auditing group dedicated compute activityâ
There are two key identities involved when a group cluster runs a workload:
The audit log system table records these identities under the following parameters:
identity_metadata.run_by
: The authenticating user who performs the actionidentity_metadata.run_as
: The authorizing group whose permissions are used for the action.The following example query pulls up the identity metadata for an action taken with the group cluster:
SQL
select action_name, event_time, user_identity.email, identity_metadata
from system.access.audit
where user_identity.email = "uc-group-cluster-group" AND service_name = "unityCatalog"
order by event_time desc limit 100;
View the audit log system table reference for more example queries. See Audit log system table reference.
Known issuesâWorkspace files and folders created from group clusters result in the assigned object owner being Unknown
. Subsequent operations on those objects, such as read
, write
, and delete
, fail with permission-denied errors.
Dedicated group access has the following limitations:
run_as
parameter only supports a single user or service principal.identity_metadata.run_as
(the authorizing group) or identity_metadata.run_by
(the authenticating user) for workloads that run on a group cluster.identity_metadata.run_as
(the authorizing group) or identity_metadata.run_by
(the authenticating user) for workloads that run on a group cluster. You must use the system.access.audit
table to view the identity metadata.%run
command and other actions executed in the notebook context always use the user's permissions rather than the group's permissions. This is because these actions are handled by the notebook environment, not the cluster's environment. Alternative commands such as dbutils.notebook.run()
are run on the cluster and therefore use the group's permissions.is_member(<group>)
function returns false
when invoked on a group cluster because the group is not a member of itself. To correctly check membership across both group clusters and other access modes, use is_member(<group>) OR current_user() == <group>
.RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4