This page provides an overview of hierarchical namespace, including its key features, common use cases, benefits, and limitations to consider.
OverviewHierarchical namespace is a capability offered by Cloud Storage that lets you organize objects into folders and store your data in a logical file system structure. Storing your data in a file system structure enhances performance, ensures consistency, and simplifies the management of data-intensive and file-oriented workloads.
Folder operations provide reliability and management capabilities, including creating, deleting, listing, and renaming folders. The hierarchical organization of objects simplifies data organization and streamlines data management tasks. A folder in a bucket with hierarchical namespace enabled can contain objects, other folders, or a combination of both.
To use folders in a bucket, you must enable hierarchical namespace when you create the bucket. Your bucket's hierarchical namespace setting can't be changed after the bucket is created. For information about enabling hierarchical namespace for your bucket, see Create and manage buckets with hierarchical namespace enabled.
The following diagram shows an example of a bucket with hierarchical namespace enabled where objects are organized in a hierarchical structure of folders.
Figure 1. Bucket hierarchy with folders and objects. Key featuresHierarchical namespace provides the following features:
Higher initial queries per second (QPS): Buckets with hierarchical namespace enabled offer up to 8 times higher initial QPS limits for reading and writing objects compared to buckets without hierarchical namespace enabled. The higher initial QPS makes it easier to scale data-intensive workloads and provides enhanced throughput. For information about performance optimization methods while using folders in buckets with hierarchical namespace enabled, see Folder management.
Folders: Folders act as a container for objects and other folders, with support for operations such as create, delete and get folders.
Rename folders: The rename folders operation helps you to atomically rename the path of a folder and its underlying folders without deleting any objects. This technique is efficient and time-saving, especially for large folders with multiple objects.
List folders: The list folders operation lists all folders in the bucket or underneath a specific folder helping you to manage and understand the structure of your data stored within a bucket.
You should consider enabling hierarchical namespace when using applications that expect a file system-like hierarchy and semantics. Hierarchical namespace is beneficial for data-intensive tasks like analytics, AI, and ML workloads. Here are some common scenarios where you should consider using hierarchical namespace:
Hadoop based processing: Hadoop and Spark workloads traditionally expect a file system-like storage structure and time-based naming for files and folders. Hierarchical namespace integrates with the Cloud Storage connector to provide enhanced throughput and atomic folder renames, improving data integrity and consistency for many data processing pipelines.
File-oriented workloads processing: Workloads such as batch analytics processing, financial services, or high performance computing are structured into partitions based on a hierarchy of folders and files. Hierarchical namespace helps to manage these environments with a dedicated API for folder management. Additionally, hierarchical namespace simplifies managing folders that contain other folders and objects. With a single API command, you can swiftly rename a folder along with all its contents, saving valuable time and resources.
AI and ML processing: AI and ML tools such as TensorFlow, Pandas, and PyTorch expect file system-like access and semantics. Hierarchical namespace, especially when combined with Cloud Storage FUSE, delivers increased throughput and efficient data access. As a result, hierarchical namespace enhances the performance and reliability of the ML model iteration.
Before enabling hierarchical namespace for your bucket, you should consider the limitations of hierarchical namespace. For information about hierarchical namespace limitations, see Limitations.
Benefits of hierarchical namespaceWhen you enable Hierarchical namespace for your buckets, you can do the following:
Optimize organization: You can organize your data into a hierarchical folder structure, that helps you to manage and locate files or datasets.
Establish a file system-like ecosystem: Hierarchical namespace introduces file system-like features such as folders, folder renaming, and folder listing, which are beneficial for file-oriented applications, including the Hadoop ecosystem and AI and ML workloads.
Performance improvement: By scaling data-intensive workloads to handle higher throughput, you can enhance the overall performance of your application.
Buckets with hierarchical namespace support the following Cloud Storage platform capabilities:
All Cloud Storage object APIs and widely-used Cloud Storage features. For details about unsupported features, see Limitations.
Data transfer from a standard bucket to a bucket with hierarchical namespace using Storage Transfer Service.
Integration with the following products:
Cloud Storage Connector, maintained by Dataproc for Hadoop workloads. For more information, see Use hierarchical namespace enabled buckets for Hadoop workloads.
Cloud Storage FUSE for file system-like bucket access using clients.
Buckets with hierarchical namespace enabled have the following interactions with other Cloud Storage operations:
Object operationsBuckets with hierarchical namespace enabled handle object operations in the following ways:
Upload
, Rewrite
, and Compose
automatically create any missing parent folders, as long as you have the necessary permissions. As a result, you don't need to pre-create folders before uploading objects.DeleteFolder
operation.ListObjects
operation with the delimiter
parameter, buckets return each child folder as a prefix.
However, empty folders are excluded by default. To include empty folders, similar to a typical file system listing, you must set the includeFoldersAsPrefixes
parameter. For information about performance optimization methods while listing objects in buckets with hierarchical namespace enabled, see Listing objects.Buckets with hierarchical namespace enabled handle managed folder operations in the following ways:
You can delete a bucket with hierarchical namespace enabled in the same manner as any other bucket. If a bucket enabled with hierarchical namespace only contains empty folders and no objects or managed folders, then the bucket can be deleted.
Object Lifecycle ManagementObject Lifecycle Management lets you automate actions on objects based on conditions, such as age or prefix. However, Object Lifecycle Management rules can behave differently in buckets with hierarchical namespace and in buckets with a flat namespace due to the RenameFolder
operation:
Object Lifecycle Management rules for buckets with a flat namespace: The renaming operation involves renaming every object using tools by copying every object to a destination location and deleting the original object from the source location. As a result, new objects are created with new creation times at the destination location. If age-based Object Lifecycle Management rules are applied for the destination location, they won't apply to the new objects immediately as their creation times are reset.
Object Lifecycle Management rules for buckets with hierarchical namespace enabled: Renaming a folder operates at the folder level, without having to rename every single object. As a result, the creation time of the objects is preserved, meaning the age-based Object Lifecycle Management rules are applied to renamed objects immediately if they meet the age criteria.
You can list all buckets with hierarchical namespace enabled, regardless of their storage layout. A bucket's storage layout describes how objects are arranged within a bucket, either in a flat namespace or a hierarchical namespace. For instructions on viewing a bucket's storage layout, see Get a bucket's storage layout. To list all buckets, follow the instructions detailed in List buckets.
You can delete a bucket with hierarchical namespace enabled in the same manner as any other bucket. For the purposes of deletion, if a bucket with hierarchical namespace enabled only contains empty folders and no objects or managed folders, then the bucket is considered empty. For instructions about deleting buckets, see Delete buckets.
PricingFor pricing information, refer to Cloud Storage pricing.
LimitationsThe following are the limitations of hierarchical namespace:
You must choose whether or not to use hierarchical namespace when you create the bucket; your bucket's hierarchical namespace setting can't be changed after the bucket is created.
In order to enable hierarchical namespace, a bucket must also enable uniform bucket-level access.
The following Cloud Storage capabilities are not supported for buckets that use hierarchical namespace:
If you're new to Google Cloud, create an account to evaluate how Cloud Storage performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
Try Cloud Storage freeRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.5