A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://cloud.google.com/storage/docs/hns-hadoop-workloads below:

Use hierarchical namespace enabled buckets for Hadoop workloads | Cloud Storage

Use hierarchical namespace enabled buckets for Hadoop workloads

Stay organized with collections Save and categorize content based on your preferences.

This page describes how to use hierarchical namespace enabled buckets for Hadoop workloads.

Overview

When using a Cloud Storage bucket with hierarchical namespace, you can configure the Cloud Storage connector to use the rename folder operation for workloads like Hadoop, Spark, Hive.

In a bucket without hierarchical namespace, a rename operation in Hadoop, Spark, and Hive involves multiple object copy and delete jobs, impacting performance and consistency. Renaming a folder using the Cloud Storage connector optimizes performance and ensures consistency, when handling folders with a large number of objects.

Before you begin

To use features of hierarchical namespace buckets, use the following Cloud Storage connector versions:

Older connector versions (3.0.0 and older than 2.2.23) have limitations. For more information about the limitations, see Compatibility with Cloud Storage connector version 3.0.0 or versions older than 2.2.23.

Enable the Cloud Storage connector on a cluster

This section describes how to enable the Cloud Storage connector on a Dataproc cluster and a self-managed Hadoop cluster.

Dataproc

You can use the Google Cloud CLI to create a Dataproc cluster and enable the Cloud Storage connector to perform the folder operations.

  1. Create a Dataproc cluster using the following command:

      gcloud dataproc clusters create CLUSTER_NAME
      --properties=core:fs.gs.hierarchical.namespace.folders.enable=true,
      core:fs.gs.http.read-timeout=30000
      

    Where:

Self-managed Hadoop

You can enable the Cloud Storage connector on your self-managed Hadoop cluster to perform the folder operations.

  1. Add the following to core-site.xml configuration file:

        <property>
          <name>fs.gs.hierarchical.namespace.folders.enable</name>
          <value>true</value>
        </property>
        <property>
          <name>fs.gs.http.read-timeout</name>
          <value>30000</value>
        </property>
      

    Where:

Compatibility with Cloud Storage connector version 3.0.0 or versions older than 2.2.23

Using the Cloud Storage connector version 3.0.0 or versions older than 2.2.23 or disabling folder operations for hierarchical namespace can lead to the following limitations:

What's next Try it for yourself

If you're new to Google Cloud, create an account to evaluate how Cloud Storage performs in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Try Cloud Storage free

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-02 UTC.

[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-10-02 UTC."],[],[]]


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.5