This article shows you how to install and configure BlobFuse2, mount an Azure blob container, and access data in the container. The basic steps are:
How to install BlobFuse2
You have two options for installing BlobFuse2:
To see supported distributions, see BlobFuse2 releases.
For information about libfuse support, see the BlobFuse2 README.
To check your version of Linux, run the following command:
cat /etc/*-release
If no binaries are available for your distribution, you can Option 2: Build the binaries from source code.
To install BlobFuse2 from the repositories:
Configure the Microsoft package repository
Configure the Linux Package Repository for Microsoft Products.
As an example, on a Red Hat Enterprise Linux 8 distribution:
sudo rpm -Uvh https://packages.microsoft.com/config/rhel/8/packages-microsoft-prod.rpm
Similarly, change the URL to .../rhel/7/...
to point to a Red Hat Enterprise Linux 7 distribution.
Another example on an Ubuntu 20.04 distribution:
sudo wget https://packages.microsoft.com/config/ubuntu/20.04/packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
sudo apt-get update
sudo apt-get install libfuse3-dev fuse3
Similarly, change the URL to .../ubuntu/16.04/...
or .../ubuntu/18.04/...
to reference another Ubuntu version.
sudo rpm -Uvh https://packages.microsoft.com/config/sles/15/packages-microsoft-prod.rpm
Install BlobFuse2
sudo yum install blobfuse2
Similarly, change the package name to blobfuse2-<version>
to install specific version.
sudo apt-get install blobfuse2
Similarly, change the package name to blobfuse2=<version>
to install specific version.
sudo zypper install blobfuse2
Similarly, change the package name to blobfuse2-<version>
to install specific version.
To build the BlobFuse2 binaries from source code:
Install the dependencies:
Install Git:
sudo apt-get install git
Install BlobFuse2 dependencies.
On Ubuntu:
sudo apt-get install libfuse3-dev fuse3 -y
Clone the repository:
sudo git clone https://github.com/Azure/azure-storage-fuse/
sudo cd ./azure-storage-fuse
sudo git checkout main
Build BlobFuse2:
go get
go build -tags=fuse3
You can configure BlobFuse2 by using various settings. Some of the typical settings include:
The settings can be configured in a YAML configuration file, using environment variables, or as parameters passed to the BlobFuse2 commands. The preferred method is to use the configuration file.
For details about each of the configuration parameters for BlobFuse2 and how to specify them, see these articles:
To configure BlobFuse2 for mounting:
BlobFuse2 provides native-like performance by using local file-caching techniques. The caching configuration and behavior varies, depending on whether you're streaming large files or accessing smaller files.
Configure caching for streaming large filesBlobFuse2 supports streaming for read and write operations as an alternative to disk caching for files. In streaming mode, BlobFuse2 caches blocks of large files in memory both for reading and writing. The configuration settings related to caching for streaming are under the stream:
settings in your configuration file:
stream:
block-size-mb:
For read only mode, the size of each block to be cached in memory while streaming (in MB)
For read/write mode, the size of newly created blocks
max-buffers: The total number of buffers to store blocks in
buffer-size-mb: The size for each buffer
Configure caching for smaller files
Smaller files are cached to a temporary path that's specified under file_cache:
in the configuration file:
file_cache:
path: <path to local disk cache>
Note
BlobFuse2 stores all open file contents in the temporary path. Make sure you have enough space to contain all open files.
You have three common options to configure the temporary path for file caching:
Use a local high-performing diskIf you use an existing local disk for file caching, choose a disk that provides the best performance possible, such as a solid-state disk (SSD).
Use a RAM diskThe following example creates a RAM disk of 16 GB and a directory for BlobFuse2. Choose a size that meets your requirements. BlobFuse2 uses the RAM disk to open files that are up to 16 GB in size.
sudo mkdir /mnt/ramdisk
sudo mount -t tmpfs -o size=16g tmpfs /mnt/ramdisk
sudo mkdir /mnt/ramdisk/blobfuse2tmp
sudo chown <youruser> /mnt/ramdisk/blobfuse2tmp
Use an SSD
In Azure, you can use the SSD ephemeral disks that are available on your VMs to provide a low-latency buffer for BlobFuse2. Depending on the provisioning agent you use, mount the ephemeral disk on /mnt for cloud-init or /mnt/resource for Microsoft Azure Linux Agent (waagent) VMs.
Make sure that your user has access to the temporary path:
sudo mkdir /mnt/resource/blobfuse2tmp -p
sudo chown <youruser> /mnt/resource/blobfuse2tmp
Create an empty directory to mount the blob container
To create an empty directory to mount the blob container:
mkdir ~/mycontainer
You must grant access to the storage account for the user who mounts the container. The most common ways to grant access are by using one of the following options:
You can provide authorization information in a configuration file or in environment variables. For more information, see Configure settings for BlobFuse2.
How to mount a blob containerImportant
BlobFuse2 doesn't support overlapping mount paths. If you run multiple instances of BlobFuse2, make sure that each instance has a unique and non-overlapping mount point.
BlobFuse2 doesn't support coexistence with NFS on the same mount path. The results of running BlobFuse2 on the same mount path as NFS are undefined and might result in data corruption.
To mount an Azure block blob container by using BlobFuse2, run the following command. The command mounts the container specified in ./config.yaml
onto the location ~/mycontainer
:
sudo blobfuse2 mount ~/mycontainer --config-file=./config.yaml
You should now have access to your block blobs through the Linux file system and related APIs. To test your deployment, try creating a new directory and file:
cd ~/mycontainer
mkdir test
echo "hello world" > test/blob.txt
How to access data
Generally, you can work with the BlobFuse2-mounted storage like you would work with the native Linux file system. It uses the virtual directory scheme with a forward slash (/
) as a delimiter in the file path and supports basic file system operations such as mkdir
, opendir
, readdir
, rmdir
, open
, read
, create
, write
, close
, unlink
, truncate
, stat
, and rename
.
However, you should be aware of some key differences in functionality:
Feature supportThis table shows how this feature is supported in your account and the effect on support when you enable certain capabilities:
1 Azure Data Lake Storage, Network File System (NFS) 3.0 protocol, and SSH File Transfer Protocol (SFTP) support all require a storage account with a hierarchical namespace enabled.
See also Next stepsRetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4