In this guide, you can learn how to store and retrieve large files in MongoDB by using GridFS. The GridFS storage system splits files into chunks when storing them and reassembles those files when retrieving them. The driver's implementation of GridFS is an abstraction that manages the operations and organization of the file storage.
Use GridFS if the size of any of your files exceeds the BSON document size limit of 16 MB. For more detailed information about whether GridFS is suitable for your use case, see GridFS in the MongoDB Server manual.
GridFS organizes files in a bucket, a group of MongoDB collections that contain the chunks of files and information describing them. The bucket contains the following collections:
chunks
: Stores the binary file chunks
files
: Stores the file metadata
The driver creates the GridFS bucket, if it doesn't already exist, when you first write data to it. The bucket contains the chunks
and files
collections prefixed with the default bucket name fs
, unless you specify a different name. To ensure efficient retrieval of the files and related metadata, the driver creates an index on each collection. The driver ensures that these indexes exist before performing read and write operations on the GridFS bucket.
For more information about GridFS indexes, see GridFS Indexes in the MongoDB Server manual.
When using GridFS to store files, the driver splits the files into smaller chunks, each represented by a separate document in the chunks
collection. It also creates a document in the files
collection that contains a file ID, file name, and other file metadata.
The following diagram shows how GridFS splits files when they are uploaded to a bucket:
When retrieving files, GridFS fetches the metadata from the files
collection in the specified bucket and uses the information to reconstruct the file from documents in the chunks
collection.
To begin using GridFS to store or retrieve files, create a new instance of the GridFSBucket
class, passing in an IMongoDatabase
object that represents your database. This method accesses an existing bucket or creates a new bucket if one does not exist.
The following example creates a new instance of the GridFSBucket
class for the db
database:
var client = new MongoClient("<connection string>");var database = client.GetDatabase("db");var bucket = new GridFSBucket(database);
You can customize the GridFS bucket configuration by passing an instance of the GridFSBucketOptions
class to the GridFSBucket()
constructor. The following table describes the properties in the GridFSBucketOptions
class:
Field
Description
BucketName
The bucket name to use as a prefix for the files and chunks collections. The default value is "fs"
.
Data type: string
ChunkSizeBytes
The chunk size that GridFS splits files into. The default value is 255 KB.
Data type: integer
ReadConcern
The read concern to use for bucket operations. The default value is the database's read concern.
Data type: ReadConcern
ReadPreference
The read preference to use for bucket operations. The default value is the database's read preference.
Data type: ReadPreference
WriteConcern
The write concern to use for bucket operations. The default value is the database's write concern.
Data type: WriteConcern
The following example creates a bucket named "myCustomBucket"
by passing an instance of the GridFSBucketOptions
class to the GridFSBucket()
constructor:
var options = new GridFSBucketOptions { BucketName = "myCustomBucket" };var customBucket = new GridFSBucket(database, options);
You can upload files to a GridFS bucket by using the following methods:
OpenUploadStream()
or OpenUploadStreamAsync()
: Opens a new upload stream to which you can write file contents
UploadFromStream()
or UploadFromStreamAsync()
: Uploads the contents of an existing stream to a GridFS file
The following sections describe how to use these methods.
Use the OpenUploadStream()
or OpenUploadStreamAsync()
method to create an upload stream for a given file name. These methods accept the following parameters:
Parameter
Description
filename
The name of the file to upload.
Data type: string
options
Optional. An instance of the GridFSUploadOptions
class that specifies the configuration for the upload stream. The default value is null
.
Data type: GridFSUploadOptions
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
This code example demonstrates how to open an upload stream by performing the following steps:
Calls the OpenUploadStream()
method to open a writable GridFS stream for a file named "my_file"
Calls the Write()
method to write data to my_file
Calls the Close()
method to close the stream that points to my_file
Select the Synchronous or Asynchronous tab to see the corresponding code:
using (var uploader = bucket.OpenUploadStream("my_file")){ byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; uploader.Write(bytes, 0, bytes.Length); uploader.Close();}
using (var uploader = await bucket.OpenUploadStreamAsync("my_file", options)){ byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; await uploader.WriteAsync(bytes, 0, bytes.Length); await uploader.CloseAsync();}
To customize the upload stream configuration, pass an instance of the GridFSUploadOptions
class to the OpenUploadStream()
or OpenUploadStreamAsync()
method. The GridFSUploadOptions
class contains the following properties:
Property
Description
BatchSize
The number of chunks to upload in each batch. The default value is 16 MB divided by the value of the ChunkSizeBytes
property.
Data type: int?
ChunkSizeBytes
The size of each chunk except the last, which is smaller. The default value is 255 KB.
Data type: int?
Metadata
Metadata to store with the file, including the following elements:
The _id
of the file
The name of the file
The length and size of the file
The upload date and time
A metadata
document in which you can store other information
The default value is null
.
Data type: BsonDocument
The following example performs the same steps as the preceding example, but also uses the ChunkSizeBytes
option to specify the size of each chunk. Select the Synchronous or Asynchronous tab to see the corresponding code.
var options = new GridFSUploadOptions{ ChunkSizeBytes = 1048576 };using (var uploader = bucket.OpenUploadStream("my_file", options)){ byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; uploader.Write(bytes, 0, bytes.Length); uploader.Close();}
var options = new GridFSUploadOptions{ ChunkSizeBytes = 1048576 };using (var uploader = await bucket.OpenUploadStreamAsync("my_file", options)){ byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; await uploader.WriteAsync(bytes, 0, bytes.Length); await uploader.CloseAsync();}
Use the UploadFromStream()
or UploadFromStreamAsync()
method to upload the contents of a stream to a new GridFS file. These methods accept the following parameters:
Parameter
Description
filename
The name of the file to upload.
Data type: string
source
The stream from which to read the file contents.
Data type: Stream
options
Optional. An instance of the GridFSUploadOptions
class that specifies the configuration for the upload stream. The default value is null
.
Data type: GridFSUploadOptions
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
This code example demonstrates how to open an upload stream by performing the following steps:
Opens a file located at /path/to/input_file
as a stream in binary read mode
Calls the UploadFromStream()
method to write the contents of the stream to a GridFS file named "new_file"
Select the Synchronous or Asynchronous tab to see the corresponding code.
using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read)){ bucket.UploadFromStream("new_file", fileStream);}
using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read)){ await bucket.UploadFromStreamAsync("new_file", fileStream);}
You can download files from a GridFS bucket by using the following methods:
OpenDownloadStream()
or OpenDownloadStreamAsync()
: Opens a new download stream from which you can read file contents
DownloadToStream()
or DownloadToStreamAsync()
: Writes the contents of a GridFS file to an existing stream
The following sections describe these methods in more detail.
Use the OpenDownloadStream()
or OpenDownloadStreamAsync()
method to create a download stream. These methods accept the following parameters:
Parameter
Description
id
The _id
value of the file to download.
Data type: BsonValue
options
Optional. An instance of the GridFSDownloadOptions
class that specifies the configuration for the download stream. The default value is null
.
Data type: GridFSDownloadOptions
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
The following code example demonstrates how to open a download stream by performing the following steps:
Retrieves the _id
value of the GridFS file named "new_file"
Calls the OpenDownloadStream()
method and passes the _id
value to open the file as a readable GridFS stream
Creates a buffer
vector to store the file contents
Calls the Read()
method to read the file contents from the downloader
stream into the vector
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var doc = bucket.Find(filter).FirstOrDefault();if (doc != null){ using (var downloader = bucket.OpenDownloadStream(doc.Id)) { var buffer = new byte[downloader.Length]; downloader.Read(buffer, 0, buffer.Length); }}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var cursor = await bucket.FindAsync(filter);var fileInfoList = await cursor.ToListAsync();var doc = fileInfoList.FirstOrDefault();if (doc != null){ using (var downloader = await bucket.OpenDownloadStreamAsync(doc.Id)) { var buffer = new byte[downloader.Length]; await downloader.ReadAsync(buffer, 0, buffer.Length); }}
To customize the download stream configuration, pass an instance of the GridFSDownloadOptions
class to the OpenDownloadStream()
method. The GridFSDownloadOptions
class contains the following property:
Property
Description
Seekable
Indicates whether the stream supports seeking, the ability to query and change the current position in a stream. The default value is false
.
Data type: bool?
The following example performs the same steps as the preceding example, but also sets the Seekable
option to true
to specify that the stream is seekable.
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var doc = bucket.Find(filter).FirstOrDefault();if (doc != null){ var options = new GridFSDownloadOptions { Seekable = true }; using (var downloader = bucket.OpenDownloadStream(id, options)) { var buffer = new byte[downloader.Length]; downloader.Read(buffer, 0, buffer.Length); }}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var cursor = await bucket.FindAsync(filter);var fileInfoList = await cursor.ToListAsync();var doc = fileInfoList.FirstOrDefault();if (doc != null){ var options = new GridFSDownloadOptions { Seekable = true }; using (var downloader = await bucket.OpenDownloadStreamAsync(doc.Id, options)) { var buffer = new byte[downloader.Length]; await downloader.ReadAsync(buffer, 0, buffer.Length); }}
Use the DownloadToStream()
or DownloadToStreamAsync()
method to download the contents of a GridFS file to an existing stream. These methods accept the following parameters:
Parameter
Description
id
The _id
value of the file to download.
Data type: BsonValue
destination
The stream that the .NET/C# Driver downloads the GridFS file to. This property's value must be an object that implements the Stream
class.
Data type: Stream
options
Optional. An instance of the GridFSDownloadOptions
class that specifies the configuration for the download stream. The default value is null
.
Data type: GridFSDownloadOptions
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
The following code example demonstrates how to download to an existing stream by performing the following actions:
Opens a file located at /path/to/output_file
as a stream in binary write mode
Retrieves the _id
value of the GridFS file named "new_file"
Calls the DownloadToStream()
method and passes the _id
value to download the contents of "new_file"
to a stream
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var doc = bucket.Find(filter).FirstOrDefault();if (doc != null){ using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write)) { bucket.DownloadToStream(doc.Id, outputFile); }}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var cursor = await bucket.FindAsync(filter);var fileInfoList = await cursor.ToListAsync();var doc = fileInfoList.FirstOrDefault();if (doc != null){ using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write)) { await bucket.DownloadToStreamAsync(doc.Id, outputFile); }}
To find files in a GridFS bucket, call the Find()
or FindAsync()
method on your GridFSBucket
instance. These methods accept the following parameters:
Parameter
Description
filter
A query filter that specifies the entries to match in the files
collection.
Data type: FilterDefinition<GridFSFileInfo>
. For more information, see the API documentation for the Find() method.
source
The stream from which to read the file contents.
Data type: Stream
options
Optional. An instance of the GridFSFindOptions
class that specifies the configuration for the find operation. The default value is null
.
Data type: GridFSFindOptions
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
The following code example shows how to retrieve and print file metadata from files in a GridFS bucket. The Find()
method returns an IAsyncCursor<GridFSFileInfo>
instance from which you can access the results. It uses a foreach
loop to iterate through the returned cursor and display the contents of the files uploaded in the Upload Files examples.
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Empty;var files = bucket.Find(filter);foreach (var file in files.ToEnumerable()){ Console.WriteLine(file.ToJson());}
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" :{ "$date" : ... }, "filename" : "new_file" }{ "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" :{ "$date" : ... }, "filename" : "my_file" }
var filter = Builders<GridFSFileInfo>.Filter.Empty;var files = await bucket.FindAsync(filter);await files.ForEachAsync(file => Console.Out.WriteLineAsync(file.ToJson()))
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" :{ "$date" : ... }, "filename" : "new_file" }{ "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" :{ "$date" : ... }, "filename" : "my_file" }
To customize the find operation, pass an instance of the GridFSFindOptions
class to the Find()
or FindAsync()
method. The GridFSFindOptions
class contains the following properties:
Property
Description
Sort
The sort order of the results. If you don't specify a sort order, the method returns the results in the order in which they were inserted.
Data type: SortDefinition<GridFSFileInfo>
. For more information, see the API documentation for the Sort property.
To delete files from a GridFS bucket, call the Delete()
or DeleteAsync()
method on your GridFSBucket
instance. This method removes a file's metadata collection and its associated chunks from your bucket.
The Delete
and DeleteAsync()
methods accept the following parameters:
Parameter
Description
id
The _id
of the file to delete.
Data type: BsonValue
cancellationToken
Optional. A token that you can use to cancel the operation.
Data type: CancellationToken
The following code example shows how to delete a file named "my_file"
passing its _id
value to delete_file()
:
Uses the Builders
class to create a filter that matches the file named "my_file"
Uses the Find()
method to find the file named "my_file"
Passes the _id
value of the file to the Delete()
method to delete the file
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var doc = bucket.Find(filter).FirstOrDefault();if (doc != null){ bucket.Delete(doc.Id);}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");var cursor = await bucket.FindAsync(filter);var fileInfoList = await cursor.ToListAsync();var doc = fileInfoList.FirstOrDefault();if (doc != null){ await bucket.DeleteAsync(doc.Id);}
Note File Revisions
The Delete()
and DeleteAsync()
methods support deleting only one file at a time. To delete each file revision, or files with different upload times that share the same file name, collect the _id
values of each revision. Then, pass each _id
value in separate calls to the Delete()
or DeleteAsync()
method.
To learn more about the classes used on this page, see the following API documentation:
To learn more about the methods in the GridFSBucket
class used on this page, see the following API documentation:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4