Mongodb 简明教程

MongoDB - GridFS

GridFS 是 MongoDB 用于存储和检索图像、音频文件、视频文件等大型文件而制定的规范。它是一种用于存储文件的类似文件系统,但其数据储存在 MongoDB 集合内。GridFS 的能力是存储大小甚至超过其 16MB 文档大小限制的文件。

GridFS is the MongoDB specification for storing and retrieving large files such as images, audio files, video files, etc. It is kind of a file system to store files but its data is stored within MongoDB collections. GridFS has the capability to store files even greater than its document size limit of 16MB.

GridFS 将文件分成区块,并将各个数据块存储在一个单独的文件中,每个文件的大小最大为 255k。

GridFS divides a file into chunks and stores each chunk of data in a separate document, each of maximum size 255k.

GridFS 默认使用两个集合 fs.filesfs.chunks 来存储文件的元数据和区块。每个区块通过其唯一的_id ObjectId字段进行标识。fs.files充当父文档。fs.chunks文档中的 files_id 字段链接区块到其父文档。

GridFS by default uses two collections fs.files and fs.chunks to store the file’s metadata and the chunks. Each chunk is identified by its unique _id ObjectId field. The fs.files serves as a parent document. The files_id field in the fs.chunks document links the chunk to its parent.

以下是一个示例文档fs.files集合 −

Following is a sample document of fs.files collection −

{
   "filename": "test.txt",
   "chunkSize": NumberInt(261120),
   "uploadDate": ISODate("2014-04-13T11:32:33.557Z"),
   "md5": "7b762939321e146569b07f72c62cca4f",
   "length": NumberInt(646)
}

文档指定文件名、区块大小、上传日期和长度。

The document specifies the file name, chunk size, uploaded date, and length.

以下是一个示例文档fs.chunks文档 −

Following is a sample document of fs.chunks document −

{
   "files_id": ObjectId("534a75d19f54bfec8a2fe44b"),
   "n": NumberInt(0),
   "data": "Mongo Binary Data"
}

Adding Files to GridFS

现在,我们将使用 put 命令通过GridFS存储一个mp3文件。为此,我们将使用MongoDB安装文件夹的bin文件夹中存在的 mongofiles.exe 实用程序。

Now, we will store an mp3 file using GridFS using the put command. For this, we will use the mongofiles.exe utility present in the bin folder of the MongoDB installation folder.

打开命令提示符,导航到MongoDB安装文件夹的bin文件夹中的mongofiles.exe,然后键入以下代码 −

Open your command prompt, navigate to the mongofiles.exe in the bin folder of MongoDB installation folder and type the following code −

>mongofiles.exe -d gridfs put song.mp3

这里, gridfs 是存储文件的数据库的名称。如果没有数据库,MongoDB将自动创建新的文档。Song.mp3是上传文件的名称。若要查看文件在数据库中的文档,可以使用find查询 −

Here, gridfs is the name of the database in which the file will be stored. If the database is not present, MongoDB will automatically create a new document on the fly. Song.mp3 is the name of the file uploaded. To see the file’s document in database, you can use find query −

>db.fs.files.find()

上述命令返回了以下文档 −

The above command returned the following document −

{
   _id: ObjectId('534a811bf8b4aa4d33fdf94d'),
   filename: "song.mp3",
   chunkSize: 261120,
   uploadDate: new Date(1397391643474), md5: "e4f53379c909f7bed2e9d631e15c1c41",
   length: 10401959
}

我们还可以使用上一个查询返回的文档id,使用以下代码看到与存储文件相关的fs.chunks集合中存在的所有区块 −

We can also see all the chunks present in fs.chunks collection related to the stored file with the following code, using the document id returned in the previous query −

>db.fs.chunks.find({files_id:ObjectId('534a811bf8b4aa4d33fdf94d')})

在我的情况下,查询返回了40个文档,这意味着整个mp3文档被分成了40个数据区块。

In my case, the query returned 40 documents meaning that the whole mp3 document was divided in 40 chunks of data.