Mongodb 简明教程
MongoDB - Aggregation
聚合操作处理数据记录并返回计算结果。聚合操作将来自多个文档的值组合在一起,并可以在组合的数据上执行各种操作以返回单个结果。在 SQL 中 count(*) 和 group by 等效于 MongoDB 聚合。
Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. In SQL count(*) and with group by is an equivalent of MongoDB aggregation.
The aggregate() Method
对于 MongoDB 中的聚合,您应该使用 aggregate() 方法。
For the aggregation in MongoDB, you should use aggregate() method.
Syntax
aggregate() 方法的基本语法如下 -
Basic syntax of aggregate() method is as follows −
>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
Example
在集合中,您有以下数据 -
In the collection you have the following data −
{
_id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by_user: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
{
_id: ObjectId(7df78ad8902d)
title: 'NoSQL Overview',
description: 'No sql database is very fast',
by_user: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 10
},
{
_id: ObjectId(7df78ad8902e)
title: 'Neo4j Overview',
description: 'Neo4j is no sql database',
by_user: 'Neo4j',
url: 'http://www.neo4j.com',
tags: ['neo4j', 'database', 'NoSQL'],
likes: 750
},
现在,从上述集合中,如果您想显示一个列出每个用户编写的教程数量的列表,则您将使用以下 aggregate() 方法 -
Now from the above collection, if you want to display a list stating how many tutorials are written by each user, then you will use the following aggregate() method −
> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{ "_id" : "tutorials point", "num_tutorial" : 2 }
{ "_id" : "Neo4j", "num_tutorial" : 1 }
>
上述用例的 SQL 等效查询将是 select by_user, count( ) from mycol group by by_user*。
Sql equivalent query for the above use case will be select by_user, count() from mycol group by by_user*.
在上面的示例中,我们按字段 by_user 对文档进行分组,并且在每次 by 用户出现时都会使前一个值加 1。下面列出了可用的聚合表达式。
In the above example, we have grouped documents by field by_user and on each occurrence of by user previous value of sum is incremented. Following is a list of available aggregation expressions.
Expression |
Description |
Example |
$sum |
Sums up the defined value from all documents in the collection. |
db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}]) |
$avg |
Calculates the average of all given values from all documents in the collection. |
db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}]) |
$min |
Gets the minimum of the corresponding values from all documents in the collection. |
db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}]) |
$max |
Gets the maximum of the corresponding values from all documents in the collection. |
db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}]) |
$push |
Inserts the value to an array in the resulting document. |
db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}]) |
$addToSet |
Inserts the value to an array in the resulting document but does not create duplicates. |
db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}]) |
$first |
Gets the first document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage. |
db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}]) |
$last |
Gets the last document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage. |
db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}]) |
Pipeline Concept
在 UNIX 命令中,shell 管道表示在一些输入上执行操作并使用输出作为下一个命令的输入,依此类推的可能性。MongoDB 在聚合框架中也支持相同概念。有一组可能的阶段,每个阶段都将一组文档作为输入并生成一组结果文档(或管道末尾的最终结果 JSON 文档)。它随后依次可用于下一个阶段,依此类推。
In UNIX command, shell pipeline means the possibility to execute an operation on some input and use the output as the input for the next command and so on. MongoDB also supports same concept in aggregation framework. There is a set of possible stages and each of those is taken as a set of documents as an input and produces a resulting set of documents (or the final resulting JSON document at the end of the pipeline). This can then in turn be used for the next stage and so on.
聚合框架中的可能阶段如下:
Following are the possible stages in aggregation framework −
-
$project − Used to select some specific fields from a collection.
-
$match − This is a filtering operation and thus this can reduce the amount of documents that are given as input to the next stage.
-
$group − This does the actual aggregation as discussed above.
-
$sort − Sorts the documents.
-
$skip − With this, it is possible to skip forward in the list of documents for a given amount of documents.
-
$limit − This limits the amount of documents to look at, by the given number starting from the current positions.
-
$unwind − This is used to unwind document that are using arrays. When using an array, the data is kind of pre-joined and this operation will be undone with this to have individual documents again. Thus with this stage we will increase the amount of documents for the next stage.