Arangodb 简明教程

ArangoDB - AQL Example Queries

在本章中,我们将思考一个 @ {s0} 数据库上的几个 AQL 示例查询。这些查询都基于图表。

In this chapter, we will consider a few AQL Example Queries on an Actors and Movies Database. These queries are based on graphs.

Problem

给定一组演员和一组电影,以及一个 actIn 边缘集合(带有一个 year 属性)以按照如下所示的方式连接顶点 −

Given a collection of actors and a collection of movies, and an actIn edges collection (with a year property) to connect the vertex as indicated below −

@ {s1}

[Actor] ← act in → [Movie]

我们如何获得 −

How do we get −

  1. All actors who acted in "movie1" OR "movie2"?

  2. All actors who acted in both "movie1" AND "movie2”?

  3. All common movies between "actor1" and "actor2”?

  4. All actors who acted in 3 or more movies?

  5. All movies where exactly 6 actors acted in?

  6. The number of actors by movie?

  7. The number of movies by actor?

  8. The number of movies acted in between 2005 and 2010 by actor?

Solution

在解决上述查询并获取答案的过程中,我们将使用 Arangosh 创建数据集并在其上运行查询。所有 AQL 查询都是字符串,可以轻松复制到您的首选驱动程序中,而不是 Arangosh。

During the process of solving and obtaining the answers to the above queries, we will use Arangosh to create the dataset and run queries on that. All the AQL queries are strings and can simply be copied over to your favorite driver instead of Arangosh.

我们首先在 Arangosh 中创建一个测试数据集。首先,下载 @ {s2} −

Let us start by creating a Test Dataset in Arangosh. First, download this file

# wget -O dataset.js
https://drive.google.com/file/d/0B4WLtBDZu_QWMWZYZ3pYMEdqajA/view?usp=sharing

Output

...
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘dataset.js’
dataset.js [ <=> ] 115.14K --.-KB/s in 0.01s
2017-09-17 14:19:12 (11.1 MB/s) - ‘dataset.js’ saved [117907]

您可以在以上输出中看到,我们下载了一个 JavaScript 文件 @ {s3}。该文件包含用于在数据库中创建数据集的 Arangosh 命令。我们不会逐个复制和粘贴这些命令,而是使用 Arangosh 上的 @ {s4} 选项来非交互式地执行多个命令。这简直是救命命令!

You can see in the output above that we have downloaded a JavaScript file dataset.js. This file contains the Arangosh commands to create the dataset in the database. Instead of copying and pasting the commands one by one, we will use the --javascript.execute option on Arangosh to execute the multiple commands non-interactively. Consider it the life saver command!

现在在 shell 上执行以下命令 −

Now execute the following command on the shell −

$ arangosh --javascript.execute dataset.js
command on the shell

当提示时提供密码,如您在上述屏幕截图中看到的。现在我们已经保存了数据,因此我们将构建 AQL 查询来回答本章开头提出的具体问题。

Supply the password when prompted as you can see in the above screenshot. Now we have saved the data, so we will construct the AQL queries to answer the specific questions raised in the beginning of this chapter.

First Question

我们来看第一个问题:@ {s5}。假设,我们想要找到在“TheMatrix”或“TheDevilsAdvocate”中演出的所有演员的姓名 −

Let us take the first question: All actors who acted in "movie1" OR "movie2". Suppose, we want to find the names of all the actors who acted in "TheMatrix" OR "TheDevilsAdvocate" −

我们将每次处理一部电影,以获取演员姓名 −

We will start with one movie at a time to get the names of the actors −

127.0.0.1:8529@_system> db._query("FOR x IN ANY 'movies/TheMatrix' actsIn
OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN x._id").toArray();

Output

我们将接收以下输出 −

We will receive the following output −

[
   "actors/Hugo",
   "actors/Emil",
   "actors/Carrie",
   "actors/Keanu",
   "actors/Laurence"
]
first question

现在,我们继续形成两个 NEIGHBORS 查询的 UNION_DISTINCT,这将是解决方案 −

Now we continue to form a UNION_DISTINCT of two NEIGHBORS queries which will be the solution −

127.0.0.1:8529@_system> db._query("FOR x IN UNION_DISTINCT ((FOR y IN ANY
'movies/TheMatrix' actsIn OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN
y._id), (FOR y IN ANY 'movies/TheDevilsAdvocate' actsIn OPTIONS {bfs: true,
uniqueVertices: 'global'} RETURN y._id)) RETURN x").toArray();

Output

[
   "actors/Charlize",
   "actors/Al",
   "actors/Laurence",
   "actors/Keanu",
   "actors/Carrie",
   "actors/Emil",
   "actors/Hugo"
]
first question 2

Second Question

现在,我们考虑第二个问题: All actors who acted in both "movie1" AND "movie2" 。这几乎与上述问题完全相同。但这次,我们不感兴趣于 UNION,而感兴趣于 INTERSECTION −

Let us now consider the second question: All actors who acted in both "movie1" AND "movie2". This is almost identical to the question above. But this time we are not interested in a UNION but in an INTERSECTION −

127.0.0.1:8529@_system> db._query("FOR x IN INTERSECTION ((FOR y IN ANY
'movies/TheMatrix' actsIn OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN
y._id), (FOR y IN ANY 'movies/TheDevilsAdvocate' actsIn OPTIONS {bfs: true,
uniqueVertices: 'global'} RETURN y._id)) RETURN x").toArray();

Output

我们将接收以下输出 −

We will receive the following output −

[
   "actors/Keanu"
]
second question

Third Question

现在,我们考虑第三个问题: All common movies between "actor1" and "actor2" 。这实际上与电影 1 和电影 2 中的共同演员问题完全相同。我们只需要更改起始顶点即可。例如,让我们找出让休·杰克曼(Hugo)和基努·里维斯共同主演的所有电影 −

Let us now consider the third question: All common movies between "actor1" and "actor2". This is actually identical to the question about common actors in movie1 and movie2. We just have to change the starting vertices. As an example, let us find all the movies where Hugo Weaving ("Hugo") and Keanu Reeves are co-starring −

127.0.0.1:8529@_system> db._query(
   "FOR x IN INTERSECTION (
      (
         FOR y IN ANY 'actors/Hugo' actsIn OPTIONS
         {bfs: true, uniqueVertices: 'global'}
          RETURN y._id
      ),

      (
         FOR y IN ANY 'actors/Keanu' actsIn OPTIONS
         {bfs: true, uniqueVertices:'global'} RETURN y._id
      )
   )
   RETURN x").toArray();

Output

我们将接收以下输出 −

We will receive the following output −

[
   "movies/TheMatrixReloaded",
   "movies/TheMatrixRevolutions",
   "movies/TheMatrix"
]
third question

Fourth Question

现在,我们考虑第四个问题。 All actors who acted in 3 or more movies 。这个问题有所不同;我们不能在此处使用邻居函数。相反,我们将利用 AQL 的边缘索引和 COLLECT 语句进行分组。基本思想是按 startVertex (在该数据集始终为演员)对所有边缘进行分组。然后,我们从结果中删除电影数量少于 3 的所有演员,因为此处我们已纳入演员参演的电影数量 −

Let us now consider the fourth question. All actors who acted in 3 or more movies. This question is different; we cannot make use of the neighbors function here. Instead we will make use of the edge-index and the COLLECT statement of AQL for grouping. The basic idea is to group all edges by their startVertex (which in this dataset is always the actor). Then we remove all actors with less than 3 movies from the result as here we have included the number of movies an actor has acted in −

127.0.0.1:8529@_system> db._query("FOR x IN actsIn COLLECT actor = x._from WITH
COUNT INTO counter FILTER counter >= 3 RETURN {actor: actor, movies:
counter}"). toArray()

Output

[
   {
      "actor" : "actors/Carrie",
      "movies" : 3
   },

   {
      "actor" : "actors/CubaG",
      "movies" : 4
   },

   {
      "actor" : "actors/Hugo",
      "movies" : 3
   },

   {
      "actor" : "actors/Keanu",
      "movies" : 4
   },

   {
      "actor" : "actors/Laurence",
      "movies" : 3
   },

   {
      "actor" : "actors/MegR",
      "movies" : 5
   },

   {
      "actor" : "actors/TomC",
      "movies" : 3
   },

   {
      "actor" : "actors/TomH",
      "movies" : 3
   }
]
fourth question

对于剩余的问题,我们将讨论查询形成,并仅提供查询。读者应在 Arangosh 终端上自行运行查询。

For the remaining questions, we will discuss the query formation, and provide the queries only. The reader should run the query themselves on the Arangosh terminal.

Fifth Question

现在,我们考虑第五个问题: All movies where exactly 6 actors acted in 。思路与之前的查询相同,但使用等式筛选器。然而,现在我们需要电影而不是演员,所以我们返回 _to attribute

Let us now consider the fifth question: All movies where exactly 6 actors acted in. The same idea as in the query before, but with the equality filter. However, now we need the movie instead of the actor, so we return the _to attribute

db._query("FOR x IN actsIn COLLECT movie = x._to WITH COUNT INTO counter FILTER
counter == 6 RETURN movie").toArray()

根据电影的演员数量?

The number of actors by movie?

我们记得在我们的数据集 _to 中,边缘对应于电影,所以我们统计 _to 出现的次数。这是演员数量。该查询几乎与之前的查询相同,但 without the FILTER after COLLECT

We remember in our dataset _to on the edge corresponds to the movie, so we count how often the same _to appears. This is the number of actors. The query is almost identical to the ones before but without the FILTER after COLLECT

db._query("FOR x IN actsIn COLLECT movie = x._to WITH COUNT INTO counter RETURN
{movie: movie, actors: counter}").toArray()

Sixth Question

现在,我们考虑第六个问题: The number of movies by an actor

Let us now consider the sixth question: The number of movies by an actor.

我们找到上述查询解决方法的方式也将帮助您找到此查询的解决方案。

The way we found solutions to our above queries will help you find the solution to this query as well.

db._query("FOR x IN actsIn COLLECT actor = x._from WITH COUNT INTO counter
RETURN {actor: actor, movies: counter}").toArray()