Mongodb 简明教程

MongoDB - Regular Expression

正则表达式经常用于各种语言中,以在任意字符串中搜索某个模式或单词。MongoDB 还提供正则表达式功能,用于使用 $regex 运算符进行字符串模式匹配。MongoDB 将 PCRE(Perl 兼容正则表达式)用作正则表达式语言。

Regular Expressions are frequently used in all languages to search for a pattern or word in any string. MongoDB also provides functionality of regular expression for string pattern matching using the $regex operator. MongoDB uses PCRE (Perl Compatible Regular Expression) as regular expression language.

不同于文本搜索,我们无需执行任何配置或命令即可使用正则表达式。

Unlike text search, we do not need to do any configuration or command to use regular expressions.

假设我们在名为 posts 的数据库中插入了一个文档,如下所示:

Assume we have inserted a document in a database named posts as shown below −

> db.posts.insert(
{
   "post_text": "enjoy the mongodb articles on tutorialspoint",
   "tags": [
      "mongodb",
      "tutorialspoint"
   ]
}
WriteResult({ "nInserted" : 1 })

Using regex Expression

以下正则表达式查询搜索其中包含字符串 tutorialspoint 的所有帖子:

The following regex query searches for all the posts containing string tutorialspoint in it −

> db.posts.find({post_text:{$regex:"tutorialspoint"}}).pretty()
{
	"_id" : ObjectId("5dd7ce28f1dd4583e7103fe0"),
	"post_text" : "enjoy the mongodb articles on tutorialspoint",
	"tags" : [
		"mongodb",
		"tutorialspoint"
	]
}
{
	"_id" : ObjectId("5dd7d111f1dd4583e7103fe2"),
	"post_text" : "enjoy the mongodb articles on tutorialspoint",
	"tags" : [
		"mongodb",
		"tutorialspoint"
	]
}
>

同样的查询还可以这样编写:

The same query can also be written as −

>db.posts.find({post_text:/tutorialspoint/})

Using regex Expression with Case Insensitive

要使搜索不区分大小写,我们可以使用值 $i$options 参数。以下命令无论大小写都会查找包含单词 tutorialspoint 的字符串:

To make the search case insensitive, we use the $options parameter with value $i. The following command will look for strings having the word tutorialspoint, irrespective of smaller or capital case −

>db.posts.find({post_text:{$regex:"tutorialspoint",$options:"$i"}})

此查询返回的结果之一是包含单词 tutorialspoint (大小写不一致)的以下文档:

One of the results returned from this query is the following document which contains the word tutorialspoint in different cases −

{
   "_id" : ObjectId("53493d37d852429c10000004"),
   "post_text" : "hey! this is my post on TutorialsPoint",
   "tags" : [ "tutorialspoint" ]
}

Using regex for Array Elements

我们还可以在数组字段上使用正则表达式概念。当我们实现标签功能时,这尤其重要。因此,如果要搜索所有标签以单词教程(教程、教程或 tutorialpoint 或 tutorialphp)开头的帖子,可以使用以下代码:

We can also use the concept of regex on array field. This is particularly very important when we implement the functionality of tags. So, if you want to search for all the posts having tags beginning from the word tutorial (either tutorial or tutorials or tutorialpoint or tutorialphp), you can use the following code −

>db.posts.find({tags:{$regex:"tutorial"}})

Optimizing Regular Expression Queries

  1. If the document fields are indexed, the query will use make use of indexed values to match the regular expression. This makes the search very fast as compared to the regular expression scanning the whole collection.

  2. If the regular expression is a prefix expression, all the matches are meant to start with a certain string characters. For e.g., if the regex expression is ^tut, then the query has to search for only those strings that begin with tut.