Javascript 简明教程

Regular Expressions and RegExp Object

regular expression (正则表达式) 在 JavaScript 中是一个描述字符模式的对象。它可以包含字母数字和特殊字符。而且,正则表达式模式可以包含单个或多个字符。

A regular expression (RegExp) in JavaScript is an object that describes a pattern of characters. It can contain the alphabetical, numeric, and special characters. Also, the regular expression pattern can have single or multiple characters.

JavaScript RegExp 类表示正则表达式,并且 String 和 RegExp 都定义了使用正则表达式来对文本执行强大的模式匹配和搜索替换功能的方法。

The JavaScript RegExp class represents regular expressions, and both String and RegExp define methods that use regular expressions to perform powerful pattern-matching and search-and-replace functions on text.

正则表达式用来在字符串中搜索特定模式,或用一个新字符串替换该模式。

The regular expression is used to search for the particular pattern in the string or replace the pattern with a new string.

在 JavaScript 中有两种方法来构造正则表达式。

There are two ways to construct the regular expression in JavaScript.

  1. Using the RegExp() constructor.

  2. Using the regular expression literal.

Syntax

正则表达式可以用 RegExp () 构造函数定义,如下所示:

A regular expression could be defined with the RegExp () constructor, as follows −

var pattern = new RegExp(pattern, attributes);
or simply
var pattern = /pattern/attributes;

Parameters

以下是参数说明 −

Here is the description of the parameters −

  1. pattern − A string that specifies the pattern of the regular expression or another regular expression.

  2. attributes − An optional string containing any of the "g", "i", and "m" attributes that specify global, case-insensitive, and multi-line matches, respectively.

在学习正则表达式示例之前,我们先学习正则表达式修饰符、量词、文本字符等。

Before we learn examples of regular expression, let’s learn about regular expression modifiers, Quantifiers, literal characters, etc.

Modifiers

几种修饰符可以简化使用 regexps, 的方式,例如大小写敏感、多行搜索等。

Several modifiers are available that can simplify the way you work with regexps, like case sensitivity, searching in multiple lines, etc.

Sr.No.

Modifier & Description

1

i Perform case-insensitive matching.

2

m Specifies that if the string has newline or carriage return characters, the ^ and $ operators will now match against a newline boundary, instead of a string boundary

3

g Performs a global matchthat is, find all matches rather than stopping after the first match.

Brackets

括号 ([]) 在正则表达式中使用时具有特殊的含义。它们用于查找字符范围。

Brackets ([]) have a special meaning when used in the context of regular expressions. They are used to find a range of characters.

Sr.No.

Expression & Description

1

[…​] Any one character between the brackets.

2

[^…​] Any one character not between the brackets.

3

[0-9] It matches any decimal digit from 0 through 9.

4

[a-z] It matches any character from lowercase a *through lowercase *z.

5

[A-Z] It matches any character from uppercase A through uppercase Z.

6

[a-Z] It matches any character from lowercase a through uppercase Z.

上面显示的范围是通用的;你还可以使用范围 [0-3] 匹配 0 到 3 之间的任何十进制数字,或者使用范围 [b-v] 匹配小写 bv 之间的任何小写字符。

The ranges shown above are general; you could also use the range [0-3] to match any decimal digit ranging from 0 through 3, or the range [b-v] to match any lowercase character ranging from b through v.

Quantifiers

括号字符序列和单个字符的频率或位置可以通过特殊字符表示。每个特殊字符都有特定的含义。+, *, ?, 和 $ 标志都跟随一个字符序列。

The frequency or position of bracketed character sequences and single characters can be denoted by a special character. Each special character has a specific connotation. The +, *, ?, and $ flags all follow a character sequence.

Sr.No.

Expression & Description

1

p+ It matches any string containing one or more p’s.

2

p* It matches any string containing zero or more p’s.

3

p? It matches any string containing at most one p.

4

p{N} It matches any string containing a sequence of N p’s

5

p{2,3} It matches any string containing a sequence of two or three p’s.

6

p{2, } It matches any string containing a sequence of at least two p’s.

7

p$ It matches any string with p at the end of it.

8

^p It matches any string with p at the beginning of it.

9

?!p It matches any string which is not followed by a string p.

Examples

以下示例详细解释了如何匹配字符。

Following examples explain more about matching characters.

Sr.No.

Expression & Description

1

[^a-zA-Z] It matches any string not containing any of the characters ranging from a through z and A through Z.

2

p.p It matches any string containing p, followed by any character, in turn followed by another p.

3

^.{2}$ It matches any string containing exactly two characters.

4

<b>(.)</b>* It matches any string enclosed within <b> and </b>.

5

p(hp)* It matches any string containing a p followed by zero or more instances of the sequence hp.

Literal characters

文本字符可以使用反斜杠 (\) 在正则表达式中。它们用于在正则表达式中插入特殊字符,如制表符、空值、Unicode 等。

The literal characters can be used with a backslash (\) in the regular expression. They are used to insert special characters, such as tab, null, Unicode, etc., in the regular expression.

Sr.No.

Character & Description

1

Alphanumeric Itself

2

\0 The NUL character (\u0000)

3

\t Tab (\u0009

4

\n Newline (\u000A)

5

\v Vertical tab (\u000B)

6

\f Form feed (\u000C)

7

\r Carriage return (\u000D)

8

\xnn The Latin character specified by the hexadecimal number nn; for example, \x0A is the same as \n

9

\uxxxx The Unicode character specified by the hexadecimal number xxxx; for example, \u0009 is the same as \t

10

\cX The control character ^X; for example, \cJ is equivalent to the newline character \n

Metacharacters

元字符只是一个字母字符,后面跟着一个反斜杠,用于赋予该组合特殊含义。

A metacharacter is simply an alphabetical character preceded by a backslash that acts to give the combination a special meaning.

例如,可以使用“\d”元字符搜索一大笔钱: /([\d]+)000/ ,此处 \d 将搜索包含所有数字字符的字符串。

For instance, you can search for a large sum of money using the '\d' metacharacter: /([\d]+)000/, Here \d will search for any string of numerical character.

下表列出了可在 PERL 样式正则表达式中使用的元字符。

The following table lists a set of metacharacters which can be used in PERL Style Regular Expressions.

Sr.No.

Character & Description

1

. a single character

2

\s a whitespace character (space, tab, newline)

3

\S non-whitespace character

4

\d a digit (0-9)

5

\D a non-digit

6

\w a word character (a-z, A-Z, 0-9, _)

7

\W a non-word character

8

[\b] a literal backspace (special case).

9

[aeiou] matches a single character in the given set

10

[^aeiou] matches a single character outside the given set

11

*(foo

bar

baz)* matches any of the alternatives specified

让我们学习如何创建正则表达式。

Let’s learn to create regular expressions below.

let exp = /tutorialspoint/i
  1. /tutorialspoint/ – It finds a match for the 'tutorialspoint' string.

  2. i – It ignores the case of the characters while matching the pattern with the string. So, it matches with 'TutoiralsPoint', or 'TUTORIALSpoint', etc.

let exp = /\d+/
  1. \d – It matches 0 to 9 digits.

  2. + – It matches one or more numeric digits.

let exp = /^Hi/
  1. ^ - It matches the start of the text.

  2. Hi – It checks whether the text contains 'Hi' at the start.

Let exp = /^[a-zA-Z0-9]+@[a-zA-Z]+\.[a-zA-Z]{2,3}$/

上述正则表达式验证电子邮件。它看起来很复杂,但理解起来非常容易。

The above regular expression validates the email. It looks complex, but it is very easy to understand.

  1. ^ - Start of the email address.

  2. [a-zA-Z0-9] – It should contain the alphanumeric characters in the start.

  3. + - It should contain at least one alphanumeric character.

  4. @ - It must have the '@' character after the alphanumeric characters.

  5. [a-zA-Z]+ - After the '@' character, it must contain at least 1 alphanumeric character.

  6. \. – It must contain a dot after that.

  7. [a-zA-Z] – After the dot, the email should contain alphabetical characters.

  8. {2, 3} – After the dot, it should contain only 2 or 3 alphabetical characters. It specifies the length.

  9. $ - It represents the end of the pattern.

现在,问题是我们是否可以使用 search() 或 replace() 方法通过将字符串作为参数来搜索或替换字符串中的文本;那么,还需要正则表达式吗?

Now, the question is whether we can use the search() or replace() method to search or replace text in the string by passing the string as an argument; then what is the need for the regular expression?

这个问题显而易见。让我们通过以下示例来理解它。

The question is obvious. Let’s understand it via the example below.

Example

在以下示例中,我们使用正则表达式字面量来定义正则表达式。模式与 'tutorialspoint' 字符串匹配,无需比较字符大小写。

In the below example, we used the regular expression literal to define the regular expression. The pattern matches the 'tutorialspoint' string without comparing the case of characters.

在第一种情况下,字符串 search() 方法搜索 'tutorialspoint' 字符串,它执行区分大小写的匹配。所以,它返回 -1。

In the first case, the string search() method searches for the 'tutorialspoint' string, which performs the case-sensitive match. So, it returns -1.

在第二种情况下,我们把正则表达式作为 search() 方法的参数传递。它执行不区分大小写的匹配。所以,它返回必需模式的索引 11。

In the second case, we passed the regular expression as an argument of the search() method. It performs the case-insensitive match. So, it returns 11, the index of the required pattern.

<html>
<head>
   <title> JavaScript - Regular Expression </title>
</head>
<body>
   <p id = "output"> </p>
   <script>
      const output = document.getElementById("output");
      let pattern = /tutorialspoint/i;
      let str = "Welcome to TuTorialsPoint! It is a good website!";
      let res = str.search('tutorialspoint');
      output.innerHTML += "Searching using the string : " + res + "<br>";
      res = str.search(pattern);
      output.innerHTML += "Searching using the regular expression : " + res;
   </script>
</body>
</html>

执行程序以查看所需结果。

Execute the program to see the desired results.

Example

在以下示例中,我们使用 replace() 方法与模式匹配,并用 '100' 字符串替换它。

In the example below, we used the replace() method to match the pattern and replace it with the '100' string.

此处,模式匹配数字对。输出显示字符串中的每个数字都被替换为 '100'。你也可以在字符串中添加字母字符。

Here, the pattern matches the pair of digits. The output shows that each number is replaced with '100' in the string. You may also add alphabetical characters in the string.

<html>
<head>
   <title> JavaScript - Regular expression </title>
</head>
<body>
   <p id = "output"> </p>
   <script>
      let pattern = /\d+/g; // Matches pair of digits
      let str = "10, 20, 30, 40, 50";

      let res = str.replace(pattern, "100");
      document.getElementById("output").innerHTML =
		"String after replacement : " + res;
   </script>
</body>
</html>

执行程序以查看所需结果。

Execute the program to see the desired results.

Example (Email validation)

在以下示例中,我们使用带 'new' 关键字的 RegExp() 构造函数来创建正则表达式。此外,我们已把模式以字符串格式作为构造函数的参数传递。

In the example below, we used the RegExp() constructor function with a 'new' keyword to create a regular expression. Also, we have passed the pattern in the string format as an argument of the constructor.

此处,我们使用正则表达式验证电子邮件。在第一种情况下,电子邮件有效。在第二种情况下,电子邮件不包含 ‘@’ 字符,所以 test() 方法返回 false。

Here, we validate the email using the regular expression. In the first case, email is valid. In the second case, the email doesn’t contain the ‘@’ character, so the test() method returns false.

<html>
<body>
   <p id = "output"> </p>
   <script>
      const pattern = new RegExp('^[a-zA-Z0-9]+@[a-zA-Z]+\.[a-zA-Z]{2,3}$');
      document.getElementById("output").innerHTML =
		"abcd@gmail.com is valid? : " + pattern.test('abcd@gmail.com') + "<br>" +
      "abcdgmail.com is valid? : " + pattern.test('abcdgmail.com');
</script>
</body>
</html>

所以,正则表达式可用于在文本中查找特定模式并执行替换等操作。

So, the regular expression can be used to find a particular pattern in the text and perform operations like replace.

RegExp Properties

以下是与 RegExp 相关的属性及其描述的列表。

Here is a list of the properties associated with RegExp and their description.

Sr.No.

Property & Description

1

constructorSpecifies the function that creates an object’s prototype.

2

globalSpecifies if the "g" modifier is set.

3

ignoreCaseSpecifies if the "i" modifier is set.

4

lastIndexThe index at which to start the next match.

5

multilineSpecifies if the "m" modifier is set.

6

sourceThe text of the pattern.

在以下部分中,我们将提供一些示例来说明 RegExp 属性的用法。

In the following sections, we will have a few examples to demonstrate the usage of RegExp properties.

RegExp Methods

以下是与 RegExp 相关的函数及其描述的列表。

Here is a list of the methods associated with RegExp along with their description.

Sr.No.

Method & Description

1

exec()Executes a search for a match in its string parameter.

2

test()Tests for a match in its string parameter.

3

toSource()Returns an object literal representing the specified object; you can use this value to create a new object.

4

toString()Returns a string representing the specified object.

在以下几个部分中,我们将列出一些示例来说明正则表达式方法的使用。

In the following sections, we will have a few examples to demonstrate the usage of RegExp methods.