Beautiful Soup 简明教程

Beautiful Soup - find_all_next() Method

Method Description

Beautiful Soup 中的 find_all_next() 方法找到与给定条件匹配并出现在文档中的此元素后面的所有 PageElements。此方法返回标记或 NavigableString 对象,其方法的参数与 find_all() 中的参数完全相同。

The find_all_next() method in Beautiful Soup finds all PageElements that match the given criteria and appear after this element in the document. This method returns tags or NavigableString objects and method takes in the exact same parameters as find_all().

Syntax

find_all_next(name, attrs, string, limit, **kwargs)

Parameters

  1. name − A filter on tag name.

  2. attrs − A dictionary of filters on attribute values.

  3. recursive − If this is True, find() a recursive search will be performed. Otherwise, only the direct children will be considered.

  4. limit − Stop looking after specified number of occurrences have been found.

  5. kwargs − A dictionary of filters on attribute values.

Return Value

该方法返回包含 PageElements(标签或 NavigableString 对象)的结果集。

This method returns a ResultSet containing PageElements (Tags or NavigableString objects).

Example 1

使用 index.html 作为此示例的 HTML 文档,我们首先定位 <form> 标签并使用 find_all_next() 方法收集它之后的所有元素。

Using the index.html as the HTML document for this example, we first locate the <form> tag and collect all the elements after it with find_all_next() method.

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.form
tags = tag.find_all_next()
print (tags)

Output

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>, <input id="marks" name="marks" type="text"/>]

Example 2

在此,我们对 find_all_next() 方法应用筛选器,以收集 <form> 之后的、ID 为 nm 或 age 的所有标签。

Here, we apply a filter to the find_all_next() method to collect all the tags subsequent to <form>, with id being nm or age.

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.form
tags = tag.find_all_next(id=['nm', 'age'])
print (tags)

Output

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>]

Example 3

如果我们检查 body 标签之后的标签,它包括一个 <h1> 标签以及 <form> 标签,其中包含三个输入元素。

If we check the tags following the body tag, it includes a <h1> tag as well as <form> tag, that includes three input elements.

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

tag = soup.body
tags = tag.find_all_next()
print (tags)

Output

<h1>TutorialsPoint</h1>
<form>
<input id="nm" name="name" type="text"/>
<input id="age" name="age" type="text"/>
<input id="marks" name="marks" type="text"/>
</form>
<input id="nm" name="name" type="text"/>
<input id="age" name="age" type="text"/>
<input id="marks" name="marks" type="text"/>