Beautiful Soup 简明教程

Beautiful Soup - Find Elements by Attribute

find() 和 find_all() 方法都用于根据传递给这些方法的参数找到文档中一个或所有标签。你可以将 attrs 参数传递给这些函数。attrs 的值必须是具有一个或多个标签属性及其值的字典。

Both find() and find_all() methods are meant to find one or all the tags in the document as per the arguments passed to these methods. You can pass attrs parameter to these functions. The value of attrs must be a dictionary with one or more tag attributes and their values.

为了检查这些方法的行为,我们将使用以下 HTML 文档 (index.html)

For the purpose of checking the behaviour of these methods, we shall use the following HTML document (index.html)

<html>
   <head>
      <title>TutorialsPoint</title>
   </head>
   <body>
      <form>
         <input type = 'text' id = 'nm' name = 'name'>
         <input type = 'text' id = 'age' name = 'age'>
         <input type = 'text' id = 'marks' name = 'marks'>
      </form>
   </body>
</html>

Using find_all()

下面的程序返回一个具有 input type="text" 属性的所有标签的列表。

The following program returns a list of all the tags having input type="text" attribute.

Example

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

obj = soup.find_all(attrs={"type":'text'})
print (obj)

Output

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>, <input id="marks" name="marks" type="text"/>]

Using find()

find() 方法返回已解析文档中具有给定属性的第一个标签。

The find() method returns the first tag in the parsed document that has the given attributes.

obj = soup.find(attrs={"name":'marks'})

Using select()

可以通过传递要比较的属性来调用 select() 方法。这些属性必须放在一个列表对象中。它返回一个具有给定属性的所有标签的列表。

The select() method can be called by passing the attributes to be compared against. The attributes must be put in a list object. It returns a list of all tags that have the given attribute.

在下面的代码中, select() 方法返回具有 type 属性的所有标签。

In the following code, the select() method returns all the tags with type attribute.

Example

from bs4 import BeautifulSoup

fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')

obj = soup.select("[type]")
print (obj)

Output

[<input id="nm" name="name" type="text"/>, <input id="age" name="age" type="text"/>, <input id="marks" name="marks" type="text"/>]

Using select_one()

select_one() 是一个类似的方法,除了它返回满足给定过滤器的第一个标签外。

The select_one() is method is similar, except that it returns the first tag satisfying the given filter.

obj = soup.select_one("[name='marks']")

Output

<input id="marks" name="marks" type="text"/>