Beautiful Soup 简明教程
Beautiful Soup - Find Elements by Class
CSS(层叠样式表)是设计 HTML 元素外观的工具。CSS 规则控制 HTML 元素的不同方面,如大小、颜色、对齐方式等。应用样式比定义 HTML 元素属性更有效。您可以将样式规则应用于每个 HTML 元素。CSS 类用于将类似的样式应用于 HTML 元素组以获得统一的网页外观,而不是逐个应用样式到每个元素。在 BeautifulSoup 中,可以找到使用 CSS 类设置样式的标签。在本章中,我们将使用以下方法搜索指定 CSS 类的元素:
CSS (cascaded Style sheets) is a tool for designing the appearance of HTML elements. CSS rules control the different aspects of HTML element such as size, color, alignment etc.. Applying styles is more effective than defining HTML element attributes. You can apply styling rules to each HTML element. Instead of applying style to each element individually, CSS classes are used to apply similar styling to groups of HTML elements to achieve uniform web page appearance. In BeautifulSoup, it is possible to find tags styled with CSS class. In this chapter, we shall use the following methods to search for elements for a specified CSS class −
-
find_all() and find() methods
-
select() and select_one() methods
Class in CSS
CSS 中的一个类是一组属性,用于指定与外观相关的不同特征,例如字体类型、大小和颜色、背景颜色、对齐方式等。声明类时,类的名称前面加点(.)。
A class in CSS is a collection of attributes specifying the different features related to appearance, such as font type, size and color, background color, alignment etc. Name of the class is prefixed with a dot (.) while declaring it.
.class {
css declarations;
}
CSS 类可以在内联中定义,也可以在需要包含在 HTML 脚本中的单独 css 文件中定义。CSS 类的典型示例如下:
A CSS class may be defined inline, or in a separate css file which needs to be included in the HTML script. A typical example of a CSS class could be as follows −
.blue-text {
color: blue;
font-weight: bold;
}
您可以借助以下 BeautifulSoup 方法搜索已定义为特定类样式的 HTML 元素。
You can search for HTML elements defined with a certain class style with the help of following BeautifulSoup methods.
出于本章的目的,我们将使用以下 HTML 页面:
For the purpose of this chapter, we shall use the following HTML page −
<html>
<head>
<title>TutorialsPoint</title>
</head>
<body>
<h2 class="heading">Departmentwise Employees</h2>
<ul>
<li class="mainmenu">Accounts</li>
<ul>
<li class="submenu">Anand</li>
<li class="submenu">Mahesh</li>
</ul>
<li class="mainmenu">HR</li>
<ul>
<li class="submenu">Rani</li>
<li class="submenu">Ankita</li>
</ul>
</ul>
</body>
</html>
Using find() and find_all()
要搜索标签中使用的特定 CSS 类的元素,请按如下使用 Tag 对象的 attrs 属性:
To search for elements with a certain CSS class used in a tag, use attrs property of Tag object as follows −
Example
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
obj = soup.find_all(attrs={"class": "mainmenu"})
print (obj)
Output
[<li class="mainmenu">Accounts</li>, <li class="mainmenu">HR</li>]
此结果是具有 mainmenu 类的所有元素的列表
The result is a list of all the elements with mainmenu class
要获取 attrs 属性中提到的任何 CSS 类中元素的列表,请将 find_all() 语句更改为:
To fetch the list of elements with any of the CSS classes mentioned in in attrs property, change the find_all() statement to −
obj = soup.find_all(attrs={"class": ["mainmenu", "submenu"]})
这会生成一个列表,其中包含上述任何 CSS 类的所有元素。
This results into a list of all the elements with any of CSS classes used above.
[
<li class="mainmenu">Accounts</li>,
<li class="submenu">Anand</li>,
<li class="submenu">Mahesh</li>,
<li class="mainmenu">HR</li>,
<li class="submenu">Rani</li>,
<li class="submenu">Ankita</li>
]
Using select() and select_one()
您还可以使用 select() 方法,CSS 选择器作为参数。(.)符号后跟类名用作 CSS 选择器。
You can also use select() method with the CSS selector as the argument. The (.) symbol followed by the name of the class is used as the CSS selector.