Beautiful Soup 简明教程
Beautiful Soup - parent Property
Method Description
BeautifulSoup库中的parent属性返回所述PegeElement的直接父元素。parents属性返回的值的类型是Tag对象。对于BeautifulSoup对象,其父级是文档对象
The parent property in BeautifulSoup library returns the immediate parent element of the said PegeElement. The type of the value returned by the parents property is a Tag object. For the BeautifulSoup object, its parent is a document object
Return value
parent属性返回Tag对象。对于Soup对象,它返回文档对象
The parent property returns a Tag object. For Soup object, it returns document object
Example 1
此示例使用.parent属性来查找示例HTML字符串中第一个<p>标签的直接父元素。
This example uses .parent property to find the immediate parent element of the first <p> tag in the example HTML string.
html = """
<html>
<head>
<title>TutorialsPoint</title>
</head>
<body>
<p>Hello World</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
tag = soup.p
print (tag.parent.name)
Example 2
在以下示例中,我们看到<title>标签封闭在<head>标签内。因此,<title>标签的parent属性返回<head>标签。
In the following example, we see that the <title> tag is enclosed inside a <head> tag. Hence, the parent property for <title> tag returns the <head> tag.
html = """
<html>
<head>
<title>TutorialsPoint</title>
</head>
<body>
<p>Hello World</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
tag = soup.title
print (tag.parent)
Example 3
Python的内置HTML解析器的行为与html5lib和lxml解析器略有不同。内置解析器不会尝试从提供的字符串中构建一个完美的文档。如果字符串中不存在的话,它不会添加附加的父标签,如body或html。另一方面,html5lib和lxml解析器会添加这些标签以使文档成为一个完美的HTML文档。
The behaviour of Python’s built-in HTML parser is a little different from html5lib and lxml parsers. The built-in parser doesn’t try to build a perfect document out of the string provided. It doesn’t add additional parent tags like body or html if they don’t exist in the string. On the other hand, html5lib and lxml parsers add these tags to make the document a perfect HTML document.
html = """
<p><b>Hello World</b></p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
print (soup.p.parent.name)
soup = BeautifulSoup(html, 'html5lib')
print (soup.p.parent.name)