Beautiful Soup 简明教程
Beautiful Soup - parents Property
Method Description
BeautifulSoup 库中的 parents 属性以递归方式检索所述 PegeElement 的所有父元素。parents 属性返回的值的类型是一个生成器,借助该生成器,我们可以列出从下到上的父元素。
The parents property in BeautifulSoup library retrieves all the parent elements of the said PegeElement in a recursive manner. The type of the value returned by the parents property is a generator, with the help of which we can list out the parents in the down-to-up order.
Example 1
此示例使用 .parents 从深入埋藏在文档中的 <a> 标记跳转到文档的最顶端。在下面的代码中,我们将跟踪示例 HTML 字符串中第一个 <p> 标记的父标记。
This example uses .parents to travel from an <a> tag buried deep within the document, to the very top of the document. In the following code, we track the parents of the first <p> tag in the example HTML string.
html = """
<html><head><title>TutorialsPoint</title></head>
<body>
<p>Hello World</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
tag = soup.p
for element in tag.parents:
print (element.name)
Output
body
html
[document]
请注意 BeautifulSoup 对象的父节点是 [document]。
Note that the parent to the BeautifulSoup object is [document].
Example 2
在下面的示例中,我们可以看到 <b> 标记被包含在 <p> 标记里面。它上方的两个 div 标记有一个 id 属性。我们尝试只打印那些具有 id 属性的元素。has_attr() 方法用于此目的。
In the following example, we see that the <b> tag is enclosed inside a <p> tag. The two div tags above it have an id attribute. We try to print the only those elements having id attribute. The has_attr() method is used for the purpose.
html = """
<div id="outer">
<div id="inner">
<p>Hello<b>World</b></p>
</div>
</div>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
tag = soup.b
for parent in tag.parents:
if parent.has_attr("id"):
print(parent["id"])