Beautiful Soup 简明教程
Beautiful Soup - next_sibling Property
Method Description
缩进级别相同的 HTML 标记称为同级元素。PageElement 的 next_sibling 属性返回同级别或同一父元素下的下一个标记。
The HTML tags appearing at the same indentation level are called siblings. The next_sibling property of the PageElement returns next tag at the same level, or under the same parent.
Return type
next_sibling 属性返回 PageElement、Tag 或 NavigableString 对象。
The next_sibling property returns a PageElement, a Tag or a NavigableString object.
Example 1
index.html 工资页面由一个 HTML 表单组成,其中包含三个具有 name 属性的输入元素。在下面的示例中,找到了 name 属性为 nm 的输入标记的下一个兄弟元素。
The index.html wage page consists of a HTML form with three input elements each with a name attribute. In the following example, the next sibling of an input tag with name attribute as nm is located.
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
tag = soup.find('input', {'name':'age'})
print (tag.find_previous())
from bs4 import BeautifulSoup
fp = open("index.html")
soup = BeautifulSoup(fp, 'html.parser')
tag = soup.find('input', {'id':'nm'})
sib = tag.next_sibling
print (sib)
Example 2
在下一个示例中,我们有一个 HTML 文档,其中包含 <p> 标记内的几个标记。next_sibling 属性返回其中 <b> 标记旁边的标记。
In the next example, we have a HTML document with a couple of tags inside a <p> tag. The next_sibling property returns the tag next to <b> tag in it.
from bs4 import BeautifulSoup
soup = BeautifulSoup("<p><b>Hello</b><i>Python</i></p>", 'html.parser')
tag1 = soup.b
print ("next:",tag1.next_sibling)
Example 3
考虑以下文档中的 HTML 字符串。它有两级 <p> 标记。第一个 <p> 标记的下一个兄弟元素应提供第二个 <p> 标记的内容。
Consider the HTML string in the following document. It has two <p> tags at the same level. The next_sibling of first <p> should give the second <p> tag’s contents.
html = '''
<p><b>Hello</b><i>Python</i></p>
<p>TutorialsPoint</p>
'''
soup = BeautifulSoup(html, 'html.parser')
tag1 = soup.p
print ("next:",tag1.next_sibling)
Output
next:
单词 next: 之后的空行是意外的。但这是因为第一个 <p> 标记后面有 \n 字符。如下所示更改打印语句以获取下一个兄弟元素的内容
The blank line after the word next: is unexpected. But that’s because of the \n character after the first <p> tag. Change the print statement as shown below to obtain the contents of the next_sibling
tag1 = soup.p
print ("next:",tag1.next_sibling.next_sibling)