Beautiful Soup 简明教程
Beautiful Soup - smooth() Method
Method Description
在调用一系列修改解析树的方法后,可能会彼此相邻放置两个或更多 NavigableString 对象。 smooth() 方法通过合并连续的字符串来平滑此元素的子元素。在对树进行大量修改的操作后,这样做可以让可读输出看起来更自然。
After calling a bunch of methods that modify the parse tree, you may end up with two or more NavigableString objects next to each other. The smooth() method smooths out this element’s children by consolidating consecutive strings. This makes pretty-printed output look more natural following a lot of operations that modified the tree.
Example 1
html ='''<html>
<head>
<title>TutorislsPoint/title>
</head>
<body>
Some Text
<div></div>
<p></p>
<div>Some more text</div>
<b></b>
<i></i> # COMMENT
</body>
</html>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
soup.find('body').sm
for item in soup.find_all():
if not item.get_text(strip=True):
p = item.parent
item.replace_with('')
p.smooth()
print (soup.prettify())
Output
<html>
<head>
<title>
TutorislsPoint/title>
</title>
</head>
<body>
Some Text
<div>
Some more text
</div>
# COMMENT
</body>
</html>