Beautiful Soup 简明教程

Beautiful Soup - contents Property

Method Description

Soup 对象和标签对象可以使用 contents 属性。它返回包含在对象内部的所有内容,即所有直接子元素和文本节点(即 Navigable 字符串)的列表。

The contents property is available with the Soup object as well as Tag object. It returns a list everything that is contained inside the object, all the immediate child elements and text nodes (i.e. Navigable String).

Syntax

Tag.contents

Return value

contents 属性返回标签/汤对象中子元素和字符串的列表。

The contents property returns a list of child elements and strings in the Tag/Soup object,.

Example 1

标签对象的内容 -

Contents of a tag object −

from bs4 import BeautifulSoup

markup = '''
   <div id="Languages">
      <p>Java</p>
      <p>Python</p>
      <p>C++</p>
   </div>
'''
soup = BeautifulSoup(markup, 'html.parser')

tag = soup.div
print (tag.contents)

Output

['\n', <p>Java</p>, '\n', <p>Python</p>, '\n', <p>C++</p>, '\n']

Example 2

文档的整个内容 -

Contents of the entire document −

from bs4 import BeautifulSoup, NavigableString

markup = '''
   <div id="Languages">
      <p>Java</p> <p>Python</p> <p>C++</p>
   </div>
'''
soup = BeautifulSoup(markup, 'html.parser')

print (soup.contents)

Output

['\n', <div id="Languages">
<p>Java</p> <p>Python</p> <p>C++</p>
</div>, '\n']

Example 3

请注意, NavigableString 对象没有内容属性。如果我们尝试访问它会引发 AttributeError。

Note that a NavigableString object doesn’t have contents property. It throws AttributeError if we try to access the same.

from bs4 import BeautifulSoup, NavigableString

markup = '''
   <div id="Languages">
      <p>Java</p> <p>Python</p> <p>C++</p>
   </div>
'''
soup = BeautifulSoup(markup, 'html.parser')
tag = soup.p
s=tag.contents[0]
print (s.contents)

Output

Traceback (most recent call last):
  File "C:\Users\user\BeautifulSoup\2.py", line 11, in <module>
    print (s.contents)
           ^^^^^^^^^^
  File "C:\Users\user\BeautifulSoup\Lib\site-packages\bs4\element.py", line 984, in __getattr__
    raise AttributeError(
AttributeError: 'NavigableString' object has no attribute 'contents'