Python Network Programming 简明教程

Python - RSS Feed

RSS(富网站摘要)是一种用来传递定期更改的 Web 内容的格式。许多新闻相关站点、Web 日志和其他在线发布者以 RSS Feed 的形式联合他们的内容提供给想要这些内容的用户。在 Python 中,我们借助以下软件包来读取和处理这些 Feed。

RSS (Rich Site Summary) is a format for delivering regularly changing web content. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it. In python we take help of the below package to read and process these feeds.

pip install feedparser

Feed Structure

在以下示例中,我们获取了 Feed 的结构,以便进一步分析我们想要处理 Feed 的哪些部分。

In the below example we get the structure of the feed so that we can analyze further about which parts of the feed we want to process.

import feedparser
NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms")
entry = NewsFeed.entries[1]

print entry.keys()

当我们运行以上程序时,我们得到以下输出:

When we run the above program, we get the following output −

['summary_detail', 'published_parsed', 'links', 'title', 'summary', 'guidislink', 'title_detail', 'link', 'published', 'id']

Feed Title and Posts

在以下示例中,我们读取了 rss Feed 的标题和头部。

In the below example we read the title and head of the rss feed.

import feedparser

NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms")

print 'Number of RSS posts :', len(NewsFeed.entries)

entry = NewsFeed.entries[1]
print 'Post Title :',entry.title

当我们运行以上程序时,我们得到了以下输出 −

When we run the above program we get the following output −

Number of RSS posts : 5
Post Title : Cong-JD(S) in SC over choice of pro tem speaker

Feed Details

基于以上条目结构,我们可以使用 Python 程序从 Feed 中导出必要的信息,如下所示。由于条目是字典,我们利用其键来生成所需的值。

Based on above entry structure we can derive the necessary details from the feed using python program as shown below. As entry is a dictionary we utilize its keys to produce the values needed.

import feedparser

NewsFeed = feedparser.parse("https://timesofindia.indiatimes.com/rssfeedstopstories.cms")

entry = NewsFeed.entries[1]

print entry.published
print "******"
print entry.summary
print "------News Link--------"
print entry.link

当我们运行以上程序时,我们得到了以下输出 −

When we run the above program we get the following output −

Fri, 18 May 2018 20:13:13 GMT
******
Controversy erupted on Friday over the appointment of BJP MLA K G Bopaiah as pro tem speaker for the assembly, with Congress and JD(S) claiming the move went against convention that the post should go to the most senior member of the House. The combine approached the SC to challenge the appointment. Hearing is scheduled for 10:30 am today.
------News Link--------
https://timesofindia.indiatimes.com/india/congress-jds-in-sc-over-bjp-mla-made-pro-tem-speaker-hearing-at-1030-am/articleshow/64228740.cms