Feed Adapter

Spring Integration 提供对通过源适配器进行联合的支持。实现基于 ROME Framework

Spring Integration provides support for syndication through feed adapters. The implementation is based on the ROME Framework.

你需要将此依赖项包含在你的项目中:

You need to include this dependency into your project:

  • Maven

  • Gradle

<dependency>
    <groupId>org.springframework.integration</groupId>
    <artifactId>spring-integration-feed</artifactId>
    <version>{project-version}</version>
</dependency>
compile "org.springframework.integration:spring-integration-feed:{project-version}"

Web联合是一种发布新闻报道、新闻稿、博客文章和其他商品(一般在网站上可用,但也会以 RSS 或 ATOM 等馈送格式提供)的方式。

Web syndication is a way to publish material such as news stories, press releases, blog posts, and other items typically available on a website but also made available in a feed format such as RSS or ATOM.

Spring Integration 通过其“馈送”适配器提供对 Web 联合的支持,并为它提供基于名称空间的便捷配置。要配置“馈送”名称空间,请在 XML 配置文件的标头中包含以下元素:

Spring integration provides support for web syndication through its 'feed' adapter and provides convenient namespace-based configuration for it. To configure the 'feed' namespace, include the following elements within the headers of your XML configuration file:

xmlns:int-feed="http://www.springframework.org/schema/integration/feed"
xsi:schemaLocation="http://www.springframework.org/schema/integration/feed
	https://www.springframework.org/schema/integration/feed/spring-integration-feed.xsd"

Feed Inbound Channel Adapter

你真正需要提供的用于支持检索馈送的唯一适配器是一个入站信道适配器。它允许你订阅特定 URL。以下示例显示了可能的配置:

The only adapter you really need to provide support for retrieving feeds is an inbound channel adapter. It lets you subscribe to a particular URL. The following example shows a possible configuration:

  • Java DSL

  • Java

  • XML

@Configuration
@EnableIntegration
public class ContextConfiguration {

    @Value("org/springframework/integration/feed/sample.rss")
    private Resource feedResource;

    @Bean
    public IntegrationFlow feedFlow() {
        return IntegrationFlow
                .from(Feed.inboundAdapter(this.feedResource, "feedTest")
                                .preserveWireFeed(true),
                        e -> e.poller(p -> p.fixedDelay(100)))
                .channel(c -> c.queue("entries"))
                .get();
    }

}
@Bean
@InboundChannelAdapter(inputChannel = "fromFeed")
public FeedEntryMessageSource feedEntrySource() {
    return new FeedEntryMessageSource("https://feeds.bbci.co.uk/news/rss.xml", "metadataKey");
}
<int-feed:inbound-channel-adapter id="feedAdapter"
        channel="feedChannel"
        url="https://feeds.bbci.co.uk/news/rss.xml">
    <int:poller fixed-rate="10000" max-messages-per-poll="100" />
</int-feed:inbound-channel-adapter>

在前面的配置中,我们正在订阅由 url 属性标识的 URL。

In the preceding configuration, we are subscribing to a URL identified by the url attribute.

在检索新闻项目时,它们会被转换为消息并发送到由 channel 属性标识的信道。每条消息的有效负荷是 com.rometools.rome.feed.synd.SyndEntry 实例。每个实例封装有关新闻项目(内容、日期、作者和其他详细信息)的各种数据。

As news items are retrieved, they are converted to messages and sent to a channel identified by the channel attribute. The payload of each message is a com.rometools.rome.feed.synd.SyndEntry instance. Each one encapsulates various data about a news item (content, dates, authors, and other details).

入站馈送信道适配器是一个轮询使用者。这意味着你必须提供探查器配置。但是,你必须了解与馈送相关的一件重要事项,即它的内部工作原理与大多数其他轮询使用者略有不同。在启动入站馈送适配器时,它会执行第一个轮询并接收 com.rometools.rome.feed.synd.SyndFeed 实例。该对象包含多个 SyndEntry 对象。每个条目存储在本地条目队列中,并根据 max-messages-per-poll 属性中的值释放,以便每条消息包含一个条目。如果在从条目队列中检索条目期间,队列已变为空,则适配器将尝试更新馈送,从而向队列中填充更多条目(如果可用的话,SyndEntry 实例)。否则,下一次轮询馈送的尝试将由探查器的触发器(在前一配置中每十秒一次)决定。

The inbound feed channel adapter is a polling consumer. That means that you must provide a poller configuration. However, one important thing you must understand with regard to a feed is that its inner workings are slightly different, then most other polling consumers. When an inbound feed adapter is started, it does the first poll and receives a com.rometools.rome.feed.synd.SyndFeed instance. That object contains multiple SyndEntry objects. Each entry is stored in the local entry queue and is released based on the value in the max-messages-per-poll attribute, such that each message contains a single entry. If, during retrieval of the entries from the entry queue, the queue has become empty, the adapter attempts to update the feed, thereby populating the queue with more entries (SyndEntry instances), if any are available. Otherwise, the next attempt to poll for a feed is determined by the trigger of the poller (every ten seconds in the preceding configuration).

Duplicate Entries

轮询以获取信息可能导致已经处理的条目 (“I already read that news item, why are you showing it to me again?”)。Spring Integration 提供了一项方便的机制,可消除担心重复条目的必要性。每个信息条目都有一个 "`published date`"字段。每次生成和发送一条新的 `Message`时,Spring Integration 都会将最新发布日期的值存储在 `MetadataStore`策略实例中(请参阅 Metadata Store)。`metadataKey`用于保存最新发布日期。

Polling for a feed can result in entries that have already been processed (“I already read that news item, why are you showing it to me again?”). Spring Integration provides a convenient mechanism to eliminate the need to worry about duplicate entries. Each feed entry has a “published date” field. Every time a new Message is generated and sent, Spring Integration stores the value of the latest published date in an instance of the MetadataStore strategy (see Metadata Store). The metadataKey is used to persist the latest published date.

Other Options

从 5.0 版开始,已弃用的 com.rometools.fetcher.FeedFetcher 选项已删除,并为 org.springframework.core.io.Resource 提供了一个重载的 FeedEntryMessageSource 构造函数。当馈送源不是 HTTP 终端,而是其他资源(例如 FTP 上的本地或远程资源)时,这很有用。在 FeedEntryMessageSource 逻辑中,此类资源(或提供的 URL)由 SyndFeedInput 解析为 SyndFeed 对象,用于前面提到的处理。你还可以将自定义的 SyndFeedInput(例如,具有 allowDoctypes 选项)实例注入 FeedEntryMessageSource

Starting with version 5.0, the deprecated com.rometools.fetcher.FeedFetcher option has been removed and an overloaded FeedEntryMessageSource constructor for an org.springframework.core.io.Resource is provided. This is useful when the feed source is not an HTTP endpoint but is any other resource (such as local or remote on FTP). In the FeedEntryMessageSource logic, such a resource (or provided URL) is parsed by the SyndFeedInput to the SyndFeed object for the processing mentioned earlier. You can also inject a customized SyndFeedInput (for example, with the allowDoctypes option) instance into the FeedEntryMessageSource.

如果与馈送的连接需要一些自定义项,例如连接和读取超时,则必须使用 org.springframework.core.io.UrlResource 扩展及其 customizeConnection(HttpURLConnection) 覆盖,而不是将纯 URL 注入 FeedEntryMessageSource。例如:

If the connection to the feed needs some customization, e.g. connection and read timeouts, the org.springframework.core.io.UrlResource extension with its customizeConnection(HttpURLConnection) override has to be used instead of plain URL injection into the FeedEntryMessageSource. For example:

@Bean
@InboundChannelAdapter("feedChannel")
FeedEntryMessageSource feedEntrySource() {
    UrlResource urlResource =
	    new UrlResource(url) {

	        @Override
	        protected void customizeConnection(HttpURLConnection connection) throws IOException {
	            super.customizeConnection(connection);
	            connection.setConnectTimeout(10000);
	            connection.setReadTimeout(5000);
	        }
	    };
    return new FeedEntryMessageSource(urlResource, "myKey");
}