Java Xml 简明教程

Java SAX Parser - Overview

Java SAX(XML 的简单 API)是一种基于事件的解析器,用于解析 XML 文档。与 DOM 解析器不同,SAX 解析器不会创建解析树。它不会将整个文档加载到内存中,而是会通读 XML 文档,并在遇到元素、属性、文本内容和以事件形式的其他数据项时通知客户端程序。这些事件由事件处理程序中实现的方法处理。

Java SAX (Simple API for XML) is an event-based parser to parse XML documents. Unlike DOM parser, SAX parser does not create a parse tree. It will not load the entire document into memory, instead, it reads through the XML document and notifies the client program whenever it encounters elements, attributes, text content and other data items in the form of events. These events are handled by the methods implemented inside the Event Handler.

What does SAX Parser do?

SAX 解析器向客户端程序执行以下操作 -

A SAX Parser does the following to a client program −

  1. Reads the XML document from top to bottom and identifies the tokens.

  2. Processes the tokens in the same order of their appearance.

  3. Reports the parser about the nature of the tokens.

  4. Invokes the callback methods in the Event handler based on the identified tokens.

When to Use Java SAX Parser?

以下情况应使用 SAX 解析器 -

You should use a SAX parser when −

  1. You want to process an XML document in a linear fashion from top to bottom.

  2. The document is not deeply nested.

  3. Your XML document is very large.

  4. The problem to be solved involves only a part of the XML document.

  5. You have streaming data (data is available as soon as it is seen by the parser).

Advantages

以下是 Java SAX 解析器的一些优点 -

Following are some advantages of Java SAX Parser −

  1. Consumes less memory

  2. It is faster than DOM parser. Because, we need not wait for the entire document to get loaded in the memory.

  3. You can still process the XML documents larger than the system memory.

Disadvantages

以下是一些 Java SAX 解析器缺点 -

Here are some of the disadvantages of Java SAX Parser −

  1. Random access to an XML document is not possible.

  2. Creating XML documents is not possible.

  3. If you want to keep track of data that the parser has seen or change the order of items, you must write the code and store the data on your own.

ContentHandler Interface

ContentHandler 接口是 org.xml.sax 包中的主接口。大多数应用程序都实现此接口来执行基本的解析事件。这些事件包括文档的开始和结束、元素的开始和结束,以及字符数据。我们必须实现并注册一个处理程序来执行 XML 文档中的任何任务。

ContentHandler interface is the main interface in org.xml.sax package. Most of the application programs implement this interface to perform basic parsing events. These events include start and end of a document, start and end of the elements and character data. We must implement and register a Handler to perform any task in the XML document.

有一些内置类,即 DefaultHandler 、DefaultHandler2、ValidatorHandler,用于实现 ContentHandler 接口。我们可以使用这些类来实现我们自定义的处理程序。

There are built-in classes, namely, DefaultHandler, DefaultHandler2, ValidatorHandler that implement ContentHandler interface. We can use these classes to implement our user defined Handlers.

此接口指定 SAX 解析器用来通知应用程序程序它看到 XML 文档组件的回调方法。以下是 ContentHandler 接口的方法 -

This interface specifies the callback methods that the SAX parser uses to notify an application program of the components of the XML document that it has seen. Following are the methods of ContentHandler interface −

Attributes Interface

Attributes 接口位于 org.xml.sax 包中。此接口用于 Element 中指定的 XML 属性列表。以下是 Attributes 接口中使用最多的方法 -

The Attributes interface is in the package org.xml.sax. This interface is for the list of XML attributes specified in an Element. Following are the most commonly used methods of Attributes interface −