Java Xml 简明教程

Java StAX Parser - Parse XML Document

Java StAX 解析器 API 具有以事件形式解析 XML 文档的类、方法和接口。这是一个基于拉的 API,它允许客户端程序仅在需要时获得访问事件的更多权限。在本章中,我们将详细了解如何使用 StAX 解析器 API 在 Java 中解析 XML 文件。

The Java StAX parser API has classes, methods and interfaces to parse XML documents in the form of events. It is a pull based API that gives the client program more privilege to access the events only if required. In this chapter, we are going to see how to parse an XML documents in Java using StAX parser API in detail.

Parse XML Using Java StAX Parser

下面是使用 Java StAX 解析器解析文档时使用的步骤:

Following are the steps used while parsing a document using Java StAX Parser −

  1. *Step 1: *Creating XMLInputFactory instance

  2. *Step 2: *Reading the XML

  3. *Step 3: *Parsing the XML

  4. *Step 4: *Retrieving the Elements

Step 1: Creating XMLInputFactory instance

XMLInputFactory 类是一个抽象类,用于获取输入流。要创建一个 XMLInputFactory 的新实例,我们使用 newInstance() 方法。如果无法加载此工厂的实例,它将抛出名为 "FactoryConfigurationError" 的错误。

The XMLInputFactory class is an abstract class that is used to get input streams. To create a new instance of an XMLInputFactory, we use newInstance() method. If the instance of this factory cannot be loaded, it throws an error named, "FactoryConfigurationError".

XMLInputFactory factory = XMLInputFactory.newInstance();

Step 2: Reading the XML

FileReader 类用于从输入文件读取字符流。如果找不到文件或由于某种原因无法读取文件,则以下语句将抛出 "FileNotFoundException"。

The FileReader class is used to read streams of characters from the input file. The following statement throws "FileNotFoundException" if the file can’t be found or if the file can’t be read for some reason.

FileReader fileReader = new FileReader("src/input.txt");

除了从文件中读取 XML 内容之外,我们还可以获取内容的字符串形式,并将其转换为字节,如下所示:

Instead of reading XML content from the file, we can also get the content in the form of a string and convert it into bytes as follows −

StringBuilder xmlBuilder = new StringBuilder();
xmlBuilder.append("<class>xyz</class>");
ByteArrayInputStream input = new ByteArrayInputStream(xmlBuilder.toString().getBytes("UTF-8"));

Step 3: Parsing the XML

要解析 XML 事件,我们通过传递 FileReader 对象或输入流对象从 XMLInputFactory 对象创建 XMLEventReader。如果 XMLEventReader 的创建不成功,它将抛出 XMLStreamException。

To parse XML events, we create XMLEventReader from the XMLInputFactory object by passing either the FileReader object or the input stream object. If the creation of XMLEventReader is not successful, it throws XMLStreamException.

XMLEventReader eventReader = factory.createXMLEventReader(input);

Step 4: Retrieving the Elements

XMLEventReader 的 nextEvent() 方法以 XMLEvent 对象的形式返回下一个 XML 事件。XMLEvent 具有返回事件的方法,如 startElement、endElement 和 Characters。

The nextEvent() method of XMLEventReader returns the next XML event in the form of XMLEvent object. The XMLEvent has methods to return events as startElement, endElement and Characters.

XMLEvent event = eventReader.nextEvent();

Retrieving Element Name

要检索元素名称,我们首先应该从 XML 文档中获取元素。当事件类型为 XMLStreamConstants.START_ELEMENT,XMLEvent 对象的 asStartElement() 以 StartElement 对象的形式检索元素。

To retrieve Element name, we should first get the Element from the XML document. When the event is of type XMLStreamConstants.START_ELEMENT, the asStartElement() on an XMLEvent object, retrieves the Element in the form of a StartElement object.

StartElement 的 getName() 方法以字符串的形式返回元素的名称。

The getName() method of StartElement returns the name of the Element in the form of a String.

Example

RetrieveElementName.java 程序采用 StringBuilder 对象中的 XML 内容并将其转换为字节。所获得的 InputStream 用于创建 XMLEventReader。使用解析器通告的事件来访问元素。

The RetrieveElementName.java program takes the XML content in a StringBuilder object and convert it into bytes. The obtained InputStream is used to create XMLEventReader. The Element is accessed using the events notified by the parser.

import java.io.ByteArrayInputStream;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

public class RetrieveElementName {
   public static void main(String args[]) {
      try {

         //Creating XMLInputFactory instance
    	 XMLInputFactory factory = XMLInputFactory.newInstance();

    	 //Reading the XML
 	     StringBuilder xmlBuilder = new StringBuilder();
 	     xmlBuilder.append("<class>xyz</class>");
 	     ByteArrayInputStream input = new ByteArrayInputStream(xmlBuilder.toString().getBytes("UTF-8"));

 	     //Parsing the XML
         XMLEventReader eventReader =
         factory.createXMLEventReader(input);

         //Retrieving the Elements
         while(eventReader.hasNext()) {
            XMLEvent event = eventReader.nextEvent();
            if(event.getEventType()==XMLStreamConstants.START_ELEMENT) {
            StartElement startElement = event.asStartElement();
            System.out.println("Element Name: " + startElement.getName());
            }
         }
      } catch(Exception e) {
    	  e.printStackTrace();
      }
   }
}

Output

元素名称显示在输出屏幕上。

The name of the Element is displayed on the output screen.

Element Name: class

Retrieving Text Content

要检索元素的文本内容,在 XMLEvent 对象上使用 asCharacters() 方法。仅当事件类型为 XMLStreamConstants.CHARACTERS 时,我们才能使用 asCharacters() 方法。此方法以 Characters 对象的形式返回数据。 getData() 方法用于以字符串的形式获取文本内容。

To retrieve text content of an element, asCharacters() method is used on XMLEvent object. When the event is of type XMLStreamConstants.CHARACTERS, only then we can use asCharacters() method. This method returns the data in the of Characters object. The getData() method is used to get the text content in the form of a String.

Example

在前面的示例中,我们已将 XML 内容作为输入流。现在,让我们通过从文件读取输入,即通过将以下 XML 内容保存在名为 classData.xml 的文件中来获取输入

In the previous example, we have taken XML content as an Input Stream. Now, let us take input by reading from a file by saving the following XML content in a file named, classData.xml

<class>xyz</class>

在以下 RetrievingTextContent.java 程序中,我们已使用 FileReader 对象读取 classData.xml 文件,并将其作为输入传递给 XMLEventReader。使用 XMLEvent 对象,我们已获取元素的文本内容。

In the following RetrievingTextContent.java program, we have read the classData.xml file using a FileReader object and passed as an input to XMLEventReader. Using, XMLEvent object, we have obtained the text content of the Element.

import java.io.FileReader;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.XMLEvent;

public class RetrievingTextContent {
   public static void main(String args[]) {
      try {

    	 //Creating XMLInputFactory instance
    	 XMLInputFactory factory = XMLInputFactory.newInstance();

    	 //Reading the XML
    	 FileReader fileReader = new FileReader("classData.xml");

    	 //Parsing the XML
         XMLEventReader eventReader =
         factory.createXMLEventReader(fileReader);

         //Retrieving the Elements
         while(eventReader.hasNext()) {
            XMLEvent event = eventReader.nextEvent();
            if(event.getEventType()==XMLStreamConstants.CHARACTERS) {
            	Characters characters = event.asCharacters();
            	System.out.println("Text Content : "+ characters.getData());
            }
         }
      } catch(Exception e) {
    	  e.printStackTrace();
      }
   }
}

Output

元素的文本内容显示在输出屏幕上。

The text content of the element is displayed on the output screen.

Text Content : xyz

Retrieving Attributes

StartElement 接口的 getAttributes() 方法返回此元素上声明的属性的只读迭代器。如果此元素上未声明任何属性,它将返回一个空迭代器。

The getAttributes() method of StartElement interface returns a readonly Iterator of attributes declared on this element. If there are no attributes declared on this element, it returns an empty iterator.

Attribute 接口上的 getValue() 函数以字符串的形式返回属性的值。

The getValue() function on Attribute interface returns the value of the attribute in the form of a String.

Example

以下 classData.xml 包含三个学生的信息及其作为属性的学号。让我们使用 Java 中的 StAX API 来检索此信息。

The following classData.xml has the information of three students along with their roll numbers as attributes. Let us retrieve this information using StAX API in Java.

<?xml version = "1.0"?>
<class>
   <student rollno = "393">
      <firstname>dinkar</firstname>
      <lastname>kad</lastname>
      <nickname>dinkar</nickname>
      <marks>85</marks>
   </student>

   <student rollno = "493">
      <firstname>Vaneet</firstname>
      <lastname>Gupta</lastname>
      <nickname>vinni</nickname>
      <marks>95</marks>
   </student>

   <student rollno = "593">
      <firstname>jasvir</firstname>
      <lastname>singn</lastname>
      <nickname>jazz</nickname>
      <marks>90</marks>
   </student>
</class>

在以下 RetrieveAttributes.java 程序中,我们已使用 switch case 语句(用于 START_ELEMENT、CHARACTERS 和 END_ELEMENT XMLStreamConstants)来访问元素的所有信息。

In the following RetrieveAttributes.java program, we have used switch case statements for START_ELEMENT, CHARACTERS and END_ELEMENT XMLStreamConstants to access all the information of elements.

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Iterator;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

public class RetrievingAttributes {
   public static void main(String[] args) {
      boolean bFirstName = false;
      boolean bLastName = false;
      boolean bNickName = false;
      boolean bMarks = false;

      try {

         //Creating XMLInputFactory instance
         XMLInputFactory factory = XMLInputFactory.newInstance();

         //Reading the XML
         FileReader fileReader = new FileReader("classData.xml");

         //Parsing the XML
         XMLEventReader eventReader =
         factory.createXMLEventReader(fileReader);

         //Retrieving the Elements
         while(eventReader.hasNext()) {
            XMLEvent event = eventReader.nextEvent();

            switch(event.getEventType()) {

               case XMLStreamConstants.START_ELEMENT:
                  StartElement startElement = event.asStartElement();
                  String qName = startElement.getName().getLocalPart();

               if (qName.equalsIgnoreCase("student")) {
                  System.out.println("Start Element : student");
                  Iterator<Attribute> attributes = startElement.getAttributes();
                  String rollNo = attributes.next().getValue();
                  System.out.println("Roll No : " + rollNo);
               } else if (qName.equalsIgnoreCase("firstname")) {
                  bFirstName = true;
               } else if (qName.equalsIgnoreCase("lastname")) {
                  bLastName = true;
               } else if (qName.equalsIgnoreCase("nickname")) {
                  bNickName = true;
               }
               else if (qName.equalsIgnoreCase("marks")) {
                  bMarks = true;
               }
               break;

               case XMLStreamConstants.CHARACTERS:
                  Characters characters = event.asCharacters();
               if(bFirstName) {
                  System.out.println("First Name: " + characters.getData());
                  bFirstName = false;
               }
               if(bLastName) {
                  System.out.println("Last Name: " + characters.getData());
                  bLastName = false;
               }
               if(bNickName) {
                  System.out.println("Nick Name: " + characters.getData());
                  bNickName = false;
               }
               if(bMarks) {
                  System.out.println("Marks: " + characters.getData());
                  bMarks = false;
               }
               break;

               case XMLStreamConstants.END_ELEMENT:
                  EndElement endElement = event.asEndElement();

               if(endElement.getName().getLocalPart().equalsIgnoreCase("student")) {
                  System.out.println("End Element : student");
                  System.out.println();
               }
               break;
            }
         }
      } catch (FileNotFoundException e) {
         e.printStackTrace();
      } catch (XMLStreamException e) {
         e.printStackTrace();
      }
   }
}

Output

所有学生的信息及其学号都显示在输出屏幕上。

All the information of students along with their roll numbers are displayed on the output screen.

Start Element : student
Roll No : 393
First Name: dinkar
Last Name: kad
Nick Name: dinkar
Marks: 85
End Element : student

Start Element : student
Roll No : 493
First Name: Vaneet
Last Name: Gupta
Nick Name: vinni
Marks: 95
End Element : student

Start Element : student
Roll No : 593
First Name: jasvir
Last Name: singn
Nick Name: jazz
Marks: 90
End Element : student