Java Xml 简明教程

Java DOM4J Parser - Parse XML Document

Java DOM4J 解析器是 Java 中一个用于解析 XML 文档的 API。它从内置 SAX 解析器或 DOM 解析器创建 DOM4J 文档。获得文档后,我们可以使用 DOM4J Document 和 Element 接口的内置方法检索元素和属性的信息。

Java DOM4J parser is an API in Java to parse XML documents. It creates DOM4J document from a built-in SAX parser or a DOM parser. After getting the document, we can retrieve the information of elements and attributes using the built-in methods of DOM4J Document and Element interfaces.

在本章中,我们使用 getRootElement() 提取文档的根元素,并使用 elements() 方法获取其所有子元素。

In this chapter, we have used getRootElement() to extract root element of the document and the elements() method to get all of its child elements.

Parse XML Using Java DOM4J Parser

使用 Java DOM4J 解析器解析文档时,将使用以下步骤:

Following are the steps used while parsing a document using Java DOM4J Parser −

  1. *Step 1: *Creating SAXReader object

  2. *Step 2: *Reading the XML file

  3. *Step 3: *Parsing the XML

  4. *Step 4: *Extracting the root

  5. *Step 5: *Retrieving Elements

Step 1: Creating SAXReader object

SAXReader 类用于从 XML 文件或流创建 DOM4J 文档。它具有自己的内置 SAX 解析器来解析文件。我们如下创建一个 SAXReader 对象:

The SAXReader class is used to create a DOM4J document from an XML file or stream. It has its own built-in SAX parser to parse the file. We create a SAXReader object as follows −

SAXReader reader = new SAXReader();

Step 2: Reading the XML file

若要以字符串形式读取 XML 内容,我们可以使用 StringBuilder 类,然后再将其转换为字节流以创建 XML 文档。如果 XML 内容以文件形式提供,我们可以使用 java.io 的 File 类按如下方式读取它 −

To read XML content as a string, we can use StringBuilder class and later convert it into a ByteStream to create XML document. If XML content is available in the form of a file, we can read it using File class of java.io as follows −

File inputFile = new File("src/input.txt");

Step 3: Parsing the XML

若要分析 XML 文件,我们在步骤 1 中创建了 SAXReader 对象。使用 SAXReader 的 read() 方法,我们可以创建一个 DOM4J 文档,方法是将我们在步骤 2 中读取的文件作为参数传递,如下所示 −

To parse the XML file, we have created SAXReader object in step 1. Using the read() method of SAXReader, we create DOM4J document by passing the file we read in step 2 as an argument as follows −

Document document = reader.read(inputFile);

Step 4: Extracting the Root

需要从 DOM4J 文档中提取根元素以获取元素的任何信息。使用 Document 接口的 getRootElement() 方法,我们获取根元素,如下所示 −

The root element needs to be extracted from DOM4J document to obtain any information of elements. Using the getRootElement() method of Document interface we obtain the root element as follows −

Element RootElement = document.getRootElement();

Step 5: Retrieving Elements

在执行了前四个步骤之后,我们现在有了根元素来获取其子元素的信息。现在,我们将执行一些任务,例如检索根,检索属性以及检索 XML 文档中元素文本,并附带示例。

After we have followed the first four steps, we now have the root element to obtain the information of its child elements. Now, we are going to perform some tasks such as retrieving the root, retrieving attributes and retrieving element text of an XML document with examples.

Retrieving the Root

Document 接口的 getRootElement() 方法以 Element 对象的形式返回文档的根元素。若要获取元素的名称,我们使用 Element 接口的 getName() 方法。它以字符串的形式返回元素的名称。

The getRootElement() method of Document interface returns the root element of the document in the form of an Element object. To get the name of the element, we use getName() method of Element interface. It returns the name of the element in the form of a String.

Example

我们要分析的 studentData.xml 文件如下 −

The studentData.xml file we need to parse is as follows −

<?xml version = "1.0"?>
<class>
   <student rollno = "393">
      <firstname>dinkar</firstname>
      <lastname>kad</lastname>
      <nickname>dinkar</nickname>
      <marks>85</marks>
   </student>

   <student rollno = "493">
      <firstname>Vaneet</firstname>
      <lastname>Gupta</lastname>
      <nickname>vinni</nickname>
      <marks>95</marks>
   </student>

   <student rollno = "593">
      <firstname>jasvir</firstname>
      <lastname>singn</lastname>
      <nickname>jazz</nickname>
      <marks>90</marks>
   </student>
</class>

RetrieveRoot.java 程序使用 SAXReader 读取以上 studentData.xml 文件,并获取一个 DOM4J 文档。获取文档后,我们使用 getRootElement() 方法提取根。

The RetrieveRoot.java program reads the above studentData.xml file using a SAXReader and obtains a DOM4J document. After getting the document, we use getRootElement() method to extract the root.

import java.io.File;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

public class RetrieveRoot {
   public static void main(String[] args) {
      try {

    	 //Creating SAXReader
         SAXReader reader = new SAXReader();

         //Reading the XML file
         File inputFile = new File("studentData.xml");

         //Parsing the XML
         Document document = reader.read(inputFile);

         //Extracting the root
         Element RootElement = document.getRootElement();

         //Printing the Root Element Name
         System.out.println("Root element Name :" + RootElement.getName());
      } catch(Exception e) {
    	 e.printStackTrace();
      }
   }
}

Output

根元素名称“class”显示在输出屏幕上。

The root element name, 'class' is displayed on the output screen.

Root element Name :class

Retrieving Attributes

Element 接口的 attributeValue() 方法以字符串的形式检索指定属性的值。如果该元素没有此类属性,则返回 null。如果未指定该属性的值,则返回一个空字符串。

The attributeValue() method of Element interface retrieves the value of the specified attribute in the form of a String. If there is no such attribute for that element, it returns null. If there is no value specified for the attribute, it returns an empty string.

Example

以下 RetrieveAttributes.java 程序使用 attributeValue() 方法,并检索所有学生元素的学号。

The following RetrieveAttributes.java program uses the attributeValue() method and retrieves all the roll numbers of student elements.

import java.io.File;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

public class RetrieveAttributes {
   public static void main(String[] args) {
      try {

    	 //Creating SAXReader
         SAXReader reader = new SAXReader();

         //Reading the XML file
         File inputFile = new File("studentData.xml");

         //Parsing the XML
         Document document = reader.read(inputFile);

         //Extracting the root
         Element RootElement = document.getRootElement();

         //Iterating over the List
         List<Element> elements = RootElement.elements();
         System.out.println("Student Roll numbers - ");
         for (Element ele : elements) {
            System.out.println(ele.attributeValue("rollno") );
         }
      } catch(Exception e) {
    	 e.printStackTrace();
      }
   }
}

所有学生的学号都显示在输出屏幕上。

All the student roll numbers are displayed on the output screen.

Output

Student Roll numbers -
393
493
593

Retrieving Element Text

Element 接口的 elements() 方法返回其中包含的元素列表。Element 接口的 elementText() 方法以字符串的形式返回元素的文本内容。

The elements() method of Element interface returns the list of Elements contained in it. The elementText() method of Element interface returns the text content of the element in the form of a string.

import java.io.File;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

public class DemoParse {
   public static void main(String[] args) {
      try {

         //Creating SAXReader
         SAXReader reader = new SAXReader();

    	 //Reading the XML file
         File inputFile = new File("studentData.xml");

         //Parsing the XML
         Document document = reader.read(inputFile);

         //Extracting the root
         Element RootElement = document.getRootElement();
         System.out.println("Root Element: " + RootElement.getName());
         List<Element> elements = RootElement.elements();
         System.out.println("---------------------------------");

         //Iterating over the List
         for (Element ele : elements) {
            System.out.println("\nCurrent Element :"
               + ele.getName());
            System.out.println("Student roll no : "
               + ele.attributeValue("rollno") );
            System.out.println("First Name : "
               + ele.elementText("firstname"));
            System.out.println("Last Name : "
               + ele.elementText("lastname"));
            System.out.println("First Name : "
               + ele.elementText("nickname"));
            System.out.println("Marks : "
               + ele.elementText("marks"));
         }
      } catch (DocumentException e) {
         e.printStackTrace();
      }
   }
}

Output

所有学生信息都显示在输出屏幕上。

All the information of students is displayed on the output screen.

Root Element: class
---------------------------------

Current Element :student
Student roll no : 393
First Name : dinkar
Last Name : kad
First Name : dinkar
Marks : 85

Current Element :student
Student roll no : 493
First Name : Vaneet
Last Name : Gupta
First Name : vinni
Marks : 95

Current Element :student
Student roll no : 593
First Name : jasvir
Last Name : singn
First Name : jazz
Marks : 90