Java Xml 简明教程
Java DOM Parser - Parse XML Document
Java DOM 解析器是用于解析任何 XML 文档的 Java API。使用所提供的方法,我们可以使用 Java DOM 解析器检索根元素、子元素及其属性。
Java DOM parser is a Java API to parse any XML document. Using the methods provided, we can retrieve root element, sub elements and their attributes using Java DOM parser.
在本教程中,我们使用 getTagName() 方法来检索元素的标签名称,getFirstChild() 来检索元素的第一个子元素,getTextContent() 来获取元素的文本内容。
In this tutorial we have used the getTagName() method to retrieve the tag name of elements, getFirstChild() to retrieve the first child of an element and getTextContent() to get the text content of elements.
Parse XML Using Java DOM parser
讨论了 Java 中提供的各种 XML 解析器后,现在让我们看看如何使用 DOM 解析器来解析 XML 文件。我们使用 parse() 方法来解析 XML 文件。在直接跳入示例之前,让我们看看使用 Java DOM 解析器解析 XML 文档的步骤−
Having discussed various XML parsers available in Java, now let us see how we can use DOM parser to parse an XML file. We use parse() method to parse an XML file. Before jumping into the example directly, let us see the steps to parse XML document using Java DOM parser −
*Step 1: *Creating a DocumentBuilder Object
*Step 2: *Reading the XML
*Step 3: *Parsing the XML Document
*Step 4: *Retrieving the Elements
Step 1: Creating a DocumentBuilder Object
DocumentBuilderFactory 是一个工厂 API,用于通过创建 DOM 树来获取解析器以解析 XML 文档。它有“newDocumentBuilder()”方法,用于创建一个“DocumentBuilder”类的实例。DocumentBuilder 类用于获取输入,以流、文件、URL 和 SAX InputSource 的形式。
DocumentBuilderFactory is a factory API to obtain parser to parse XML documents by creating DOM trees. It has 'newDocumentBuilder()' method that creates an instance of the class 'DocumentBuilder'. This DocumentBuilder class is used to get input in the form of streams, files, URLs and SAX InputSources.
DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
Step 2: Reading the XML
输入可以是文件类型或流类型。要输入 XML 文件,请创建一个文件对象,并将文件路径作为参数传递。
Input can be of file type or stream type. To input an XML file, Create a file object and pass the file path as argument.
File xmlFile = new File("input.xml");
要以流的形式获取输入,我们使用了 StringBuilder 类并附加输入字符串,然后再将其转换为字节数组。获得的 ByteArrayInputStream 被提供给文档作为输入。
To get input in the form of stream, we have used StringBuilder class and appended the input string and later converted it into bytes. The obtained ByteArrayInputStream is given as input to the document.
StringBuilder xmlBuilder = new StringBuilder();
xmlBuilder.append("<?xml version="1.0"?> <rootElement></rootElement>");
ByteArrayInputStream input = new ByteArrayInputStream( xmlBuilder.toString().getBytes("UTF-8"));
Step 3: Parsing the XML Document
在上述步骤中创建的 DocumentBuilder 用于解析输入的 XML 文件。它包含一个名为 parse() 的方法,该方法接受文件或输入流作为参数,并返回 DOM Document 对象。如果给定的文件或输入流为 NULL,则此方法将抛出 IllegalArgumentException。
DocumentBuilder created in above steps is used to parse the input XML file. It contains a method named parse() which accepts a file or input stream as a parameter and returns a DOM Document object. If the given file or input stream is NULL, this method throws an IllegalArgumentException.
Document xmldoc = docBuilder.parse(input);
Step4: Retrieving the Elements
org.w3c.dom. 包的 Node 和 Element 接口提供了各种方法来检索 XML 文档中元素的所需信息。此信息包括元素的名称、文本内容、属性及其值。我们有许多 DOM 接口和方法来获取此信息。
The Node and Element interfaces of the org.w3c.dom. package provides various methods to retrieve desired information about elements from the XML documents. This information includes element’s name, text content, attributes and their values. We have many DOM interfaces and methods to get this information.
Retrieving Root Element Name
XML 文档由许多元素构成。在 Java 中,XML/HTML 文档由名为 Element 的接口表示。此接口提供了多种方法来检索、添加和修改 XML/HTML 文档的内容。
XML document constitutes of many elements. In Java an XML/HTML document is represented by the interface named Element. This interface provides various methods to retrieve, add and modify the contents of an XML/HTML document.
我们可以使用 Element 接口的名称 getTagName() 的方法来检索根元素的名称。以字符串形式返回根元素的名称。
We can retrieve the name of the root element using the method named getTagName() of the Element interface. It returns the name of the root element in the form of a string.
由于 Element 是一个接口,因此要创建其对象,我们需要使用 getDocumentElement() 方法。此方法检索并以对象的格式返回根元素。
Since Element is an interface, to create its object we need to use the getDocumentElement() method. This method retrieves and returns the root element in the form of an object.
在以下示例中,我们使用 StringBuilder 类传递了一个只有一个根元素“college”的简单 XML 文档。然后,我们将其检索并在控制台上打印。
In the following example we have passed a simple XML document with just one root element 'college' using StringBuilder class. Then, we are retrieving it and printing on the console.
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import javax.xml.parsers.DocumentBuilder;
public class RetrieveRootElementName {
public static void main(String[] args) {
try {
//Creating a DocumentBuilder Object
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
//Reading the XML
StringBuilder xmlBuilder = new StringBuilder();
//Parsing the XML Document
ByteArrayInputStream input = new ByteArrayInputStream(xmlBuilder.toString().getBytes("UTF-8"));
Document xmldoc = docBuilder.parse(input);
//Retrieving the Root Element Name
Element element = xmldoc.getDocumentElement();
System.out.println("Root element name is "+element.getTagName());
} catch (Exception e) {
The element name, 'college' is displayed on the output screen as shown below −
Root element name is college
Parsing Single Sub Element in XML
我们可以解析一个根元素中只有一个元素的简单 XML 文档。到目前为止,我们已经了解了如何检索根元素。现在,让我们看看如何获取根元素中的子元素。
We can parse a simple XML document with single element inside the root element. So far, we have seen how to retrieve the root element. Now, let us see how to get the sub element inside the root element.
由于我们只有一个子元素,因此我们使用 getFirstChild() 方法来检索它。此方法与根元素一起使用,以获取其第一个子元素。它以 Node 对象的形式返回子节点。
Since, we have only one sub element, we are using getFirstChild() method to retrieve it. This method is used with the root element to get its first child. It returns the child node in the form of a Node object.
检索子节点后,使用 getNodeName() 方法获取节点名称。它以字符串形式返回节点名称。
After retrieving the child node, getNodeName() method is used to get the name of the node. It returns the node name in the form of a string.
要获取文本内容,我们使用 getTextContent() 方法。它以字符串的形式返回文本内容。
To get the text content, we use getTextContent() method. It returns the text content in the form of a String.
让我们来看以下示例,其中有一个根元素和一个子元素。在此处,“college” 是根元素,而 “department” 为其子元素。“department” 元素具有 “Computer Science” 文本内容。我们正在检索子元素的名称和文本内容。
Let us see the following example where we have one root element and a sub element. Here, 'college' is the root element with 'department' as sub element. The 'department' element has text content, "Computer Science". We are retrieving the name and text content of the sub element.
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import javax.xml.parsers.DocumentBuilder;
public class SingleSubElement {
public static void main(String[] args) {
try {
//Creating a DocumentBuilder Object
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
//Reading the XML
StringBuilder xmlBuilder = new StringBuilder();
xmlBuilder.append("<college><department>Computer Science</department></college>");
//Parsing the XML Document
ByteArrayInputStream input = new ByteArrayInputStream(xmlBuilder.toString().getBytes("UTF-8"));
Document xmldoc = docBuilder.parse(input);
//Retrieving the Root Element
Element element = xmldoc.getDocumentElement();
//Retrieving the Child Node
Node childNode = element.getFirstChild();
String childNodeName = childNode.getNodeName();
System.out.println("Sub Element name : " + childNodeName);
//Retrieving Text Content of the Child Node "+ childNodeName);
System.out.println("Text content of Sub Element : "+childNode.getTextContent());
} catch (Exception e) {
The output window displays Sub element name and text content.
Sub Element name : department
Text content of Sub Element : Computer Science
Parsing Multiple Elements in XML
要使用多个元素解析一个 XML 文档,我们需要使用循环。 getChildNodes() 方法检索某个元素的所有子节点,并以一个 NodeList 的形式返回。我们需要循环遍历所获取 NodeList 的所有元素,并像我们在前文中所做的那样检索每个元素所需的信息。
To parse an XML document with multiple elements we need to use loops. The getChildNodes() method retrieves all the child nodes of an element and returns it as a NodeList. We need to loop through all the elements of the obtained NodeList and retrieve the desired information about each element as we did in the previous sections.
现在,让我们向 XML 文件 ( multipleElements.xml ) 中再添加两个部门。让我们尝试检索所有部门名称和工作人员数量。
Now, let us add two more departments to the XML file (multipleElements.xml). Let us try to retrieve all the department names and staff count.
<name>Computer Science</name>
<name>Electrical and Electronics</name>
在以下程序中,我们将部门元素的列表检索为一个 NodeList,并重复处理所有部门来获取部门名称和工作人员数量。
In the following program, we retrieve the list of department elements into a NodeList and iterate all the departments to get the department name and staff count.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
public class MultipleElementsXmlParsing {
public static void main(String[] args) {
try {
//Input the XML file
File inputXmlFile = new File("src/multipleElements.xml");
//creating DocumentBuilder
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbFactory.newDocumentBuilder();
Document xmldoc = docBuilder.parse(inputXmlFile);
//Retrieving the Root Element
Element element = xmldoc.getDocumentElement();
System.out.println("Root element name is "+element.getTagName());
//Getting the child elements List
NodeList nList = element.getChildNodes();
//Iterating through all the child elements of the root
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Name of the department : " + eElement.getElementsByTagName("name").item(0).getTextContent());
System.out.println("Staff Count of the department : " + eElement.getElementsByTagName("staffCount").item(0).getTextContent());
} catch (Exception e) {
All the three departments with name and staff count are displayed.
Root element :college
Current Element :department
Name of the department : Computer Science
Staff Count of the department : 20
Current Element :department
Name of the department : Electrical and Electronics
Staff Count of the department : 23
Current Element :department
Name of the department : Mechanical
Staff Count of the department : 15
Parsing Attributes in XML
XML 元素可以有属性,这些属性可以使用 getAttribute() 方法检索。该方法将属性名称作为参数,并以一个字符串的形式返回其对应的属性值。如果没有该属性值的属性名称所指定的属性值或默认值,它将返回一个空字符串。
XML elements can have attributes and these can be retrieved using the getAttribute() method. This method takes attribute name as a parameter and returns its corresponding attribute value as a String. It returns an empty string if there is no attribute value or default value for the attribute name specified.
现在,让我们向 “ attributesParsing.xml ” 文件中的所有部门元素添加一个名为 “deptcode” 的属性。
Now, let us add an attribute, 'deptcode' to all the department elements in the 'attributesParsing.xml' file.
<?xml version = "1.0"?>
<department deptcode = "DEP_CS23">
<name>Computer Science</name>
<department deptcode = "DEP_EC34">
<name>Electrical and Electronics</name>
<department deptcode = "DEP_MC89">
In the following program, we are retrieving deptcode along with name and staff count for each department.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
public class AttributesXmlParsing {
public static void main(String[] args) {
try {
//Input the XML file
File inputXmlFile = new File("attributesParsing.xml");
//creating DocumentBuilder
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbFactory.newDocumentBuilder();
Document xmldoc = docBuilder.parse(inputXmlFile);
//Getting the root element
System.out.println("Root element :" + xmldoc.getDocumentElement().getNodeName());
NodeList nList = xmldoc.getElementsByTagName("department");
//Iterating through all the child elements of the root
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Department Code : " + eElement.getAttribute("deptcode"));
System.out.println("Name of the department : " + eElement.getElementsByTagName("name").item(0).getTextContent());
System.out.println("Staff Count of the department : " + eElement.getElementsByTagName("staffCount").item(0).getTextContent());
} catch (Exception e) {
The three departments are displayed with their corresponding department code, name and staff count.
Root element :college
Current Element :department
Department Code : DEP_CS23
Name of the department : Computer Science
Staff Count of the department : 20
Current Element :department
Department Code : DEP_EC34
Name of the department : Electrical and Electronics
Staff Count of the department : 23
Current Element :department
Department Code : DEP_MC89
Name of the department : Mechanical
Staff Count of the department : 15