Apache Poi Word 简明教程

Apache POI Word - Overview

很多时候,软件应用程序需要生成 Microsoft Word 文件格式的参考文档。有时,甚至预期一个应用程序接收 Word 文件作为输入数据。

Many a time, a software application is required to generate reference documents in Microsoft Word file format. Sometimes, an application is even expected to receive Word files as input data.

任何想要生成 MS-Office 文件作为输出的 Java 编程人员都必须使用一个预定义且只读的 API 来执行此操作。

Any Java programmer who wants to produce MS-Office files as output must use a predefined and read-only API to do so.

What is Apache POI?

Apache POI 是一个流行的 API,它允许编程人员使用 Java 程序创建、修改和显示 MS-Office 文件。它是一个由 Apache 软件基金会开发和分发的开源库,用于使用 Java 程序设计或修改 MS-Office 文件。它包含将用户输入数据或文件解码为 MS-Office 文档的类和方法。

Apache POI is a popular API that allows programmers to create, modify, and display MS-Office files using Java programs. It is an open source library developed and distributed by Apache Software Foundation to design or modify MS-Office files using Java program. It contains classes and methods to decode the user input data or a file into MS-Office documents.

Components of Apache POI

Apache POI 包含用于操作 MS-Office 所有 OLE2 复合文档的类和方法。此 API 的组件列表如下 -

Apache POI contains classes and methods to work on all OLE2 Compound documents of MS-Office. The list of components of this API is given below −

  1. POIFS (Poor Obfuscation Implementation File System) − This component is the basic factor of all other POI elements. It is used to read different files explicitly.

  2. HSSF (Horrible SpreadSheet Format) − It is used to read and write .xls format of MS-Excel files.

  3. XSSF (XML SpreadSheet Format) − It is used for .xlsx file format of MS-Excel.

  4. HPSF (Horrible Property Set Format) − It is used to extract property sets of the MS-Office files.

  5. HWPF (Horrible Word Processor Format) − It is used to read and write .doc extension files of MS-Word.

  6. XWPF (XML Word Processor Format) − It is used to read and write .docx extension files of MS-Word.

  7. HSLF (Horrible Slide Layout Format) − It is used to read, create, and edit PowerPoint presentations.

  8. HDGF (Horrible DiaGram Format) − It contains classes and methods for MS-Visio binary files.

  9. HPBF (Horrible PuBlisher Format) − It is used to read and write MS-Publisher files.

本教程将指导您完成使用 Java 处理 MS-Word 文件的过程。因此,讨论仅限于 HWPF 和 XWPF 组件。

This tutorial guides you through the process of working on MS-Word files using Java. Therefore the discussion is confined to HWPF and XWPF components.

Note − POI 的较旧版本支持 DOC、XLS、PPT 等二进制文件格式。从版本 3.5 开始,POI 支持 DOCX、XLS、PPTX 等 MS-Office 的 OOXML 文件格式。

Note − OLDER VERSIONS OF POI SUPPORT BINARY FILE FORMATS SUCH AS DOC, XLS, PPT, ETC. VERSION 3.5 ONWARDS, POI SUPPORTS OOXML FILE FORMATS OF MS-OFFICE SUCH AS DOCX, XLSX, PPTX, ETC.