Pdfbox 简明教程
PDFBox - Splitting a PDF Document
在上一个章节中,我们已经看到如何向 PDF 文档添加 JavaScript。现在,让我们学习如何将给定的 PDF 文档拆分为多个文档。
In the previous chapter, we have seen how to add JavaScript to a PDF document. Let us now learn how to split a given PDF document into multiple documents.
Splitting the Pages in a PDF Document
您可以使用名为 Splitter 的类将给定的 PDF 文档拆分为多个 PDF 文档。该类用于将给定 PDF 文档拆分为多个其他文档。
You can split the given PDF document in to multiple PDF documents using the class named Splitter. This class is used to split the given PDF document into several other documents.
以下是拆分现有 PDF 文档的步骤:
Following are the steps to split an existing PDF document
Step 1: Loading an Existing PDF Document
使用 PDDocument 类的静态方法 load() 加载现有 PDF 文档。此方法接受一个文件对象作为参数,因为这是一个静态方法,您可使用类名调用它,如下所示:
Load an existing PDF document using the static method load() of the PDDocument class. This method accepts a file object as a parameter, since this is a static method you can invoke it using class name as shown below.
File file = new File("path of the document")
PDDocument document = PDDocument.load(file);
Step 2: Instantiate the Splitter Class
名为 Splitter 的类包含拆分给定 PDF 文档的方法,因此,实例化这个类如下所示:
The class named Splitter contains the methods to split the given PDF document therefore, instantiate this class as shown below.
Splitter splitter = new Splitter();
Step 3: Splitting the PDF Document
您可以使用此类的 Split() 方法拆分给定的文档。此方法接受一个 Splitter 类的对象作为参数。
You can split the given document using the Split() method of the Splitter class this class. This method accepts an object of the PDDocument class as a parameter.
List<PDDocument> Pages = splitter.split(document);
split() 方法将给定文档的每页作为单独的文档进行拆分,并以列表的形式返回所有这些文档。
The split() method splits each page of the given document as an individual document and returns all these in the form of a list.
Step 4: Creating an Iterator Object
为了遍历文档列表,您需要获取上面步骤中获取的列表的迭代器对象,您需要使用 listIterator() 方法获取列表的迭代器对象,如下所示:
In order to traverse through the list of documents you need to get an iterator object of the list acquired in the above step, you need to get the iterator object of the list using the listIterator() method as shown below.
Iterator<PDDocument> iterator = Pages.listIterator();
Example
比如说,有一个名称为 sample.pdf 的 PDF 文档,位于路径 C:\PdfBox_Examples\ 当中,此文档包含两页——一页包含图片,另一页包含文本,如下所示。
Suppose, there is a PDF document with name sample.pdf in the path C:\PdfBox_Examples\ and this document contains two pages — one page containing image and another page containing text as shown below.
此示例演示如何分割上述 PDF 文档。这里,我们将名为 sample.pdf 的 PDF 文档分割为 sample1.pdf 和 sample2.pdf 两个不同的文档。使用名称为 SplitPages.java. 的文件,保存此代码
This example demonstrates how to split the above mentioned PDF document. Here, we will split the PDF document named sample.pdf into two different documents sample1.pdf and sample2.pdf. Save this code in a file with name SplitPages.java.
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;
import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Iterator;
public class SplitPages {
public static void main(String[] args) throws IOException {
//Loading an existing PDF document
File file = new File("C:/PdfBox_Examples/sample.pdf");
PDDocument document = PDDocument.load(file);
//Instantiating Splitter class
Splitter splitter = new Splitter();
//splitting the pages of a PDF document
List<PDDocument> Pages = splitter.split(document);
//Creating an iterator
Iterator<PDDocument> iterator = Pages.listIterator();
//Saving each page as an individual document
int i = 1;
while(iterator.hasNext()) {
PDDocument pd = iterator.next();
pd.save("C:/PdfBox_Examples/sample"+ i++ +".pdf");
}
System.out.println("Multiple PDF’s created");
document.close();
}
}
使用以下命令,从命令提示符中编译并执行已保存的 Java 文件
Compile and execute the saved Java file from the command prompt using the following commands
javac SplitPages.java
java SplitPages
执行完上述程序之后,会对给定的 PDF 文档加密并显示以下信息。
Upon execution, the above program encrypts the given PDF document displaying the following message.
Multiple PDF’s created
如果您验证提供的路径,您会观察到,创建了几个 PDF 文件,名称为 sample1 和 sample2 ,如下所示。
If you verify the given path, you can observe that multiple PDFs were created with names sample1 and sample2 as shown below.