Pdfbox 简明教程

PDFBox - Document Properties

像其他文件一样,PDF 文档也具有文档属性。这些属性是键值对。每个属性都提供有关文档的特定信息。

Like other files, a PDF document also has document properties. These properties are key-value pairs. Each property gives particular information about the document.

以下是 PDF 文档的属性 −

Following are the properties of a PDF document −

S.No.

Property & Description

1

File This property holds the name of the file.

2

Title Using this property, you can set the title for the document.

3

Author Using this property, you can set the name of the author for the document.

4

Subject Using this property, you can specify the subject of the PDF document.

5

Keywords Using this property, you can list the keywords with which we can search the document.

6

Created Using this property, you can set the date created for the document.

7

Modified Using this property, you can set the date modified for the document.

8

Application Using this property, you can set the Application of the document.

以下是 PDF 文档的文档属性表截图。

Following is a screenshot of the document properties table of a PDF document.

pdf properties

Setting the Document Properties

PDFBox 为您提供了名为 PDDocumentInformation 的类。此类包含一组 setter 和 getter 方法。

PDFBox provides you a class named PDDocumentInformation. This class has a set of setter and getter methods.

此类的 setter 方法用于将值设置给文档的各种属性,而 getter 方法用于检索这些值。

The setter methods of this class are used to set values to various properties of a document and getter methods which are used to retrieve these values.

以下是 PDDocumentInformation 类的 setter 方法。

Following are the setter methods of the PDDocumentInformation class.

S.No.

Method & Description

1

setAuthor(String author) This method is used to set the value for the property of the PDF document named Author.

2

setTitle(String title) This method is used to set the value for the property of the PDF document named Title.

3

setCreator(String creator) This method is used to set the value for the property of the PDF document named Creator.

4

setSubject(String subject) This method is used to set the value for the property of the PDF document named Subject.

5

setCreationDate(Calendar date) This method is used to set the value for the property of the PDF document named CreationDate.

6

setModificationDate(Calendar date) This method is used to set the value for the property of the PDF document named ModificationDate.

7

setKeywords(String keywords list) This method is used to set the value for the property of the PDF document named Keywords.

Example

PDFBox 提供了一个名为 PDDocumentInformation 的类,此类提供了各种方法。这些方法可以将各种属性设置给文档并检索这些属性。

PDFBox provides a class called PDDocumentInformation and this class provides various methods. These methods can set various properties to the document and retrieve them.

此示例演示了如何将属性(如 Author, Title, Date, and Subject )添加到 PDF 文档。此处,我们将创建名为 doc_attributes.pdf 的 PDF 文档,向其中添加各种属性,并将其保存在路径 C:/PdfBox_Examples/ 中。在名为 AddingAttributes.java 的文件中保存此代码。

This example demonstrates how to add properties such as Author, Title, Date, and Subject to a PDF document. Here, we will create a PDF document named doc_attributes.pdf, add various attributes to it, and save it in the path C:/PdfBox_Examples/. Save this code in a file with name AddingAttributes.java.

import java.io.IOException;
import java.util.Calendar;
import java.util.GregorianCalendar;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;
import org.apache.pdfbox.pdmodel.PDPage;

public class AddingDocumentAttributes {
   public static void main(String args[]) throws IOException {

      //Creating PDF document object
      PDDocument document = new PDDocument();

      //Creating a blank page
      PDPage blankPage = new PDPage();

      //Adding the blank page to the document
      document.addPage( blankPage );

      //Creating the PDDocumentInformation object
      PDDocumentInformation pdd = document.getDocumentInformation();

      //Setting the author of the document
      pdd.setAuthor("Tutorialspoint");

      // Setting the title of the document
      pdd.setTitle("Sample document");

      //Setting the creator of the document
      pdd.setCreator("PDF Examples");

      //Setting the subject of the document
      pdd.setSubject("Example document");

      //Setting the created date of the document
      Calendar date = new GregorianCalendar();
      date.set(2015, 11, 5);
      pdd.setCreationDate(date);
      //Setting the modified date of the document
      date.set(2016, 6, 5);
      pdd.setModificationDate(date);

      //Setting keywords for the document
      pdd.setKeywords("sample, first example, my pdf");

      //Saving the document
      document.save("C:/PdfBox_Examples/doc_attributes.pdf");

      System.out.println("Properties added successfully ");

      //Closing the document
      document.close();

   }
}

使用以下命令从命令提示符处编译并执行已保存的 Java 文件。

Compile and execute the saved Java file from the command prompt using the following commands.

javac AddingAttributes.java
java AddingAttributes

执行时,上述程序将所有指定属性添加到文档中,并显示以下消息。

Upon execution, the above program adds all the specified attributes to the document displaying the following message.

Properties added successfully

现在,如果您访问给定路径,您可以在其中找到创建的 PDF。右击文档并选择文档属性选项,如下所示。

Now, if you visit the given path you can find the PDF created in it. Right click on the document and select the document properties option as shown below.

document properties

以下方法为文档属性窗口,你可以观察到文档的所有属性均设置为指定的值。

This will give you the document properties window and here you can observe all the properties of the document were set to specified values.

properties menu

Retrieving the Document Properties

你可以使用 PDDocumentInformation 类提供的 getter 方法来检索文档属性。

You can retrieve the properties of a document using the getter methods provided by the PDDocumentInformation class.

下面是 PDDocumentInformation 类的 getter 方法。

Following are the getter methods of the PDDocumentInformation class.

S.No.

Method & Description

1

getAuthor() This method is used to retrieve the value for the property of the PDF document named Author.

2

getTitle() This method is used to retrieve the value for the property of the PDF document named Title.

3

getCreator() This method is used to retrieve the value for the property of the PDF document named Creator.

4

getSubject() This method is used to retrieve the value for the property of the PDF document named Subject.

5

getCreationDate() This method is used to retrieve the value for the property of the PDF document named CreationDate.

6

getModificationDate() This method is used to retrieve the value for the property of the PDF document named ModificationDate.

7

getKeywords() This method is used to retrieve the value for the property of the PDF document named Keywords.

Example

此示例演示如何检索现有 PDF 文件的属性。在此处,我们将创建一个 Java 程序并加载名为 doc_attributes.pdf 的 PDF 文件,它保存在路径 C:/PdfBox_Examples/ 中,并检索其属性。将此代码保存到名为 RetrivingDocumentAttributes.java 的文件中。

This example demonstrates how to retrieve the properties of an existing PDF document. Here, we will create a Java program and load the PDF document named doc_attributes.pdf, which is saved in the path C:/PdfBox_Examples/, and retrieve its properties. Save this code in a file with name RetrivingDocumentAttributes.java.

import java.io.File;
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentInformation;

public class RetrivingDocumentAttributes {
   public static void main(String args[]) throws IOException {

      //Loading an existing document
      File file = new File("C:/PdfBox_Examples/doc_attributes.pdf")
      PDDocument document = PDDocument.load(file);
      //Getting the PDDocumentInformation object
      PDDocumentInformation pdd = document.getDocumentInformation();

      //Retrieving the info of a PDF document
      System.out.println("Author of the document is :"+ pdd.getAuthor());
      System.out.println("Title of the document is :"+ pdd.getTitle());
      System.out.println("Subject of the document is :"+ pdd.getSubject());

      System.out.println("Creator of the document is :"+ pdd.getCreator());
      System.out.println("Creation date of the document is :"+ pdd.getCreationDate());
      System.out.println("Modification date of the document is :"+
         pdd.getModificationDate());
      System.out.println("Keywords of the document are :"+ pdd.getKeywords());

      //Closing the document
      document.close();
   }
}

使用以下命令从命令提示符处编译并执行已保存的 Java 文件。

Compile and execute the saved Java file from the command prompt using the following commands.

javac RetrivingDocumentAttributes.java
java RetrivingDocumentAttributes

执行后,上述程序将检索文档的所有属性并按如下所示显示它们。

Upon execution, the above program retrieves all the attributes of the document and displays them as shown below.

Author of the document is :Tutorialspoint
Title of the document is :Sample document
Subject of the document is :Example document
Creator of the document is :PDF Examples
Creation date of the document is :11/5/2015
Modification date of the document is :6/5/2016
Keywords of the document are :sample, first example, my pdf