PDFBox Extracting Image

by Online Tutorials Library July 14, 2022

PDFBox Extracting Image

In this section, we will learn how to extract image from the existing PDF document. The PDFBox library provides a PDFRender class which renders a PDF document into an AWT BufferedImage.

Follow the steps below to extract an image from the existing PDF document-

Load Existing PDF Document

We can load the existing PDF document by using the static load () method. This method accepts a file object as a parameter. We can also invoke it using the class name PDDocument of the PDFBox.

  File file = new File(“Path of Document”);   PDDocument doc = PDDocument.load(file);   

Instantiate the PDFRender class

PDFRenderer class renders a PDF document into an AWT BufferedImage. The instance of this class needs a document object as its parameter. This can be shown in the following code.

Render Image

The renderImage() method of the Renderer class can be used to render the image in a particular page. This method need to pass the index of the page, where we have the image that is to be rendered.

Writing the Image to a File

We can write the rendered image to a file using the write () method. In this method, we need to pass three parameters –

The rendered image object.
String representing the type of the image (jpg or png).
File object to which we need to save the extracted image.

This can be shown in the following code:

Close Document

After completing the task, we need to close the PDDocument class object by using the close () method.

Example-

This is a PDF document which we are going to extract its page as an image by using PDFBox library of a Java program.

Java Program

  import java.awt.image.BufferedImage;  import java.io.File;  import java.io.IOException;  import javax.imageio.ImageIO;  import org.apache.pdfbox.pdmodel.PDDocument;  import org.apache.pdfbox.rendering.PDFRenderer;    public class ExtractImage {    public static void main(String[] args)throws IOException {    //Loading an existing document         File file = new File(“/eclipse-workspace/blank.pdf”);        PDDocument doc = PDDocument.load(file);    //Instantiating the PDFRenderer class        PDFRenderer renderer = new PDFRenderer(doc);    //Rendering an image from the PDF document        BufferedImage image = renderer.renderImage(2);    //Writing the image to a file       ImageIO.write(image, “JPEG”, new File(“/eclipse-workspace/my_image.jpeg”));          System.out.println(“Image created successfully.”);    //Closing the document  doc.close();  }  }  

Output:

After successful execution, the above program shows the following output.

Now for verification, open the image as shown below-

Next TopicPDFBox Get Location and Image Size

PDFBox Extracting Image

PDFBox Extracting Image

Load Existing PDF Document

Instantiate the PDFRender class

Render Image

Writing the Image to a File

Close Document

Example-

Java Program

Difference between System Software and Operating System

Features of Phalcon

You may also like