Apache PDFBox Convert PDF to Image in Java
This tutorial demonstrates how to convert a PDF document to images in Java using Apache PDFBox.
Maven Dependencies
We use Apache Maven to manage our project dependencies. Make sure the following dependencies reside on the class-path.
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.8</version>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-tools</artifactId>
<version>2.0.8</version>
</dependency>
Apache PDFBox Convert PDF to Image in Java
We start by loading in the PDF document. Next we create a PDFRenderer
class. Then we loop over each page and create a BufferedImage
. Finally we write the image to disk. Clean and simple.
package com.memorynotfound.pdf.pdfbox;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.ImageType;
import org.apache.pdfbox.rendering.PDFRenderer;
import org.apache.pdfbox.tools.imageio.ImageIOUtil;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
public class PdfToImage {
private static final String OUTPUT_DIR = "/tmp/";
public static void main(String[] args) throws Exception{
try (final PDDocument document = PDDocument.load(new File("/tmp/bookmark.pdf"))){
PDFRenderer pdfRenderer = new PDFRenderer(document);
for (int page = 0; page < document.getNumberOfPages(); ++page)
{
BufferedImage bim = pdfRenderer.renderImageWithDPI(page, 300, ImageType.RGB);
String fileName = OUTPUT_DIR + "image-" + page + ".png";
ImageIOUtil.writeImage(bim, fileName, 300);
}
document.close();
} catch (IOException e){
System.err.println("Exception while trying to create pdf document - " + e);
}
}
}
Thanks.