Java TAR example – compress and decompress *.tar or *.tar.gz files
This tutorial demonstrate how to compress files or directories recursively in .tar
or .tar.gz
format and how to decompress a .tar
or .tar.gz
file.
Project Structure
Let’s start by looking at the project structure.
Maven Dependencies
We use Apache Maven to manage our project dependencies. Make sure the following dependencies reside on the class-path. We use Apache Commons Compress, make sure the org.apache.commons:commons-compress
dependency resides on the class-path.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.memorynotfound.io.compression</groupId>
<artifactId>tar</artifactId>
<version>1.0.0-SNAPSHOT</version>
<name>IO Compression - ${project.artifactId}</name>
<url>https://memorynotfound.com</url>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.14</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.7.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
Compress and Decompress *.tar
files
- Compressing files in
.tar
format (also known as tarring): We use theTarArchiveOutputStream
to compress files and/or directories intoTAR
format. We can add entries in the archive using theTarArchiveOutputStream.putArchiveEntry
mehtod and pass in aTarArchiveEntry
as an argument containing the file and filename respectively.. - Decompressing
.tar
archive (also known as untarring): We can untar theTAR
archive using theTarArchiveInputStream
class. Next, we loop over theTarArchiveEntry
using theTarArchiveEntry.getNextTarEntry()
class and copy the content to anFileOutputStream
.
package com.memorynotfound.resource;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.utils.IOUtils;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class TAR {
private TAR() {
}
public static void compress(String name, File... files) throws IOException {
try (TarArchiveOutputStream out = getTarArchiveOutputStream(name)){
for (File file : files){
addToArchiveCompression(out, file, ".");
}
}
}
public static void decompress(String in, File out) throws IOException {
try (TarArchiveInputStream fin = new TarArchiveInputStream(new FileInputStream(in))){
TarArchiveEntry entry;
while ((entry = fin.getNextTarEntry()) != null) {
if (entry.isDirectory()) {
continue;
}
File curfile = new File(out, entry.getName());
File parent = curfile.getParentFile();
if (!parent.exists()) {
parent.mkdirs();
}
IOUtils.copy(fin, new FileOutputStream(curfile));
}
}
}
private static TarArchiveOutputStream getTarArchiveOutputStream(String name) throws IOException {
TarArchiveOutputStream taos = new TarArchiveOutputStream(new FileOutputStream(name));
// TAR has an 8 gig file limit by default, this gets around that
taos.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR);
// TAR originally didn't support long file names, so enable the support for it
taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
taos.setAddPaxHeadersForNonAsciiNames(true);
return taos;
}
private static void addToArchiveCompression(TarArchiveOutputStream out, File file, String dir) throws IOException {
String entry = dir + File.separator + file.getName();
if (file.isFile()){
out.putArchiveEntry(new TarArchiveEntry(file, entry));
try (FileInputStream in = new FileInputStream(file)){
IOUtils.copy(in, out);
}
out.closeArchiveEntry();
} else if (file.isDirectory()) {
File[] children = file.listFiles();
if (children != null){
for (File child : children){
addToArchiveCompression(out, child, entry);
}
}
} else {
System.out.println(file.getName() + " is not supported");
}
}
}
Compress and Decompress *.tar.gz
files
We can also compress or decompress TAR
archives in GZIP
to save some space. To compress files or directories into .tar.gz
format wrap the GzipCompressorOutputStream
inside the TarArchiveOutputStream
. To decompress .tar.gz
archive wrap the GzipCompressorInputStream
inside the TarArchiveInputStream
.
// compressing *.tar.gz format
TarArchiveOutputStream taos = new TarArchiveOutputStream(new GzipCompressorOutputStream(new FileOutputStream(name)));
// decompressing *.tar.gz files
TarArchiveInputStream fin = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream(in)))
Java Tar and Untar Example
This program demonstrates the tar archive compression decompression example.
package com.memorynotfound.resource;
import java.io.File;
import java.io.IOException;
public class TARProgram {
private static final String OUTPUT_DIRECTORY = "/tmp";
private static final String JAR_SUFFIX = ".tar";
private static final String MULTIPLE_RESOURCES = "/example-multiple-resources";
private static final String RECURSIVE_DIRECTORY = "/example-recursive-directory";
private static final String MULTIPLE_RESOURCES_PATH = OUTPUT_DIRECTORY + MULTIPLE_RESOURCES + JAR_SUFFIX;
private static final String RECURSIVE_DIRECTORY_PATH = OUTPUT_DIRECTORY + RECURSIVE_DIRECTORY + JAR_SUFFIX;
public static void main(String... args) throws IOException {
// class for resource classloading
Class clazz = TARProgram.class;
// get multiple resources files to compress
File resource1 = new File(clazz.getResource("/resource1.txt").getFile());
File resource2 = new File(clazz.getResource("/resource2.txt").getFile());
File resource3 = new File(clazz.getResource("/resource3.txt").getFile());
// compress multiple resources
TAR.compress(MULTIPLE_RESOURCES_PATH, resource1, resource2, resource3);
// decompress multiple resources
TAR.decompress(MULTIPLE_RESOURCES_PATH, new File(OUTPUT_DIRECTORY + MULTIPLE_RESOURCES));
// get directory file to compress
File directory = new File(clazz.getResource("/dir").getFile());
// compress recursive directory
TAR.compress(RECURSIVE_DIRECTORY_PATH, directory);
// decompress recursive directory
TAR.decompress(RECURSIVE_DIRECTORY_PATH, new File(OUTPUT_DIRECTORY + RECURSIVE_DIRECTORY));
}
}
Generated Files
Here is an example of the generated tar archive.
References
- Apache Commons Compress Official Website
- Apache Commons Compress API JavaDoc
- TarArchiveOutputStream JavaDoc
- TarArchiveInputStream JavaDoc
- TarArchiveEntry JavaDoc
- GzipCompressorInputStream JavaDoc
- GzipCompressorOutputStream JavaDoc